1 | Tokenizing |
2 | ---------- |
3 | |
4 | Functions: javaTok, wordTok, javaTokWithBrackets, javaTokNoQuotes, ... |
5 | |
6 | All functions take a string return a string list in "CNC" format. |
7 | CNC, when counting from 1, has whitespace in odd positions and "real" tokens in even |
8 | positions. |
9 | |
10 | Add a "C" to any function name to drop the white space elements, e.g. "javaTokC". |
11 | |
12 | Nouns |
13 | ----- |
14 | |
15 | Mech Lists: "Nouns" |
16 | Functions: isNoun |
17 | |
18 | Adjectives |
19 | ---------- |
20 | |
21 | Mech Lists: "Adjectives", "Adjectives and comparatives" |
22 | Functions: isAdjective, adjectiveToAdverb |
23 | Snippets: #1011077 (List of adjectives) |
24 | |
25 | Adverbs |
26 | ------- |
27 | |
28 | Mech Lists: "Adverbs", "Adverbial phrases" |
29 | Functions: ai_dropLeadingAdverbs, adjectiveToAdverb, isAdverb |
30 | |
31 | Determiners |
32 | ----------- |
33 | |
34 | Mech Lists: "Determiners" |
35 | Functions: isDeterminer, ai_groupSimplestNounPhrases |
36 | |
37 | Fillers |
38 | ------- |
39 | |
40 | ("like", "uh", "maybe", ...) |
41 | |
42 | Mech Lists: "Fillers", "Not fillers" (exceptions to first list) |
43 | Functions: ai_dropFillers, ai_dropFillers_* |
Travelled to 6 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #1024082 |
Snippet name: | AI Functions [Overview Document] |
Eternal ID of this version: | #1024082/1 |
Text MD5: | 083f5e6350616af802674b02705024ce |
Author: | stefan |
Category: | javax |
Type: | Document |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2019-07-20 14:39:30 |
Source code size: | 983 bytes / 43 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 165 / 85 |
Referenced in: | [show references] |