Tokenizing ---------- Functions: javaTok, wordTok, javaTokWithBrackets, javaTokNoQuotes, ... All functions take a string return a string list in "CNC" format. CNC, when counting from 1, has whitespace in odd positions and "real" tokens in even positions. Add a "C" to any function name to drop the white space elements, e.g. "javaTokC". Nouns ----- Mech Lists: "Nouns" Functions: isNoun Adjectives ---------- Mech Lists: "Adjectives", "Adjectives and comparatives" Functions: isAdjective, adjectiveToAdverb Snippets: #1011077 (List of adjectives) Adverbs ------- Mech Lists: "Adverbs", "Adverbial phrases" Functions: ai_dropLeadingAdverbs, adjectiveToAdverb, isAdverb Determiners ----------- Mech Lists: "Determiners" Functions: isDeterminer, ai_groupSimplestNounPhrases Fillers ------- ("like", "uh", "maybe", ...) Mech Lists: "Fillers", "Not fillers" (exceptions to first list) Functions: ai_dropFillers, ai_dropFillers_*
Travelled to 6 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #1024082 |
Snippet name: | AI Functions [Overview Document] |
Eternal ID of this version: | #1024082/1 |
Text MD5: | 083f5e6350616af802674b02705024ce |
Author: | stefan |
Category: | javax |
Type: | Document |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2019-07-20 14:39:30 |
Source code size: | 983 bytes / 43 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 163 / 84 |
Referenced in: | [show references] |