Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

26
LINES

< > BotCompany Repo | #1024521 - dropPunctuation3 [experimental]

JavaX fragment (include) [tags: use-pretranspiled]

Libraryless. Click here for Pure Java version (2326L/15K).

1  
scope dropPunctuation3.
2  
3  
static LS #keep = ll("*", "<", ">");
4  
static SS #cache = defaultSizeMRUCache();
5  
6  
static LS dropPunctuation3(LS tok) {
7  
  tok = new ArrayList<S>(tok);
8  
  for (int i = 1; i < tok.size(); i += 2) {
9  
    S t = tok.get(i);
10  
    if (t.length() == 1 && !Character.isLetter(t.charAt(0)) && !Character.isDigit(t.charAt(0)) && !dropPunctuation3_keep.contains(t)) {
11  
      // merge spacing and make sure it's not completely empty
12  
      tok.set(i-1, or2(tok.get(i-1) + tok.get(i+1), " "));
13  
      tok.remove(i);
14  
      tok.remove(i);
15  
      i -= 2;
16  
    }
17  
  }
18  
  return tok;
19  
}
20  
21  
sS dropPunctuation3(S s) {
22  
  ret getOrCreate_f0(cache, s,
23  
    () -> join(dropPunctuation3(javaTokNoQuotes(s)));
24  
}
25  
26  
end scope

Author comment

Began life as a copy of #1000814

download  show line numbers  debug dex   

Travelled to 2 computer(s): mqqgnosmbjvj, tvejysmllsmz

No comments. add comment

Snippet ID: #1024521
Snippet name: dropPunctuation3 [experimental]
Eternal ID of this version: #1024521/6
Text MD5: 25f30e08967e15026c63101ac92c444d
Transpilation MD5: 26bdc9c46e39aa7ad3cb8ccc3f975194
Author: stefan
Category:
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2020-01-12 01:16:53
Source code size: 729 bytes / 26 lines
Pitched / IR pitched: No / No
Views / Downloads: 29 / 58
Version history: 5 change(s)
Referenced in: [show references]

Formerly at http://tinybrain.de/1024521 & http://1024521.tinybrain.de