Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

40
LINES

< > BotCompany Repo | #1003077 // nlTok2

JavaX fragment (include)

1  
// reduced NL parsing without quoted strings and some other stuff
2  
3  
static List<String> nlTok2(String s) {
4  
  List<String> tok = new ArrayList<String>();
5  
  int l = s.length();
6  
  
7  
  int i = 0;
8  
  while (i < l) {
9  
    int j = i;
10  
    char c;
11  
    
12  
    // scan for whitespace
13  
    while (j < l) {
14  
      c = s.charAt(j);
15  
      if (c == ' ' || c == '\t' || c == '\r' || c == '\n')
16  
        ++j;
17  
      else
18  
        break;
19  
    }
20  
    
21  
    tok.add(s.substring(i, j));
22  
    i = j;
23  
    if (i >= l) break;
24  
    c = s.charAt(i);
25  
26  
    // scan for non-whitespace
27  
    if (Character.isJavaIdentifierStart(c))
28  
      do ++j; while (j < l && (Character.isJavaIdentifierPart(s.charAt(j)) /*|| s.charAt(j) == '\''*/));
29  
    else if (Character.isDigit(c))
30  
      do ++j; while (j < l && Character.isDigit(s.charAt(j)));
31  
    else
32  
      ++j;
33  
34  
    tok.add(s.substring(i, j));
35  
    i = j;
36  
  }
37  
  
38  
  if ((tok.size() % 2) == 0) tok.add("");
39  
  return tok;
40  
}

Author comment

Began life as a copy of #1000769

download  show line numbers  debug dex  old transpilations   

Travelled to 13 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1003077
Snippet name: nlTok2
Eternal ID of this version: #1003077/1
Text MD5: 7eaf24efc9dee3d1dba4c681eb077098
Author: stefan
Category:
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2016-04-27 21:48:32
Source code size: 954 bytes / 40 lines
Pitched / IR pitched: No / No
Views / Downloads: 512 / 626
Referenced in: [show references]