Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

35
LINES

< > BotCompany Repo | #1034538 // singlePredTok - tokenizer based on a single predicate

JavaX fragment (include) [tags: use-pretranspiled]

Transpiled version (103L) is out of date.

static LS lambdaMapLike singlePredTok(ICharPred isPartOfToken, S s) {
  new ArrayList<S> tok;
  int l = s == null ? 0 : s.length();
  
  int i = 0, n = 0;
  while (i < l) {
    int j = i;
    char c;
    
    // scan for whitespace
    while (j < l) {
      c = s.charAt(j);
      if (isPartOfToken.get(c))
        break;
      else
        ++j;
    }
    
    tok.add(substring(s, i, j));
    i = j;
    if (i >= l) break;

    // scan for non-whitespace
    
    do
      ++j;
    while (j < l && isPartOfToken.get(s.charAt(j)));

    tok.add(substring(s, i, j));
    i = j;
  }
  
  if ((tok.size() % 2) == 0) tok.add("");
  ret tok;
}

Author comment

Began life as a copy of #1034375

download  show line numbers  debug dex  old transpilations   

Travelled to 3 computer(s): bhatertpkbcr, mowyntqkapby, mqqgnosmbjvj

No comments. add comment

Snippet ID: #1034538
Snippet name: singlePredTok - tokenizer based on a single predicate
Eternal ID of this version: #1034538/3
Text MD5: fb138ae4faae2065aa0d2726f5767ed4
Author: stefan
Category: javax
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2022-02-11 21:38:04
Source code size: 672 bytes / 35 lines
Pitched / IR pitched: No / No
Views / Downloads: 155 / 215
Version history: 2 change(s)
Referenced in: [show references]