Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

30
LINES

< > BotCompany Repo | #1006107 // ungroupCharacters - superseded by ocr_ungroupCharacters

JavaX fragment (include)

// TODO: return groups including the brackets
static L<S> ungroupCharacters(S s) {
  new L<S> l;
  for i over s:
    if (s.charAt(i) == '\\' && i+1 < l(s)) { // parse escaped character
      ++i;
      l.add(substring(s, i, i+1));
    } else if (s.charAt(i) == '[') { // parse group
      int j = i+1;
      new StringBuilder buf;
      while (j < l(s) && s.charAt(j) != ']') {
        if (s.charAt(j) == '\\' && j+1 < l(s)) ++j;
        buf.append(s.charAt(j++));
      }
      l.add(str(buf));
      i = j;
    } else if (s.charAt(i) == '{') { // parse symbol
      int j = i;
      new StringBuilder buf;
      while (j < l(s) && s.charAt(j) != '}') {
        if (s.charAt(j) == '\\' && j+1 < l(s)) ++j; // handle escaped characters
        buf.append(s.charAt(j++));
      }
      buf.append('}');
      l.add(str(buf));
      i = j;
    } else
      l.add(substring(s, i, i+1));
  ret l;
}

download  show line numbers  debug dex  old transpilations   

Travelled to 14 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, ddnzoavkxhuk, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1006107
Snippet name: ungroupCharacters - superseded by ocr_ungroupCharacters
Eternal ID of this version: #1006107/1
Text MD5: a44b41b881ddb193f39940794cec9beb
Author: stefan
Category: javax / ocr
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2016-12-26 00:11:53
Source code size: 923 bytes / 30 lines
Pitched / IR pitched: No / No
Views / Downloads: 529 / 532
Referenced in: #1006254 - ocr_parseGroups - see: ocr_escapeMeaning