static L<S> splitIntoSentences(S s) { new L<S> sentences; for (S sentence : splitIntoSentences_split(s)) { char first = sentence.charAt(0); if (Character.isLowerCase(first) || ",;:=".indexOf(first) >= 0) continue; if (!hasCharacters(sentence)) continue; sentences.add(sentence); } ret sentences; } static L<S> splitIntoSentences_split(S s) { L<S> tok = javaTok(s); // To parse quoted things simpleSpaces(tok); new L<S> list; int i = 0; while (true) { int j = i; do { j = indexOfAny(tok, j+1, ".", "?"); if (j < 0) return list; } while (j+1 < tok.size()-1 && tok.get(j+1).equals("")); // matches stuff like "9.5" S sentence = join(tok.subList(i, j+1)).trim(); if (sentence.length() > 1) list.add(sentence); i = j+1; } }
download show line numbers debug dex old transpilations
Travelled to 13 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #1007652 |
Snippet name: | splitIntoSentences |
Eternal ID of this version: | #1007652/2 |
Text MD5: | a6f583af3e90778edadc0da42d9796f2 |
Author: | stefan |
Category: | javax / parsing |
Type: | JavaX fragment (include) |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2017-03-30 19:46:59 |
Source code size: | 830 bytes / 29 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 572 / 595 |
Version history: | 1 change(s) |
Referenced in: | [show references] |