Uses 865K of libraries. Click here for Pure Java version (7845L/54K/184K).
!7 p { maxConsoleChars(10000); new LinkedHashMap<U, S> sentences; // sentence => title int n = 0, max = 1000; for (WikiPage page : indexedSimpleWikipedia_allPages()) { for (S s : sentencesAfterLines(page.text)) putIfNotThere(sentences, toU(s), page.title); if ((++n % 1000) == 0) print(n + " / " + l(sentences)); if (n >= max) break; } Pair<L<U>> p = splitAccordingToPredicate(f mechList_isEnglishSentence_u, keys(sentences)); printAsciiHeading("Non-sentences"); pnl(keysToLinkedHashMap(sentences, p.a)); printAsciiHeading("Sentences"); pnl(keysToLinkedHashMap(sentences, p.b)); }
Began life as a copy of #1014173
download show line numbers debug dex old transpilations
Travelled to 13 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #1014177 |
Snippet name: | Collect all sentences from 1000 Simple Wikipedia pages |
Eternal ID of this version: | #1014177/9 |
Text MD5: | 13e3469e6546cd4c7bb3ab9e08966571 |
Transpilation MD5: | 19cb66ab8562ac17bfaca3c3e6479c43 |
Author: | stefan |
Category: | javax / a.i. / networking |
Type: | JavaX source code |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2018-04-15 23:37:56 |
Source code size: | 654 bytes / 22 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 374 / 495 |
Version history: | 8 change(s) |
Referenced in: | -