Libraryless. Click here for Pure Java version (1706L/13K/41K).
!752 p { File odt = new File(userHome(), "Documents/super-state.odt"); // unwrapContainerTags makes editable lists L<L<S>> paragraphs = unwrapContainerTags(paragraphsFromODT(odt)); paragraphs = map(paragraphs, func(L<S> p) { dropListPrefix(p, "", "<text:soft-page-break/>") }); for i over paragraphs: { L<S> p = paragraphs.get(i); int idx; while ((idx = p.indexOf("<text:line-break/>")) >= 0) { paragraphs.add(i+1, newSubList(p, idx+1)); removeSubList(p, 0, idx+1); } } // nb: we might need to remove formatting if user used any // (not doing that yet) L<S> lines = map(paragraphs, func(L<S> p) { join(" # ", p) }); lines = map(lines, func(S s) { trim(htmldecode(s)) }); paragraphs = groupParagraphs(lines); lines = map(paragraphs, func(L<S> p) { fromLines(p) }); psl(lines); } static L<L<S>> groupParagraphs(L<S> lines) { ret groupNonEmpty(lines, func(S line) { empty(line) }); }
Began life as a copy of #1005089
download show line numbers debug dex old transpilations
Travelled to 14 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, ddnzoavkxhuk, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
| Snippet ID: | #1005092 | 
| Snippet name: | Extract raw text from .odt | 
| Eternal ID of this version: | #1005092/1 | 
| Text MD5: | b93af5d191c85759438c48cb1a36bf42 | 
| Transpilation MD5: | 1711e685cb367b3381bdc4328a98c839 | 
| Author: | stefan | 
| Category: | javax / loading | 
| Type: | JavaX source code | 
| Public (visible to everyone): | Yes | 
| Archived (hidden from active list): | No | 
| Created/modified: | 2016-10-16 17:23:14 | 
| Source code size: | 996 bytes / 35 lines | 
| Pitched / IR pitched: | No / No | 
| Views / Downloads: | 830 / 960 | 
| Referenced in: | [show references] |