Libraryless. Click here for Pure Java version (1706L/13K/41K).
1 | !752 |
2 | |
3 | p { |
4 | File odt = new File(userHome(), "Documents/super-state.odt"); |
5 | |
6 | // unwrapContainerTags makes editable lists |
7 | L<L<S>> paragraphs = unwrapContainerTags(paragraphsFromODT(odt)); |
8 | |
9 | paragraphs = map(paragraphs, func(L<S> p) { |
10 | dropListPrefix(p, "", "<text:soft-page-break/>") |
11 | }); |
12 | |
13 | for i over paragraphs: { |
14 | L<S> p = paragraphs.get(i); |
15 | int idx; |
16 | while ((idx = p.indexOf("<text:line-break/>")) >= 0) { |
17 | paragraphs.add(i+1, newSubList(p, idx+1)); |
18 | removeSubList(p, 0, idx+1); |
19 | } |
20 | } |
21 | |
22 | // nb: we might need to remove formatting if user used any |
23 | // (not doing that yet) |
24 | L<S> lines = map(paragraphs, func(L<S> p) { join(" # ", p) }); |
25 | lines = map(lines, func(S s) { trim(htmldecode(s)) }); |
26 | |
27 | paragraphs = groupParagraphs(lines); |
28 | lines = map(paragraphs, func(L<S> p) { fromLines(p) }); |
29 | |
30 | psl(lines); |
31 | } |
32 | |
33 | static L<L<S>> groupParagraphs(L<S> lines) { |
34 | ret groupNonEmpty(lines, func(S line) { empty(line) }); |
35 | } |
Began life as a copy of #1005089
download show line numbers debug dex old transpilations
Travelled to 14 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, ddnzoavkxhuk, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #1005092 |
Snippet name: | Extract raw text from .odt |
Eternal ID of this version: | #1005092/1 |
Text MD5: | b93af5d191c85759438c48cb1a36bf42 |
Transpilation MD5: | 1711e685cb367b3381bdc4328a98c839 |
Author: | stefan |
Category: | javax / loading |
Type: | JavaX source code |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2016-10-16 17:23:14 |
Source code size: | 996 bytes / 35 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 610 / 701 |
Referenced in: | [show references] |