Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

30
LINES

< > BotCompany Repo | #1005101 // rawTextFromODT

JavaX fragment (include)

1  
static L<S> rawTextFromODT(File odt) {
2  
  // unwrapContainerTags makes editable lists
3  
  L<L<S>> paragraphs = unwrapContainerTags(paragraphsFromODT(odt));
4  
  
5  
  paragraphs = map(paragraphs, func(L<S> p) {
6  
    dropListPrefix(p, "", "<text:soft-page-break/>")
7  
  });
8  
  
9  
  for i over paragraphs: {
10  
    L<S> p = paragraphs.get(i);
11  
    int idx;
12  
    while ((idx = p.indexOf("<text:line-break/>")) >= 0) {
13  
      paragraphs.add(i+1, newSubList(p, idx+1));
14  
      removeSubList(p, 0, idx+1);
15  
    }
16  
  }
17  
  
18  
  // nb: we might need to remove formatting if user used any
19  
  // (not doing that yet)
20  
  L<S> lines = map(paragraphs, func(L<S> p) { join(" # ", p) });
21  
  lines = map(lines, func(S s) { trim(htmldecode(s)) });
22  
  
23  
  paragraphs = rawTextFromODT_groupParagraphs(lines);
24  
  lines = map(paragraphs, func(L<S> p) { fromLines(p) });
25  
  ret lines;
26  
}
27  
28  
static L<L<S>> rawTextFromODT_groupParagraphs(L<S> lines) {
29  
  ret groupNonEmpty(lines, func(S line) { empty(line) });
30  
}

Author comment

Began life as a copy of #1005092

download  show line numbers  debug dex  old transpilations   

Travelled to 14 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, ddnzoavkxhuk, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1005101
Snippet name: rawTextFromODT
Eternal ID of this version: #1005101/1
Text MD5: 177340e4330c0cfa301ee0b695d4198a
Author: stefan
Category: javax / loading / a.i.
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2016-10-16 17:24:55
Source code size: 979 bytes / 30 lines
Pitched / IR pitched: No / No
Views / Downloads: 511 / 532
Referenced in: [show references]