< > BotCompany Repo | #1000670 // htmlcoarsetok (function)

JavaX fragment (include) [tags: use-pretranspiled]

Transpiled version (84L) is out of date.

// TODO: process CDATA, scripts

static LS htmlcoarsetok(S s) {
  new LS tok;
  int l = s == null ? 0 : s.length();
  int i = 0;
  while (i < l) {
    int j = i;
    char c;
    // scan for non-tags
    while (j < l) {
      if (s.charAt(j) != '<')
        // regular character
      else if (s.substring(j, Math.min(j+4, l)).equals("<!--")) {
        // HTML comment
        j = j+3;
        do ++j; while (j < l && !s.substring(j, Math.min(j+3, l)).equals("-->"));
        j = Math.min(j+3, l);
      } else {
        char d = charAt(s, j+1); // character after <
        if (d == '/' || isLetter(d))
          // it's a tag
    tok.add(s.substring(i, j)); // add non-tag content
    i = j;
    if (i >= l) break;
    c = s.charAt(i);

    // scan over tag
    if (c == '<') {
      while (j < l && s.charAt(j) != '>') ++j; // TODO: strings in tag?
      if (j < l) ++j;

    tok.add(s.substring(i, j)); // add tag
    i = j;
  if ((tok.size() & 1) == 0) tok.add("");
  return tok;

Snippet ID: #1000670
Snippet name: htmlcoarsetok (function)
Eternal ID of this version: #1000670/6
Text MD5: f782f73d604810e4c861182426c30c61
Author: stefan
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2021-09-24 20:29:17
Source code size: 1151 bytes / 51 lines
Pitched / IR pitched: No / No
Views / Downloads: 816 / 2086
Version history: 5 change(s)
Referenced in: [show references]