Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

37
LINES

< > BotCompany Repo | #1029089 // Snippets Chunked Deep BitSet Word Index

JavaX source code (Dynamic Module) [tags: use-pretranspiled] - run with: Stefan's OS

Uses 911K of libraries. Click here for Pure Java version (5880L/30K).

!7

cprint SnippetsDeepBitSetWordIndex {
  transient ChunkedDeepBitSetWordIndex<S> wordIndex; // string = snippet ID
  switchable S regexp = "\\w+";
  switchable int chunkSize = 16384;

  start-thread {
    dm_reloadOnFieldChange("regexp", "chunkSize");
    time "Make bit-set word index" {
      print("Making index");
      new ChunkedDeepBitSetWordIndex<S> wordIndex;
      wordIndex.chunkLength = chunkSize;
      wordIndex.regexp = regexp;
      for (virtual CSnippet sn : dm_allSnippets()) {
        S snippetID = (S) rcall snippetID(sn);
        S text = cast rcall text(sn);
        wordIndex.add(snippetID, text);
      }
      wordIndex.doneAdding();
      setField(+wordIndex);
    }
    infoBox("Indexed " + nWords(wordIndex.numWords()));
  }
  
  // API
  
  Iterable<S> snippetPreSearch(S query, O... _) {
    long nanos = nanoTime();
    Iterable<S> l;
    //time "lookup" {
      l = wordIndex == null ? null : wordIndex.preSearch(query, _);
    //}
    //print((nanoTime()-nanos) + " nanos in pre");
    ret l;
  }
}

Author comment

Began life as a copy of #1029084

download  show line numbers  debug dex  old transpilations   

Travelled to 7 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt, xrpafgyirdlv

No comments. add comment

Snippet ID: #1029089
Snippet name: Snippets Chunked Deep BitSet Word Index
Eternal ID of this version: #1029089/5
Text MD5: 73fb311e260dd974bad0243c7886c7ee
Transpilation MD5: 70a8952c080c474d52e4023bcfc216fa
Author: stefan
Category: javax
Type: JavaX source code (Dynamic Module)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2020-07-19 03:20:41
Source code size: 1071 bytes / 37 lines
Pitched / IR pitched: No / No
Views / Downloads: 163 / 358
Version history: 4 change(s)
Referenced in: #1029090 - Instant Full-Text Snippet Search v7 [using deep chunked bit-set word index, dev.]