Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

27
LINES

< > BotCompany Repo | #1027972 // constructWordFromCISet_withConnectors

JavaX fragment (include) [tags: use-pretranspiled]

Libraryless. Click here for Pure Java version (2717L/17K).

1  
// dictionary and connectors have to be ciSets
2  
static Chain<S> constructWordFromCISet_withConnectors(S word, Set<S> dictionary, Set<S> connectors) {
3  
  ifdef constructWordFromCISet_withConnectors_debug
4  
  print("checking: " + word);
5  
  endifdef
6  
  if (contains(dictionary, word)) ret Chain<S>(word);
7  
  for ping (int i = l(word)-1; i > 0; i--) {
8  
    S prefix = takeFirst(i, word);
9  
    if (contains(dictionary, prefix)) {
10  
      S rest = substring(word, i);
11  
      
12  
      // try connector
13  
      LS matchingConnectors = reversed(prefixesOfIC(connectors, rest));
14  
      for (S conn : matchingConnectors) {
15  
        Chain<S> chain = constructWordFromCISet_withConnectors(dropPrefixIC(conn, rest), dictionary, connectors);
16  
        if (chain != null)
17  
          ret Chain<S>(prefix, Chain<S>(conn, chain));
18  
      }
19  
      
20  
      // try no connector
21  
      Chain<S> chain = constructWordFromCISet_withConnectors(rest, dictionary, connectors);
22  
      if (chain != null)
23  
        ret Chain<S>(prefix, chain);
24  
    }
25  
  }
26  
  null;
27  
}

Author comment

Began life as a copy of #1027969

download  show line numbers  debug dex  old transpilations   

Travelled to 7 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt, xrpafgyirdlv

No comments. add comment

Snippet ID: #1027972
Snippet name: constructWordFromCISet_withConnectors
Eternal ID of this version: #1027972/3
Text MD5: abe1dd6d6b479478392fbb972825cee7
Transpilation MD5: 945c475ba665eac774e0458bd8ecc7a8
Author: stefan
Category: javax / nlp
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2020-04-21 11:11:10
Source code size: 1031 bytes / 27 lines
Pitched / IR pitched: No / No
Views / Downloads: 129 / 197
Version history: 2 change(s)
Referenced in: [show references]