Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

57
LINES

< > BotCompany Repo | #1000987 // Multiset matching for prediction

JavaX source code [tags: use-pretranspiled] - run with: x30.jar

Libraryless. Click here for Pure Java version (427L/4K/12K).

!747

m {
  static S sentence1 = "Yes is the opposite of no. No is the opposite of yes.";
  static S sentence2 = "Green is the opposite of white.";
  
  !include #1000988 // MultiSet
  
  p {
    L<S> tok1 = parse(sentence1);
    L<S> tok2 = parse(sentence2);
    
    Map<S, MultiSet<S>> map = makeMMapPrefix(tok1, tok2);
    if (map == null)
      print("No match.");
    else
      print(structure(map));
      
    print(complete(sentence1, sentence2));
  }
  
  static L<S> parse(S s) {
    return tokensToLowerCase(javaTok(s));
  }
  
  static S complete(S sentence1, S sentence2) {
    L<S> tok1 = parse(sentence1);
    L<S> tok2 = parse(sentence2);
    Map<S, MultiSet<S>> map = makeMMapPrefix(tok1, tok2);
    if (map == null) return null;
    new L<S> tok;
    tok.addAll(tok2.subList(0, tok2.size()-1));
    for (int i = tok2.size()-1; i < tok1.size(); i++) {
      S t = tok1.get(i);
      MultiSet<S> set = map.get(t);
      tok.add(set == null ? t : set.getMostPopularEntry());
    }
    return join(tok);
  }
  
  static Map<S, MultiSet<S>> makeMMapPrefix(L<S> tok1, L<S> tok2) {
    if (tok1.size() < tok2.size()) return null;
    
    Map<S, MultiSet<S>> map = new TreeMap<S, MultiSet<S>>();
    for (int i = 1; i < tok2.size(); i += 2) {
      S t1 = tok1.get(i), t2 = tok2.get(i);
      MultiSet<S> set = map.get(t1);
      if (set == null)
        map.put(t1, new MultiSet<S>(t2));
      else
        set.add(t2);
    }
    
    // match succeeds
    return map;
  }
}

Author comment

Began life as a copy of #1000986

download  show line numbers  debug dex  old transpilations   

Travelled to 15 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, onxytkatvevr, pyentgdyhuwx, pzhvpgtvlbxg, teubizvjbppd, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1000987
Snippet name: Multiset matching for prediction
Eternal ID of this version: #1000987/1
Text MD5: ed7856c0508dde97310f3017d6c3a51b
Transpilation MD5: dfad3438c063fe55b3d79df6839131d6
Author: stefan
Category: javax
Type: JavaX source code
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2015-09-14 02:07:56
Source code size: 1544 bytes / 57 lines
Pitched / IR pitched: No / Yes
Views / Downloads: 697 / 723
Referenced in: [show references]