Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

59
LINES

< > BotCompany Repo | #1002009 // Language detector - find shortest code (developing)

JavaX source code - run with: x30.jar

!752

p {
  // load Python, JavaX example code
  S text1 = loadSnippet("#1002004");
  S lang1 = "python";
  S text2 = loadSnippet("#1001747");
  S lang2 = "javax";
  
  // select a few lines each (or all lines) & make in/out examples
  L<S[]> examples = concatLists(
    makeExamples(text1, lang1),
    makeExamples(text2, lang2));
 
  // optionally: remove lines appearing in both languages
  
  print("Got " + l(examples) + " example lines.");
  
  // run solver :)
  
  O solver = hotwire("#738"); // the master solver!
  
  // Usually, the solver looks for exact (100%) solutions.
  // This time, instead we try to find the solutions that work in
  // as many cases as possible.
  
  L learners = cast call(solver, "makeLearners");
  print("Got " + l(learners) + " learners.");
  
  O _case = call(solver, "produceCase", unrollExamples(examples));
  print("Full examples: " + structure(get(_case, "fullExamples")));
  
  // it's not split yet - call split()
  print("Split point: " + l((L) get(_case, "examples1")));
}

static S[] unrollExamples(L<S[]> examples) {
  S[] x = new S[l(examples)*2];
  for (int i = 0; i < l(examples); i++) {
    x[i*2] = examples.get(i)[0];
    x[i*2+1] = examples.get(i)[1];
  }
  ret x;
}

// two-element string arrays (source line, language name)
static L<S[]> makeExamples(S text, S lang) {
  new L<S[]> l;
  for (S line : toLinesFullTrim(text))
    l.add(new S[] {line, lang});
  ret l;
}

static int codeSize(S java) {
  ret totalLengthOfCodeTokens(javaTok(java));
}


}

Author comment

Began life as a copy of #1002005

download  show line numbers  debug dex  old transpilations   

Travelled to 13 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1002009
Snippet name: Language detector - find shortest code (developing)
Eternal ID of this version: #1002009/1
Text MD5: 3681357a848478a692b3ecbd1f8d821a
Author: stefan
Category: python/javax
Type: JavaX source code
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2015-12-13 17:58:37
Source code size: 1569 bytes / 59 lines
Pitched / IR pitched: No / Yes
Views / Downloads: 668 / 599
Referenced in: [show references]