Libraryless. Click here for Pure Java version (3379L/21K).
1 | svoid ai_makeRegexpLanguageDetectorsFromRandomNGrams(S lang1, S lang2, int n) { |
2 | LanguageDetectionTask task = dm_languageDetectionTask(lang1, lang2); |
3 | print(task.task()); |
4 | |
5 | Set<S> seen = ciSet(); |
6 | new DynamicTopTen<S> tt; |
7 | |
8 | repeat 1000 { |
9 | S re = firstNotSeen_nAttempts(1000, seen, () -> regexpQuote_useBackslashes(randomNGram(n, random(task.pos))); |
10 | if (re == null) break with print("Can't find any new regexps"); |
11 | tt.add(re, scoreRegexpIC(re, task.pos, task.neg)); |
12 | } |
13 | |
14 | pnl(tt.withScores()); |
15 | dm_saveLanguageDetectionRegexps(task, tt); |
16 | } |
download show line numbers debug dex old transpilations
Travelled to 7 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt, xrpafgyirdlv
No comments. add comment
Snippet ID: | #1027487 |
Snippet name: | ai_makeRegexpLanguageDetectorsFromRandomNGrams |
Eternal ID of this version: | #1027487/1 |
Text MD5: | 54c5d2d8048253126530215ec2b0bc21 |
Transpilation MD5: | 2c53c34e4d41c1c319f6461bf6dfac23 |
Author: | stefan |
Category: | javax |
Type: | JavaX fragment (include) |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2020-03-22 15:00:22 |
Source code size: | 576 bytes / 16 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 177 / 249 |
Referenced in: | [show references] |