Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

33
LINES

< > BotCompany Repo | #1021122 // Test Language Detector [OK]

JavaX source code (desktop) [tags: use-pretranspiled] - run with: x30.jar

Download Jar. Uses 81K of libraries. Click here for Pure Java version (5921L/42K).

!7

lib 1400180 // github.com/optimaize/language-detector
lib 1400181 // jsonic
lib 1011966 // slf4j-api-1.7.25.jar
lib 1400182 // guava

import com.optimaize.langdetect.*;
import com.optimaize.langdetect.i18n.*;
import com.optimaize.langdetect.ngram.*;
import com.optimaize.langdetect.profiles.*;
import com.optimaize.langdetect.text.*;

p-exp {
  new LanguageProfileReader profileReader;
  //L<LanguageProfile> languageProfiles = profileReader.readAllBuiltIn();
  L<LanguageProfile> languageProfiles = ll(profileReader.readBuiltIn(LdLocale.fromString("de")), profileReader.readBuiltIn(LdLocale.fromString("en")));
  
  LanguageDetector languageDetector = LanguageDetectorBuilder.create(NgramExtractors.standard())
    .withProfiles(languageProfiles)
    .build();

  // create a text object factory
  TextObjectFactory textObjectFactory = CommonTextObjectFactories.forDetectingOnLargeText();

  // query
  for (S text : ll("hello world", "hallo welt")) {
    //TextObject textObject = textObjectFactory.forText(text);
    //LdLocale lang = languageDetector.detect(textObject).orElse(null);
    L<DetectedLanguage> languages = languageDetector.getProbabilities(text);
    print(text + " => " + languages);
  }
}

download  show line numbers  debug dex  old transpilations   

Travelled to 7 computer(s): bhatertpkbcr, cfunsshuasjs, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1021122
Snippet name: Test Language Detector [OK]
Eternal ID of this version: #1021122/9
Text MD5: 4a76bc18dfbcae47825c34d01a3553b1
Transpilation MD5: 473e2c9b6593d2a0a016756c177b7b7a
Author: stefan
Category: javax / nlp
Type: JavaX source code (desktop)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2019-01-23 18:48:14
Source code size: 1244 bytes / 33 lines
Pitched / IR pitched: No / No
Views / Downloads: 407 / 900
Version history: 8 change(s)
Referenced in: [show references]