Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

39
LINES

< > BotCompany Repo | #1021123 // Language Detector (German/English)

JavaX source code (Dynamic Module) [tags: use-pretranspiled] - run with: Stefan's OS

Uses 992K of libraries. Click here for Pure Java version (1035L/7K).

!7

lib 1400180 // github.com/optimaize/language-detector
lib 1400181 // jsonic
lib 1011966 // slf4j-api-1.7.25.jar
lib 1400182 // guava

static LS _stickyLibs_langDetect = ll(#1400180, #1400181, #1011966, #1400182);

import com.optimaize.langdetect.*;
import com.optimaize.langdetect.i18n.*;
import com.optimaize.langdetect.ngram.*;
import com.optimaize.langdetect.profiles.*;
import com.optimaize.langdetect.text.*;

cmodule LanguageDetectorModule {
  transient LanguageDetector languageDetector;
  
  void init() {
    lock lock;
    if (languageDetector != null) ret;
    final new LanguageProfileReader profileReader;
    L<LanguageProfile> languageProfiles = map(ll("de", "en"), func(S lang) -> LanguageProfile { profileReader.readBuiltIn(LdLocale.fromString(lang)) });
    languageDetector = LanguageDetectorBuilder.create(NgramExtractors.standard())
      .withProfiles(languageProfiles)
      .build();
  }

  // API
  
  // returns "en" or "de" or null
  S detectLanguage(S text) {
    S lang = dm_findAndCallModule("#1021121/WordToLanguageCRUD", 'languageForText, text);
    if (lang != null) ret languageToTwoLetters(lang);
    init();
    L<DetectedLanguage> languages = languageDetector.getProbabilities(text);
    ret empty(languages) ? null : first(languages).getLocale().getLanguage();
  }
}

Author comment

Began life as a copy of #1021122

download  show line numbers  debug dex  old transpilations   

Travelled to 7 computer(s): bhatertpkbcr, cfunsshuasjs, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1021123
Snippet name: Language Detector (German/English)
Eternal ID of this version: #1021123/4
Text MD5: 3540a0300976b4b3706e15f670d7a1a2
Transpilation MD5: 5c185b5ae2f07a0933ff81f0969bb997
Author: stefan
Category: javax / nlp
Type: JavaX source code (Dynamic Module)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2019-01-23 23:20:52
Source code size: 1346 bytes / 39 lines
Pitched / IR pitched: No / No
Views / Downloads: 386 / 928
Version history: 3 change(s)
Referenced in: [show references]