Libraryless. Click here for Pure Java version (211L/2K).
1 | // Calculates the fuzzy match of needle in haystack, |
2 | // using a modified version of the Levenshtein distance algorithm. |
3 | // ported from: http://ginstrom.com/scribbles/2007/12/01/fuzzy-substring-matching-with-levenshtein-distance-in-python/ |
4 | |
5 | static int levenSubstringIC(S haystack, S needle) {
|
6 | int m = strL(needle), n = strL(haystack); |
7 | |
8 | // base cases |
9 | if (containsIgnoreCase(haystack, needle)) ret 0; |
10 | |
11 | int[] v0 = new int[n+1], v1 = new int[n+1]; |
12 | for i to m: { // iterate over needle
|
13 | v1[0] = i+1; |
14 | for j to n: { // iterate over haystack
|
15 | int cost = neqic(needle.charAt(i), haystack.charAt(j)) ? 1 : 0; |
16 | v1[j+1] = min3(v1[j]+1, // insertion |
17 | v0[j+1]+1, // deletion |
18 | v0[j]+cost); // substitution |
19 | } |
20 | int[] temp = v0; |
21 | v0 = v1; |
22 | v1 = temp; |
23 | } |
24 | ret minOfIntArray(v0); |
25 | } |
download show line numbers debug dex old transpilations
Travelled to 6 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt
No comments. add comment
| Snippet ID: | #1023862 |
| Snippet name: | levenSubstringIC - find leven-closest substring in larger string. complexity O(m*n) |
| Eternal ID of this version: | #1023862/18 |
| Text MD5: | 9b4b6ac905b2e9f6c10602294e1a53f2 |
| Transpilation MD5: | af3329563c64aeeca709f9d60c287c34 |
| Author: | stefan |
| Category: | javax / a.i. |
| Type: | JavaX fragment (include) |
| Public (visible to everyone): | Yes |
| Archived (hidden from active list): | No |
| Created/modified: | 2019-11-11 14:59:59 |
| Source code size: | 878 bytes / 25 lines |
| Pitched / IR pitched: | No / No |
| Views / Downloads: | 581 / 807 |
| Version history: | 17 change(s) |
| Referenced in: | [show references] |