Libraryless. Click here for Pure Java version (302L/3K).
1 | // Calculates the fuzzy match of needle in haystack, |
2 | // using a modified version of the Levenshtein distance algorithm. |
3 | // ported from: http://ginstrom.com/scribbles/2007/12/01/fuzzy-substring-matching-with-levenshtein-distance-in-python/ |
4 | |
5 | static Pair<Int> levenSubstringIC_withIndex(S haystack, S needle) {
|
6 | int m = strL(needle), n = strL(haystack); |
7 | |
8 | // base cases |
9 | if (m == 1) {
|
10 | int idx = indexOfIC(haystack, needle); |
11 | ret idx < 0 ? pair(0, 1) : pair(idx, 0); |
12 | } |
13 | if (m == 0) |
14 | ret pair(0, m); |
15 | |
16 | int[] row1 = new int[n+1], row2 = new int[n+1]; |
17 | for i to m: {
|
18 | row2[0] = i+1; |
19 | for j to n: {
|
20 | int cost = neqic(needle.charAt(i), haystack.charAt(j)) ? 1 : 0; |
21 | row2[j+1] = min3(row1[j+1]+1, // deletion |
22 | row2[j]+1, // insertion |
23 | row1[j]+cost); // substitution |
24 | } |
25 | int[] temp = row1; |
26 | row1 = row2; |
27 | row2 = temp; |
28 | } |
29 | Pair<Int> p = minOfIntArray_withIndex(row1); |
30 | ret pair(max(0, p.a-m), p.b); |
31 | } |
Began life as a copy of #1023862
download show line numbers debug dex old transpilations
Travelled to 6 computer(s): bhatertpkbcr, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tvejysmllsmz, vouqrxazstgt
No comments. add comment
| Snippet ID: | #1023873 |
| Snippet name: | levenSubstringIC_withIndex - find leven-closest substring in larger string. complexity O(m*n). return rough index of substring & distance |
| Eternal ID of this version: | #1023873/4 |
| Text MD5: | 8148251608469a0749475c3b3f2bcfab |
| Transpilation MD5: | a05f7293185a2aac2acce6338168ce6d |
| Author: | stefan |
| Category: | javax / a.i. |
| Type: | JavaX fragment (include) |
| Public (visible to everyone): | Yes |
| Archived (hidden from active list): | No |
| Created/modified: | 2019-07-11 00:10:34 |
| Source code size: | 1017 bytes / 31 lines |
| Pitched / IR pitched: | No / No |
| Views / Downloads: | 432 / 579 |
| Version history: | 3 change(s) |
| Referenced in: | [show references] |