static int leven_limitedIC(S left, S right, int threshold) { if (--threshold < 0) ret 0; int n = l(left); // length of left int m = l(right); // length of right // if one string is empty, the edit distance is necessarily the length // of the other if (n == 0) ret m <= threshold ? m : threshold+1; if (m == 0) ret n <= threshold ? n : threshold+1; // if length difference is > threshold, just bail out if (abs(m-n) > threshold) ret threshold+1; if (n > m) { // swap the two strings to consume less memory final S tmp = left; left = right; right = tmp; n = m; m = right.length(); } int[] p = new int[n + 1]; // 'previous' cost array, horizontally int[] d = new int[n + 1]; // cost array, horizontally int[] tempD; // placeholder to assist in swapping p and d // fill in starting table values final int boundary = Math.min(n, threshold) + 1; for (int i = 0; i < boundary; i++) { p[i] = i; } // these fills ensure that the value above the rightmost entry of our // stripe will be ignored in following loop iterations Arrays.fill(p, boundary, p.length, Integer.MAX_VALUE); Arrays.fill(d, Integer.MAX_VALUE); // iterates through t - USE LONG to fix JVM SAFE POINTS. for (long _j = 1; _j <= m; _j++) { int j = (int) _j; final char rightJ = right.charAt(j - 1); // jth character of right d[0] = j; // compute stripe indices, constrain to array size final int min = Math.max(1, j - threshold); final int max = j > Integer.MAX_VALUE - threshold ? n : Math.min( n, j + threshold); // the stripe may lead off of the table if s and t are of different // sizes if (min > max) { return threshold+1; } // ignore entry left of leftmost if (min > 1) { d[min - 1] = Integer.MAX_VALUE; } // iterates through [min, max] in s for (int i = min; i <= max; i++) { if (equalsIgnoreCase(left.charAt(i - 1), rightJ)) { // diagonally left and up d[i] = p[i - 1]; } else { // 1 + minimum of cell to the left, to the top, diagonally // left and up d[i] = 1 + Math.min(Math.min(d[i - 1], p[i]), p[i - 1]); } } // copy current distance counts to 'previous row' distance counts tempD = p; p = d; d = tempD; } // if p[n] is greater than the threshold, there's no guarantee on it // being the correct // distance if (p[n] <= threshold) { return p[n]; } return threshold+1; }
Began life as a copy of #1007596
download show line numbers debug dex old transpilations
Travelled to 13 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #1008051 |
Snippet name: | leven_limitedIC - limited Levenshtein distance, ignoring case |
Eternal ID of this version: | #1008051/4 |
Text MD5: | c334db1122f818c4d91fdcaf23447d72 |
Author: | stefan |
Category: | javax |
Type: | JavaX fragment (include) |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2019-08-13 14:58:34 |
Source code size: | 2715 bytes / 87 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 598 / 626 |
Version history: | 3 change(s) |
Referenced in: | #1006654 - Standard functions list 2 (LIVE, continuation of #761) |