Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

85
LINES

< > BotCompany Repo | #1007596 // leven_limited - limited Levenshtein distance. returns threshold+1 if distance > threshold

JavaX fragment (include)

static int leven_limited(S left, S right, int threshold) {
  if (--threshold < 0) ret 0;

  int n = left.length(); // length of left
  int m = right.length(); // length of right

  // if one string is empty, the edit distance is necessarily the length
  // of the other
  if (n == 0) {
      return m <= threshold ? m : threshold+1;
  } else if (m == 0) {
      return n <= threshold ? n : threshold+1;
  }

  if (n > m) {
      // swap the two strings to consume less memory
      final S tmp = left;
      left = right;
      right = tmp;
      n = m;
      m = right.length();
  }

  int[] p = new int[n + 1]; // 'previous' cost array, horizontally
  int[] d = new int[n + 1]; // cost array, horizontally
  int[] tempD; // placeholder to assist in swapping p and d

  // fill in starting table values
  final int boundary = Math.min(n, threshold) + 1;
  for (int i = 0; i < boundary; i++) {
      p[i] = i;
  }
  // these fills ensure that the value above the rightmost entry of our
  // stripe will be ignored in following loop iterations
  Arrays.fill(p, boundary, p.length, Integer.MAX_VALUE);
  Arrays.fill(d, Integer.MAX_VALUE);

  // iterates through t - USE LONG to fix JVM SAFE POINTS.
  for (long _j = 1; _j <= m; _j++) {
      int j = (int) _j;
      final char rightJ = right.charAt(j - 1); // jth character of right
      d[0] = j;

      // compute stripe indices, constrain to array size
      final int min = Math.max(1, j - threshold);
      final int max = j > Integer.MAX_VALUE - threshold ? n : Math.min(
              n, j + threshold);

      // the stripe may lead off of the table if s and t are of different
      // sizes
      if (min > max) {
          return threshold+1;
      }

      // ignore entry left of leftmost
      if (min > 1) {
          d[min - 1] = Integer.MAX_VALUE;
      }

      // iterates through [min, max] in s
      for (int i = min; i <= max; i++) {
          if (left.charAt(i - 1) == rightJ) {
              // diagonally left and up
              d[i] = p[i - 1];
          } else {
              // 1 + minimum of cell to the left, to the top, diagonally
              // left and up
              d[i] = 1 + Math.min(Math.min(d[i - 1], p[i]), p[i - 1]);
          }
      }

      // copy current distance counts to 'previous row' distance counts
      tempD = p;
      p = d;
      d = tempD;
  }

  // if p[n] is greater than the threshold, there's no guarantee on it
  // being the correct
  // distance
  if (p[n] <= threshold) {
      return p[n];
  }
  return threshold+1;
}

download  show line numbers  debug dex  old transpilations   

Travelled to 14 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, irmadwmeruwu, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1007596
Snippet name: leven_limited - limited Levenshtein distance. returns threshold+1 if distance > threshold
Eternal ID of this version: #1007596/7
Text MD5: 150a5f904262dffe602d47114dcf0e51
Author: stefan
Category: javax
Type: JavaX fragment (include)
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2018-09-23 14:03:46
Source code size: 2625 bytes / 85 lines
Pitched / IR pitched: No / No
Views / Downloads: 498 / 533
Version history: 6 change(s)
Referenced in: [show references]