Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

46
LINES

< > BotCompany Repo | #1000559 // Text diff function in JavaX (char-based, based on levenshtein distance, developing [maybe])

JavaX source code - run with: x30.jar

// diff format: array of character numbers out of s used to reconstruct t

static int leven(String s, String t) {
  // degenerate cases
  if (s.equals(t)) return 0;
  if (s.length() == 0) return t.length();
  if (t.length() == 0) return s.length();

  // create two work vectors of integer distances
  L<int>[] v0 = new ArrayList<int>[t.length() + 1];
  L<Int>[] v1 = new ArrayList<int>[t.length() + 1];

  // initialize v0 (the previous row of distances)
  // this row is A[0][i]: edit distance for an empty s
  // the distance is just the number of characters to delete from t
  for (int i = 0; i < v0.length; i++)
      v0[i] = i;

  for (int i = 0; i < s.length(); i++)
  {
      // calculate v1 (current row distances) from the previous row v0

      // first element of v1 is A[i+1][0]
      //   edit distance is delete (i+1) chars from s to match empty t
      v1[0] = i + 1;

      // use formula to fill in the rest of the row
      for (int j = 0; j < t.length(); j++)
      {
          int cost = s.charAt(i) == t.charAt(j) ? 0 : 1;
          v1[j + 1] = Math.min(Math.min(v1[j] + 1, v0[j + 1] + 1), v0[j] + cost);
      }

      // copy v1 (current row) to v0 (previous row) for next iteration
      /*for (int j = 0; j < v0.length; j++)
          v0[j] = v1[j];*/
          
      // swap arrays
      int[] v = v1; v1 = v0; v0 = v;
  }

  // for array copying:
  // return v1[t.length()];
  // for array swapping:
  return v0[t.length()];
}

Author comment

Began life as a copy of #1000388

download  show line numbers  debug dex  old transpilations   

Travelled to 13 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1000559
Snippet name: Text diff function in JavaX (char-based, based on levenshtein distance, developing [maybe])
Eternal ID of this version: #1000559/1
Text MD5: 7574330524ed3d71c7eaec0d1353156a
Author: stefan
Category: javax
Type: JavaX source code
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2015-08-14 14:10:36
Source code size: 1500 bytes / 46 lines
Pitched / IR pitched: No / Yes
Views / Downloads: 581 / 544
Referenced in: [show references]