Class Levenshtein
java.lang.Object
info.debatty.java.stringsimilarity.Levenshtein
- All Implemented Interfaces:
MetricStringDistance
,StringDistance
,Serializable
The Levenshtein distance between two words is the minimum number of
single-character edits (insertions, deletions or substitutions) required to
change one string into the other.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfinal double
Equivalent to distance(s1, s2, Integer.MAX_VALUE).final double
The Levenshtein distance, or edit distance, between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
-
Constructor Details
-
Levenshtein
public Levenshtein()
-
-
Method Details
-
distance
Equivalent to distance(s1, s2, Integer.MAX_VALUE).- Specified by:
distance
in interfaceMetricStringDistance
- Specified by:
distance
in interfaceStringDistance
- Parameters:
s1
-s2
-- Returns:
-
distance
The Levenshtein distance, or edit distance, between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. http://en.wikipedia.org/wiki/Levenshtein_distance It is always at least the difference of the sizes of the two strings. It is at most the length of the longer string. It is zero if and only if the strings are equal. If the strings are the same size, the Hamming distance is an upper bound on the Levenshtein distance. The Levenshtein distance verifies the triangle inequality (the distance between two strings is no greater than the sum Levenshtein distances from a third string). Implementation uses dynamic programming (Wagner–Fischer algorithm), with only 2 rows of data. The space requirement is thus O(m) and the algorithm runs in O(mn).- Parameters:
s1
- The first string to compare.s2
- The second string to compare.limit
- The maximum result to compute before stopping. This means that the calculation can terminate early if you only care about strings with a certain similarity. Set this to Integer.MAX_VALUE if you want to run the calculation to completion in every case.- Returns:
- The computed Levenshtein distance.
- Throws:
NullPointerException
- if s1 or s2 is null.
-