Class WeightedLevenshtein

  • All Implemented Interfaces:
    StringDistance, java.io.Serializable

    @Immutable
    public class WeightedLevenshtein
    extends java.lang.Object
    implements StringDistance
    Implementation of Levenshtein that allows to define different weights for different character substitutions.
    See Also:
    Serialized Form
    • Constructor Detail

      • WeightedLevenshtein

        public WeightedLevenshtein​(CharacterSubstitutionInterface charsub)
        Instantiate with provided character substitution.
        Parameters:
        charsub - The strategy to determine character substitution weights.
      • WeightedLevenshtein

        public WeightedLevenshtein​(CharacterSubstitutionInterface charsub,
                                   CharacterInsDelInterface charchange)
        Instantiate with provided character substitution, insertion, and deletion weights.
        Parameters:
        charsub - The strategy to determine character substitution weights.
        charchange - The strategy to determine character insertion / deletion weights.
    • Method Detail

      • distance

        public final double distance​(java.lang.String s1,
                                     java.lang.String s2)
        Equivalent to distance(s1, s2, Double.MAX_VALUE).
        Specified by:
        distance in interface StringDistance
        Returns:
      • distance

        public final double distance​(java.lang.String s1,
                                     java.lang.String s2,
                                     double limit)
        Compute Levenshtein distance using provided weights for substitution.
        Parameters:
        s1 - The first string to compare.
        s2 - The second string to compare.
        limit - The maximum result to compute before stopping. This means that the calculation can terminate early if you only care about strings with a certain similarity. Set this to Double.MAX_VALUE if you want to run the calculation to completion in every case.
        Returns:
        The computed weighted Levenshtein distance.
        Throws:
        java.lang.NullPointerException - if s1 or s2 is null.
      • insertionCost

        private double insertionCost​(char c)
      • deletionCost

        private double deletionCost​(char c)