Class WeightedLevenshtein

java.lang.Object
info.debatty.java.stringsimilarity.WeightedLevenshtein
All Implemented Interfaces:
StringDistance, Serializable

@Immutable public class WeightedLevenshtein extends Object implements StringDistance
Implementation of Levenshtein that allows to define different weights for different character substitutions.
See Also:
  • Field Details

  • Constructor Details

    • WeightedLevenshtein

      public WeightedLevenshtein(CharacterSubstitutionInterface charsub)
      Instantiate with provided character substitution.
      Parameters:
      charsub - The strategy to determine character substitution weights.
    • WeightedLevenshtein

      public WeightedLevenshtein(CharacterSubstitutionInterface charsub, CharacterInsDelInterface charchange)
      Instantiate with provided character substitution, insertion, and deletion weights.
      Parameters:
      charsub - The strategy to determine character substitution weights.
      charchange - The strategy to determine character insertion / deletion weights.
  • Method Details

    • distance

      public final double distance(String s1, String s2)
      Equivalent to distance(s1, s2, Double.MAX_VALUE).
      Specified by:
      distance in interface StringDistance
      Parameters:
      s1 -
      s2 -
      Returns:
    • distance

      public final double distance(String s1, String s2, double limit)
      Compute Levenshtein distance using provided weights for substitution.
      Parameters:
      s1 - The first string to compare.
      s2 - The second string to compare.
      limit - The maximum result to compute before stopping. This means that the calculation can terminate early if you only care about strings with a certain similarity. Set this to Double.MAX_VALUE if you want to run the calculation to completion in every case.
      Returns:
      The computed weighted Levenshtein distance.
      Throws:
      NullPointerException - if s1 or s2 is null.
    • insertionCost

      private double insertionCost(char c)
    • deletionCost

      private double deletionCost(char c)