Uses of Interface
info.debatty.java.stringsimilarity.interfaces.StringDistance

Packages that use StringDistance
  • Uses of StringDistance in info.debatty.java.stringsimilarity

    Modifier and Type
    Class
    Description
    class 
    The similarity between the two strings is the cosine of the angle between these two vectors representation.
    class 
    Implementation of Damerau-Levenshtein distance with transposition (also sometimes calls unrestricted Damerau-Levenshtein distance).
    class 
    Each input string is converted into a set of n-grams, the Jaccard index is then computed as |V1 inter V2| / |V1 union V2|.
    class 
    The Jaro–Winkler distance metric is designed and best suited for short strings such as person names, and to detect typos; it is (roughly) a variation of Damerau-Levenshtein, where the substitution of 2 close characters is considered less important then the substitution of 2 characters that a far from each other.
    class 
    The Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one string into the other.
    class 
    The longest common subsequence (LCS) problem consists in finding the longest subsequence common to two (or more) sequences.
    class 
    Distance metric based on Longest Common Subsequence, from the notes "An LCS-based string metric" by Daniel Bakkelund.
    class 
    N-Gram Similarity as defined by Kondrak, "N-Gram Similarity and Distance", String Processing and Information Retrieval, Lecture Notes in Computer Science Volume 3772, 2005, pp 115-126.
    class 
    This distance is computed as levenshtein distance divided by the length of the longest string.
    final class 
    Implementation of the the Optimal String Alignment (sometimes called the restricted edit distance) variant of the Damerau-Levenshtein distance.
    class 
    Q-gram distance, as defined by Ukkonen in "Approximate string-matching with q-grams and maximal matches".
    class 
    Ratcliff/Obershelp pattern recognition The Ratcliff/Obershelp algorithm computes the similarity of two strings a the doubled number of matching characters divided by the total number of characters in the two strings.
    class 
    Similar to Jaccard index, but this time the similarity is computed as 2 * |V1 inter V2| / (|V1| + |V2|).
    class 
    Implementation of Levenshtein that allows to define different weights for different character substitutions.
  • Uses of StringDistance in info.debatty.java.stringsimilarity.experimental

    Modifier and Type
    Class
    Description
    class 
    Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
  • Uses of StringDistance in info.debatty.java.stringsimilarity.interfaces

    Modifier and Type
    Interface
    Description
    interface 
    String distances that implement this interface are metrics.
    interface 
    Normalized string similarities return a similarity between 0.0 and 1.0.