Class Jaccard

    • Constructor Summary

      Constructors 
      Constructor Description
      Jaccard()
      The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
      Jaccard​(int k)
      The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      double distance​(java.lang.String s1, java.lang.String s2)
      Distance is computed as 1 - similarity.
      double similarity​(java.lang.String s1, java.lang.String s2)
      Compute Jaccard index: |A inter B| / |A union B|.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Jaccard

        public Jaccard​(int k)
        The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|. The default value of k is 3.
        Parameters:
        k -
      • Jaccard

        public Jaccard()
        The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|. The default value of k is 3.
    • Method Detail

      • similarity

        public final double similarity​(java.lang.String s1,
                                       java.lang.String s2)
        Compute Jaccard index: |A inter B| / |A union B|.
        Specified by:
        similarity in interface StringSimilarity
        Parameters:
        s1 - The first string to compare.
        s2 - The second string to compare.
        Returns:
        The Jaccard index in the range [0, 1]
        Throws:
        java.lang.NullPointerException - if s1 or s2 is null.
      • distance

        public final double distance​(java.lang.String s1,
                                     java.lang.String s2)
        Distance is computed as 1 - similarity.
        Specified by:
        distance in interface MetricStringDistance
        Specified by:
        distance in interface StringDistance
        Parameters:
        s1 - The first string to compare.
        s2 - The second string to compare.
        Returns:
        1 - the Jaccard similarity.
        Throws:
        java.lang.NullPointerException - if s1 or s2 is null.