Class Cosine

    • Constructor Summary

      Constructors 
      Constructor Description
      Cosine()
      Implements Cosine Similarity between strings.
      Cosine​(int k)
      Implements Cosine Similarity between strings.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      double distance​(java.lang.String s1, java.lang.String s2)
      Return 1.0 - similarity.
      private static double dotProduct​(java.util.Map<java.lang.String,​java.lang.Integer> profile1, java.util.Map<java.lang.String,​java.lang.Integer> profile2)  
      private static double norm​(java.util.Map<java.lang.String,​java.lang.Integer> profile)
      Compute the norm L2 : sqrt(Sum_i( v_i²)).
      double similarity​(java.lang.String s1, java.lang.String s2)
      Compute the cosine similarity between strings.
      double similarity​(java.util.Map<java.lang.String,​java.lang.Integer> profile1, java.util.Map<java.lang.String,​java.lang.Integer> profile2)
      Compute similarity between precomputed profiles.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Cosine

        public Cosine​(int k)
        Implements Cosine Similarity between strings. The strings are first transformed in vectors of occurrences of k-shingles (sequences of k characters). In this n-dimensional space, the similarity between the two strings is the cosine of their respective vectors.
        Parameters:
        k -
      • Cosine

        public Cosine()
        Implements Cosine Similarity between strings. The strings are first transformed in vectors of occurrences of k-shingles (sequences of k characters). In this n-dimensional space, the similarity between the two strings is the cosine of their respective vectors. Default k is 3.
    • Method Detail

      • similarity

        public final double similarity​(java.lang.String s1,
                                       java.lang.String s2)
        Compute the cosine similarity between strings.
        Specified by:
        similarity in interface StringSimilarity
        Parameters:
        s1 - The first string to compare.
        s2 - The second string to compare.
        Returns:
        The cosine similarity in the range [0, 1]
        Throws:
        java.lang.NullPointerException - if s1 or s2 is null.
      • norm

        private static double norm​(java.util.Map<java.lang.String,​java.lang.Integer> profile)
        Compute the norm L2 : sqrt(Sum_i( v_i²)).
        Parameters:
        profile -
        Returns:
        L2 norm
      • dotProduct

        private static double dotProduct​(java.util.Map<java.lang.String,​java.lang.Integer> profile1,
                                         java.util.Map<java.lang.String,​java.lang.Integer> profile2)
      • distance

        public final double distance​(java.lang.String s1,
                                     java.lang.String s2)
        Return 1.0 - similarity.
        Specified by:
        distance in interface StringDistance
        Parameters:
        s1 - The first string to compare.
        s2 - The second string to compare.
        Returns:
        1.0 - the cosine similarity in the range [0, 1]
        Throws:
        java.lang.NullPointerException - if s1 or s2 is null.
      • similarity

        public final double similarity​(java.util.Map<java.lang.String,​java.lang.Integer> profile1,
                                       java.util.Map<java.lang.String,​java.lang.Integer> profile2)
        Compute similarity between precomputed profiles.
        Parameters:
        profile1 -
        profile2 -
        Returns: