Class TrimSuffixEncoder

  • All Implemented Interfaces:
    ISequenceEncoder

    public class TrimSuffixEncoder
    extends java.lang.Object
    implements ISequenceEncoder
    Encodes dst relative to src by trimming whatever non-equal suffix src has. The output code is (bytes):
     {K}{suffix}
     
    where (K - 'A') bytes should be trimmed from the end of src and then the suffix should be appended to the resulting byte sequence.

    Examples:

     src: foo
     dst: foobar
     encoded: Abar
     
     src: foo
     dst: bar
     encoded: Dbar
     
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private static int REMOVE_EVERYTHING
      Maximum encodable single-byte code.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.nio.ByteBuffer decode​(java.nio.ByteBuffer reuse, java.nio.ByteBuffer source, java.nio.ByteBuffer encoded)
      Decodes encoded relative to source, optionally reusing the provided ByteBuffer.
      java.nio.ByteBuffer encode​(java.nio.ByteBuffer reuse, java.nio.ByteBuffer source, java.nio.ByteBuffer target)
      Encodes target relative to source, optionally reusing the provided ByteBuffer.
      int prefixBytes()
      The number of encoded form's prefix bytes that should be ignored (needed for separator lookup).
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • REMOVE_EVERYTHING

        private static final int REMOVE_EVERYTHING
        Maximum encodable single-byte code.
        See Also:
        Constant Field Values
    • Constructor Detail

      • TrimSuffixEncoder

        public TrimSuffixEncoder()
    • Method Detail

      • encode

        public java.nio.ByteBuffer encode​(java.nio.ByteBuffer reuse,
                                          java.nio.ByteBuffer source,
                                          java.nio.ByteBuffer target)
        Description copied from interface: ISequenceEncoder
        Encodes target relative to source, optionally reusing the provided ByteBuffer.
        Specified by:
        encode in interface ISequenceEncoder
        Parameters:
        reuse - Reuses the provided ByteBuffer or allocates a new one if there is not enough remaining space.
        source - The source byte sequence.
        target - The target byte sequence to encode relative to source
        Returns:
        Returns the ByteBuffer with encoded target.
      • prefixBytes

        public int prefixBytes()
        Description copied from interface: ISequenceEncoder
        The number of encoded form's prefix bytes that should be ignored (needed for separator lookup). An ugly workaround for GH-85, should be fixed by prior knowledge of whether the dictionary contains tags; then we can scan for separator right-to-left.
        Specified by:
        prefixBytes in interface ISequenceEncoder
        See Also:
        "https://github.com/morfologik/morfologik-stemming/issues/85"
      • decode

        public java.nio.ByteBuffer decode​(java.nio.ByteBuffer reuse,
                                          java.nio.ByteBuffer source,
                                          java.nio.ByteBuffer encoded)
        Description copied from interface: ISequenceEncoder
        Decodes encoded relative to source, optionally reusing the provided ByteBuffer.
        Specified by:
        decode in interface ISequenceEncoder
        Parameters:
        reuse - Reuses the provided ByteBuffer or allocates a new one if there is not enough remaining space.
        source - The source byte sequence.
        encoded - The previously encoded byte sequence.
        Returns:
        Returns the ByteBuffer with decoded target.
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object