Class RawFieldParser


  • public class RawFieldParser
    extends java.lang.Object

    Low level parser for header field elements. The parsing routines of this class are designed to produce near zero intermediate garbage and make no intermediate copies of input data.

    This class is immutable and thread safe.

    • Constructor Summary

      Constructors 
      Constructor Description
      RawFieldParser()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void copyContent​(ByteSequence buf, ParserCursor cursor, java.util.BitSet delimiters, java.lang.StringBuilder dst)
      Transfers content into the destination buffer until a whitespace character, a comment, or any of the given delimiters is encountered.
      void copyQuotedContent​(ByteSequence buf, ParserCursor cursor, java.lang.StringBuilder dst)
      Transfers content enclosed with quote marks into the destination buffer.
      void copyUnquotedContent​(ByteSequence buf, ParserCursor cursor, java.util.BitSet delimiters, java.lang.StringBuilder dst)
      Transfers content into the destination buffer until a whitespace character, a comment, a quote, or any of the given delimiters is encountered.
      static java.util.BitSet INIT_BITSET​(int... b)  
      RawField parseField​(ByteSequence raw)
      Parses the sequence of bytes into RawField.
      NameValuePair parseParameter​(ByteSequence buf, ParserCursor cursor)
      Parses the sequence of bytes containing a field parameter delimited with semicolon into NameValuePair.
      java.util.List<NameValuePair> parseParameters​(ByteSequence buf, ParserCursor cursor)
      Parses the sequence of bytes containing field parameters delimited with semicolon into a list of NameValuePairs.
      RawBody parseRawBody​(RawField field)
      Parses the field body containing a value with parameters into RawBody.
      RawBody parseRawBody​(ByteSequence buf, ParserCursor cursor)
      Parses the sequence of bytes containing a value with parameters into RawBody.
      java.lang.String parseToken​(ByteSequence buf, ParserCursor cursor, java.util.BitSet delimiters)
      Extracts from the sequence of bytes a token terminated with any of the given delimiters discarding semantically insignificant whitespace characters and comments.
      private java.lang.String parseUtf8Filename​(ByteSequence buf)
      Special case for parsing filename attribute in nonstandard encoding like: Content-Disposition: attachment; filename="УПД ОБЩЕСТВО С ОГРАНИЧЕННОЙ ОТВЕТСТВЕННОСТЬЮ "СТАНЦИЯ ВИРТУАЛЬНАЯ" 01-05-21.pdf"
      java.lang.String parseValue​(ByteSequence buf, ParserCursor cursor, java.util.BitSet delimiters)
      Extracts from the sequence of bytes a value which can be enclosed in quote marks and terminated with any of the given delimiters discarding semantically insignificant whitespace characters and comments.
      void skipAllWhiteSpace​(ByteSequence buf, ParserCursor cursor)
      Skips semantically insignificant whitespace characters and comments and moves the cursor to the closest semantically significant non-whitespace character.
      void skipComment​(ByteSequence buf, ParserCursor cursor)
      Skips semantically insignificant content if the current position is positioned at the beginning of a comment and moves the cursor past the end of the comment.
      void skipWhiteSpace​(ByteSequence buf, ParserCursor cursor)
      Skips semantically insignificant whitespace characters and moves the cursor to the closest non-whitespace character.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • COLON

        static final java.util.BitSet COLON
      • EQUAL_OR_SEMICOLON

        static final java.util.BitSet EQUAL_OR_SEMICOLON
      • SEMICOLON

        static final java.util.BitSet SEMICOLON
    • Constructor Detail

      • RawFieldParser

        public RawFieldParser()
    • Method Detail

      • INIT_BITSET

        public static java.util.BitSet INIT_BITSET​(int... b)
      • parseRawBody

        public RawBody parseRawBody​(RawField field)
        Parses the field body containing a value with parameters into RawBody.
        Parameters:
        field - unstructured (raw) field
      • parseRawBody

        public RawBody parseRawBody​(ByteSequence buf,
                                    ParserCursor cursor)
        Parses the sequence of bytes containing a value with parameters into RawBody.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
      • parseParameters

        public java.util.List<NameValuePair> parseParameters​(ByteSequence buf,
                                                             ParserCursor cursor)
        Parses the sequence of bytes containing field parameters delimited with semicolon into a list of NameValuePairs.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
      • parseParameter

        public NameValuePair parseParameter​(ByteSequence buf,
                                            ParserCursor cursor)
        Parses the sequence of bytes containing a field parameter delimited with semicolon into NameValuePair.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
      • parseToken

        public java.lang.String parseToken​(ByteSequence buf,
                                           ParserCursor cursor,
                                           java.util.BitSet delimiters)
        Extracts from the sequence of bytes a token terminated with any of the given delimiters discarding semantically insignificant whitespace characters and comments.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
        delimiters - set of delimiting characters. Can be null if the token is not delimited by any character.
      • parseValue

        public java.lang.String parseValue​(ByteSequence buf,
                                           ParserCursor cursor,
                                           java.util.BitSet delimiters)
        Extracts from the sequence of bytes a value which can be enclosed in quote marks and terminated with any of the given delimiters discarding semantically insignificant whitespace characters and comments.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
        delimiters - set of delimiting characters. Can be null if the value is not delimited by any character.
      • parseUtf8Filename

        private java.lang.String parseUtf8Filename​(ByteSequence buf)
        Special case for parsing filename attribute in nonstandard encoding like: Content-Disposition: attachment; filename="УПД ОБЩЕСТВО С ОГРАНИЧЕННОЙ ОТВЕТСТВЕННОСТЬЮ "СТАНЦИЯ ВИРТУАЛЬНАЯ" 01-05-21.pdf"
        Parameters:
        buf - field raw.
        Returns:
        filename value or null.
      • skipWhiteSpace

        public void skipWhiteSpace​(ByteSequence buf,
                                   ParserCursor cursor)
        Skips semantically insignificant whitespace characters and moves the cursor to the closest non-whitespace character.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
      • skipComment

        public void skipComment​(ByteSequence buf,
                                ParserCursor cursor)
        Skips semantically insignificant content if the current position is positioned at the beginning of a comment and moves the cursor past the end of the comment. Nested comments and escaped characters are recognized and handled appropriately.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
      • skipAllWhiteSpace

        public void skipAllWhiteSpace​(ByteSequence buf,
                                      ParserCursor cursor)
        Skips semantically insignificant whitespace characters and comments and moves the cursor to the closest semantically significant non-whitespace character. Nested comments and escaped characters are recognized and handled appropriately.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
      • copyContent

        public void copyContent​(ByteSequence buf,
                                ParserCursor cursor,
                                java.util.BitSet delimiters,
                                java.lang.StringBuilder dst)
        Transfers content into the destination buffer until a whitespace character, a comment, or any of the given delimiters is encountered.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
        delimiters - set of delimiting characters. Can be null if the value is delimited by a whitespace or a comment only.
        dst - destination buffer
      • copyUnquotedContent

        public void copyUnquotedContent​(ByteSequence buf,
                                        ParserCursor cursor,
                                        java.util.BitSet delimiters,
                                        java.lang.StringBuilder dst)
        Transfers content into the destination buffer until a whitespace character, a comment, a quote, or any of the given delimiters is encountered.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
        delimiters - set of delimiting characters. Can be null if the value is delimited by a whitespace, a quote or a comment only.
        dst - destination buffer
      • copyQuotedContent

        public void copyQuotedContent​(ByteSequence buf,
                                      ParserCursor cursor,
                                      java.lang.StringBuilder dst)
        Transfers content enclosed with quote marks into the destination buffer.
        Parameters:
        buf - buffer with the sequence of bytes to be parsed
        cursor - defines the bounds and current position of the buffer
        dst - destination buffer