Class SimpleTextParser


  • public class SimpleTextParser
    extends java.lang.Object
    Class providing basic text parsing capabilities. The goals of this class are to (1) provide a simple, flexible API for performing common text parsing operations and (2) provide a mechanism for creating consistent and informative parsing errors. This class is not intended as a replacement for grammar-based parsers and/or lexers.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      private class  SimpleTextParser.StringCollector
      Internal class used to collect strings from the character stream while ensuring that the collected strings do not exceed the maximum configured string length.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private CharReadBuffer buffer
      Character read buffer used to access the character stream.
      private int columnNumber
      Current character column on the current line; column numbers start at 1.
      private static char CR
      Carriage return character.
      private java.lang.String currentToken
      The current token.
      private int currentTokenColumnNumber
      The character number that the current token started on.
      private int currentTokenLineNumber
      The line number that the current token started on.
      private static int DEFAULT_MAX_STRING_LENGTH
      Default value for the max string length property.
      private static int EOF
      Constant indicating that the end of the input has been reached.
      private boolean hasSetToken
      Flag used to indicate that at least one token has been read from the stream.
      private static int INITIAL_TOKEN_POS
      Initial token position number.
      private static char LF
      Line feed character.
      private int lineNumber
      Current line number; line numbers start counting at 1.
      private int maxStringLength
      Maximum length for strings returned by this instance.
      private static java.util.function.IntConsumer NOOP_CONSUMER
      Int consumer that does nothing.
      private static java.lang.String STRING_LENGTH_ERR_MSG
      Error message used when a string exceeds the configured maximum length.
    • Constructor Summary

      Constructors 
      Constructor Description
      SimpleTextParser​(java.io.Reader reader)
      Construct a new instance that reads characters from the given reader.
      SimpleTextParser​(CharReadBuffer buffer)
      Construct a new instance that reads characters from the given character buffer.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int choose​(java.lang.String... expected)
      Return the index of the argument that exactly matches the current token.
      int choose​(java.util.List<java.lang.String> expected)
      Return the index of the argument that exactly matches the current token.
      int chooseIgnoreCase​(java.lang.String... expected)
      Return the index of the argument that matches the current token, ignoring case.
      int chooseIgnoreCase​(java.util.List<java.lang.String> expected)
      Return the index of the argument that matches the current token, ignoring case.
      private int chooseInternal​(java.util.List<java.lang.String> expected, boolean caseSensitive, boolean throwOnFailure)
      Internal method to compare the current token with a list of possible strings.
      SimpleTextParser consume​(int len, java.util.function.IntConsumer consumer)
      Consume at most len characters from the stream, passing each to the given consumer.
      SimpleTextParser consume​(java.util.function.IntPredicate pred, java.util.function.IntConsumer consumer)
      Consume characters from the stream and pass them to consumer while the given predicate returns true.
      SimpleTextParser consumeWithLineContinuation​(char lineContinuationChar, int len, java.util.function.IntConsumer consumer)
      Consume at most len characters from the stream, passing each to the given consumer.
      SimpleTextParser consumeWithLineContinuation​(char lineContinuationChar, java.util.function.IntPredicate pred, java.util.function.IntConsumer consumer)
      Consume characters from the stream and pass them to consumer while the given predicate returns true.
      SimpleTextParser discard​(int len)
      Discard len number of characters from the character stream.
      SimpleTextParser discard​(java.util.function.IntPredicate pred)
      Discard characters from the stream while the given predicate returns true.
      SimpleTextParser discardLine()
      Discard all remaining characters on the current line, including the terminating newline character sequence.
      SimpleTextParser discardLineWhitespace()
      Discard the next whitespace characters on the current line.
      SimpleTextParser discardNewLineSequence()
      Discard the newline character sequence at the current reader position.
      SimpleTextParser discardWhitespace()
      Discard a sequence of whitespace characters from the character stream starting from the current parser position.
      SimpleTextParser discardWithLineContinuation​(char lineContinuationChar, int len)
      Discard len number of characters from the character stream.
      SimpleTextParser discardWithLineContinuation​(char lineContinuationChar, java.util.function.IntPredicate pred)
      Discard characters from the stream while the given predicate returns true.
      private void ensureHasSetToken()
      Ensure that a token read operation has been performed, throwing an exception if not.
      int getColumnNumber()
      Get the current column number.
      java.lang.String getCurrentToken()
      Get the current token.
      double getCurrentTokenAsDouble()
      Get the current token parsed as a double.
      int getCurrentTokenAsInt()
      Get the current token parsed as an integer.
      int getCurrentTokenColumnNumber()
      Get the column position that the current token started on.
      private java.lang.String getCurrentTokenDescription()
      Get a user-friendly description of the current token.
      int getCurrentTokenLineNumber()
      Get the line number that the current token started on.
      int getLineNumber()
      Get the current line number.
      int getMaxStringLength()
      Get the maximum length for strings returned by this instance.
      boolean hasMoreCharacters()
      Return true if there are more characters to read from this instance.
      boolean hasMoreCharactersOnLine()
      Return true if there are more characters to read on the current line.
      boolean hasNonEmptyToken()
      Return true if the current token is not null or empty.
      static boolean isAlphanumeric​(int ch)
      Return true if the given character (Unicode code point) is alphanumeric.
      static boolean isDecimalPart​(int ch)
      Return true if the given character (Unicode code point) can be used as part of the string representation of a decimal number.
      static boolean isIntegerPart​(int ch)
      Return true if the given character (Unicode code point) can be used as part of the string representation of an integer.
      static boolean isLineWhitespace​(int ch)
      Return true if the given character (Unicode code point) is whitespace that is not used in newline sequences (ie, not '\r' or '\n').
      static boolean isNewLinePart​(int ch)
      Return true if the given character (Unicode code point) is used as part of newline sequences (ie, is either '\r' or '\n').
      static boolean isNotAlphanumeric​(int ch)
      Return true if the given character (Unicode code point) is not alphanumeric.
      static boolean isNotNewLinePart​(int ch)
      Return true if the given character (Unicode code point) is not used as part of newline sequences (ie, not '\r' or '\n').
      static boolean isNotWhitespace​(int ch)
      Return true if the given character (Unicode code point) is not whitespace.
      static boolean isWhitespace​(int ch)
      Return true if the given character (Unicode code point) is whitespace.
      SimpleTextParser match​(java.lang.String expected)
      Compare the current token with the argument and throw an exception if they are not equal.
      SimpleTextParser matchIgnoreCase​(java.lang.String expected)
      Compare the current token with the argument and throw an exception if they are not equal.
      private boolean matchInternal​(java.lang.String expected, boolean caseSensitive, boolean throwOnFailure)
      Internal method to compare the current token with the argument.
      SimpleTextParser next​(int len)
      Read a string containing at most len characters from the stream and set it as the current token.
      SimpleTextParser next​(java.util.function.IntPredicate pred)
      Read characters from the stream while the given predicate returns true and set the result as the current token.
      SimpleTextParser nextAlphanumeric()
      Read a sequence of alphanumeric characters starting from the current parser position and set the result as the current token.
      SimpleTextParser nextLine()
      Read characters from the current parser position to the next new line sequence and set the result as the current token .
      SimpleTextParser nextWithLineContinuation​(char lineContinuationChar, int len)
      Read a string containing at most len characters from the stream and set it as the current token.
      SimpleTextParser nextWithLineContinuation​(char lineContinuationChar, java.util.function.IntPredicate pred)
      Read characters from the stream while the given predicate returns true and set the result as the current token.
      java.lang.IllegalStateException parseError​(int line, int col, java.lang.String msg)
      Return an exception indicating an error during parsing.
      java.lang.IllegalStateException parseError​(int line, int col, java.lang.String msg, java.lang.Throwable cause)
      Return an exception indicating an error during parsing.
      java.lang.IllegalStateException parseError​(java.lang.String msg)
      Return an exception indicating an error occurring at the current parser position.
      java.lang.IllegalStateException parseError​(java.lang.String msg, java.lang.Throwable cause)
      Return an exception indicating an error occurring at the current parser position.
      java.lang.String peek​(int len)
      Return a string containing containing at most len characters from the stream but without changing the parser position.
      java.lang.String peek​(java.util.function.IntPredicate pred)
      Read characters from the stream while the given predicate returns true but do not change the current token or advance the parser position.
      int peekChar()
      Return the next character in the stream but do not advance the parser position.
      int readChar()
      Read and return the next character in the stream and advance the parser position.
      void setColumnNumber​(int column)
      Set the current column number.
      void setLineNumber​(int lineNumber)
      Set the current line number.
      void setMaxStringLength​(int maxStringLength)
      Set the maximum length for strings returned by this instance.
      private void setToken​(int line, int col, java.lang.String token)
      Set the current token string and position.
      private static boolean stringsEqual​(java.lang.String a, java.lang.String b, boolean caseSensitive)
      Test two strings for equality.
      java.lang.IllegalStateException tokenError​(java.lang.String msg)
      Get an exception indicating an error during parsing at the current token position.
      java.lang.IllegalStateException tokenError​(java.lang.String msg, java.lang.Throwable cause)
      Get an exception indicating an error during parsing at the current token position.
      int tryChoose​(java.lang.String... expected)
      Return the index of the argument that exactly matches the current token or -1 if no match is found.
      int tryChoose​(java.util.List<java.lang.String> expected)
      Return the index of the argument that exactly matches the current token or -1 if no match is found.
      int tryChooseIgnoreCase​(java.lang.String... expected)
      Return the index of the argument that matches the current token or -1 if no match is found.
      int tryChooseIgnoreCase​(java.util.List<java.lang.String> expected)
      Return the index of the argument that matches the current token or -1 if no match is found.
      boolean tryMatch​(java.lang.String expected)
      Return true if the current token is equal to the argument.
      boolean tryMatchIgnoreCase​(java.lang.String expected)
      Return true if the current token is equal to the argument.
      java.lang.IllegalStateException unexpectedToken​(java.lang.String expected)
      Get an exception indicating that the current token was unexpected.
      java.lang.IllegalStateException unexpectedToken​(java.lang.String expected, java.lang.Throwable cause)
      Get an exception indicating that the current token was unexpected.
      private void validateRequestedStringLength​(int len)
      Validate the requested string length.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • EOF

        private static final int EOF
        Constant indicating that the end of the input has been reached.
        See Also:
        Constant Field Values
      • DEFAULT_MAX_STRING_LENGTH

        private static final int DEFAULT_MAX_STRING_LENGTH
        Default value for the max string length property.
        See Also:
        Constant Field Values
      • STRING_LENGTH_ERR_MSG

        private static final java.lang.String STRING_LENGTH_ERR_MSG
        Error message used when a string exceeds the configured maximum length.
        See Also:
        Constant Field Values
      • INITIAL_TOKEN_POS

        private static final int INITIAL_TOKEN_POS
        Initial token position number.
        See Also:
        Constant Field Values
      • NOOP_CONSUMER

        private static final java.util.function.IntConsumer NOOP_CONSUMER
        Int consumer that does nothing.
      • lineNumber

        private int lineNumber
        Current line number; line numbers start counting at 1.
      • columnNumber

        private int columnNumber
        Current character column on the current line; column numbers start at 1.
      • maxStringLength

        private int maxStringLength
        Maximum length for strings returned by this instance.
      • currentToken

        private java.lang.String currentToken
        The current token.
      • currentTokenLineNumber

        private int currentTokenLineNumber
        The line number that the current token started on.
      • currentTokenColumnNumber

        private int currentTokenColumnNumber
        The character number that the current token started on.
      • hasSetToken

        private boolean hasSetToken
        Flag used to indicate that at least one token has been read from the stream.
      • buffer

        private final CharReadBuffer buffer
        Character read buffer used to access the character stream.
    • Constructor Detail

      • SimpleTextParser

        public SimpleTextParser​(java.io.Reader reader)
        Construct a new instance that reads characters from the given reader. The reader will not be closed.
        Parameters:
        reader - reader instance to read characters from
      • SimpleTextParser

        public SimpleTextParser​(CharReadBuffer buffer)
        Construct a new instance that reads characters from the given character buffer.
        Parameters:
        buffer - read buffer to read characters from
    • Method Detail

      • getLineNumber

        public int getLineNumber()
        Get the current line number. Line numbers start at 1.
        Returns:
        the current line number
      • setLineNumber

        public void setLineNumber​(int lineNumber)
        Set the current line number. This does not affect the character stream position, only the value returned by getLineNumber().
        Parameters:
        lineNumber - line number to set; line numbers start at 1
      • getColumnNumber

        public int getColumnNumber()
        Get the current column number. This indicates the column position of the character that will returned by the next call to readChar(). The first character of each line has a column number of 1.
        Returns:
        the current column number; column numbers start at 1
      • setColumnNumber

        public void setColumnNumber​(int column)
        Set the current column number. This does not affect the character stream position, only the value returned by getColumnNumber().
        Parameters:
        column - the column number to set; column numbers start at 1
      • getMaxStringLength

        public int getMaxStringLength()
        Get the maximum length for strings returned by this instance. Operations that produce strings longer than this length will throw an exception.
        Returns:
        maximum length for strings returned by this instance
      • setMaxStringLength

        public void setMaxStringLength​(int maxStringLength)
        Set the maximum length for strings returned by this instance. Operations that produce strings longer than this length will throw an exception.
        Parameters:
        maxStringLength - maximum length for strings returned by this instance
        Throws:
        java.lang.IllegalArgumentException - if the argument is less than zero
      • getCurrentToken

        public java.lang.String getCurrentToken()
        Get the current token. This is the most recent string read by one of the nextXXX() methods. This value will be null if no token has yet been read or if the end of content has been reached.
        Returns:
        the current token
        See Also:
        next(int), next(IntPredicate), nextLine(), nextAlphanumeric()
      • hasNonEmptyToken

        public boolean hasNonEmptyToken()
        Return true if the current token is not null or empty.
        Returns:
        true if the current token is not null or empty
        See Also:
        getCurrentToken()
      • getCurrentTokenLineNumber

        public int getCurrentTokenLineNumber()
        Get the line number that the current token started on. This value will be -1 if no token has been read yet.
        Returns:
        current token starting line number or -1 if no token has been read yet
        See Also:
        getCurrentToken()
      • getCurrentTokenColumnNumber

        public int getCurrentTokenColumnNumber()
        Get the column position that the current token started on. This value will be -1 if no token has been read yet.
        Returns:
        current token column number or -1 if no oken has been read yet
        See Also:
        getCurrentToken()
      • getCurrentTokenAsInt

        public int getCurrentTokenAsInt()
        Get the current token parsed as an integer.
        Returns:
        the current token parsed as an integer
        Throws:
        java.lang.IllegalStateException - if no token has been read or the current token cannot be parsed as an integer
      • getCurrentTokenAsDouble

        public double getCurrentTokenAsDouble()
        Get the current token parsed as a double.
        Returns:
        the current token parsed as a double
        Throws:
        java.lang.IllegalStateException - if no token has been read or the current token cannot be parsed as a double
      • hasMoreCharacters

        public boolean hasMoreCharacters()
        Return true if there are more characters to read from this instance.
        Returns:
        true if there are more characters to read from this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • hasMoreCharactersOnLine

        public boolean hasMoreCharactersOnLine()
        Return true if there are more characters to read on the current line.
        Returns:
        true if there are more characters to read on the current line
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • readChar

        public int readChar()
        Read and return the next character in the stream and advance the parser position. This method updates the current line number and column number but does not set the current token.
        Returns:
        the next character in the stream or -1 if the end of the stream has been reached
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        peekChar()
      • next

        public SimpleTextParser next​(int len)
        Read a string containing at most len characters from the stream and set it as the current token. Characters are added to the string until the string has the specified length or the end of the stream is reached. The characters are consumed from the stream. The token is set to null if no more characters are available from the character stream when this method is called.
        Parameters:
        len - the maximum length of the extracted string
        Returns:
        this instance
        Throws:
        java.lang.IllegalArgumentException - if len is less than 0 or greater than the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken(), consume(int, IntConsumer)
      • nextWithLineContinuation

        public SimpleTextParser nextWithLineContinuation​(char lineContinuationChar,
                                                         int len)
        Read a string containing at most len characters from the stream and set it as the current token. This is similar to next(int) but with the exception that new line sequences beginning with lineContinuationChar are skipped.
        Parameters:
        lineContinuationChar - character used to indicate skipped new line sequences
        len - the maximum length of the extracted string
        Returns:
        this instance
        Throws:
        java.lang.IllegalArgumentException - if len is less than 0 or greater than the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken(), consumeWithLineContinuation(char, int, IntConsumer)
      • next

        public SimpleTextParser next​(java.util.function.IntPredicate pred)
        Read characters from the stream while the given predicate returns true and set the result as the current token. The next call to readChar() will return either a character that fails the predicate test or -1 if the end of the stream has been reached. The token will be null if the end of the stream has been reached prior to the method call.
        Parameters:
        pred - predicate function passed characters read from the input; reading continues until the predicate returns false
        Returns:
        this instance
        Throws:
        java.lang.IllegalStateException - if the length of the produced string exceeds the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken(), consume(IntPredicate, IntConsumer)
      • nextWithLineContinuation

        public SimpleTextParser nextWithLineContinuation​(char lineContinuationChar,
                                                         java.util.function.IntPredicate pred)
        Read characters from the stream while the given predicate returns true and set the result as the current token. This is similar to next(IntPredicate) but with the exception that new line sequences prefixed with lineContinuationChar are skipped.
        Parameters:
        lineContinuationChar - character used to indicate skipped new line sequences
        pred - predicate function passed characters read from the input; reading continues until the predicate returns false
        Returns:
        this instance
        Throws:
        java.lang.IllegalStateException - if the length of the produced string exceeds the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken(), consume(IntPredicate, IntConsumer)
      • nextLine

        public SimpleTextParser nextLine()
        Read characters from the current parser position to the next new line sequence and set the result as the current token . The newline character sequence ('\r', '\n', or '\r\n') at the end of the line is consumed but is not included in the token. The token will be null if the end of the stream has been reached prior to the method call.
        Returns:
        this instance
        Throws:
        java.lang.IllegalStateException - if the length of the produced string exceeds the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken()
      • nextAlphanumeric

        public SimpleTextParser nextAlphanumeric()
        Read a sequence of alphanumeric characters starting from the current parser position and set the result as the current token. The token will be the empty string if the next character in the stream is not alphanumeric and will be null if the end of the stream has been reached prior to the method call.
        Returns:
        this instance
        Throws:
        java.lang.IllegalStateException - if the length of the produced string exceeds the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken()
      • discard

        public SimpleTextParser discard​(int len)
        Discard len number of characters from the character stream. The parser position is updated but the current token is not changed.
        Parameters:
        len - number of characters to discard
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discardWithLineContinuation

        public SimpleTextParser discardWithLineContinuation​(char lineContinuationChar,
                                                            int len)
        Discard len number of characters from the character stream. The parser position is updated but the current token is not changed. Lines beginning with lineContinuationChar are skipped.
        Parameters:
        lineContinuationChar - character used to indicate skipped new line sequences
        len - number of characters to discard
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discard

        public SimpleTextParser discard​(java.util.function.IntPredicate pred)
        Discard characters from the stream while the given predicate returns true. The next call to readChar() will return either a character that fails the predicate test or -1 if the end of the stream has been reached. The parser position is updated but the current token is not changed.
        Parameters:
        pred - predicate test for characters to discard
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discardWithLineContinuation

        public SimpleTextParser discardWithLineContinuation​(char lineContinuationChar,
                                                            java.util.function.IntPredicate pred)
        Discard characters from the stream while the given predicate returns true. New line sequences beginning with lineContinuationChar are skipped. The next call o readChar() will return either a character that fails the predicate test or -1 if the end of the stream has been reached. The parser position is updated but the current token is not changed.
        Parameters:
        lineContinuationChar - character used to indicate skipped new line sequences
        pred - predicate test for characters to discard
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discardWhitespace

        public SimpleTextParser discardWhitespace()
        Discard a sequence of whitespace characters from the character stream starting from the current parser position. The next call to readChar() will return either a non-whitespace character or -1 if the end of the stream has been reached. The parser position is updated but the current token is not changed.
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discardLineWhitespace

        public SimpleTextParser discardLineWhitespace()
        Discard the next whitespace characters on the current line. The next call to readChar() will return either a non-whitespace character on the current line, the newline character sequence (indicating the end of the line), or -1 (indicating the end of the stream). The parser position is updated but the current token is not changed.
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discardNewLineSequence

        public SimpleTextParser discardNewLineSequence()
        Discard the newline character sequence at the current reader position. The sequence is defined as one of "\r", "\n", or "\r\n". Does nothing if the reader is not positioned at a newline sequence. The parser position is updated but the current token is not changed.
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • discardLine

        public SimpleTextParser discardLine()
        Discard all remaining characters on the current line, including the terminating newline character sequence. The next call to readChar() will return either the first character on the next line or -1 if the end of the stream has been reached. The parser position is updated but the current token is not changed.
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • consume

        public SimpleTextParser consume​(java.util.function.IntPredicate pred,
                                        java.util.function.IntConsumer consumer)
        Consume characters from the stream and pass them to consumer while the given predicate returns true. The operation ends when the predicate returns false or the end of the stream is reached.
        Parameters:
        pred - predicate test for characters to consume
        consumer - object to be passed each consumed character
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • consumeWithLineContinuation

        public SimpleTextParser consumeWithLineContinuation​(char lineContinuationChar,
                                                            int len,
                                                            java.util.function.IntConsumer consumer)
        Consume at most len characters from the stream, passing each to the given consumer. This method is similar to consume(int, IntConsumer) with the exception that new line sequences prefixed with lineContinuationChar are skipped.
        Parameters:
        lineContinuationChar - character used to indicate skipped new line sequences
        len - number of characters to consume
        consumer - function to be passed each consumed character
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • consume

        public SimpleTextParser consume​(int len,
                                        java.util.function.IntConsumer consumer)
        Consume at most len characters from the stream, passing each to the given consumer. The operation continues until len number of characters have been read or the end of the stream has been reached.
        Parameters:
        len - number of characters to consume
        consumer - object to be passed each consumed character
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • consumeWithLineContinuation

        public SimpleTextParser consumeWithLineContinuation​(char lineContinuationChar,
                                                            java.util.function.IntPredicate pred,
                                                            java.util.function.IntConsumer consumer)
        Consume characters from the stream and pass them to consumer while the given predicate returns true. This method is similar to consume(IntPredicate, IntConsumer) with the exception that new lines sequences beginning with lineContinuationChar are skipped.
        Parameters:
        lineContinuationChar - character used to indicate skipped new line sequences
        pred - predicate test for characters to consume
        consumer - object to be passed each consumed character
        Returns:
        this instance
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
      • peekChar

        public int peekChar()
        Return the next character in the stream but do not advance the parser position.
        Returns:
        the next character in the stream or -1 if the end of the stream has been reached
        Throws:
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        readChar()
      • peek

        public java.lang.String peek​(int len)
        Return a string containing containing at most len characters from the stream but without changing the parser position. Characters are added to the string until the string has the specified length or the end of the stream is reached.
        Parameters:
        len - the maximum length of the returned string
        Returns:
        a string containing containing at most len characters from the stream or null if the parser has already reached the end of the stream
        Throws:
        java.lang.IllegalArgumentException - if len is less than 0 or greater than the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        next(int)
      • peek

        public java.lang.String peek​(java.util.function.IntPredicate pred)
        Read characters from the stream while the given predicate returns true but do not change the current token or advance the parser position.
        Parameters:
        pred - predicate function passed characters read from the input; reading continues until the predicate returns false
        Returns:
        string containing characters matching pred or null if the parser has already reached the end of the stream
        Throws:
        java.lang.IllegalStateException - if the length of the produced string exceeds the configured maximum string length
        java.io.UncheckedIOException - if an I/O error occurs
        See Also:
        getCurrentToken()
      • match

        public SimpleTextParser match​(java.lang.String expected)
        Compare the current token with the argument and throw an exception if they are not equal. The comparison is case-sensitive.
        Parameters:
        expected - expected token
        Returns:
        this instance
        Throws:
        java.lang.IllegalStateException - if no token has been read or expected does not exactly equal the current token
      • matchIgnoreCase

        public SimpleTextParser matchIgnoreCase​(java.lang.String expected)
        Compare the current token with the argument and throw an exception if they are not equal. The comparison is not case-sensitive.
        Parameters:
        expected - expected token
        Returns:
        this instance
        Throws:
        java.lang.IllegalStateException - if no token has been read or expected does not equal the current token (ignoring case)
      • tryMatch

        public boolean tryMatch​(java.lang.String expected)
        Return true if the current token is equal to the argument. The comparison is case-sensitive.
        Parameters:
        expected - expected token
        Returns:
        true if the argument exactly equals the current token
        Throws:
        java.lang.IllegalStateException - if no token has been read
        java.io.UncheckedIOException - if an I/O error occurs
      • tryMatchIgnoreCase

        public boolean tryMatchIgnoreCase​(java.lang.String expected)
        Return true if the current token is equal to the argument. The comparison is not case-sensitive.
        Parameters:
        expected - expected token
        Returns:
        true if the argument equals the current token (ignoring case)
        Throws:
        java.lang.IllegalStateException - if no token has been read
      • matchInternal

        private boolean matchInternal​(java.lang.String expected,
                                      boolean caseSensitive,
                                      boolean throwOnFailure)
        Internal method to compare the current token with the argument.
        Parameters:
        expected - expected token
        caseSensitive - if the comparison should be case-sensitive
        throwOnFailure - if an exception should be thrown if the argument is not equal to the current token
        Returns:
        true if the argument is equal to the current token
        Throws:
        java.lang.IllegalStateException - if no token has been read or expected does not match the current token and throwOnFailure is true
      • choose

        public int choose​(java.lang.String... expected)
        Return the index of the argument that exactly matches the current token. An exception is thrown if no match is found. String comparisons are case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that exactly matches the current token
        Throws:
        java.lang.IllegalStateException - if no token has been read or no match is found among the arguments
      • choose

        public int choose​(java.util.List<java.lang.String> expected)
        Return the index of the argument that exactly matches the current token. An exception is thrown if no match is found. String comparisons are case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that exactly matches the current token
        Throws:
        java.lang.IllegalStateException - if no token has been read or no match is found among the arguments
      • chooseIgnoreCase

        public int chooseIgnoreCase​(java.lang.String... expected)
        Return the index of the argument that matches the current token, ignoring case. An exception is thrown if no match is found. String comparisons are not case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that matches the current token (ignoring case)
        Throws:
        java.lang.IllegalStateException - if no token has been read or no match is found among the arguments
      • chooseIgnoreCase

        public int chooseIgnoreCase​(java.util.List<java.lang.String> expected)
        Return the index of the argument that matches the current token, ignoring case. An exception is thrown if no match is found. String comparisons are not case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that matches the current token (ignoring case)
        Throws:
        java.lang.IllegalStateException - if no token has been read or no match is found among the arguments
      • tryChoose

        public int tryChoose​(java.lang.String... expected)
        Return the index of the argument that exactly matches the current token or -1 if no match is found. String comparisons are case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that exactly matches the current token or -1 if no match is found
        Throws:
        java.lang.IllegalStateException - if no token has been read
      • tryChoose

        public int tryChoose​(java.util.List<java.lang.String> expected)
        Return the index of the argument that exactly matches the current token or -1 if no match is found. String comparisons are case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that exactly matches the current token or -1 if no match is found
        Throws:
        java.lang.IllegalStateException - if no token has been read
      • tryChooseIgnoreCase

        public int tryChooseIgnoreCase​(java.lang.String... expected)
        Return the index of the argument that matches the current token or -1 if no match is found. String comparisons are not case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that matches the current token (ignoring case) or -1 if no match is found
        Throws:
        java.lang.IllegalStateException - if no token has been read
      • tryChooseIgnoreCase

        public int tryChooseIgnoreCase​(java.util.List<java.lang.String> expected)
        Return the index of the argument that matches the current token or -1 if no match is found. String comparisons are not case-sensitive.
        Parameters:
        expected - strings to compare with the current token
        Returns:
        index of the argument that matches the current token (ignoring case) or -1 if no match is found
        Throws:
        java.lang.IllegalStateException - if no token has been read
      • chooseInternal

        private int chooseInternal​(java.util.List<java.lang.String> expected,
                                   boolean caseSensitive,
                                   boolean throwOnFailure)
        Internal method to compare the current token with a list of possible strings. The index of the matching argument is returned.
        Parameters:
        expected - strings to compare with the current token
        caseSensitive - if the comparisons should be case-sensitive
        throwOnFailure - if an exception should be thrown if no match is found
        Returns:
        the index of the matching argument or -1 if no match is found
        Throws:
        java.lang.IllegalStateException - if no token has been read or no match is found and throwOnFailure is true
      • unexpectedToken

        public java.lang.IllegalStateException unexpectedToken​(java.lang.String expected)
        Get an exception indicating that the current token was unexpected. The returned exception contains a message with the line number and column of the current token and a description of its value.
        Parameters:
        expected - string describing what was expected
        Returns:
        exception indicating that the current token was unexpected
      • unexpectedToken

        public java.lang.IllegalStateException unexpectedToken​(java.lang.String expected,
                                                               java.lang.Throwable cause)
        Get an exception indicating that the current token was unexpected. The returned exception contains a message with the line number and column of the current token and a description of its value.
        Parameters:
        expected - string describing what was expected
        cause - cause of the error
        Returns:
        exception indicating that the current token was unexpected
      • tokenError

        public java.lang.IllegalStateException tokenError​(java.lang.String msg)
        Get an exception indicating an error during parsing at the current token position.
        Parameters:
        msg - error message
        Returns:
        an exception indicating an error during parsing at the current token position
      • tokenError

        public java.lang.IllegalStateException tokenError​(java.lang.String msg,
                                                          java.lang.Throwable cause)
        Get an exception indicating an error during parsing at the current token position.
        Parameters:
        msg - error message
        cause - the cause of the error; may be null
        Returns:
        an exception indicating an error during parsing at the current token position
      • parseError

        public java.lang.IllegalStateException parseError​(java.lang.String msg)
        Return an exception indicating an error occurring at the current parser position.
        Parameters:
        msg - error message
        Returns:
        an exception indicating an error during parsing
      • parseError

        public java.lang.IllegalStateException parseError​(java.lang.String msg,
                                                          java.lang.Throwable cause)
        Return an exception indicating an error occurring at the current parser position.
        Parameters:
        msg - error message
        cause - the cause of the error; may be null
        Returns:
        an exception indicating an error during parsing
      • parseError

        public java.lang.IllegalStateException parseError​(int line,
                                                          int col,
                                                          java.lang.String msg)
        Return an exception indicating an error during parsing.
        Parameters:
        line - line number of the error
        col - column number of the error
        msg - error message
        Returns:
        an exception indicating an error during parsing
      • parseError

        public java.lang.IllegalStateException parseError​(int line,
                                                          int col,
                                                          java.lang.String msg,
                                                          java.lang.Throwable cause)
        Return an exception indicating an error during parsing.
        Parameters:
        line - line number of the error
        col - column number of the error
        msg - error message
        cause - the cause of the error
        Returns:
        an exception indicating an error during parsing
      • setToken

        private void setToken​(int line,
                              int col,
                              java.lang.String token)
        Set the current token string and position.
        Parameters:
        line - line number for the start of the token
        col - column number for the start of the token
        token - token to set
      • getCurrentTokenDescription

        private java.lang.String getCurrentTokenDescription()
        Get a user-friendly description of the current token.
        Returns:
        a user-friendly description of the current token.
      • validateRequestedStringLength

        private void validateRequestedStringLength​(int len)
        Validate the requested string length.
        Parameters:
        len - requested string length
        Throws:
        java.lang.IllegalArgumentException - if len is less than 0 or greater than maxStringLength
      • ensureHasSetToken

        private void ensureHasSetToken()
        Ensure that a token read operation has been performed, throwing an exception if not.
        Throws:
        java.lang.IllegalStateException - if no token read operation has been performed
      • isWhitespace

        public static boolean isWhitespace​(int ch)
        Return true if the given character (Unicode code point) is whitespace.
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character is whitespace
        See Also:
        Character.isWhitespace(int)
      • isNotWhitespace

        public static boolean isNotWhitespace​(int ch)
        Return true if the given character (Unicode code point) is not whitespace.
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character is not whitespace
        See Also:
        isWhitespace(int)
      • isLineWhitespace

        public static boolean isLineWhitespace​(int ch)
        Return true if the given character (Unicode code point) is whitespace that is not used in newline sequences (ie, not '\r' or '\n').
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character is a whitespace character not used in newline sequences
      • isNewLinePart

        public static boolean isNewLinePart​(int ch)
        Return true if the given character (Unicode code point) is used as part of newline sequences (ie, is either '\r' or '\n').
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character is used as part of newline sequences
      • isNotNewLinePart

        public static boolean isNotNewLinePart​(int ch)
        Return true if the given character (Unicode code point) is not used as part of newline sequences (ie, not '\r' or '\n').
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character is not used as part of newline sequences
        See Also:
        isNewLinePart(int)
      • isAlphanumeric

        public static boolean isAlphanumeric​(int ch)
        Return true if the given character (Unicode code point) is alphanumeric.
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the argument is alphanumeric
        See Also:
        Character.isAlphabetic(int), Character.isDigit(int)
      • isNotAlphanumeric

        public static boolean isNotAlphanumeric​(int ch)
        Return true if the given character (Unicode code point) is not alphanumeric.
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the argument is not alphanumeric
        See Also:
        isAlphanumeric(int)
      • isIntegerPart

        public static boolean isIntegerPart​(int ch)
        Return true if the given character (Unicode code point) can be used as part of the string representation of an integer. This will be true for the following types of characters:
        • digits
        • the '-' (minus) character
        • the '+' (plus) character
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character can be used as part of an integer string
      • isDecimalPart

        public static boolean isDecimalPart​(int ch)
        Return true if the given character (Unicode code point) can be used as part of the string representation of a decimal number. This will be true for the following types of characters:
        • digits
        • the '-' (minus) character
        • the '+' (plus) character
        • the '.' (period) character
        • the 'e' character
        • the 'E' character
        Parameters:
        ch - character (Unicode code point) to test
        Returns:
        true if the given character can be used as part of a decimal number string
      • stringsEqual

        private static boolean stringsEqual​(java.lang.String a,
                                            java.lang.String b,
                                            boolean caseSensitive)
        Test two strings for equality. One or both arguments may be null.
        Parameters:
        a - first string
        b - second string
        caseSensitive - comparison is case-sensitive if set to true
        Returns:
        true if the string arguments are considered equal