Class LexerImpl

java.lang.Object
io.pebbletemplates.pebble.lexer.LexerImpl
All Implemented Interfaces:
Lexer

public final class LexerImpl extends Object implements Lexer
This class reads the template input and builds single items out of it.

This class is not thread safe.

  • Field Details

    • logger

      private final org.slf4j.Logger logger
    • syntax

      private final Syntax syntax
      Syntax
    • unaryOperators

      private final Collection<UnaryOperator> unaryOperators
      Unary operators
    • binaryOperators

      private final Collection<BinaryOperator> binaryOperators
      Binary operators
    • source

      private TemplateSource source
      As we progress through the source we maintain a string which is the text that has yet to be tokenized.
    • tokens

      private ArrayList<Token> tokens
      The list of tokens that we find and use to create a TokenStream
    • brackets

      private LinkedList<Pair<String,Integer>> brackets
      Represents the brackets we are currently inside ordered by how recently we encountered them. (i.e. peek() will return the most innermost bracket, getLast() will return the outermost). Brackets in this case includes double quotes. The String value of the pair is the bracket representation, and the Integer is the line number.
    • lexerStateStack

      private Deque<LexerImpl.State> lexerStateStack
      The state of the lexer is important so that we know what to expect next and to help discover errors in the template (ex. unclosed comments).
    • trimLeadingWhitespaceFromNextData

      private boolean trimLeadingWhitespaceFromNextData
      If we encountered an END delimiter that was preceded with a whitespace trim character (ex. {{ foo -}}) then this boolean is toggled to "true" which tells the lexData() method to trim leading whitespace from the next text token.
    • REGEX_IDENTIFIER

      private static final Pattern REGEX_IDENTIFIER
      Static regular expressions for identifiers.
    • REGEX_LONG

      private static final Pattern REGEX_LONG
    • REGEX_NUMBER

      private static final Pattern REGEX_NUMBER
    • REGEX_DOUBLEQUOTE

      private static final Pattern REGEX_DOUBLEQUOTE
      Matches a double quote
    • REGEX_STRING_NON_INTERPOLATED_PART

      private static final Pattern REGEX_STRING_NON_INTERPOLATED_PART
      Matches everything up to the first interpolation in a double quoted string
    • REGEX_STRING_PLAIN

      private static final Pattern REGEX_STRING_PLAIN
      Matches single quoted strings and double quoted strings without interpolation. Extra complexity is due to ignoring escaped quotation marks.
    • PUNCTUATION

      private static final String PUNCTUATION
      See Also:
    • regexOperators

      private Pattern regexOperators
      Regular expression to find operators
  • Constructor Details

    • LexerImpl

      public LexerImpl(Syntax syntax, Collection<UnaryOperator> unaryOperators, Collection<BinaryOperator> binaryOperators)
      Constructor
      Parameters:
      syntax - The primary syntax
      unaryOperators - The available unary operators
      binaryOperators - The available binary operators
  • Method Details

    • tokenize

      public TokenStream tokenize(Reader reader, String name)
      This is the main method used to tokenize the raw contents of a template.
      Specified by:
      tokenize in interface Lexer
      Parameters:
      reader - The reader provided from the Loader
      name - The name of the template (used for meaningful error messages)
    • tokenizeStringInterpolation

      private void tokenizeStringInterpolation()
    • tokenizeString

      private void tokenizeString()
    • tokenizeData

      private void tokenizeData()
      The DATA state assumes that we are current NOT in between any pair of meaningful delimiters. We are currently looking for the next "open" or "start" delimiter, ex. the opening comment delimiter, or the opening variable delimiter.
    • tokenizeBetweenExecuteDelimiters

      private void tokenizeBetweenExecuteDelimiters()
      Tokenizes between execute delimiters.
    • tokenizeBetweenPrintDelimiters

      private void tokenizeBetweenPrintDelimiters()
      Tokenizes between print delimiters.
    • tokenizeComment

      private void tokenizeComment()
      Tokenizes between comment delimiters.

      Simply find the closing delimiter for the comment and move the cursor to that point.

    • tokenizeExpression

      private void tokenizeExpression()
      Tokenizing an expression which can be found within both execute and print regions.
    • unquoteAndUnescape

      private String unquoteAndUnescape(String str)
      This method assumes the provided str starts with a single or double quote. It removes the wrapping quotes, and un-escapes any quotes within the string.
    • checkForLeadingWhitespaceTrim

      private void checkForLeadingWhitespaceTrim(Token leadingToken)
    • checkForTrailingWhitespaceTrim

      private void checkForTrailingWhitespaceTrim()
    • lexVerbatimData

      private void lexVerbatimData(Matcher verbatimStartMatcher)
      Implementation of the "verbatim" tag
    • pushToken

      private Token pushToken(Token.Type type)
      Create a Token with a Token Type but without no value onto the list of tokens that we are maintaining.
      Parameters:
      type - The type of Token we are creating
    • pushToken

      private Token pushToken(Token.Type type, String value)
      Create a Token of a certain type and value and push it into the list of tokens that we are maintaining. `
      Parameters:
      type - The type of token we are creating
      value - The value of the new token
    • popState

      private void popState()
      Pop state from the stack
    • buildOperatorRegex

      private void buildOperatorRegex()
      Retrieves the operators (both unary and binary) from the PebbleEngine and then dynamically creates one giant regular expression to detect for the existence of one of these operators.