Class PHPTokenMaker

  • All Implemented Interfaces:
    TokenMaker

    public class PHPTokenMaker
    extends AbstractMarkupTokenMaker
    Scanner for PHP files. This implementation was created using JFlex 1.4.1; however, the generated file was modified for performance. Memory allocation needs to be almost completely removed to be competitive with the handwritten lexers (subclasses of AbstractTokenMaker), so this class has been modified so that Strings are never allocated (via yytext()), and the scanner never has to worry about refilling its buffer (needlessly copying chars around). We can achieve this because RText always scans exactly 1 line of tokens at a time, and hands the scanner this line as an array of characters (a Segment really). Since tokens contain pointers to char arrays instead of Strings holding their contents, there is no need for allocating new memory for Strings.

    The actual algorithm generated for scanning has, of course, not been modified.

    If you wish to regenerate this file yourself, keep in mind the following:

    • The generated PHPTokenMaker.java file will contain two definitions of both zzRefill and yyreset. You should hand-delete the second of each definition (the ones generated by the lexer), as these generated methods modify the input buffer, which we'll never have to do.
    • You should also change the declaration/definition of zzBuffer to NOT be initialized. This is a needless memory allocation for us since we will be pointing the array somewhere else anyway.
    • You should NOT call yylex() on the generated scanner directly; rather, you should use getTokenList as you would with any other TokenMaker instance.
    Version:
    0.9
    • Field Detail

      • YYEOF

        public static final int YYEOF
        This character denotes the end of file
        See Also:
        Constant Field Values
      • JS_TEMPLATE_LITERAL_EXPR

        public static final int JS_TEMPLATE_LITERAL_EXPR
        See Also:
        Constant Field Values
      • ZZ_CMAP_PACKED

        private static final java.lang.String ZZ_CMAP_PACKED
        Translates characters to character classes
        See Also:
        Constant Field Values
      • ZZ_CMAP

        private static final char[] ZZ_CMAP
        Translates characters to character classes
      • ZZ_ACTION

        private static final int[] ZZ_ACTION
        Translates DFA states to action switch labels.
      • ZZ_ACTION_PACKED_0

        private static final java.lang.String ZZ_ACTION_PACKED_0
        See Also:
        Constant Field Values
      • ZZ_ROWMAP

        private static final int[] ZZ_ROWMAP
        Translates a state to a row index in the transition table
      • ZZ_ROWMAP_PACKED_0

        private static final java.lang.String ZZ_ROWMAP_PACKED_0
        See Also:
        Constant Field Values
      • ZZ_TRANS

        private static final int[] ZZ_TRANS
        The transition table of the DFA
      • ZZ_TRANS_PACKED_0

        private static final java.lang.String ZZ_TRANS_PACKED_0
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_1

        private static final java.lang.String ZZ_TRANS_PACKED_1
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_2

        private static final java.lang.String ZZ_TRANS_PACKED_2
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_3

        private static final java.lang.String ZZ_TRANS_PACKED_3
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_4

        private static final java.lang.String ZZ_TRANS_PACKED_4
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_5

        private static final java.lang.String ZZ_TRANS_PACKED_5
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_6

        private static final java.lang.String ZZ_TRANS_PACKED_6
        See Also:
        Constant Field Values
      • ZZ_TRANS_PACKED_7

        private static final java.lang.String ZZ_TRANS_PACKED_7
        See Also:
        Constant Field Values
      • ZZ_ERROR_MSG

        private static final java.lang.String[] ZZ_ERROR_MSG
      • ZZ_ATTRIBUTE

        private static final int[] ZZ_ATTRIBUTE
        ZZ_ATTRIBUTE[aState] contains the attributes of state aState
      • ZZ_ATTRIBUTE_PACKED_0

        private static final java.lang.String ZZ_ATTRIBUTE_PACKED_0
        See Also:
        Constant Field Values
      • zzReader

        private java.io.Reader zzReader
        the input device
      • zzState

        private int zzState
        the current state of the DFA
      • zzLexicalState

        private int zzLexicalState
        the current lexical state
      • zzBuffer

        private char[] zzBuffer
        this buffer contains the current text to be matched and is the source of the yytext() string
      • zzMarkedPos

        private int zzMarkedPos
        the textposition at the last accepting state
      • zzCurrentPos

        private int zzCurrentPos
        the current text position in the buffer
      • zzStartRead

        private int zzStartRead
        startRead marks the beginning of the yytext() string in the buffer
      • zzEndRead

        private int zzEndRead
        endRead marks the last character in the buffer, that has been read from input
      • zzAtEOF

        private boolean zzAtEOF
        zzAtEOF == true <=> the scanner is at the EOF
      • INTERNAL_ATTR_DOUBLE

        static final int INTERNAL_ATTR_DOUBLE
        Type specific to PHPTokenMaker denoting a line ending with an unclosed double-quote attribute.
        See Also:
        Constant Field Values
      • INTERNAL_ATTR_SINGLE

        static final int INTERNAL_ATTR_SINGLE
        Type specific to PHPTokenMaker denoting a line ending with an unclosed single-quote attribute.
        See Also:
        Constant Field Values
      • INTERNAL_INTAG

        static final int INTERNAL_INTAG
        Token type specific to PHPTokenMaker; this signals that the user has ended a line with an unclosed HTML tag; thus a new line is beginning still inside of the tag.
        See Also:
        Constant Field Values
      • INTERNAL_INTAG_SCRIPT

        static final int INTERNAL_INTAG_SCRIPT
        Token type specific to PHPTokenMaker; this signals that the user has ended a line with an unclosed <script> tag.
        See Also:
        Constant Field Values
      • INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT

        static final int INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT
        Token type specifying we're in a double-quoted attribute in a script tag.
        See Also:
        Constant Field Values
      • INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT

        static final int INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT
        Token type specifying we're in a single-quoted attribute in a script tag.
        See Also:
        Constant Field Values
      • INTERNAL_INTAG_STYLE

        static final int INTERNAL_INTAG_STYLE
        Token type specifying that the user has ended a line with an unclosed <style> tag.
        See Also:
        Constant Field Values
      • INTERNAL_ATTR_DOUBLE_QUOTE_STYLE

        static final int INTERNAL_ATTR_DOUBLE_QUOTE_STYLE
        Token type specifying we're in a double-quoted attribute in a style tag.
        See Also:
        Constant Field Values
      • INTERNAL_ATTR_SINGLE_QUOTE_STYLE

        static final int INTERNAL_ATTR_SINGLE_QUOTE_STYLE
        Token type specifying we're in a single-quoted attribute in a style tag.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS

        static final int INTERNAL_IN_JS
        Token type specifying we're in JavaScript.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_MLC

        static final int INTERNAL_IN_JS_MLC
        Token type specifying we're in a JavaScript multiline comment.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_COMMENT_DOCUMENTATION

        static final int INTERNAL_IN_JS_COMMENT_DOCUMENTATION
        Token type specifying we're in a JavaScript documentation comment.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_STRING_INVALID

        static final int INTERNAL_IN_JS_STRING_INVALID
        Token type specifying we're in an invalid multi-line JS string.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_STRING_VALID

        static final int INTERNAL_IN_JS_STRING_VALID
        Token type specifying we're in a valid multi-line JS string.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_CHAR_INVALID

        static final int INTERNAL_IN_JS_CHAR_INVALID
        Token type specifying we're in an invalid multi-line JS single-quoted string.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_CHAR_VALID

        static final int INTERNAL_IN_JS_CHAR_VALID
        Token type specifying we're in a valid multi-line JS single-quoted string.
        See Also:
        Constant Field Values
      • INTERNAL_CSS

        static final int INTERNAL_CSS
        Internal type denoting a line ending in CSS.
        See Also:
        Constant Field Values
      • INTERNAL_CSS_PROPERTY

        static final int INTERNAL_CSS_PROPERTY
        Internal type denoting a line ending in a CSS property.
        See Also:
        Constant Field Values
      • INTERNAL_CSS_VALUE

        static final int INTERNAL_CSS_VALUE
        Internal type denoting a line ending in a CSS property value.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_TEMPLATE_LITERAL_VALID

        static final int INTERNAL_IN_JS_TEMPLATE_LITERAL_VALID
        Token type specifying we're in a valid multi-line template literal.
        See Also:
        Constant Field Values
      • INTERNAL_IN_JS_TEMPLATE_LITERAL_INVALID

        static final int INTERNAL_IN_JS_TEMPLATE_LITERAL_INVALID
        Token type specifying we're in an invalid multi-line template literal.
        See Also:
        Constant Field Values
      • INTERNAL_CSS_STRING

        static final int INTERNAL_CSS_STRING
        Internal type denoting line ending in a CSS double-quote string. The state to return to is embedded in the actual end token type.
        See Also:
        Constant Field Values
      • INTERNAL_CSS_CHAR

        static final int INTERNAL_CSS_CHAR
        Internal type denoting line ending in a CSS single-quote string. The state to return to is embedded in the actual end token type.
        See Also:
        Constant Field Values
      • INTERNAL_CSS_MLC

        static final int INTERNAL_CSS_MLC
        Internal type denoting line ending in a CSS multi-line comment. The state to return to is embedded in the actual end token type.
        See Also:
        Constant Field Values
      • INTERNAL_IN_PHP

        static final int INTERNAL_IN_PHP
        Token type specifying we're in PHP. This particular field is public so that we can hack and key off of it for code completion.
        See Also:
        Constant Field Values
      • INTERNAL_IN_PHP_MLC

        static final int INTERNAL_IN_PHP_MLC
        Token type specifying we're in a PHP multiline comment.
        See Also:
        Constant Field Values
      • INTERNAL_IN_PHP_STRING

        static final int INTERNAL_IN_PHP_STRING
        Token type specifying we're in a PHP multiline string.
        See Also:
        Constant Field Values
      • INTERNAL_IN_PHP_CHAR

        static final int INTERNAL_IN_PHP_CHAR
        Token type specifying we're in a PHP multiline char.
        See Also:
        Constant Field Values
      • cssPrevState

        private int cssPrevState
        The state previous CSS-related state we were in before going into a CSS string, multi-line comment, etc.
      • completeCloseTags

        private static boolean completeCloseTags
        Whether closing markup tags are automatically completed for PHP.
      • phpInState

        private int phpInState
        The state PHP was started in (YYINITIAL, INTERNAL_IN_JS, etc.).
      • phpInLangIndex

        private int phpInLangIndex
        The language index we were in when PHP was started.
      • validJSString

        private boolean validJSString
        When in the JS_STRING state, whether the current string is valid.
      • LANG_INDEX_DEFAULT

        static final int LANG_INDEX_DEFAULT
        Language state set on HTML tokens. Must be 0.
        See Also:
        Constant Field Values
      • LANG_INDEX_JS

        static final int LANG_INDEX_JS
        Language state set on JavaScript tokens.
        See Also:
        Constant Field Values
      • LANG_INDEX_CSS

        static final int LANG_INDEX_CSS
        Language state set on CSS tokens.
        See Also:
        Constant Field Values
      • LANG_INDEX_PHP

        static final int LANG_INDEX_PHP
        Language state set on PHP.
        See Also:
        Constant Field Values
      • varDepths

        private java.util.Stack<java.lang.Boolean> varDepths
    • Constructor Detail

      • PHPTokenMaker

        public PHPTokenMaker()
        Constructor. This must be here because JFlex does not generate a no-parameter constructor.
      • PHPTokenMaker

        public PHPTokenMaker​(java.io.Reader in)
        Creates a new scanner There is also a java.io.InputStream version of this constructor.
        Parameters:
        in - the java.io.Reader to read input from.
      • PHPTokenMaker

        public PHPTokenMaker​(java.io.InputStream in)
        Creates a new scanner. There is also java.io.Reader version of this constructor.
        Parameters:
        in - the java.io.Inputstream to read input from.
    • Method Detail

      • zzUnpackAction

        private static int[] zzUnpackAction()
      • zzUnpackAction

        private static int zzUnpackAction​(java.lang.String packed,
                                          int offset,
                                          int[] result)
      • zzUnpackRowMap

        private static int[] zzUnpackRowMap()
      • zzUnpackRowMap

        private static int zzUnpackRowMap​(java.lang.String packed,
                                          int offset,
                                          int[] result)
      • zzUnpackTrans

        private static int[] zzUnpackTrans()
      • zzUnpackTrans

        private static int zzUnpackTrans​(java.lang.String packed,
                                         int offset,
                                         int[] result)
      • zzUnpackAttribute

        private static int[] zzUnpackAttribute()
      • zzUnpackAttribute

        private static int zzUnpackAttribute​(java.lang.String packed,
                                             int offset,
                                             int[] result)
      • addEndToken

        private void addEndToken​(int tokenType)
        Adds the token specified to the current linked list of tokens as an "end token;" that is, at zzMarkedPos.
        Parameters:
        tokenType - The token's type.
      • addHyperlinkToken

        private void addHyperlinkToken​(int start,
                                       int end,
                                       int tokenType)
        Adds the token specified to the current linked list of tokens.
        Parameters:
        tokenType - The token's type.
        See Also:
        addToken(int, int, int)
      • addPhpEndToken

        private void addPhpEndToken​(int endTokenState)
        Adds an end token that encodes the information necessary to return to the pre-PHP state and language index.
        Parameters:
        endTokenState - The PHP-related end-token state.
      • addToken

        private void addToken​(int tokenType)
        Adds the token specified to the current linked list of tokens.
        Parameters:
        tokenType - The token's type.
      • addToken

        private void addToken​(int start,
                              int end,
                              int tokenType)
        Adds the token specified to the current linked list of tokens.
        Parameters:
        tokenType - The token's type.
      • addToken

        public void addToken​(char[] array,
                             int start,
                             int end,
                             int tokenType,
                             int startOffset)
        Adds the token specified to the current linked list of tokens.
        Specified by:
        addToken in interface TokenMaker
        Overrides:
        addToken in class TokenMakerBase
        Parameters:
        array - The character array.
        start - The starting offset in the array.
        end - The ending offset in the array.
        tokenType - The token's type.
        startOffset - The offset in the document at which this token occurs.
      • createOccurrenceMarker

        protected OccurrenceMarker createOccurrenceMarker()
        Returns the occurrence marker to use for this token maker. Subclasses can override to use different implementations.
        Overrides:
        createOccurrenceMarker in class TokenMakerBase
        Returns:
        The occurrence marker to use.
      • getCompleteCloseTags

        public boolean getCompleteCloseTags()
        Sets whether markup close tags should be completed. You might not want this to be the case, since some tags in standard HTML aren't usually closed.
        Specified by:
        getCompleteCloseTags in class AbstractMarkupTokenMaker
        Returns:
        Whether closing markup tags are completed.
        See Also:
        setCompleteCloseTags(boolean)
      • getCurlyBracesDenoteCodeBlocks

        public boolean getCurlyBracesDenoteCodeBlocks​(int languageIndex)
        Description copied from class: TokenMakerBase
        Returns whether this programming language uses curly braces ('{' and '}') to denote code blocks. The default implementation returns false; subclasses can override this method if necessary.
        Specified by:
        getCurlyBracesDenoteCodeBlocks in interface TokenMaker
        Overrides:
        getCurlyBracesDenoteCodeBlocks in class TokenMakerBase
        Parameters:
        languageIndex - The language index at the offset in question. Since some TokenMakers effectively have nested languages (such as JavaScript in HTML), this parameter tells the TokenMaker what sub-language to look at.
        Returns:
        Whether curly braces denote code blocks.
      • getLineCommentStartAndEnd

        public java.lang.String[] getLineCommentStartAndEnd​(int languageIndex)
        Returns the text to place at the beginning and end of a line to "comment" it in this programming language.
        Specified by:
        getLineCommentStartAndEnd in interface TokenMaker
        Overrides:
        getLineCommentStartAndEnd in class AbstractMarkupTokenMaker
        Parameters:
        languageIndex - The language index at the offset in question. Since some TokenMakers effectively have nested languages (such as JavaScript in HTML), this parameter tells the TokenMaker what sub-language to look at.
        Returns:
        The start and end strings to add to a line to "comment" it out. A null value for either means there is no string to add for that part. A value of null for the array means this language does not support commenting/uncommenting lines.
      • getMarkOccurrencesOfTokenType

        public boolean getMarkOccurrencesOfTokenType​(int type)
        Returns whether tokens of the specified type should have "mark occurrences" enabled for the current programming language. The default implementation returns true if type is TokenTypes.IDENTIFIER. Subclasses can override this method to support other token types, such as TokenTypes.VARIABLE.
        Specified by:
        getMarkOccurrencesOfTokenType in interface TokenMaker
        Overrides:
        getMarkOccurrencesOfTokenType in class TokenMakerBase
        Parameters:
        type - The token type.
        Returns:
        Whether tokens of this type should have "mark occurrences" enabled.
      • getTokenList

        public Token getTokenList​(javax.swing.text.Segment text,
                                  int initialTokenType,
                                  int startOffset)
        Returns the first token in the linked list of tokens generated from text. This method must be implemented by subclasses so they can correctly implement syntax highlighting.
        Parameters:
        text - The text from which to get tokens.
        initialTokenType - The token type we should start with.
        startOffset - The offset into the document at which text starts.
        Returns:
        The first Token in a linked list representing the syntax highlighted text.
      • setCompleteCloseTags

        public static void setCompleteCloseTags​(boolean complete)
        Sets whether markup close tags should be completed. You might not want this to be the case, since some tags in standard HTML aren't usually closed.
        Parameters:
        complete - Whether closing markup tags are completed.
        See Also:
        getCompleteCloseTags()
      • yybegin

        protected void yybegin​(int state,
                               int languageIndex)
        Overridden to remember the language index we're leaving.
        Overrides:
        yybegin in class AbstractJFlexTokenMaker
        Parameters:
        state - The new JFlex state to enter.
        languageIndex - The new language index.
      • zzRefill

        private boolean zzRefill()
        Refills the input buffer.
        Returns:
        true if EOF was reached, otherwise false.
      • yyreset

        public final void yyreset​(java.io.Reader reader)
        Resets the scanner to read from a new input stream. Does not close the old reader. All internal variables are reset, the old input stream cannot be reused (internal buffer is discarded and lost). Lexical state is set to YY_INITIAL.
        Parameters:
        reader - the new input stream
      • zzUnpackCMap

        private static char[] zzUnpackCMap​(java.lang.String packed)
        Unpacks the compressed character translation table.
        Parameters:
        packed - the packed character translation table
        Returns:
        the unpacked character translation table
      • yyclose

        public final void yyclose()
                           throws java.io.IOException
        Closes the input stream.
        Specified by:
        yyclose in class AbstractJFlexTokenMaker
        Throws:
        java.io.IOException - If an IO error occurs.
      • yystate

        public final int yystate()
        Returns the current lexical state.
      • yybegin

        public final void yybegin​(int newState)
        Enters a new lexical state
        Specified by:
        yybegin in class AbstractJFlexTokenMaker
        Parameters:
        newState - the new lexical state
      • yytext

        public final java.lang.String yytext()
        Returns the text matched by the current regular expression.
        Specified by:
        yytext in class AbstractJFlexTokenMaker
      • yycharat

        public final char yycharat​(int pos)
        Returns the character at position pos from the matched text. It is equivalent to yytext().charAt(pos), but faster
        Parameters:
        pos - the position of the character to fetch. A value from 0 to yylength()-1.
        Returns:
        the character at position pos
      • yylength

        public final int yylength()
        Returns the length of the matched text region.
      • zzScanError

        private void zzScanError​(int errorCode)
        Reports an error that occured while scanning. In a wellformed scanner (no or only correct usage of yypushback(int) and a match-all fallback rule) this method will only be called with things that "Can't Possibly Happen". If this method is called, something is seriously wrong (e.g. a JFlex bug producing a faulty scanner etc.). Usual syntax/scanner level error handling should be done in error fallback rules.
        Parameters:
        errorCode - the code of the errormessage to display
      • yypushback

        public void yypushback​(int number)
        Pushes the specified amount of characters back into the input stream. They will be read again by then next call of the scanning method
        Parameters:
        number - the number of characters to be read again. This number must not be greater than yylength()!
      • yylex

        public Token yylex()
                    throws java.io.IOException
        Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.
        Returns:
        the next token
        Throws:
        java.io.IOException - if any I/O-Error occurs