Class CharacterReader

java.lang.Object
com.itextpdf.styledxmlparser.jsoup.parser.CharacterReader

public final class CharacterReader extends Object
CharacterReader consumes tokens off a string. Used internally by jsoup. API subject to changes.
  • Field Details

    • EOF

      static final char EOF
      See Also:
    • maxStringCacheLen

      private static final int maxStringCacheLen
      See Also:
    • maxBufferLen

      static final int maxBufferLen
      See Also:
    • readAheadLimit

      static final int readAheadLimit
      See Also:
    • minReadAheadLen

      private static final int minReadAheadLen
      See Also:
    • charBuf

      private char[] charBuf
    • reader

      private Reader reader
    • bufLength

      private int bufLength
    • bufSplitPoint

      private int bufSplitPoint
    • bufPos

      private int bufPos
    • readerPos

      private int readerPos
    • bufMark

      private int bufMark
    • stringCacheSize

      private static final int stringCacheSize
      See Also:
    • stringCache

      private String[] stringCache
    • readFully

      private boolean readFully
  • Constructor Details

    • CharacterReader

      public CharacterReader(Reader input, int sz)
    • CharacterReader

      public CharacterReader(Reader input)
    • CharacterReader

      public CharacterReader(String input)
  • Method Details

    • close

      public void close()
    • bufferUp

      private void bufferUp()
    • pos

      public int pos()
      Gets the current cursor position in the content.
      Returns:
      current position
    • isEmpty

      public boolean isEmpty()
      Tests if all the content has been read.
      Returns:
      true if nothing left to read.
    • isEmptyNoBufferUp

      private boolean isEmptyNoBufferUp()
    • current

      public char current()
      Get the char at the current position.
      Returns:
      char
    • consume

      char consume()
    • unconsume

      void unconsume()
      Unconsume one character (bufPos--). MUST only be called directly after a consume(), and no chance of a bufferUp.
    • advance

      public void advance()
      Moves the current position by one.
    • mark

      void mark()
    • unmark

      void unmark()
    • rewindToMark

      void rewindToMark()
    • nextIndexOf

      int nextIndexOf(char c)
      Returns the number of characters between the current position and the next instance of the input char
      Parameters:
      c - scan target
      Returns:
      offset between current position and next instance of target. -1 if not found.
    • nextIndexOf

      int nextIndexOf(CharSequence seq)
      Returns the number of characters between the current position and the next instance of the input sequence
      Parameters:
      seq - scan target
      Returns:
      offset between current position and next instance of target. -1 if not found.
    • consumeTo

      public String consumeTo(char c)
      Reads characters up to the specific char.
      Parameters:
      c - the delimiter
      Returns:
      the chars read
    • consumeTo

      String consumeTo(String seq)
    • consumeToAny

      public String consumeToAny(char... chars)
      Read characters until the first of any delimiters is found.
      Parameters:
      chars - delimiters to scan for
      Returns:
      characters read up to the matched delimiter.
    • consumeToAnySorted

      String consumeToAnySorted(char... chars)
    • consumeData

      String consumeData()
    • consumeAttributeQuoted

      String consumeAttributeQuoted(boolean single)
    • consumeRawData

      String consumeRawData()
    • consumeTagName

      String consumeTagName()
    • consumeToEnd

      String consumeToEnd()
    • consumeLetterSequence

      String consumeLetterSequence()
    • consumeLetterThenDigitSequence

      String consumeLetterThenDigitSequence()
    • consumeHexSequence

      String consumeHexSequence()
    • consumeDigitSequence

      String consumeDigitSequence()
    • matches

      boolean matches(char c)
    • matches

      boolean matches(String seq)
    • matchesIgnoreCase

      boolean matchesIgnoreCase(String seq)
    • matchesAny

      boolean matchesAny(char... seq)
    • matchesAnySorted

      boolean matchesAnySorted(char[] seq)
    • matchesLetter

      boolean matchesLetter()
    • matchesDigit

      boolean matchesDigit()
    • matchConsume

      boolean matchConsume(String seq)
    • matchConsumeIgnoreCase

      boolean matchConsumeIgnoreCase(String seq)
    • containsIgnoreCase

      boolean containsIgnoreCase(String seq)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • cacheString

      private static String cacheString(char[] charBuf, String[] stringCache, int start, int count)
      Caches short strings, as a flywheel pattern, to reduce GC load. Just for this doc, to prevent leaks.

      Simplistic, and on hash collisions just falls back to creating a new string, vs a full HashMap with Entry list. That saves both having to create objects as hash keys, and running through the entry list, at the expense of some more duplicates.

    • rangeEquals

      static boolean rangeEquals(char[] charBuf, int start, int count, String cached)
      Check if the value of the provided range equals the string.
    • rangeEquals

      boolean rangeEquals(int start, int count, String cached)