Class PRTokeniser


  • public class PRTokeniser
    extends java.lang.Object
    • Field Detail

      • outBuf

        private final java.lang.StringBuilder outBuf
      • delims

        public static final boolean[] delims
      • stringValue

        protected java.lang.String stringValue
      • reference

        protected int reference
      • generation

        protected int generation
      • hexString

        protected boolean hexString
    • Constructor Detail

      • PRTokeniser

        public PRTokeniser​(RandomAccessFileOrArray file)
        Creates a PRTokeniser for the specified RandomAccessFileOrArray. The beginning of the file is read to determine the location of the header, and the data source is adjusted as necessary to account for any junk that occurs in the byte source before the header
        Parameters:
        file - the source
    • Method Detail

      • seek

        public void seek​(long pos)
                  throws java.io.IOException
        Throws:
        java.io.IOException
      • getFilePointer

        public long getFilePointer()
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • close

        public void close()
                   throws java.io.IOException
        Throws:
        java.io.IOException
      • length

        public long length()
                    throws java.io.IOException
        Throws:
        java.io.IOException
      • read

        public int read()
                 throws java.io.IOException
        Throws:
        java.io.IOException
      • readString

        public java.lang.String readString​(int size)
                                    throws java.io.IOException
        Throws:
        java.io.IOException
      • isWhitespace

        public static final boolean isWhitespace​(int ch)
        Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.
        The same as calling isWhiteSpace(ch, true).
        Parameters:
        ch - int
        Returns:
        boolean
        Since:
        5.5.1
      • isWhitespace

        public static final boolean isWhitespace​(int ch,
                                                 boolean isWhitespace)
        Checks whether a character is a whitespace. Currently checks on the following: '0', '9', '10', '12', '13', '32'.
        Parameters:
        ch - int
        isWhitespace - boolean
        Returns:
        boolean
        Since:
        5.5.1
      • isDelimiter

        public static final boolean isDelimiter​(int ch)
      • isDelimiterWhitespace

        public static final boolean isDelimiterWhitespace​(int ch)
      • getStringValue

        public java.lang.String getStringValue()
      • getReference

        public int getReference()
        Gets current reference number. If parsing was failed with NumberFormatException -1 will be return.
        Returns:
        a positive integer for correct reference, or negative for incorrect.
      • getGeneration

        public int getGeneration()
      • backOnePosition

        public void backOnePosition​(int ch)
      • throwError

        public void throwError​(java.lang.String error)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • getHeaderOffset

        public int getHeaderOffset()
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • checkPdfHeader

        public char checkPdfHeader()
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • checkFdfHeader

        public void checkFdfHeader()
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • getStartxref

        public long getStartxref()
                          throws java.io.IOException
        Throws:
        java.io.IOException
      • getHex

        public static int getHex​(int v)
      • nextValidToken

        public void nextValidToken()
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • nextToken

        public boolean nextToken()
                          throws java.io.IOException
        Throws:
        java.io.IOException
      • longValue

        public long longValue()
      • intValue

        public int intValue()
      • readLineSegment

        public boolean readLineSegment​(byte[] input)
                                throws java.io.IOException
        Reads data into the provided byte[]. Checks on leading whitespace. See isWhiteSpace(int) or isWhiteSpace(int, boolean) for a list of whitespace characters.
        The same as calling readLineSegment(input, true).
        Parameters:
        input - byte[]
        Returns:
        boolean
        Throws:
        java.io.IOException
        Since:
        5.5.1
      • readLineSegment

        public boolean readLineSegment​(byte[] input,
                                       boolean isNullWhitespace)
                                throws java.io.IOException
        Reads data into the provided byte[]. Checks on leading whitespace. See isWhiteSpace(int) or isWhiteSpace(int, boolean) for a list of whitespace characters.
        Parameters:
        input - byte[]
        isNullWhitespace - boolean to indicate whether '0' is whitespace or not. If in doubt, use true or overloaded method readLineSegment(input)
        Returns:
        boolean
        Throws:
        java.io.IOException
        Since:
        5.5.1
      • checkObjectStart

        public static long[] checkObjectStart​(byte[] line)
      • isHexString

        public boolean isHexString()