Package com.itextpdf.io.source
Class PdfTokenizer
- java.lang.Object
-
- com.itextpdf.io.source.PdfTokenizer
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public class PdfTokenizer extends java.lang.Object implements java.io.Closeable
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
PdfTokenizer.TokenType
-
Field Summary
Fields Modifier and Type Field Description private boolean
closeStream
Streams are closed automatically.static boolean[]
delims
static byte[]
F
static byte[]
False
private RandomAccessFileOrArray
file
protected int
generation
protected boolean
hexString
static byte[]
N
static byte[]
Null
static byte[]
Obj
protected ByteBuffer
outBuf
static byte[]
R
protected int
reference
static byte[]
Startxref
static byte[]
Stream
static byte[]
Trailer
static byte[]
True
protected PdfTokenizer.TokenType
type
static byte[]
Xref
-
Constructor Summary
Constructors Constructor Description PdfTokenizer(RandomAccessFileOrArray file)
Creates a PdfTokenizer for the specifiedRandomAccessFileOrArray
.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
backOnePosition(int ch)
void
checkFdfHeader()
static int[]
checkObjectStart(PdfTokenizer lineTokenizer)
Check whether line starts with object declaration.java.lang.String
checkPdfHeader()
static boolean
checkTrailer(ByteBuffer line)
Checks whetherline
equals to 'trailer'.void
close()
static byte[]
decodeStringContent(byte[] content, boolean hexWriting)
Resolve escape symbols or hexadecimal symbols.protected static byte[]
decodeStringContent(byte[] content, int from, int to, boolean hexWriting)
Resolve escape symbols or hexadecimal symbols.byte[]
getByteContent()
byte[]
getDecodedStringContent()
int
getGenNr()
int
getHeaderOffset()
int
getIntValue()
long
getLongValue()
long
getNextEof()
Gets next %%EOF marker in current PDF file.int
getObjNr()
long
getPosition()
RandomAccessFileOrArray
getSafeFile()
long
getStartxref()
java.lang.String
getStringValue()
PdfTokenizer.TokenType
getTokenType()
boolean
isCloseStream()
protected static boolean
isDelimiter(int ch)
protected static boolean
isDelimiterWhitespace(int ch)
boolean
isHexString()
static boolean
isWhitespace(int ch)
Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.protected static boolean
isWhitespace(int ch, boolean isWhitespace)
Checks whether a character is a whitespace.long
length()
boolean
nextToken()
void
nextValidToken()
int
read()
void
readFully(byte[] bytes)
boolean
readLineSegment(ByteBuffer buffer)
Reads data into the provided byte[].boolean
readLineSegment(ByteBuffer buffer, boolean isNullWhitespace)
Reads data into the provided byte[].java.lang.String
readString(int size)
void
seek(long pos)
void
setCloseStream(boolean closeStream)
void
throwError(java.lang.String error, java.lang.Object... messageParams)
Helper method to handle content errors.boolean
tokenValueEqualsTo(byte[] cmp)
-
-
-
Field Detail
-
delims
public static final boolean[] delims
-
Obj
public static final byte[] Obj
-
R
public static final byte[] R
-
Xref
public static final byte[] Xref
-
Startxref
public static final byte[] Startxref
-
Stream
public static final byte[] Stream
-
Trailer
public static final byte[] Trailer
-
N
public static final byte[] N
-
F
public static final byte[] F
-
Null
public static final byte[] Null
-
True
public static final byte[] True
-
False
public static final byte[] False
-
type
protected PdfTokenizer.TokenType type
-
reference
protected int reference
-
generation
protected int generation
-
hexString
protected boolean hexString
-
outBuf
protected ByteBuffer outBuf
-
file
private final RandomAccessFileOrArray file
-
closeStream
private boolean closeStream
Streams are closed automatically.
-
-
Constructor Detail
-
PdfTokenizer
public PdfTokenizer(RandomAccessFileOrArray file)
Creates a PdfTokenizer for the specifiedRandomAccessFileOrArray
. The beginning of the file is read to determine the location of the header, and the data source is adjusted as necessary to account for any junk that occurs in the byte source before the header- Parameters:
file
- the source
-
-
Method Detail
-
seek
public void seek(long pos)
-
readFully
public void readFully(byte[] bytes) throws java.io.IOException
- Throws:
java.io.IOException
-
getPosition
public long getPosition()
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Throws:
java.io.IOException
-
length
public long length()
-
read
public int read() throws java.io.IOException
- Throws:
java.io.IOException
-
readString
public java.lang.String readString(int size) throws java.io.IOException
- Throws:
java.io.IOException
-
getTokenType
public PdfTokenizer.TokenType getTokenType()
-
getByteContent
public byte[] getByteContent()
-
getStringValue
public java.lang.String getStringValue()
-
getDecodedStringContent
public byte[] getDecodedStringContent()
-
tokenValueEqualsTo
public boolean tokenValueEqualsTo(byte[] cmp)
-
getObjNr
public int getObjNr()
-
getGenNr
public int getGenNr()
-
backOnePosition
public void backOnePosition(int ch)
-
getHeaderOffset
public int getHeaderOffset() throws java.io.IOException
- Throws:
java.io.IOException
-
checkPdfHeader
public java.lang.String checkPdfHeader() throws java.io.IOException
- Throws:
java.io.IOException
-
checkFdfHeader
public void checkFdfHeader() throws java.io.IOException
- Throws:
java.io.IOException
-
getStartxref
public long getStartxref() throws java.io.IOException
- Throws:
java.io.IOException
-
getNextEof
public long getNextEof() throws java.io.IOException
Gets next %%EOF marker in current PDF file.- Returns:
- next %%EOF marker position
- Throws:
java.io.IOException
- in case of input-output related exceptions during PDF document reading
-
nextValidToken
public void nextValidToken() throws java.io.IOException
- Throws:
java.io.IOException
-
nextToken
public boolean nextToken() throws java.io.IOException
- Throws:
java.io.IOException
-
getLongValue
public long getLongValue()
-
getIntValue
public int getIntValue()
-
isHexString
public boolean isHexString()
-
isCloseStream
public boolean isCloseStream()
-
setCloseStream
public void setCloseStream(boolean closeStream)
-
getSafeFile
public RandomAccessFileOrArray getSafeFile()
-
decodeStringContent
protected static byte[] decodeStringContent(byte[] content, int from, int to, boolean hexWriting)
Resolve escape symbols or hexadecimal symbols.NOTE Due to PdfReference 1.7 part 3.2.3 String value contain ASCII characters, so we can convert it directly to byte array.
- Parameters:
content
- string bytes to be decodedfrom
- given start indexto
- given end indexhexWriting
- true if given string is hex-encoded, e.g. '<69546578…>'. False otherwise, e.g. '((iText( some version)…)'- Returns:
- byte[] for decrypting or for creating
String
.
-
decodeStringContent
public static byte[] decodeStringContent(byte[] content, boolean hexWriting)
Resolve escape symbols or hexadecimal symbols.
NOTE Due to PdfReference 1.7 part 3.2.3 String value contain ASCII characters, so we can convert it directly to byte array.- Parameters:
content
- string bytes to be decodedhexWriting
- true if given string is hex-encoded, e.g. '<69546578…>'. False otherwise, e.g. '((iText( some version)…)'- Returns:
- byte[] for decrypting or for creating
String
.
-
isWhitespace
public static boolean isWhitespace(int ch)
Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.
The same as callingisWhiteSpace(ch, true)
.- Parameters:
ch
- int- Returns:
- boolean
-
isWhitespace
protected static boolean isWhitespace(int ch, boolean isWhitespace)
Checks whether a character is a whitespace. Currently checks on the following: '0', '9', '10', '12', '13', '32'.- Parameters:
ch
- intisWhitespace
- boolean- Returns:
- boolean
-
isDelimiter
protected static boolean isDelimiter(int ch)
-
isDelimiterWhitespace
protected static boolean isDelimiterWhitespace(int ch)
-
throwError
public void throwError(java.lang.String error, java.lang.Object... messageParams)
Helper method to handle content errors. Add file position toPdfRuntimeException
.- Parameters:
error
- message.messageParams
- error params.- Throws:
IOException
- wrap error message intoPdfRuntimeException
and add position in file.
-
checkTrailer
public static boolean checkTrailer(ByteBuffer line)
Checks whetherline
equals to 'trailer'.- Parameters:
line
- for check- Returns:
- true, if line is equals to 'trailer', otherwise false
-
readLineSegment
public boolean readLineSegment(ByteBuffer buffer) throws java.io.IOException
Reads data into the provided byte[]. Checks on leading whitespace. SeeisWhiteSpace(int)
orisWhiteSpace(int, boolean)
for a list of whitespace characters.
The same as callingreadLineSegment(input, true)
.- Parameters:
buffer
- aByteBuffer
to which the result of reading will be saved- Returns:
- true, if something was read or if the end of the input stream is not reached
- Throws:
java.io.IOException
- in case of any reading error
-
readLineSegment
public boolean readLineSegment(ByteBuffer buffer, boolean isNullWhitespace) throws java.io.IOException
Reads data into the provided byte[]. Checks on leading whitespace. SeeisWhiteSpace(int)
orisWhiteSpace(int, boolean)
for a list of whitespace characters.- Parameters:
buffer
- aByteBuffer
to which the result of reading will be savedisNullWhitespace
- boolean to indicate whether '0' is whitespace or not. If in doubt, use true or overloaded methodreadLineSegment(input)
- Returns:
- true, if something was read or if the end of the input stream is not reached
- Throws:
java.io.IOException
- in case of any reading error
-
checkObjectStart
public static int[] checkObjectStart(PdfTokenizer lineTokenizer)
Check whether line starts with object declaration.- Parameters:
lineTokenizer
- tokenizer, built by single line.- Returns:
- object number and generation if check is successful, otherwise - null.
-
-