Package com.sun.pdfview
Class PDFParser
java.lang.Object
com.sun.pdfview.BaseWatchable
com.sun.pdfview.PDFParser
PDFParser is the class that parses a PDF content stream and
produces PDFCmds for a PDFPage. You should never ever see it run:
it gets created by a PDFPage only if needed, and may even run in
its own thread.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) class
A class to store state needed whiel rendering.(package private) class
a token from a PDF StreamNested classes/interfaces inherited from class com.sun.pdfview.BaseWatchable
BaseWatchable.Gate
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate boolean
private int
private PDFPage
the actual command, for use within a singe iteration.static final String
emit a file of DCT stream data.static int
(package private) boolean
private int
private WeakReference
a weak reference to the page we render into.private Stack
<PDFParser.ParserState> private GeneralPath
private boolean
private PDFParser.ParserState
(package private) byte[]
private PDFParser.Tok
Fields inherited from interface com.sun.pdfview.Watchable
COMPLETED, ERROR, NEEDS_DATA, NOT_STARTED, PAUSED, RUNNING, STOPPED, UNKNOWN
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
cleanup()
Cleanup when iteration is donestatic void
private void
Inject a stream of PDF commands onto the page.private void
Parse image data into a Java BufferedImage and add the image command to the page.private PDFPaint
doPattern
(PatternSpace patternSpace) Set the values into a PatternSpaceprivate void
build a shader from a dictionary.private void
Insert a PDF object into the command stream.void
static void
emitDataFile
(byte[] ary, String name) take a byte array and write a temporary file with it's data.static String
private PDFObject
findResource
(String name, String inDict) get a property from a named dictionary in the resources of this content stream.private PDFFont
getFontFrom
(String fontref) get a PDFFont from the resources, given the resource name of the font.int
iterate()
parse the stream.private PDFParser.Tok
get the next token.private PDFColorSpace
parseColorSpace
(PDFObject csobj) generate a PDFColorSpace description based on a PDFObject.private void
Parse an inline image.private Object
Parse the next object out of the PDF stream.private Object[]
popArray()
pop an array off the stackprivate float
popFloat()
pop a single float value off the stack.private float[]
popFloat
(int count) pop an array of float values off the stack.private float[]
pop an array of integer values off the stack.private int
popInt()
pop a single integer value off the stack.private PDFObject
pop a PDFObject off the stack.private String
pop a String off the stack.private void
abstracted command processing for BT command.private void
abstracted command processing for Q command.private String
read a byte array from the stream.private String
readName()
read a name (sequence of non-PDF-delimiting characters) from the stream.private double
readNum()
read a floating point number from the streamprivate String
read a String from the stream.static void
setDebugLevel
(int level) private void
setGSState
(String name) add graphics state commands contained within a dictionary.void
setup()
Called to prepare for some iterationsprivate void
put the current token back so that it is returned again by nextToken().Methods inherited from class com.sun.pdfview.BaseWatchable
execute, getStatus, go, go, go, go, isExecutable, isFinished, isSuppressSetErrorStackTrace, run, setError, setStatus, setSuppressSetErrorStackTrace, stop, waitForFinish
-
Field Details
-
DEBUG_DCTDECODE_DATA
emit a file of DCT stream data.- See Also:
-
stack
-
parserStates
-
state
-
path
-
clip
private int clip -
loc
private int loc -
resend
private boolean resend -
tok
-
catchexceptions
private boolean catchexceptions -
pageRef
a weak reference to the page we render into. For the page to remain available, some other code must retain a strong reference to it. -
cmds
the actual command, for use within a singe iteration. Note that this must be released at the end of each iteration to assure the page can be collected if not in use -
stream
byte[] stream -
resources
-
debuglevel
public static int debuglevel -
errorwritten
boolean errorwritten
-
-
Constructor Details
-
PDFParser
Don't call this constructor directly. Instead, use PDFFile.getPage(int pagenum) to get a PDFPage. There should never be any reason for a user to create, access, or hold on to a PDFParser.
-
-
Method Details
-
debug
-
escape
-
setDebugLevel
public static void setDebugLevel(int level) -
throwback
private void throwback()put the current token back so that it is returned again by nextToken(). -
nextToken
get the next token. TODO: this creates a new token each time. Is this strictly necessary? -
readName
read a name (sequence of non-PDF-delimiting characters) from the stream. -
readNum
private double readNum()read a floating point number from the stream -
readString
read a String from the stream. Strings begin with a '(' character, which has already been read, and end with a balanced ')' character. A '\' character starts an escape sequence of up to three octal digits.
Parenthesis must be enclosed by a balanced set of parenthesis, so a string may enclose balanced parenthesis.
- Returns:
- the string with escape sequences replaced with their values
-
readByteArray
read a byte array from the stream. Byte arrays begin with a 'invalid input: '<'' character, which has already been read, and end with a '>' character. Each byte in the array is made up of two hex characters, the first being the high-order bit. We translate the byte arrays into char arrays by combining two bytes into a character, and then translate the character array into a string. [JK FIXME this is probably a really bad idea!]- Returns:
- the byte array
-
setup
public void setup()Called to prepare for some iterations- Overrides:
setup
in classBaseWatchable
-
iterate
parse the stream. commands are added to the PDFPage initialized in the constructor as they are encountered.Page numbers in comments refer to the Adobe PDF specification.
commands are listed in PDF spec 32000-1:2008 in Table A.1- Specified by:
iterate
in classBaseWatchable
- Returns:
- Watchable.RUNNING when there are commands to be processed
- Watchable.COMPLETED when the page is done and all the commands have been processed
- Watchable.STOPPED if the page we are rendering into is no longer available
- Throws:
Exception
-
processQCmd
private void processQCmd()abstracted command processing for Q command. Used directly and as part of processing of mushed QBT command. -
processBTCmd
private void processBTCmd()abstracted command processing for BT command. Used directly and as part of processing of mushed QBT command. -
cleanup
public void cleanup()Cleanup when iteration is done- Overrides:
cleanup
in classBaseWatchable
-
dumpStreamToError
public void dumpStreamToError() -
dumpStream
-
emitDataFile
take a byte array and write a temporary file with it's data. This is intended to capture data for analysis, like after decoders.- Parameters:
ary
-name
-
-
findResource
get a property from a named dictionary in the resources of this content stream.- Parameters:
name
- the name of the property in the dictionaryinDict
- the name of the dictionary in the resources- Returns:
- the value of the property in the dictionary
- Throws:
IOException
-
doXObject
Insert a PDF object into the command stream. The object must either be an Image or a Form, which is a set of PDF commands in a stream.- Parameters:
obj
- the object to insert, an Image or a Form.- Throws:
IOException
-
doImage
Parse image data into a Java BufferedImage and add the image command to the page.- Parameters:
obj
- contains the image data, and a dictionary describing the width, height and color space of the image.- Throws:
IOException
-
doForm
Inject a stream of PDF commands onto the page. Optimized to cache a parsed stream of commands, so that each Form object only needs to be parsed once.- Parameters:
obj
- a stream containing the PDF commands, a transformation matrix, bounding box, and resources.- Throws:
IOException
-
doPattern
Set the values into a PatternSpace- Throws:
IOException
-
parseObject
Parse the next object out of the PDF stream. This could be a Double, a String, a HashMap (dictionary), Object[] array, or a Tok containing a PDF command.- Throws:
PDFParseException
-
parseInlineImage
Parse an inline image. An inline image starts with BI (already read, contains a dictionary until ID, and then image data until EI.- Throws:
IOException
-
doShader
build a shader from a dictionary.- Throws:
IOException
-
getFontFrom
get a PDFFont from the resources, given the resource name of the font.- Parameters:
fontref
- the resource key for the font- Throws:
IOException
-
setGSState
add graphics state commands contained within a dictionary.- Parameters:
name
- the resource name of the graphics state dictionary- Throws:
IOException
-
parseColorSpace
generate a PDFColorSpace description based on a PDFObject. The object could be a standard name, or the name of a resource in the ColorSpace dictionary, or a color space name with a defining dictionary or stream.- Throws:
IOException
-
popFloat
pop a single float value off the stack.- Returns:
- the float value of the top of the stack
- Throws:
PDFParseException
- if the value on the top of the stack isn't a number
-
popFloat
pop an array of float values off the stack. This is equivalent to filling an array from end to front by popping values off the stack.- Parameters:
count
- the number of numbers to pop off the stack- Returns:
- an array of length count
- Throws:
PDFParseException
- if any of the values popped off the stack are not numbers.
-
popInt
pop a single integer value off the stack.- Returns:
- the integer value of the top of the stack
- Throws:
PDFParseException
- if the top of the stack isn't a number.
-
popFloatArray
pop an array of integer values off the stack. This is equivalent to filling an array from end to front by popping values off the stack.- Parameters:
count
- the number of numbers to pop off the stack- Returns:
- an array of length count
- Throws:
PDFParseException
- if any of the values popped off the stack are not numbers.
-
popString
pop a String off the stack.- Returns:
- the String from the top of the stack
- Throws:
PDFParseException
- if the top of the stack is not a NAME or STR.
-
popObject
pop a PDFObject off the stack.- Returns:
- the PDFObject from the top of the stack
- Throws:
PDFParseException
- if the top of the stack does not contain a PDFObject.
-
popArray
pop an array off the stack- Returns:
- the array of objects that is the top element of the stack
- Throws:
PDFParseException
- if the top element of the stack does not contain an array.
-