Package com.itextpdf.text.pdf.mc
Class MCParser
- java.lang.Object
-
- com.itextpdf.text.pdf.mc.MCParser
-
public class MCParser extends Object
This class will parse page content streams and add Do operators in a marked-content sequence for every field that needs to be flattened.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
MCParser.PdfOperator
PDF Operator interface.
-
Field Summary
Fields Modifier and Type Field Description protected PdfArray
annots
the annotations of the page that is being processed.protected ByteArrayOutputStream
baos
The contents of the new content stream of the page.protected boolean
btWrite
Did we postpone writing a BT operator?static String
DEFAULTOPERATOR
Constant used for the default operator.protected boolean
etExtra
Did we postpone writing a BT operator?protected boolean
inText
Are we inside a BT/ET sequence?protected StructureItems
items
The list with structure items.protected static Logger
LOGGER
The Logger instanceprotected Map<String,MCParser.PdfOperator>
operators
A map with all supported operators operators (PDF syntax).protected PdfDictionary
page
The page dictionaryprotected PdfIndirectReference
pageref
The reference to the page dictionaryprotected static RandomAccessSourceFactory
RASFACTORY
Factory that will help us build a RandomAccessSource.protected PdfNumber
structParents
the StructParents of the page that is being processed.protected StringBuffer
text
A buffer containing text state.static PdfLiteral
TSTAR
A new line operatorprotected PdfDictionary
xobjects
the XObject dictionary of the page that is being processed.
-
Constructor Summary
Constructors Constructor Description MCParser(StructureItems items)
Creates an MCParser object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
checkBT()
Checks if a BT operator is waiting to be added.protected void
convertToXObject(StructureObject item)
Converts an annotation structure item to a Form XObject annotation.protected void
dealWithMcid(PdfNumber mcid)
When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.protected void
dealWithXObj(PdfName xobj)
When an XObject with a StructParent is encountered, we want to remove it from the stack.void
parse(PdfDictionary page, PdfIndirectReference pageref)
Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.protected void
populateOperators()
Populates the operators variable.protected void
println(PdfObject o)
Writes a PDF object to the OutputStream, followed by a newline character.protected void
printOperator(PdfLiteral operator, List<PdfObject> operands)
Adds an operator and its operands (if any) to baos.protected void
printsp(PdfObject o)
Writes a PDF object to the OutputStream, followed by a space character.protected void
printTextOperator(PdfLiteral operator, List<PdfObject> operands)
Adds an operator and its operands (if any) to baos, keeping track of the text state.protected void
processOperator(PdfLiteral operator, List<PdfObject> operands)
Processes an operator, for instance: write the operator and its operands to baos.protected void
setInText(boolean inText)
Informs the parser that we're inside or outside a text object.
-
-
-
Field Detail
-
LOGGER
protected static final Logger LOGGER
The Logger instance
-
RASFACTORY
protected static final RandomAccessSourceFactory RASFACTORY
Factory that will help us build a RandomAccessSource.
-
DEFAULTOPERATOR
public static final String DEFAULTOPERATOR
Constant used for the default operator.- See Also:
- Constant Field Values
-
TSTAR
public static final PdfLiteral TSTAR
A new line operator
-
operators
protected Map<String,MCParser.PdfOperator> operators
A map with all supported operators operators (PDF syntax).
-
items
protected StructureItems items
The list with structure items.
-
baos
protected ByteArrayOutputStream baos
The contents of the new content stream of the page.
-
page
protected PdfDictionary page
The page dictionary
-
pageref
protected PdfIndirectReference pageref
The reference to the page dictionary
-
annots
protected PdfArray annots
the annotations of the page that is being processed.
-
structParents
protected PdfNumber structParents
the StructParents of the page that is being processed.
-
xobjects
protected PdfDictionary xobjects
the XObject dictionary of the page that is being processed.
-
btWrite
protected boolean btWrite
Did we postpone writing a BT operator?
-
etExtra
protected boolean etExtra
Did we postpone writing a BT operator?
-
inText
protected boolean inText
Are we inside a BT/ET sequence?
-
text
protected StringBuffer text
A buffer containing text state.
-
-
Constructor Detail
-
MCParser
public MCParser(StructureItems items)
Creates an MCParser object.- Parameters:
items
- a list of StructureItem objects
-
-
Method Detail
-
populateOperators
protected void populateOperators()
Populates the operators variable.
-
parse
public void parse(PdfDictionary page, PdfIndirectReference pageref) throws IOException, DocumentException
Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.- Parameters:
page
- a page dictionarypageref
- the reference to the page dictionaryfinalPage
- indicates whether the page being processed is the final page in the document- Throws:
IOException
DocumentException
-
dealWithXObj
protected void dealWithXObj(PdfName xobj)
When an XObject with a StructParent is encountered, we want to remove it from the stack.- Parameters:
xobj
- the name of an XObject
-
dealWithMcid
protected void dealWithMcid(PdfNumber mcid) throws IOException, DocumentException
When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.- Parameters:
mcid
- the MCID that was encountered in the content stream- Throws:
IOException
DocumentException
-
convertToXObject
protected void convertToXObject(StructureObject item) throws IOException, DocumentException
Converts an annotation structure item to a Form XObject annotation.- Parameters:
item
- the structure item- Throws:
IOException
DocumentException
-
processOperator
protected void processOperator(PdfLiteral operator, List<PdfObject> operands) throws IOException, DocumentException
Processes an operator, for instance: write the operator and its operands to baos.- Parameters:
operator
- the operatoroperands
- the operator's operands- Throws:
IOException
DocumentException
-
printOperator
protected void printOperator(PdfLiteral operator, List<PdfObject> operands) throws IOException
Adds an operator and its operands (if any) to baos.- Parameters:
operator
- the operatoroperands
- its operands- Throws:
IOException
-
printTextOperator
protected void printTextOperator(PdfLiteral operator, List<PdfObject> operands) throws IOException
Adds an operator and its operands (if any) to baos, keeping track of the text state.- Parameters:
operator
- the operatoroperands
- its operands- Throws:
IOException
-
printsp
protected void printsp(PdfObject o) throws IOException
Writes a PDF object to the OutputStream, followed by a space character.- Parameters:
o
- a PdfObject- Throws:
IOException
-
println
protected void println(PdfObject o) throws IOException
Writes a PDF object to the OutputStream, followed by a newline character.- Parameters:
o
- a PdfObject- Throws:
IOException
-
checkBT
protected void checkBT() throws IOException
Checks if a BT operator is waiting to be added.- Throws:
IOException
-
setInText
protected void setInText(boolean inText)
Informs the parser that we're inside or outside a text object. Also sets a parameter indicating that BT needs to be written.- Parameters:
inText
- true if we're inside.
-
-