Package com.itextpdf.text.pdf.mc
Class MCParser
- java.lang.Object
-
- com.itextpdf.text.pdf.mc.MCParser
-
public class MCParser extends java.lang.Object
This class will parse page content streams and add Do operators in a marked-content sequence for every field that needs to be flattened.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
MCParser.BeginMarkedContentDictionaryOperator
Class that knows how to process marked content operators.private static class
MCParser.BeginTextOperator
Class that knows how to process the BT operator.private static class
MCParser.CopyContentOperator
Class that processes content by just printing the operator and its operands.private static class
MCParser.DoOperator
Class that knows how to process Do operators.private static class
MCParser.EndTextOperator
Class that knows how to the ET operators.static interface
MCParser.PdfOperator
PDF Operator interface.private static class
MCParser.TextNewLineOperator
Class that knows how to the text state operators that result in a newline.private static class
MCParser.TextPositioningOperator
Class that knows how to the ET operators.private static class
MCParser.TextStateOperator
Class that knows how to the text state operators.
-
Field Summary
Fields Modifier and Type Field Description protected PdfArray
annots
the annotations of the page that is being processed.protected java.io.ByteArrayOutputStream
baos
The contents of the new content stream of the page.protected boolean
btWrite
Did we postpone writing a BT operator?static java.lang.String
DEFAULTOPERATOR
Constant used for the default operator.protected boolean
etExtra
Did we postpone writing a BT operator?protected boolean
inText
Are we inside a BT/ET sequence?protected StructureItems
items
The list with structure items.protected static Logger
LOGGER
The Logger instanceprotected java.util.Map<java.lang.String,MCParser.PdfOperator>
operators
A map with all supported operators operators (PDF syntax).protected PdfDictionary
page
The page dictionaryprotected PdfIndirectReference
pageref
The reference to the page dictionaryprotected static RandomAccessSourceFactory
RASFACTORY
Factory that will help us build a RandomAccessSource.protected PdfNumber
structParents
the StructParents of the page that is being processed.protected java.lang.StringBuffer
text
A buffer containing text state.static PdfLiteral
TSTAR
A new line operatorprotected PdfDictionary
xobjects
the XObject dictionary of the page that is being processed.
-
Constructor Summary
Constructors Constructor Description MCParser(StructureItems items)
Creates an MCParser object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
checkBT()
Checks if a BT operator is waiting to be added.protected void
convertToXObject(StructureObject item)
Converts an annotation structure item to a Form XObject annotation.protected void
dealWithMcid(PdfNumber mcid)
When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.protected void
dealWithXObj(PdfName xobj)
When an XObject with a StructParent is encountered, we want to remove it from the stack.void
parse(PdfDictionary page, PdfIndirectReference pageref)
Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.protected void
populateOperators()
Populates the operators variable.protected void
println(PdfObject o)
Writes a PDF object to the OutputStream, followed by a newline character.protected void
printOperator(PdfLiteral operator, java.util.List<PdfObject> operands)
Adds an operator and its operands (if any) to baos.protected void
printsp(PdfObject o)
Writes a PDF object to the OutputStream, followed by a space character.protected void
printTextOperator(PdfLiteral operator, java.util.List<PdfObject> operands)
Adds an operator and its operands (if any) to baos, keeping track of the text state.protected void
processOperator(PdfLiteral operator, java.util.List<PdfObject> operands)
Processes an operator, for instance: write the operator and its operands to baos.protected void
setInText(boolean inText)
Informs the parser that we're inside or outside a text object.
-
-
-
Field Detail
-
LOGGER
protected static final Logger LOGGER
The Logger instance
-
RASFACTORY
protected static final RandomAccessSourceFactory RASFACTORY
Factory that will help us build a RandomAccessSource.
-
DEFAULTOPERATOR
public static final java.lang.String DEFAULTOPERATOR
Constant used for the default operator.- See Also:
- Constant Field Values
-
TSTAR
public static final PdfLiteral TSTAR
A new line operator
-
operators
protected java.util.Map<java.lang.String,MCParser.PdfOperator> operators
A map with all supported operators operators (PDF syntax).
-
items
protected StructureItems items
The list with structure items.
-
baos
protected java.io.ByteArrayOutputStream baos
The contents of the new content stream of the page.
-
page
protected PdfDictionary page
The page dictionary
-
pageref
protected PdfIndirectReference pageref
The reference to the page dictionary
-
annots
protected PdfArray annots
the annotations of the page that is being processed.
-
structParents
protected PdfNumber structParents
the StructParents of the page that is being processed.
-
xobjects
protected PdfDictionary xobjects
the XObject dictionary of the page that is being processed.
-
btWrite
protected boolean btWrite
Did we postpone writing a BT operator?
-
etExtra
protected boolean etExtra
Did we postpone writing a BT operator?
-
inText
protected boolean inText
Are we inside a BT/ET sequence?
-
text
protected java.lang.StringBuffer text
A buffer containing text state.
-
-
Constructor Detail
-
MCParser
public MCParser(StructureItems items)
Creates an MCParser object.- Parameters:
items
- a list of StructureItem objects
-
-
Method Detail
-
populateOperators
protected void populateOperators()
Populates the operators variable.
-
parse
public void parse(PdfDictionary page, PdfIndirectReference pageref) throws java.io.IOException, DocumentException
Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.- Parameters:
page
- a page dictionarypageref
- the reference to the page dictionaryfinalPage
- indicates whether the page being processed is the final page in the document- Throws:
java.io.IOException
DocumentException
-
dealWithXObj
protected void dealWithXObj(PdfName xobj)
When an XObject with a StructParent is encountered, we want to remove it from the stack.- Parameters:
xobj
- the name of an XObject
-
dealWithMcid
protected void dealWithMcid(PdfNumber mcid) throws java.io.IOException, DocumentException
When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.- Parameters:
mcid
- the MCID that was encountered in the content stream- Throws:
java.io.IOException
DocumentException
-
convertToXObject
protected void convertToXObject(StructureObject item) throws java.io.IOException, DocumentException
Converts an annotation structure item to a Form XObject annotation.- Parameters:
item
- the structure item- Throws:
java.io.IOException
DocumentException
-
processOperator
protected void processOperator(PdfLiteral operator, java.util.List<PdfObject> operands) throws java.io.IOException, DocumentException
Processes an operator, for instance: write the operator and its operands to baos.- Parameters:
operator
- the operatoroperands
- the operator's operands- Throws:
java.io.IOException
DocumentException
-
printOperator
protected void printOperator(PdfLiteral operator, java.util.List<PdfObject> operands) throws java.io.IOException
Adds an operator and its operands (if any) to baos.- Parameters:
operator
- the operatoroperands
- its operands- Throws:
java.io.IOException
-
printTextOperator
protected void printTextOperator(PdfLiteral operator, java.util.List<PdfObject> operands) throws java.io.IOException
Adds an operator and its operands (if any) to baos, keeping track of the text state.- Parameters:
operator
- the operatoroperands
- its operands- Throws:
java.io.IOException
-
printsp
protected void printsp(PdfObject o) throws java.io.IOException
Writes a PDF object to the OutputStream, followed by a space character.- Parameters:
o
- a PdfObject- Throws:
java.io.IOException
-
println
protected void println(PdfObject o) throws java.io.IOException
Writes a PDF object to the OutputStream, followed by a newline character.- Parameters:
o
- a PdfObject- Throws:
java.io.IOException
-
checkBT
protected void checkBT() throws java.io.IOException
Checks if a BT operator is waiting to be added.- Throws:
java.io.IOException
-
setInText
protected void setInText(boolean inText)
Informs the parser that we're inside or outside a text object. Also sets a parameter indicating that BT needs to be written.- Parameters:
inText
- true if we're inside.
-
-