Class MCParser

java.lang.Object
com.itextpdf.text.pdf.mc.MCParser

public class MCParser extends Object
This class will parse page content streams and add Do operators in a marked-content sequence for every field that needs to be flattened.
  • Field Details

    • LOGGER

      protected static final Logger LOGGER
      The Logger instance
    • RASFACTORY

      protected static final RandomAccessSourceFactory RASFACTORY
      Factory that will help us build a RandomAccessSource.
    • DEFAULTOPERATOR

      public static final String DEFAULTOPERATOR
      Constant used for the default operator.
      See Also:
    • TSTAR

      public static final PdfLiteral TSTAR
      A new line operator
    • operators

      protected Map<String,MCParser.PdfOperator> operators
      A map with all supported operators operators (PDF syntax).
    • items

      protected StructureItems items
      The list with structure items.
    • baos

      protected ByteArrayOutputStream baos
      The contents of the new content stream of the page.
    • page

      protected PdfDictionary page
      The page dictionary
    • pageref

      protected PdfIndirectReference pageref
      The reference to the page dictionary
    • annots

      protected PdfArray annots
      the annotations of the page that is being processed.
    • structParents

      protected PdfNumber structParents
      the StructParents of the page that is being processed.
    • xobjects

      protected PdfDictionary xobjects
      the XObject dictionary of the page that is being processed.
    • btWrite

      protected boolean btWrite
      Did we postpone writing a BT operator?
    • etExtra

      protected boolean etExtra
      Did we postpone writing a BT operator?
    • inText

      protected boolean inText
      Are we inside a BT/ET sequence?
    • text

      protected StringBuffer text
      A buffer containing text state.
  • Constructor Details

    • MCParser

      public MCParser(StructureItems items)
      Creates an MCParser object.
      Parameters:
      items - a list of StructureItem objects
  • Method Details

    • populateOperators

      protected void populateOperators()
      Populates the operators variable.
    • parse

      public void parse(PdfDictionary page, PdfIndirectReference pageref) throws IOException, DocumentException
      Parses the content of a page, inserting the normal (/N) appearances (/AP) of annotations into the content stream as Form XObjects.
      Parameters:
      page - a page dictionary
      pageref - the reference to the page dictionary
      finalPage - indicates whether the page being processed is the final page in the document
      Throws:
      IOException
      DocumentException
    • dealWithXObj

      protected void dealWithXObj(PdfName xobj)
      When an XObject with a StructParent is encountered, we want to remove it from the stack.
      Parameters:
      xobj - the name of an XObject
    • dealWithMcid

      protected void dealWithMcid(PdfNumber mcid) throws IOException, DocumentException
      When an MCID is encountered, the parser will check the list structure items and turn an annotation into an XObject if necessary.
      Parameters:
      mcid - the MCID that was encountered in the content stream
      Throws:
      IOException
      DocumentException
    • convertToXObject

      protected void convertToXObject(StructureObject item) throws IOException, DocumentException
      Converts an annotation structure item to a Form XObject annotation.
      Parameters:
      item - the structure item
      Throws:
      IOException
      DocumentException
    • processOperator

      protected void processOperator(PdfLiteral operator, List<PdfObject> operands) throws IOException, DocumentException
      Processes an operator, for instance: write the operator and its operands to baos.
      Parameters:
      operator - the operator
      operands - the operator's operands
      Throws:
      IOException
      DocumentException
    • printOperator

      protected void printOperator(PdfLiteral operator, List<PdfObject> operands) throws IOException
      Adds an operator and its operands (if any) to baos.
      Parameters:
      operator - the operator
      operands - its operands
      Throws:
      IOException
    • printTextOperator

      protected void printTextOperator(PdfLiteral operator, List<PdfObject> operands) throws IOException
      Adds an operator and its operands (if any) to baos, keeping track of the text state.
      Parameters:
      operator - the operator
      operands - its operands
      Throws:
      IOException
    • printsp

      protected void printsp(PdfObject o) throws IOException
      Writes a PDF object to the OutputStream, followed by a space character.
      Parameters:
      o - a PdfObject
      Throws:
      IOException
    • println

      protected void println(PdfObject o) throws IOException
      Writes a PDF object to the OutputStream, followed by a newline character.
      Parameters:
      o - a PdfObject
      Throws:
      IOException
    • checkBT

      protected void checkBT() throws IOException
      Checks if a BT operator is waiting to be added.
      Throws:
      IOException
    • setInText

      protected void setInText(boolean inText)
      Informs the parser that we're inside or outside a text object. Also sets a parameter indicating that BT needs to be written.
      Parameters:
      inText - true if we're inside.