Class XMPMetaParser

java.lang.Object
com.itextpdf.kernel.xmp.impl.XMPMetaParser

public class XMPMetaParser extends Object
This class replaces the ExpatAdapter.cpp and does the XML-parsing and fixes the prefix. After the parsing several normalisations are applied to the XMPTree.
Since:
01.02.2006
  • Field Details

    • XMP_RDF

      private static final Object XMP_RDF
  • Constructor Details

    • XMPMetaParser

      private XMPMetaParser()
      Hidden constructor, initialises the SAX parser handler.
  • Method Details

    • parse

      public static XMPMeta parse(Object input, ParseOptions options) throws XMPException
      Parses the input source into an XMP metadata object, including de-aliasing and normalisation.
      Parameters:
      input - the input can be an InputStream, a String or a byte buffer containing the XMP packet.
      options - the parse options
      Returns:
      Returns the resulting XMP metadata object
      Throws:
      XMPException - Thrown if parsing or normalisation fails.
    • parseXml

      private static Document parseXml(Object input, ParseOptions options) throws XMPException
      Parses the raw XML metadata packet considering the parsing options. Latin-1/ISO-8859-1 can be accepted when the input is a byte stream (some old toolkits versions such packets). The stream is then wrapped in another stream that converts Latin-1 to UTF-8.

      If control characters shall be fixed, a reader is used that fixes the chars to spaces (if the input is a byte stream is has to be read as character stream).

      Both options reduce the performance of the parser.

      Parameters:
      input - the input can be an InputStream, a String or a byte buffer containing the XMP packet.
      options - the parsing options
      Returns:
      Returns the parsed XML document or an exception.
      Throws:
      XMPException - Thrown if the parsing fails for different reasons
    • parseXmlFromInputStream

      private static Document parseXmlFromInputStream(InputStream stream, ParseOptions options) throws XMPException
      Parses XML from an InputStream, fixing the encoding (Latin-1 to UTF-8) and illegal control character optionally.
      Parameters:
      stream - an InputStream
      options - the parsing options
      Returns:
      Returns an XML DOM-Document.
      Throws:
      XMPException - Thrown when the parsing fails.
    • parseXmlFromBytebuffer

      private static Document parseXmlFromBytebuffer(ByteBuffer buffer, ParseOptions options) throws XMPException
      Parses XML from a byte buffer, fixing the encoding (Latin-1 to UTF-8) and illegal control character optionally.
      Parameters:
      buffer - a byte buffer containing the XMP packet
      options - the parsing options
      Returns:
      Returns an XML DOM-Document.
      Throws:
      XMPException - Thrown when the parsing fails.
    • parseXmlFromString

      private static Document parseXmlFromString(String input, ParseOptions options) throws XMPException
      Parses XML from a String, fixing the illegal control character optionally.
      Parameters:
      input - a String containing the XMP packet
      options - the parsing options
      Returns:
      Returns an XML DOM-Document.
      Throws:
      XMPException - Thrown when the parsing fails.
    • parseInputSource

      private static Document parseInputSource(InputSource source) throws XMPException
      Runs the XML-Parser.
      Parameters:
      source - an InputSource
      Returns:
      Returns an XML DOM-Document.
      Throws:
      XMPException - Wraps parsing and I/O-exceptions into an XMPException.
    • findRootNode

      private static Object[] findRootNode(Node root, boolean xmpmetaRequired, Object[] result)
      Find the XML node that is the root of the XMP data tree. Generally this will be an outer node, but it could be anywhere if a general XML document is parsed (e.g. SVG). The XML parser counted all rdf:RDF and pxmp:XMP_Packet nodes, and kept a pointer to the last one. If there is more than one possible root use PickBestRoot to choose among them.

      If there is a root node, try to extract the version of the previous XMP toolkit.

      Pick the first x:xmpmeta among multiple root candidates. If there aren't any, pick the first bare rdf:RDF if that is allowed. The returned root is the rdf:RDF child if an x:xmpmeta element was chosen. The search is breadth first, so a higher level candiate is chosen over a lower level one that was textually earlier in the serialized XML.

      Parameters:
      root - the root of the xml document
      xmpmetaRequired - flag if the xmpmeta-tag is still required, might be set initially to true, if the parse option "REQUIRE_XMP_META" is set
      result - The result array that is filled during the recursive process.
      Returns:
      Returns an array that contains the result or null. The array contains:
      • [0] - the rdf:RDF-node
      • [1] - an object that is either XMP_RDF or XMP_PLAIN (the latter is decrecated)
      • [2] - the body text of the xpacket-instruction.