Package com.itextpdf.xmp.impl
Class XMPMetaParser
- java.lang.Object
-
- com.itextpdf.xmp.impl.XMPMetaParser
-
public class XMPMetaParser extends java.lang.Object
This class replaces theExpatAdapter.cpp
and does the XML-parsing and fixes the prefix. After the parsing several normalisations are applied to the XMPTree.- Since:
- 01.02.2006
-
-
Constructor Summary
Constructors Modifier Constructor Description private
XMPMetaParser()
Hidden constructor, initialises the SAX parser handler.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static javax.xml.parsers.DocumentBuilderFactory
createDocumentBuilderFactory()
private static java.lang.Object[]
findRootNode(org.w3c.dom.Node root, boolean xmpmetaRequired, java.lang.Object[] result)
Find the XML node that is the root of the XMP data tree.static XMPMeta
parse(java.lang.Object input, ParseOptions options)
Parses the input source into an XMP metadata object, including de-aliasing and normalisation.private static org.w3c.dom.Document
parseInputSource(org.xml.sax.InputSource source)
Runs the XML-Parser.private static org.w3c.dom.Document
parseXml(java.lang.Object input, ParseOptions options)
Parses the raw XML metadata packet considering the parsing options.private static org.w3c.dom.Document
parseXmlFromBytebuffer(ByteBuffer buffer, ParseOptions options)
Parses XML from a byte buffer, fixing the encoding (Latin-1 to UTF-8) and illegal control character optionally.private static org.w3c.dom.Document
parseXmlFromInputStream(java.io.InputStream stream, ParseOptions options)
Parses XML from anInputStream
, fixing the encoding (Latin-1 to UTF-8) and illegal control character optionally.private static org.w3c.dom.Document
parseXmlFromString(java.lang.String input, ParseOptions options)
Parses XML from aString
, fixing the illegal control character optionally.
-
-
-
Method Detail
-
parse
public static XMPMeta parse(java.lang.Object input, ParseOptions options) throws XMPException
Parses the input source into an XMP metadata object, including de-aliasing and normalisation.- Parameters:
input
- the input can be anInputStream
, aString
or a byte buffer containing the XMP packet.options
- the parse options- Returns:
- Returns the resulting XMP metadata object
- Throws:
XMPException
- Thrown if parsing or normalisation fails.
-
parseXml
private static org.w3c.dom.Document parseXml(java.lang.Object input, ParseOptions options) throws XMPException
Parses the raw XML metadata packet considering the parsing options. Latin-1/ISO-8859-1 can be accepted when the input is a byte stream (some old toolkits versions such packets). The stream is then wrapped in another stream that converts Latin-1 to UTF-8.If control characters shall be fixed, a reader is used that fixes the chars to spaces (if the input is a byte stream is has to be read as character stream).
Both options reduce the performance of the parser.
- Parameters:
input
- the input can be anInputStream
, aString
or a byte buffer containing the XMP packet.options
- the parsing options- Returns:
- Returns the parsed XML document or an exception.
- Throws:
XMPException
- Thrown if the parsing fails for different reasons
-
parseXmlFromInputStream
private static org.w3c.dom.Document parseXmlFromInputStream(java.io.InputStream stream, ParseOptions options) throws XMPException
Parses XML from anInputStream
, fixing the encoding (Latin-1 to UTF-8) and illegal control character optionally.- Parameters:
stream
- anInputStream
options
- the parsing options- Returns:
- Returns an XML DOM-Document.
- Throws:
XMPException
- Thrown when the parsing fails.
-
parseXmlFromBytebuffer
private static org.w3c.dom.Document parseXmlFromBytebuffer(ByteBuffer buffer, ParseOptions options) throws XMPException
Parses XML from a byte buffer, fixing the encoding (Latin-1 to UTF-8) and illegal control character optionally.- Parameters:
buffer
- a byte buffer containing the XMP packetoptions
- the parsing options- Returns:
- Returns an XML DOM-Document.
- Throws:
XMPException
- Thrown when the parsing fails.
-
parseXmlFromString
private static org.w3c.dom.Document parseXmlFromString(java.lang.String input, ParseOptions options) throws XMPException
Parses XML from aString
, fixing the illegal control character optionally.- Parameters:
input
- aString
containing the XMP packetoptions
- the parsing options- Returns:
- Returns an XML DOM-Document.
- Throws:
XMPException
- Thrown when the parsing fails.
-
parseInputSource
private static org.w3c.dom.Document parseInputSource(org.xml.sax.InputSource source) throws XMPException
Runs the XML-Parser.- Parameters:
source
- anInputSource
- Returns:
- Returns an XML DOM-Document.
- Throws:
XMPException
- Wraps parsing and I/O-exceptions into an XMPException.
-
findRootNode
private static java.lang.Object[] findRootNode(org.w3c.dom.Node root, boolean xmpmetaRequired, java.lang.Object[] result)
Find the XML node that is the root of the XMP data tree. Generally this will be an outer node, but it could be anywhere if a general XML document is parsed (e.g. SVG). The XML parser counted all rdf:RDF and pxmp:XMP_Packet nodes, and kept a pointer to the last one. If there is more than one possible root use PickBestRoot to choose among them.If there is a root node, try to extract the version of the previous XMP toolkit.
Pick the first x:xmpmeta among multiple root candidates. If there aren't any, pick the first bare rdf:RDF if that is allowed. The returned root is the rdf:RDF child if an x:xmpmeta element was chosen. The search is breadth first, so a higher level candiate is chosen over a lower level one that was textually earlier in the serialized XML.
- Parameters:
root
- the root of the xml documentxmpmetaRequired
- flag if the xmpmeta-tag is still required, might be set initially totrue
, if the parse option "REQUIRE_XMP_META" is setresult
- The result array that is filled during the recursive process.- Returns:
- Returns an array that contains the result or
null
. The array contains:- [0] - the rdf:RDF-node
- [1] - an object that is either XMP_RDF or XMP_PLAIN (the latter is decrecated)
- [2] - the body text of the xpacket-instruction.
-
createDocumentBuilderFactory
private static javax.xml.parsers.DocumentBuilderFactory createDocumentBuilderFactory()
- Returns:
- Creates, configures and returnes the document builder factory for the Metadata Parser.
-
-