Package org.eclipse.rdf4j.rio.rdfxml
Class RDFXMLParser
- java.lang.Object
-
- org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
-
- org.eclipse.rdf4j.rio.helpers.XMLReaderBasedParser
-
- org.eclipse.rdf4j.rio.rdfxml.RDFXMLParser
-
- All Implemented Interfaces:
RDFParser
,org.xml.sax.ErrorHandler
public class RDFXMLParser extends XMLReaderBasedParser implements org.xml.sax.ErrorHandler
A parser for XML-serialized RDF. This parser operates directly on the SAX events generated by a SAX-enabled XML parser. The XML parser should be compliant with SAX2. You should specify which SAX parser should be used by setting theorg.xml.sax.driver
property. This parser is not thread-safe, therefore it's public methods are synchronized.To parse a document using this parser:
- Create an instance of RDFXMLParser, optionally supplying it with your own ValueFactory.
- Set the RDFHandler.
- Optionally, set the ParseErrorListener and/or ParseLocationListener.
- Optionally, specify whether the parser should verify the data it parses and whether it should stop immediately when it finds an error in the data (both default to true).
- Call the parse method.
// Use the SAX2-compliant Xerces parser: System.setProperty("org.xml.sax.driver", "org.apache.xerces.parsers.SAXParser"); RDFParser parser = new RDFXMLParser(); parser.setRDFHandler(myRDFHandler); parser.setParseErrorListener(myParseErrorListener); parser.setVerifyData(true); parser.stopAtFirstError(false); // Parse the data from inputStream, resolving any // relative URIs against http://foo/bar: parser.parse(inputStream, "http://foo/bar");
Note that JAXP entity expansion limits may apply. Check the documentation on limits and using the jaxp.properties file if you get one of the following errors:
JAXP00010001: The parser has encountered more than "64000" entity expansions in this document JAXP00010004: The accumulated size of entities is ... that exceeded the "50,000,000" limit
As a work-around, try passing
-Djdk.xml.totalEntitySizeLimit=0 -DentityExpansionLimit=0
to the JVM.- See Also:
ValueFactory
,RDFHandler
,ParseErrorListener
,ParseLocationListener
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
RDFXMLParser.NodeElement
(package private) static class
RDFXMLParser.PropertyElement
-
Field Summary
Fields Modifier and Type Field Description private java.lang.String
documentURI
The base URI of the document.private java.util.Stack<java.lang.Object>
elementStack
A stack of node- and property elements.private SAXFilter
saxFilter
A filter filtering calls to SAX methods specifically for this parser.private java.util.Set<IRI>
usedIDs
A set containing URIs that have been generated as a result of rdf:ID attributes.private java.lang.String
xmlLang
The language of literal values as can be specified using xml:lang attributes.-
Fields inherited from class org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
rdfHandler, valueFactory
-
-
Constructor Summary
Constructors Constructor Description RDFXMLParser()
Creates a new RDFXMLParser that will use aSimpleValueFactory
to create RDF model objects.RDFXMLParser(ValueFactory valueFactory)
Creates a new RDFXMLParser that will use the supplied ValueFactory to create RDF model objects.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private IRI
buildResourceFromLocalName(java.lang.String localName)
Builds a Resource from a non-qualified localname.private IRI
buildURIFromID(java.lang.String id)
Builds a Resource from the value of an rdf:ID attribute.private void
checkNodeEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
Checks whether the node element name is from the RDF namespace and, if so, if it is allowed to be used in a node element.private void
checkNoMoreAtts(Atts atts)
Checks whether 'atts' is empty.private void
checkPropertyEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, RioSetting<java.lang.Boolean> setting)
Checks whether the property element name is from the RDF namespace and, if so, if it is allowed to be used in a property element.private void
checkRDFAtts(Atts atts)
Checks whether 'atts' contains attributes from the RDF namespace that are not allowed as attributes.protected Literal
createLiteral(java.lang.String label, java.lang.String lang, IRI datatype)
Creates aLiteral
object with the supplied parameters.protected Resource
createNode(java.lang.String nodeID)
(package private) void
emptyElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts)
(package private) void
endDocument()
(package private) void
endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
void
error(org.xml.sax.SAXParseException exception)
Implementation of SAX ErrorHandler.errorvoid
fatalError(org.xml.sax.SAXParseException exception)
Implementation of SAX ErrorHandler.fatalErrorprivate Resource
getNodeResource(Atts atts)
Retrieves the resource of a node element (subject or object) using relevant attributes (rdf:ID, rdf:about and rdf:nodeID) from its attributes list.boolean
getParseStandAloneDocuments()
Returns whether the parser is currently in a mode to parse stand-alone RDF documents.private Resource
getPropertyResource(Atts atts)
Retrieves the object resource of a property element using relevant attributes (rdf:resource and rdf:nodeID) from its attributes list.RDFFormat
getRDFFormat()
Gets the RDF format that this parser can parse.javax.xml.transform.sax.SAXResult
getSAXResult(java.lang.String baseURI)
java.util.Collection<RioSetting<?>>
getSupportedSettings()
private void
handleReification(Value value)
void
parse(java.io.InputStream in, java.lang.String baseURI)
Parses the data from the supplied InputStream, using the supplied baseURI to resolve any relative URI references.void
parse(java.io.Reader reader, java.lang.String baseURI)
Parses the data from the supplied Reader, using the supplied baseURI to resolve any relative URI references.private void
parse(org.xml.sax.InputSource inputSource)
private java.lang.Object
peekStack(int distFromTop)
private void
processNodeElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt)
private void
processPropertyElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt)
private void
processSubjectAtts(RDFXMLParser.NodeElement nodeElt, Atts atts)
processes subject attributes.private void
reifyStatement(Resource reifNode, Resource subj, IRI pred, Value obj)
protected void
reportError(java.lang.Exception e, RioSetting<java.lang.Boolean> setting)
OverridesAbstractRDFParser.reportError(String, RioSetting)
, adding line- and column number information to the error.protected void
reportError(java.lang.String msg, RioSetting<java.lang.Boolean> setting)
OverridesAbstractRDFParser.reportError(String, RioSetting)
, adding line- and column number information to the error.protected void
reportFatalError(java.lang.Exception e)
OverridesAbstractRDFParser.reportFatalError(Exception)
, adding line- and column number information to the error.protected void
reportFatalError(java.lang.String msg)
OverridesAbstractRDFParser.reportFatalError(String)
, adding line- and column number information to the error.private void
reportStatement(Resource subject, IRI predicate, Value object)
Reports a stament to the configured RDFHandlerException.protected void
reportWarning(java.lang.String msg)
OverridesAbstractRDFParser.reportWarning(String)
, adding line- and column number information to the error.protected void
setBaseURI(java.lang.String baseURI)
Parses the supplied URI-string and sets it as the base URI for resolving relative URIs.protected void
setBaseURI(ParsedIRI baseURI)
Sets the base URI for resolving relative URIs.void
setParseStandAloneDocuments(boolean standAloneDocs)
Sets the parser in a mode to parse stand-alone RDF documents.(package private) void
setXMLLang(java.lang.String xmlLang)
(package private) void
startDocument()
(package private) void
startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts)
(package private) void
text(java.lang.String text)
private boolean
topIsProperty()
void
warning(org.xml.sax.SAXParseException exception)
Implementation of SAX ErrorHandler.warning-
Methods inherited from class org.eclipse.rdf4j.rio.helpers.XMLReaderBasedParser
getCompulsoryXmlFeatureSettings, getCompulsoryXmlPropertySettings, getOptionalXmlFeatureSettings, getOptionalXmlPropertySettings, getXMLReader
-
Methods inherited from class org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
clear, clearBNodeIDMap, createBNode, createBNode, createLiteral, createNode, createStatement, createStatement, createURI, getNamespace, getParseErrorListener, getParseLocationListener, getParserConfig, getRDFHandler, initializeNamespaceTableFromConfiguration, preserveBNodeIDs, reportError, reportError, reportError, reportFatalError, reportFatalError, reportFatalError, reportLocation, reportWarning, resolveURI, set, setNamespace, setParseErrorListener, setParseLocationListener, setParserConfig, setPreserveBNodeIDs, setRDFHandler, setValueFactory
-
-
-
-
Field Detail
-
saxFilter
private final SAXFilter saxFilter
A filter filtering calls to SAX methods specifically for this parser.
-
documentURI
private java.lang.String documentURI
The base URI of the document. This variable is set when parse(inputStream, baseURI) is called and will not be changed during parsing.
-
xmlLang
private java.lang.String xmlLang
The language of literal values as can be specified using xml:lang attributes. This variable is set/modified by the SAXFilter during parsing such that it always represents the language of the context in which elements are reported.
-
elementStack
private final java.util.Stack<java.lang.Object> elementStack
A stack of node- and property elements.
-
usedIDs
private final java.util.Set<IRI> usedIDs
A set containing URIs that have been generated as a result of rdf:ID attributes. These URIs should be unique within a single document.
-
-
Constructor Detail
-
RDFXMLParser
public RDFXMLParser()
Creates a new RDFXMLParser that will use aSimpleValueFactory
to create RDF model objects.
-
RDFXMLParser
public RDFXMLParser(ValueFactory valueFactory)
Creates a new RDFXMLParser that will use the supplied ValueFactory to create RDF model objects.- Parameters:
valueFactory
- A ValueFactory.
-
-
Method Detail
-
getRDFFormat
public final RDFFormat getRDFFormat()
Description copied from interface:RDFParser
Gets the RDF format that this parser can parse.- Specified by:
getRDFFormat
in interfaceRDFParser
-
setParseStandAloneDocuments
public void setParseStandAloneDocuments(boolean standAloneDocs)
Sets the parser in a mode to parse stand-alone RDF documents. In stand-alone RDF documents, the enclosing rdf:RDF root element is optional if this root element contains just one element (e.g. rdf:Description.
-
getParseStandAloneDocuments
public boolean getParseStandAloneDocuments()
Returns whether the parser is currently in a mode to parse stand-alone RDF documents.- See Also:
setParseStandAloneDocuments(boolean)
-
parse
public void parse(java.io.InputStream in, java.lang.String baseURI) throws java.io.IOException, RDFParseException, RDFHandlerException
Description copied from interface:RDFParser
Parses the data from the supplied InputStream, using the supplied baseURI to resolve any relative URI references.- Specified by:
parse
in interfaceRDFParser
- Parameters:
in
- The InputStream from which to read the data.baseURI
- The URI associated with the data in the InputStream. May benull
. Parsers for syntax formats that do not support relative URIs will ignore this argument.Note that if the data contains an embedded base URI, that embedded base URI will overrule the value supplied here (see RFC 3986 section 5.1 for details).
- Throws:
java.io.IOException
- If an I/O error occurred while data was read from the InputStream.RDFParseException
- If the parser has found an unrecoverable parse error.RDFHandlerException
- If the configured statement handler has encountered an unrecoverable error.
-
parse
public void parse(java.io.Reader reader, java.lang.String baseURI) throws java.io.IOException, RDFParseException, RDFHandlerException
Description copied from interface:RDFParser
Parses the data from the supplied Reader, using the supplied baseURI to resolve any relative URI references.- Specified by:
parse
in interfaceRDFParser
- Parameters:
reader
- The Reader from which to read the data.baseURI
- The URI associated with the data in the InputStream. May benull
. Parsers for syntax formats that do not support relative URIs will ignore this argument.Note that if the data contains an embedded base URI, that embedded base URI will overrule the value supplied here (see RFC 3986 section 5.1 for details).
- Throws:
java.io.IOException
- If an I/O error occurred while data was read from the InputStream.RDFParseException
- If the parser has found an unrecoverable parse error.RDFHandlerException
- If the configured statement handler has encountered an unrecoverable error.
-
parse
private void parse(org.xml.sax.InputSource inputSource) throws java.io.IOException, RDFParseException, RDFHandlerException
- Throws:
java.io.IOException
RDFParseException
RDFHandlerException
-
getSupportedSettings
public java.util.Collection<RioSetting<?>> getSupportedSettings()
- Specified by:
getSupportedSettings
in interfaceRDFParser
- Overrides:
getSupportedSettings
in classAbstractRDFParser
- Returns:
- A collection of
RioSetting
s that are supported by this RDFParser.
-
getSAXResult
public javax.xml.transform.sax.SAXResult getSAXResult(java.lang.String baseURI)
-
startDocument
void startDocument() throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
endDocument
void endDocument() throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
setBaseURI
protected void setBaseURI(ParsedIRI baseURI)
Description copied from class:AbstractRDFParser
Sets the base URI for resolving relative URIs.- Overrides:
setBaseURI
in classAbstractRDFParser
-
setBaseURI
protected void setBaseURI(java.lang.String baseURI)
Description copied from class:AbstractRDFParser
Parses the supplied URI-string and sets it as the base URI for resolving relative URIs.- Overrides:
setBaseURI
in classAbstractRDFParser
-
setXMLLang
void setXMLLang(java.lang.String xmlLang)
-
startElement
void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
endElement
void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
emptyElement
void emptyElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
text
void text(java.lang.String text) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
processNodeElt
private void processNodeElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
getNodeResource
private Resource getNodeResource(Atts atts) throws RDFParseException
Retrieves the resource of a node element (subject or object) using relevant attributes (rdf:ID, rdf:about and rdf:nodeID) from its attributes list.- Returns:
- a resource or a bNode.
- Throws:
RDFParseException
-
processSubjectAtts
private void processSubjectAtts(RDFXMLParser.NodeElement nodeElt, Atts atts) throws RDFParseException, RDFHandlerException
processes subject attributes.- Throws:
RDFParseException
RDFHandlerException
-
processPropertyElt
private void processPropertyElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
getPropertyResource
private Resource getPropertyResource(Atts atts) throws RDFParseException
Retrieves the object resource of a property element using relevant attributes (rdf:resource and rdf:nodeID) from its attributes list.- Returns:
- a resource or a bNode.
- Throws:
RDFParseException
-
handleReification
private void handleReification(Value value) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
reifyStatement
private void reifyStatement(Resource reifNode, Resource subj, IRI pred, Value obj) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseException
RDFHandlerException
-
buildResourceFromLocalName
private IRI buildResourceFromLocalName(java.lang.String localName) throws RDFParseException
Builds a Resource from a non-qualified localname.- Throws:
RDFParseException
-
buildURIFromID
private IRI buildURIFromID(java.lang.String id) throws RDFParseException
Builds a Resource from the value of an rdf:ID attribute.- Throws:
RDFParseException
-
createNode
protected Resource createNode(java.lang.String nodeID) throws RDFParseException
Description copied from class:AbstractRDFParser
- Overrides:
createNode
in classAbstractRDFParser
- Parameters:
nodeID
- node identifier- Returns:
- blank node or skolem IRI
- Throws:
RDFParseException
-
peekStack
private java.lang.Object peekStack(int distFromTop)
-
topIsProperty
private boolean topIsProperty()
-
checkNodeEltName
private void checkNodeEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws RDFParseException
Checks whether the node element name is from the RDF namespace and, if so, if it is allowed to be used in a node element. If the name is equal to one of the disallowed names (RDF, ID, about, parseType, resource, nodeID, datatype and li), an error is generated. If the name is not defined in the RDF namespace, but it claims that it is from this namespace, a warning is generated.- Throws:
RDFParseException
-
checkPropertyEltName
private void checkPropertyEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, RioSetting<java.lang.Boolean> setting) throws RDFParseException
Checks whether the property element name is from the RDF namespace and, if so, if it is allowed to be used in a property element. If the name is equal to one of the disallowed names (RDF, ID, about, parseType, resource and li), an error is generated. If the name is not defined in the RDF namespace, but it claims that it is from this namespace, a warning is generated.- Parameters:
setting
-- Throws:
RDFParseException
-
checkRDFAtts
private void checkRDFAtts(Atts atts) throws RDFParseException
Checks whether 'atts' contains attributes from the RDF namespace that are not allowed as attributes. If such an attribute is found, an error is generated and the attribute is removed from 'atts'. If the attribute is not defined in the RDF namespace, but it claims that it is from this namespace, a warning is generated.- Throws:
RDFParseException
-
checkNoMoreAtts
private void checkNoMoreAtts(Atts atts) throws RDFParseException
Checks whether 'atts' is empty. If this is not the case, a warning is generated for each attribute that is still present.- Throws:
RDFParseException
-
reportStatement
private void reportStatement(Resource subject, IRI predicate, Value object) throws RDFParseException, RDFHandlerException
Reports a stament to the configured RDFHandlerException.- Parameters:
subject
- The statement's subject.predicate
- The statement's predicate.object
- The statement's object.- Throws:
RDFHandlerException
- If the configured RDFHandlerException throws an RDFHandlerException.RDFParseException
-
createLiteral
protected Literal createLiteral(java.lang.String label, java.lang.String lang, IRI datatype) throws RDFParseException
Description copied from class:AbstractRDFParser
Creates aLiteral
object with the supplied parameters.- Overrides:
createLiteral
in classAbstractRDFParser
- Throws:
RDFParseException
-
reportWarning
protected void reportWarning(java.lang.String msg)
OverridesAbstractRDFParser.reportWarning(String)
, adding line- and column number information to the error.- Overrides:
reportWarning
in classAbstractRDFParser
-
reportError
protected void reportError(java.lang.String msg, RioSetting<java.lang.Boolean> setting) throws RDFParseException
OverridesAbstractRDFParser.reportError(String, RioSetting)
, adding line- and column number information to the error.- Overrides:
reportError
in classAbstractRDFParser
- Parameters:
msg
- The message to use forParseErrorListener.error(String, long, long)
and forRDFParseException(String, long, long)
.setting
- The boolean setting that will be checked to determine if this is an issue that we need to look at at all. If this setting is true, then the error listener will receive the error, and ifParserConfig.isNonFatalError(RioSetting)
returns true an exception will be thrown.- Throws:
RDFParseException
- IfRioConfig.get(RioSetting)
returns true, andParserConfig.isNonFatalError(RioSetting)
returns true for the given setting.
-
reportError
protected void reportError(java.lang.Exception e, RioSetting<java.lang.Boolean> setting) throws RDFParseException
OverridesAbstractRDFParser.reportError(String, RioSetting)
, adding line- and column number information to the error.- Overrides:
reportError
in classAbstractRDFParser
- Parameters:
e
- The exception whose message will be used forParseErrorListener.error(String, long, long)
and forRDFParseException(String, long, long)
.setting
- The boolean setting that will be checked to determine if this is an issue that we need to look at at all. If this setting is true, then the error listener will receive the error, and ifParserConfig.isNonFatalError(RioSetting)
returns true an exception will be thrown.- Throws:
RDFParseException
- IfRioConfig.get(RioSetting)
returns true, andParserConfig.isNonFatalError(RioSetting)
returns true for the given setting.
-
reportFatalError
protected void reportFatalError(java.lang.String msg) throws RDFParseException
OverridesAbstractRDFParser.reportFatalError(String)
, adding line- and column number information to the error.- Overrides:
reportFatalError
in classAbstractRDFParser
- Throws:
RDFParseException
-
reportFatalError
protected void reportFatalError(java.lang.Exception e) throws RDFParseException
OverridesAbstractRDFParser.reportFatalError(Exception)
, adding line- and column number information to the error.- Overrides:
reportFatalError
in classAbstractRDFParser
- Throws:
RDFParseException
-
warning
public void warning(org.xml.sax.SAXParseException exception) throws org.xml.sax.SAXException
Implementation of SAX ErrorHandler.warning- Specified by:
warning
in interfaceorg.xml.sax.ErrorHandler
- Throws:
org.xml.sax.SAXException
-
error
public void error(org.xml.sax.SAXParseException exception) throws org.xml.sax.SAXException
Implementation of SAX ErrorHandler.error- Specified by:
error
in interfaceorg.xml.sax.ErrorHandler
- Throws:
org.xml.sax.SAXException
-
fatalError
public void fatalError(org.xml.sax.SAXParseException exception) throws org.xml.sax.SAXException
Implementation of SAX ErrorHandler.fatalError- Specified by:
fatalError
in interfaceorg.xml.sax.ErrorHandler
- Throws:
org.xml.sax.SAXException
-
-