Package org.eclipse.rdf4j.rio.rdfxml
Class RDFXMLParser
- java.lang.Object
-
- org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
-
- org.eclipse.rdf4j.rio.helpers.XMLReaderBasedParser
-
- org.eclipse.rdf4j.rio.rdfxml.RDFXMLParser
-
- All Implemented Interfaces:
RDFParser,org.xml.sax.ErrorHandler
public class RDFXMLParser extends XMLReaderBasedParser implements org.xml.sax.ErrorHandler
A parser for XML-serialized RDF. This parser operates directly on the SAX events generated by a SAX-enabled XML parser. The XML parser should be compliant with SAX2. You should specify which SAX parser should be used by setting theorg.xml.sax.driverproperty. This parser is not thread-safe, therefore it's public methods are synchronized.To parse a document using this parser:
- Create an instance of RDFXMLParser, optionally supplying it with your own ValueFactory.
- Set the RDFHandler.
- Optionally, set the ParseErrorListener and/or ParseLocationListener.
- Optionally, specify whether the parser should verify the data it parses and whether it should stop immediately when it finds an error in the data (both default to true).
- Call the parse method.
// Use the SAX2-compliant Xerces parser: System.setProperty("org.xml.sax.driver", "org.apache.xerces.parsers.SAXParser"); RDFParser parser = new RDFXMLParser(); parser.setRDFHandler(myRDFHandler); parser.setParseErrorListener(myParseErrorListener); parser.setVerifyData(true); parser.stopAtFirstError(false); // Parse the data from inputStream, resolving any // relative URIs against http://foo/bar: parser.parse(inputStream, "http://foo/bar");Note that JAXP entity expansion limits may apply. Check the documentation on limits and using the jaxp.properties file if you get one of the following errors:
JAXP00010001: The parser has encountered more than "64000" entity expansions in this document JAXP00010004: The accumulated size of entities is ... that exceeded the "50,000,000" limit
As a work-around, try passing
-Djdk.xml.totalEntitySizeLimit=0 -DentityExpansionLimit=0to the JVM.- See Also:
ValueFactory,RDFHandler,ParseErrorListener,ParseLocationListener
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static classRDFXMLParser.NodeElement(package private) static classRDFXMLParser.PropertyElement
-
Field Summary
Fields Modifier and Type Field Description private java.lang.StringdocumentURIThe base URI of the document.private java.util.Stack<java.lang.Object>elementStackA stack of node- and property elements.private SAXFiltersaxFilterA filter filtering calls to SAX methods specifically for this parser.private java.util.Set<IRI>usedIDsA set containing URIs that have been generated as a result of rdf:ID attributes.private java.lang.StringxmlLangThe language of literal values as can be specified using xml:lang attributes.-
Fields inherited from class org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
rdfHandler, valueFactory
-
-
Constructor Summary
Constructors Constructor Description RDFXMLParser()Creates a new RDFXMLParser that will use aSimpleValueFactoryto create RDF model objects.RDFXMLParser(ValueFactory valueFactory)Creates a new RDFXMLParser that will use the supplied ValueFactory to create RDF model objects.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private IRIbuildResourceFromLocalName(java.lang.String localName)Builds a Resource from a non-qualified localname.private IRIbuildURIFromID(java.lang.String id)Builds a Resource from the value of an rdf:ID attribute.private voidcheckNodeEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)Checks whether the node element name is from the RDF namespace and, if so, if it is allowed to be used in a node element.private voidcheckNoMoreAtts(Atts atts)Checks whether 'atts' is empty.private voidcheckPropertyEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, RioSetting<java.lang.Boolean> setting)Checks whether the property element name is from the RDF namespace and, if so, if it is allowed to be used in a property element.private voidcheckRDFAtts(Atts atts)Checks whether 'atts' contains attributes from the RDF namespace that are not allowed as attributes.protected LiteralcreateLiteral(java.lang.String label, java.lang.String lang, IRI datatype)Creates aLiteralobject with the supplied parameters.protected ResourcecreateNode(java.lang.String nodeID)(package private) voidemptyElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts)(package private) voidendDocument()(package private) voidendElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)voiderror(org.xml.sax.SAXParseException exception)Implementation of SAX ErrorHandler.errorvoidfatalError(org.xml.sax.SAXParseException exception)Implementation of SAX ErrorHandler.fatalErrorprivate ResourcegetNodeResource(Atts atts)Retrieves the resource of a node element (subject or object) using relevant attributes (rdf:ID, rdf:about and rdf:nodeID) from its attributes list.booleangetParseStandAloneDocuments()Returns whether the parser is currently in a mode to parse stand-alone RDF documents.private ResourcegetPropertyResource(Atts atts)Retrieves the object resource of a property element using relevant attributes (rdf:resource and rdf:nodeID) from its attributes list.RDFFormatgetRDFFormat()Gets the RDF format that this parser can parse.javax.xml.transform.sax.SAXResultgetSAXResult(java.lang.String baseURI)java.util.Collection<RioSetting<?>>getSupportedSettings()private voidhandleReification(Value value)voidparse(java.io.InputStream in, java.lang.String baseURI)Parses the data from the supplied InputStream, using the supplied baseURI to resolve any relative URI references.voidparse(java.io.Reader reader, java.lang.String baseURI)Parses the data from the supplied Reader, using the supplied baseURI to resolve any relative URI references.private voidparse(org.xml.sax.InputSource inputSource)private java.lang.ObjectpeekStack(int distFromTop)private voidprocessNodeElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt)private voidprocessPropertyElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt)private voidprocessSubjectAtts(RDFXMLParser.NodeElement nodeElt, Atts atts)processes subject attributes.private voidreifyStatement(Resource reifNode, Resource subj, IRI pred, Value obj)protected voidreportError(java.lang.Exception e, RioSetting<java.lang.Boolean> setting)OverridesAbstractRDFParser.reportError(String, RioSetting), adding line- and column number information to the error.protected voidreportError(java.lang.String msg, RioSetting<java.lang.Boolean> setting)OverridesAbstractRDFParser.reportError(String, RioSetting), adding line- and column number information to the error.protected voidreportFatalError(java.lang.Exception e)OverridesAbstractRDFParser.reportFatalError(Exception), adding line- and column number information to the error.protected voidreportFatalError(java.lang.String msg)OverridesAbstractRDFParser.reportFatalError(String), adding line- and column number information to the error.private voidreportStatement(Resource subject, IRI predicate, Value object)Reports a stament to the configured RDFHandlerException.protected voidreportWarning(java.lang.String msg)OverridesAbstractRDFParser.reportWarning(String), adding line- and column number information to the error.protected voidsetBaseURI(java.lang.String baseURI)Parses the supplied URI-string and sets it as the base URI for resolving relative URIs.protected voidsetBaseURI(ParsedIRI baseURI)Sets the base URI for resolving relative URIs.voidsetParseStandAloneDocuments(boolean standAloneDocs)Sets the parser in a mode to parse stand-alone RDF documents.(package private) voidsetXMLLang(java.lang.String xmlLang)(package private) voidstartDocument()(package private) voidstartElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts)(package private) voidtext(java.lang.String text)private booleantopIsProperty()voidwarning(org.xml.sax.SAXParseException exception)Implementation of SAX ErrorHandler.warning-
Methods inherited from class org.eclipse.rdf4j.rio.helpers.XMLReaderBasedParser
getCompulsoryXmlFeatureSettings, getCompulsoryXmlPropertySettings, getOptionalXmlFeatureSettings, getOptionalXmlPropertySettings, getXMLReader
-
Methods inherited from class org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
clear, clearBNodeIDMap, createBNode, createBNode, createLiteral, createNode, createStatement, createStatement, createURI, getNamespace, getParseErrorListener, getParseLocationListener, getParserConfig, getRDFHandler, initializeNamespaceTableFromConfiguration, preserveBNodeIDs, reportError, reportError, reportError, reportFatalError, reportFatalError, reportFatalError, reportLocation, reportWarning, resolveURI, set, setNamespace, setParseErrorListener, setParseLocationListener, setParserConfig, setPreserveBNodeIDs, setRDFHandler, setValueFactory
-
-
-
-
Field Detail
-
saxFilter
private final SAXFilter saxFilter
A filter filtering calls to SAX methods specifically for this parser.
-
documentURI
private java.lang.String documentURI
The base URI of the document. This variable is set when parse(inputStream, baseURI) is called and will not be changed during parsing.
-
xmlLang
private java.lang.String xmlLang
The language of literal values as can be specified using xml:lang attributes. This variable is set/modified by the SAXFilter during parsing such that it always represents the language of the context in which elements are reported.
-
elementStack
private final java.util.Stack<java.lang.Object> elementStack
A stack of node- and property elements.
-
usedIDs
private final java.util.Set<IRI> usedIDs
A set containing URIs that have been generated as a result of rdf:ID attributes. These URIs should be unique within a single document.
-
-
Constructor Detail
-
RDFXMLParser
public RDFXMLParser()
Creates a new RDFXMLParser that will use aSimpleValueFactoryto create RDF model objects.
-
RDFXMLParser
public RDFXMLParser(ValueFactory valueFactory)
Creates a new RDFXMLParser that will use the supplied ValueFactory to create RDF model objects.- Parameters:
valueFactory- A ValueFactory.
-
-
Method Detail
-
getRDFFormat
public final RDFFormat getRDFFormat()
Description copied from interface:RDFParserGets the RDF format that this parser can parse.- Specified by:
getRDFFormatin interfaceRDFParser
-
setParseStandAloneDocuments
public void setParseStandAloneDocuments(boolean standAloneDocs)
Sets the parser in a mode to parse stand-alone RDF documents. In stand-alone RDF documents, the enclosing rdf:RDF root element is optional if this root element contains just one element (e.g. rdf:Description.
-
getParseStandAloneDocuments
public boolean getParseStandAloneDocuments()
Returns whether the parser is currently in a mode to parse stand-alone RDF documents.- See Also:
setParseStandAloneDocuments(boolean)
-
parse
public void parse(java.io.InputStream in, java.lang.String baseURI) throws java.io.IOException, RDFParseException, RDFHandlerExceptionDescription copied from interface:RDFParserParses the data from the supplied InputStream, using the supplied baseURI to resolve any relative URI references.- Specified by:
parsein interfaceRDFParser- Parameters:
in- The InputStream from which to read the data.baseURI- The URI associated with the data in the InputStream. May benull. Parsers for syntax formats that do not support relative URIs will ignore this argument.Note that if the data contains an embedded base URI, that embedded base URI will overrule the value supplied here (see RFC 3986 section 5.1 for details).
- Throws:
java.io.IOException- If an I/O error occurred while data was read from the InputStream.RDFParseException- If the parser has found an unrecoverable parse error.RDFHandlerException- If the configured statement handler has encountered an unrecoverable error.
-
parse
public void parse(java.io.Reader reader, java.lang.String baseURI) throws java.io.IOException, RDFParseException, RDFHandlerExceptionDescription copied from interface:RDFParserParses the data from the supplied Reader, using the supplied baseURI to resolve any relative URI references.- Specified by:
parsein interfaceRDFParser- Parameters:
reader- The Reader from which to read the data.baseURI- The URI associated with the data in the InputStream. May benull. Parsers for syntax formats that do not support relative URIs will ignore this argument.Note that if the data contains an embedded base URI, that embedded base URI will overrule the value supplied here (see RFC 3986 section 5.1 for details).
- Throws:
java.io.IOException- If an I/O error occurred while data was read from the InputStream.RDFParseException- If the parser has found an unrecoverable parse error.RDFHandlerException- If the configured statement handler has encountered an unrecoverable error.
-
parse
private void parse(org.xml.sax.InputSource inputSource) throws java.io.IOException, RDFParseException, RDFHandlerException- Throws:
java.io.IOExceptionRDFParseExceptionRDFHandlerException
-
getSupportedSettings
public java.util.Collection<RioSetting<?>> getSupportedSettings()
- Specified by:
getSupportedSettingsin interfaceRDFParser- Overrides:
getSupportedSettingsin classAbstractRDFParser- Returns:
- A collection of
RioSettings that are supported by this RDFParser.
-
getSAXResult
public javax.xml.transform.sax.SAXResult getSAXResult(java.lang.String baseURI)
-
startDocument
void startDocument() throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
endDocument
void endDocument() throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
setBaseURI
protected void setBaseURI(ParsedIRI baseURI)
Description copied from class:AbstractRDFParserSets the base URI for resolving relative URIs.- Overrides:
setBaseURIin classAbstractRDFParser
-
setBaseURI
protected void setBaseURI(java.lang.String baseURI)
Description copied from class:AbstractRDFParserParses the supplied URI-string and sets it as the base URI for resolving relative URIs.- Overrides:
setBaseURIin classAbstractRDFParser
-
setXMLLang
void setXMLLang(java.lang.String xmlLang)
-
startElement
void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts) throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
endElement
void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
emptyElement
void emptyElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts) throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
text
void text(java.lang.String text) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseExceptionRDFHandlerException
-
processNodeElt
private void processNodeElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt) throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
getNodeResource
private Resource getNodeResource(Atts atts) throws RDFParseException
Retrieves the resource of a node element (subject or object) using relevant attributes (rdf:ID, rdf:about and rdf:nodeID) from its attributes list.- Returns:
- a resource or a bNode.
- Throws:
RDFParseException
-
processSubjectAtts
private void processSubjectAtts(RDFXMLParser.NodeElement nodeElt, Atts atts) throws RDFParseException, RDFHandlerException
processes subject attributes.- Throws:
RDFParseExceptionRDFHandlerException
-
processPropertyElt
private void processPropertyElt(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Atts atts, boolean isEmptyElt) throws RDFParseException, RDFHandlerException- Throws:
RDFParseExceptionRDFHandlerException
-
getPropertyResource
private Resource getPropertyResource(Atts atts) throws RDFParseException
Retrieves the object resource of a property element using relevant attributes (rdf:resource and rdf:nodeID) from its attributes list.- Returns:
- a resource or a bNode.
- Throws:
RDFParseException
-
handleReification
private void handleReification(Value value) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseExceptionRDFHandlerException
-
reifyStatement
private void reifyStatement(Resource reifNode, Resource subj, IRI pred, Value obj) throws RDFParseException, RDFHandlerException
- Throws:
RDFParseExceptionRDFHandlerException
-
buildResourceFromLocalName
private IRI buildResourceFromLocalName(java.lang.String localName) throws RDFParseException
Builds a Resource from a non-qualified localname.- Throws:
RDFParseException
-
buildURIFromID
private IRI buildURIFromID(java.lang.String id) throws RDFParseException
Builds a Resource from the value of an rdf:ID attribute.- Throws:
RDFParseException
-
createNode
protected Resource createNode(java.lang.String nodeID) throws RDFParseException
Description copied from class:AbstractRDFParser- Overrides:
createNodein classAbstractRDFParser- Parameters:
nodeID- node identifier- Returns:
- blank node or skolem IRI
- Throws:
RDFParseException
-
peekStack
private java.lang.Object peekStack(int distFromTop)
-
topIsProperty
private boolean topIsProperty()
-
checkNodeEltName
private void checkNodeEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws RDFParseExceptionChecks whether the node element name is from the RDF namespace and, if so, if it is allowed to be used in a node element. If the name is equal to one of the disallowed names (RDF, ID, about, parseType, resource, nodeID, datatype and li), an error is generated. If the name is not defined in the RDF namespace, but it claims that it is from this namespace, a warning is generated.- Throws:
RDFParseException
-
checkPropertyEltName
private void checkPropertyEltName(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, RioSetting<java.lang.Boolean> setting) throws RDFParseExceptionChecks whether the property element name is from the RDF namespace and, if so, if it is allowed to be used in a property element. If the name is equal to one of the disallowed names (RDF, ID, about, parseType, resource and li), an error is generated. If the name is not defined in the RDF namespace, but it claims that it is from this namespace, a warning is generated.- Parameters:
setting-- Throws:
RDFParseException
-
checkRDFAtts
private void checkRDFAtts(Atts atts) throws RDFParseException
Checks whether 'atts' contains attributes from the RDF namespace that are not allowed as attributes. If such an attribute is found, an error is generated and the attribute is removed from 'atts'. If the attribute is not defined in the RDF namespace, but it claims that it is from this namespace, a warning is generated.- Throws:
RDFParseException
-
checkNoMoreAtts
private void checkNoMoreAtts(Atts atts) throws RDFParseException
Checks whether 'atts' is empty. If this is not the case, a warning is generated for each attribute that is still present.- Throws:
RDFParseException
-
reportStatement
private void reportStatement(Resource subject, IRI predicate, Value object) throws RDFParseException, RDFHandlerException
Reports a stament to the configured RDFHandlerException.- Parameters:
subject- The statement's subject.predicate- The statement's predicate.object- The statement's object.- Throws:
RDFHandlerException- If the configured RDFHandlerException throws an RDFHandlerException.RDFParseException
-
createLiteral
protected Literal createLiteral(java.lang.String label, java.lang.String lang, IRI datatype) throws RDFParseException
Description copied from class:AbstractRDFParserCreates aLiteralobject with the supplied parameters.- Overrides:
createLiteralin classAbstractRDFParser- Throws:
RDFParseException
-
reportWarning
protected void reportWarning(java.lang.String msg)
OverridesAbstractRDFParser.reportWarning(String), adding line- and column number information to the error.- Overrides:
reportWarningin classAbstractRDFParser
-
reportError
protected void reportError(java.lang.String msg, RioSetting<java.lang.Boolean> setting) throws RDFParseExceptionOverridesAbstractRDFParser.reportError(String, RioSetting), adding line- and column number information to the error.- Overrides:
reportErrorin classAbstractRDFParser- Parameters:
msg- The message to use forParseErrorListener.error(String, long, long)and forRDFParseException(String, long, long).setting- The boolean setting that will be checked to determine if this is an issue that we need to look at at all. If this setting is true, then the error listener will receive the error, and ifParserConfig.isNonFatalError(RioSetting)returns true an exception will be thrown.- Throws:
RDFParseException- IfRioConfig.get(RioSetting)returns true, andParserConfig.isNonFatalError(RioSetting)returns true for the given setting.
-
reportError
protected void reportError(java.lang.Exception e, RioSetting<java.lang.Boolean> setting) throws RDFParseExceptionOverridesAbstractRDFParser.reportError(String, RioSetting), adding line- and column number information to the error.- Overrides:
reportErrorin classAbstractRDFParser- Parameters:
e- The exception whose message will be used forParseErrorListener.error(String, long, long)and forRDFParseException(String, long, long).setting- The boolean setting that will be checked to determine if this is an issue that we need to look at at all. If this setting is true, then the error listener will receive the error, and ifParserConfig.isNonFatalError(RioSetting)returns true an exception will be thrown.- Throws:
RDFParseException- IfRioConfig.get(RioSetting)returns true, andParserConfig.isNonFatalError(RioSetting)returns true for the given setting.
-
reportFatalError
protected void reportFatalError(java.lang.String msg) throws RDFParseExceptionOverridesAbstractRDFParser.reportFatalError(String), adding line- and column number information to the error.- Overrides:
reportFatalErrorin classAbstractRDFParser- Throws:
RDFParseException
-
reportFatalError
protected void reportFatalError(java.lang.Exception e) throws RDFParseExceptionOverridesAbstractRDFParser.reportFatalError(Exception), adding line- and column number information to the error.- Overrides:
reportFatalErrorin classAbstractRDFParser- Throws:
RDFParseException
-
warning
public void warning(org.xml.sax.SAXParseException exception) throws org.xml.sax.SAXExceptionImplementation of SAX ErrorHandler.warning- Specified by:
warningin interfaceorg.xml.sax.ErrorHandler- Throws:
org.xml.sax.SAXException
-
error
public void error(org.xml.sax.SAXParseException exception) throws org.xml.sax.SAXExceptionImplementation of SAX ErrorHandler.error- Specified by:
errorin interfaceorg.xml.sax.ErrorHandler- Throws:
org.xml.sax.SAXException
-
fatalError
public void fatalError(org.xml.sax.SAXParseException exception) throws org.xml.sax.SAXExceptionImplementation of SAX ErrorHandler.fatalError- Specified by:
fatalErrorin interfaceorg.xml.sax.ErrorHandler- Throws:
org.xml.sax.SAXException
-
-