Interface RDFParser
-
- All Known Implementing Classes:
AbstractRDFParser
,JsonLdParser
,RDF4JParser
public interface RDFParser
Parse an RDF source into a target (e.g. a Graph/Dataset).Experimental
This interface (and its implementations) should be considered at risk; they might change or be removed in the next minor update of Commons RDF. It may move to the theorg.apache.commons.rdf.api
package when it has stabilized.Description
This interface follows the Builder pattern, allowing to set parser settings like
contentType(RDFSyntax)
andbase(IRI)
. A caller MUST call one of thesource
methods (e.g.source(IRI)
,source(Path)
,source(InputStream)
), and MUST call one of thetarget
methods (e.g.target(Consumer)
,target(Dataset)
,target(Graph)
) before callingparse()
on the returned RDFParser - however methods can be called in any order.The call to
parse()
returns aFuture
, allowing asynchronous parse operations. Callers are recommended to checkFuture.get()
to ensure parsing completed successfully, or catch exceptions thrown during parsing.Setting a method that has already been set will override any existing value in the returned builder - regardless of the parameter type (e.g.
source(IRI)
will override a previoussource(Path)
. Settings can be unset by passingnull
- note that this may require casting, e.g.contentType( (RDFSyntax) null )
to undo a previous call tocontentType(RDFSyntax)
.It is undefined if a RDFParser is mutable or thread-safe, so callers should always use the returned modified RDFParser from the builder methods. The builder may return itself after modification, or a cloned builder with the modified settings applied. Implementations are however encouraged to be immutable, thread-safe and document this. As an example starting point, see
org.apache.commons.rdf.simple.AbstractRDFParser
.Example usage:
Graph g1 = rDFTermFactory.createGraph(); new ExampleRDFParserBuilder().source(Paths.get("/tmp/graph.ttl")).contentType(RDFSyntax.TURTLE).target(g1).parse() .get(30, TimeUnit.Seconds);
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interface
RDFParser.ParseResult
The result ofparse()
indicating parsing completed.
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description RDFParser
base(java.lang.String base)
Specify a base IRI to use for parsing any relative IRI references.RDFParser
base(IRI base)
Specify a base IRI to use for parsing any relative IRI references.RDFParser
contentType(java.lang.String contentType)
Specify the content type of the RDF syntax to parse.RDFParser
contentType(RDFSyntax rdfSyntax)
Specify the content type of the RDF syntax to parse.java.util.concurrent.Future<? extends RDFParser.ParseResult>
parse()
Parse the specified source.RDFParser
rdfTermFactory(RDF rdfTermFactory)
RDFParser
source(java.io.InputStream inputStream)
Specify a sourceInputStream
to parse.RDFParser
source(java.lang.String iri)
Specify an absolute source IRI to retrieve and parse.RDFParser
source(java.nio.file.Path file)
Specify a source filePath
to parse.RDFParser
source(IRI iri)
Specify an absolute sourceIRI
to retrieve and parse.RDFParser
target(java.util.function.Consumer<Quad> consumer)
Specify a consumer for parsed quads.default RDFParser
target(Dataset dataset)
Specify aDataset
to add parsed quads to.default RDFParser
target(Graph graph)
Specify aGraph
to add parsed triples to.
-
-
-
Method Detail
-
rdfTermFactory
RDFParser rdfTermFactory(RDF rdfTermFactory)
Specify whichRDF
to use for generatingRDFTerm
s.This option may be used together with
target(Graph)
to override the implementation's default factory and graph.Warning: Using the same
RDF
for multipleparse()
calls may accidentally mergeBlankNode
s having the same label, as the parser may use theRDF.createBlankNode(String)
method from the parsed blank node labels.- Parameters:
rdfTermFactory
-RDF
to use for generating RDFTerms.- Returns:
- An
RDFParser
that will use the specified rdfTermFactory - See Also:
target(Graph)
-
contentType
RDFParser contentType(RDFSyntax rdfSyntax) throws java.lang.IllegalArgumentException
Specify the content type of the RDF syntax to parse.This option can be used to select the RDFSyntax of the source, overriding any
Content-Type
headers or equivalent.The character set of the RDFSyntax is assumed to be
StandardCharsets.UTF_8
unless overridden within the document (e.g.<?xml version="1.0" encoding="iso-8859-1"?>
inRDFSyntax.RDFXML
).This method will override any contentType set with
contentType(String)
.- Parameters:
rdfSyntax
- AnRDFSyntax
to parse the source according to, e.g.RDFSyntax.TURTLE
.- Returns:
- An
RDFParser
that will use the specified content type. - Throws:
java.lang.IllegalArgumentException
- If this RDFParser does not support the specified RDFSyntax.- See Also:
contentType(String)
-
contentType
RDFParser contentType(java.lang.String contentType) throws java.lang.IllegalArgumentException
Specify the content type of the RDF syntax to parse.This option can be used to select the RDFSyntax of the source, overriding any
Content-Type
headers or equivalent.The content type MAY include a
charset
parameter if the RDF media types permit it; the default charset isStandardCharsets.UTF_8
unless overridden within the document.This method will override any contentType set with
contentType(RDFSyntax)
.- Parameters:
contentType
- A content-type string, e.g.application/ld+json
ortext/turtle;charset="UTF-8"
as specified by RFC7231.- Returns:
- An
RDFParser
that will use the specified content type. - Throws:
java.lang.IllegalArgumentException
- If the contentType has an invalid syntax, or this RDFParser does not support the specified contentType.- See Also:
contentType(RDFSyntax)
-
target
default RDFParser target(Graph graph)
Specify aGraph
to add parsed triples to.If the source supports datasets (e.g. the
contentType(RDFSyntax)
set hasRDFSyntax.supportsDataset()
is true)), then only quads in the default graph will be added to the Graph asTriple
s.It is undefined if any triples are added to the specified
Graph
ifparse()
throws any exceptions. (However implementations are free to prevent this using transaction mechanisms or similar). IfFuture.get()
does not indicate an exception, the parser implementation SHOULD have inserted all parsed triples to the specified graph.Calling this method will override any earlier targets set with
target(Graph)
,target(Consumer)
ortarget(Dataset)
.The default implementation of this method calls
target(Consumer)
with aConsumer
that doesGraph.add(Triple)
withQuad.asTriple()
if the quad is in the default graph.
-
target
default RDFParser target(Dataset dataset)
Specify aDataset
to add parsed quads to.It is undefined if any quads are added to the specified
Dataset
ifparse()
throws any exceptions. (However implementations are free to prevent this using transaction mechanisms or similar). On the other hand, ifparse()
does not indicate an exception, the implementation SHOULD have inserted all parsed quads to the specified dataset.Calling this method will override any earlier targets set with
target(Graph)
,target(Consumer)
ortarget(Dataset)
.The default implementation of this method calls
target(Consumer)
with aConsumer
that doesDataset.add(Quad)
.
-
target
RDFParser target(java.util.function.Consumer<Quad> consumer)
Specify a consumer for parsed quads.The quads will include triples in all named graphs of the parsed source, including any triples in the default graph. When parsing a source format which do not support datasets, all quads delivered to the consumer will be in the default graph (e.g. their
Quad.getGraphName()
will be asOptional.empty()
), while for a sourceIt is undefined if any quads are consumed if
parse()
throws any exceptions. On the other hand, ifparse()
does not indicate an exception, the implementation SHOULD have produced all parsed quads to the specified consumer.Calling this method will override any earlier targets set with
target(Graph)
,target(Consumer)
ortarget(Dataset)
.The consumer is not assumed to be thread safe - only one
Consumer.accept(Object)
is delivered at a time for a givenparse()
call.This method is typically called with a functional consumer, for example:
List<Quad> quads = new ArrayList<Quad>; parserBuilder.target(quads::add).parse();
-
base
RDFParser base(IRI base)
Specify a base IRI to use for parsing any relative IRI references.Setting this option will override any protocol-specific base IRI (e.g.
Content-Location
header) or thesource(IRI)
IRI, but does not override any base IRIs set within the source document (e.g.@base
in Turtle documents).If the source is in a syntax that does not support relative IRI references (e.g.
RDFSyntax.NTRIPLES
), setting thebase
has no effect.This method will override any base IRI set with
base(String)
.- Parameters:
base
- An absolute IRI to use as a base.- Returns:
- An
RDFParser
that will use the specified base IRI. - See Also:
base(String)
-
base
RDFParser base(java.lang.String base) throws java.lang.IllegalArgumentException
Specify a base IRI to use for parsing any relative IRI references.Setting this option will override any protocol-specific base IRI (e.g.
Content-Location
header) or thesource(IRI)
IRI, but does not override any base IRIs set within the source document (e.g.@base
in Turtle documents).If the source is in a syntax that does not support relative IRI references (e.g.
RDFSyntax.NTRIPLES
), setting thebase
has no effect.This method will override any base IRI set with
base(IRI)
.
-
source
RDFParser source(java.io.InputStream inputStream)
Specify a sourceInputStream
to parse.The source set will not be read before the call to
parse()
.The InputStream will not be closed after parsing. The InputStream does not need to support
InputStream.markSupported()
.The parser might not consume the complete stream (e.g. an RDF/XML parser may not read beyond the closing tag of
</rdf:Description>
).The
contentType(RDFSyntax)
orcontentType(String)
SHOULD be set before callingparse()
.The character set is assumed to be
StandardCharsets.UTF_8
unless thecontentType(String)
specifies otherwise or the document declares its own charset (e.g. RDF/XML with a<?xml encoding="iso-8859-1">
header).The
base(IRI)
orbase(String)
MUST be set before callingparse()
, unless the RDF syntax does not permit relative IRIs (e.g.RDFSyntax.NTRIPLES
).This method will override any source set with
source(IRI)
,source(Path)
orsource(String)
.- Parameters:
inputStream
- An InputStream to consume- Returns:
- An
RDFParser
that will use the specified source.
-
source
RDFParser source(java.nio.file.Path file)
Specify a source filePath
to parse.The source set will not be read before the call to
parse()
.The
contentType(RDFSyntax)
orcontentType(String)
SHOULD be set before callingparse()
.The character set is assumed to be
StandardCharsets.UTF_8
unless thecontentType(String)
specifies otherwise or the document declares its own charset (e.g. RDF/XML with a<?xml encoding="iso-8859-1">
header).The
base(IRI)
orbase(String)
MAY be set before callingparse()
, otherwisePath.toUri()
will be used as the base IRI.This method will override any source set with
source(IRI)
,source(InputStream)
orsource(String)
.- Parameters:
file
- A Path for a file to parse- Returns:
- An
RDFParser
that will use the specified source.
-
source
RDFParser source(IRI iri)
Specify an absolute sourceIRI
to retrieve and parse.The source set will not be read before the call to
parse()
.If this builder does not support the given IRI protocol (e.g.
urn:uuid:ce667463-c5ab-4c23-9b64-701d055c4890
), this method should succeed, while theparse()
should throw anIOException
.The
contentType(RDFSyntax)
orcontentType(String)
MAY be set before callingparse()
, in which case that type MAY be used for content negotiation (e.g.Accept
header in HTTP), and SHOULD be used for selecting the RDFSyntax.The character set is assumed to be
StandardCharsets.UTF_8
unless the protocol's equivalent ofContent-Type
specifies otherwise or the document declares its own charset (e.g. RDF/XML with a<?xml encoding="iso-8859-1">
header).The
base(IRI)
orbase(String)
MAY be set before callingparse()
, otherwise the source IRI will be used as the base IRI.This method will override any source set with
source(Path)
,source(InputStream)
orsource(String)
.- Parameters:
iri
- An IRI to retrieve and parse- Returns:
- An
RDFParser
that will use the specified source.
-
source
RDFParser source(java.lang.String iri) throws java.lang.IllegalArgumentException
Specify an absolute source IRI to retrieve and parse.The source set will not be read before the call to
parse()
.If this builder does not support the given IRI (e.g.
urn:uuid:ce667463-c5ab-4c23-9b64-701d055c4890
), this method should succeed, while theparse()
should throw anIOException
.The
contentType(RDFSyntax)
orcontentType(String)
MAY be set before callingparse()
, in which case that type MAY be used for content negotiation (e.g.Accept
header in HTTP), and SHOULD be used for selecting the RDFSyntax.The character set is assumed to be
StandardCharsets.UTF_8
unless the protocol's equivalent ofContent-Type
specifies otherwise or the document declares its own charset (e.g. RDF/XML with a<?xml encoding="iso-8859-1">
header).The
base(IRI)
orbase(String)
MAY be set before callingparse()
, otherwise the source IRI will be used as the base IRI.This method will override any source set with
source(Path)
,source(InputStream)
orsource(IRI)
.- Parameters:
iri
- An IRI to retrieve and parse- Returns:
- An
RDFParser
that will use the specified source. - Throws:
java.lang.IllegalArgumentException
- If the base is not a valid absolute IRI string
-
parse
java.util.concurrent.Future<? extends RDFParser.ParseResult> parse() throws java.io.IOException, java.lang.IllegalStateException
Parse the specified source.A source method (e.g.
source(InputStream)
,source(IRI)
,source(Path)
,source(String)
or an equivalent subclass method) MUST have been called before calling this method, otherwise anIllegalStateException
will be thrown.A target method (e.g.
target(Consumer)
,target(Dataset)
,target(Graph)
or an equivalent subclass method) MUST have been called before calling parse(), otherwise anIllegalStateException
will be thrown.It is undefined if this method is thread-safe, however the
RDFParser
may be reused (e.g. setting a different source) as soon as theFuture
has been returned from this method.The RDFParser SHOULD perform the parsing as an asynchronous operation, and return the
Future
as soon as preliminary checks (such as validity of thesource(IRI)
andcontentType(RDFSyntax)
settings) have finished. The future SHOULD not markFuture.isDone()
before parsing is complete. A synchronous implementation MAY be blocking on theparse()
call and return a Future that is alreadyFuture.isDone()
.The returned
Future
contains aRDFParser.ParseResult
. Implementations may subclass this interface to provide any parser details, e.g. list of warnings.null
is a possible return value if no details are available, but parsing succeeded.If an exception occurs during parsing, (e.g.
IOException
ororg.apache.commons.rdf.simple.experimental.RDFParseException
), it should be indicated as theThrowable.getCause()
in theExecutionException
thrown onFuture.get()
.- Returns:
- A Future that will return the populated
Graph
when the parsing has finished. - Throws:
java.io.IOException
- If an error occurred while starting to read the source (e.g. file not found, unsupported IRI protocol). Note that IO errors during parsing would instead be theThrowable.getCause()
of theExecutionException
thrown onFuture.get()
.java.lang.IllegalStateException
- If the builder is in an invalid state, e.g. asource
has not been set.
-
-