Class HDTParser
- All Implemented Interfaces:
RDFParser
Unfortunately the draft specification is not entirely clear and probably slightly out of date, since the open source reference implementation HDT-It seems to implement a slightly different version. This parser tries to be compatible with HDT-It 1.0.
The most important parts are the Dictionaries containing the actual values (S, P, O part of a triple), and the Triples containing the numeric references to construct the triples.
Since objects in one triple are often subjects in another triple, these "shared" parts are stored in a shared Dictionary, which may significantly reduce the file size.
File structure:
+---------------------+ | Global | | Header | | Dictionary (Shared) | | Dictionary (S) | | Dictionary (P) | | Dictionary (O) | | Triples | +---------------------+
- See Also:
-
Field Summary
Fields inherited from class org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
rdfHandler, valueFactory
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a new HDTParser that will use aSimpleValueFactory
to create RDF model objects.HDTParser
(ValueFactory valueFactory) Creates a new HDTParser that will use the supplied ValueFactory to create RDF model objects. -
Method Summary
Modifier and TypeMethodDescriptionprivate Value
createObject
(byte[] b) Create object (typed) literal, IRI or blank nodeprivate IRI
createPredicate
(byte[] b) Create predicate IRIprivate Resource
createSubject
(byte[] b) Create subject IRI or blank nodeGets the RDF format that this parser can parse.private byte[]
getSO
(int pos, int size, HDTDictionarySection shared, HDTDictionarySection other) Get part of triple from shared HDT Dictionary or (if not found) from specific HDT DictionaryCollection
<RioSetting<?>> private boolean
isBNodeID
(byte[] b) void
parse
(InputStream in, String baseURI) Parses the data from the supplied InputStream, using the supplied baseURI to resolve any relative URI references.void
Not supported, since HDT is a binary format.Methods inherited from class org.eclipse.rdf4j.rio.helpers.AbstractRDFParser
clear, clearBNodeIDMap, createBNode, createBNode, createLiteral, createLiteral, createNode, createNode, createStatement, createStatement, createURI, getNamespace, getParseErrorListener, getParseLocationListener, getParserConfig, getRDFHandler, initializeNamespaceTableFromConfiguration, preserveBNodeIDs, reportError, reportError, reportError, reportError, reportError, reportFatalError, reportFatalError, reportFatalError, reportFatalError, reportFatalError, reportLocation, reportWarning, reportWarning, resolveURI, set, setBaseURI, setBaseURI, setNamespace, setParseErrorListener, setParseLocationListener, setParserConfig, setPreserveBNodeIDs, setRDFHandler, setValueFactory
-
Constructor Details
-
HDTParser
public HDTParser()Creates a new HDTParser that will use aSimpleValueFactory
to create RDF model objects. -
HDTParser
Creates a new HDTParser that will use the supplied ValueFactory to create RDF model objects.- Parameters:
valueFactory
- A ValueFactory.
-
-
Method Details
-
getRDFFormat
Description copied from interface:RDFParser
Gets the RDF format that this parser can parse. -
getSupportedSettings
- Specified by:
getSupportedSettings
in interfaceRDFParser
- Overrides:
getSupportedSettings
in classAbstractRDFParser
- Returns:
- A collection of
RioSetting
s that are supported by this RDFParser.
-
parse
public void parse(InputStream in, String baseURI) throws IOException, RDFParseException, RDFHandlerException Description copied from interface:RDFParser
Parses the data from the supplied InputStream, using the supplied baseURI to resolve any relative URI references.- Parameters:
in
- The InputStream from which to read the data.baseURI
- The URI associated with the data in the InputStream. May benull
. Parsers for syntax formats that do not support relative URIs will ignore this argument.Note that if the data contains an embedded base URI, that embedded base URI will overrule the value supplied here (see RFC 3986 section 5.1 for details).
- Throws:
IOException
- If an I/O error occurred while data was read from the InputStream.RDFParseException
- If the parser has found an unrecoverable parse error.RDFHandlerException
- If the configured statement handler has encountered an unrecoverable error.
-
parse
public void parse(Reader reader, String baseURI) throws IOException, RDFParseException, RDFHandlerException Not supported, since HDT is a binary format.- Parameters:
reader
- The Reader from which to read the data.baseURI
- The URI associated with the data in the InputStream. May benull
. Parsers for syntax formats that do not support relative URIs will ignore this argument.Note that if the data contains an embedded base URI, that embedded base URI will overrule the value supplied here (see RFC 3986 section 5.1 for details).
- Throws:
IOException
- If an I/O error occurred while data was read from the InputStream.RDFParseException
- If the parser has found an unrecoverable parse error.RDFHandlerException
- If the configured statement handler has encountered an unrecoverable error.
-
getSO
private byte[] getSO(int pos, int size, HDTDictionarySection shared, HDTDictionarySection other) throws IOException Get part of triple from shared HDT Dictionary or (if not found) from specific HDT Dictionary- Parameters:
pos
- positionsize
- size of shared Dictionaryshared
- shared Dictionaryother
- specific Dictionary- Returns:
- subject or object
- Throws:
IOException
-
isBNodeID
private boolean isBNodeID(byte[] b) -
createSubject
Create subject IRI or blank node- Parameters:
b
- byte buffer- Returns:
- IRI or blank node
-
createPredicate
Create predicate IRI- Parameters:
b
- byte buffer- Returns:
- IRI
-
createObject
Create object (typed) literal, IRI or blank node- Parameters:
b
- byte buffer- Returns:
- literal, IRI or blank node
-