Package org.eclipse.rdf4j.model.util
Class URIUtil
java.lang.Object
org.eclipse.rdf4j.model.util.URIUtil
Utility functions for working with
URIs
.-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final char[]
Punctuation mark characters, which are part of the set of unreserved chars and therefore allowed to occur in unescaped form.Reserved characters: their usage within the URI component is limited to their reserved purpose.private static final Pattern
Regular expression pattern for matching unicode control characters. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static String
escapeExcludedChars
(String unescaped) Escapes any character that is not either reserved or in the legal range of unreserved characters, according to RFC 2396.static int
getLocalNameIndex
(String uri) Finds the index of the first local name character in an (non-relative) URI.static boolean
isCorrectURISplit
(String namespace, String localName) Checks whether the URI consisting of the specified namespace and local name has been split correctly according to the URI splitting rules specified inURI
.private static boolean
isNameChar
(int codePoint) Check if the supplied code point represents a valid name character.private static boolean
isNameStartChar
(int codePoint) Check if the supplied code point represents a valid name start character.private static boolean
private static boolean
isPLX_START
(String name) private static boolean
isPN_CHARS
(int codePoint) Check if the supplied code point represents a valid prefixed name character.private static boolean
isPN_CHARS_BASE
(int codePoint) Check if the supplied code point represents a valid prefixed name base character.private static boolean
isPN_CHARS_U
(int codePoint) Check if the supplied code point represents either a valid prefixed name base character or an underscore.private static boolean
isPN_LOCAL_ESC
(String name) private static boolean
isUnreserved
(char c) A character is unreserved according to RFC 2396 if it is either an alphanumeric char or a punctuation mark.static boolean
isValidLocalName
(String name) Checks whether the specified name is allowed as the local name part of an IRI according to the SPARQL 1.1/Turtle 1.1 spec.static boolean
isValidURIReference
(String uriRef) Verifies that the supplied string is a valid RDF (1.0) URI reference, as defined in section 6.4 of the RDF Concepts and Abstract Syntax specification (RDF 1.0 Recommendation of February 10, 2004).
-
Field Details
-
reserved
Reserved characters: their usage within the URI component is limited to their reserved purpose. If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI. http://www.isi.edu/in-notes/rfc2396.txt section 2.2. -
mark
Punctuation mark characters, which are part of the set of unreserved chars and therefore allowed to occur in unescaped form. See http://www.isi.edu/in-notes/rfc2396.txt -
unicodeControlCharPattern
Regular expression pattern for matching unicode control characters. -
LOCAL_ESCAPED_CHARS
private static final char[] LOCAL_ESCAPED_CHARS
-
-
Constructor Details
-
URIUtil
public URIUtil()
-
-
Method Details
-
getLocalNameIndex
Finds the index of the first local name character in an (non-relative) URI. This index is determined by the following the following steps:- Find the first occurrence of the '#' character,
- If this fails, find the last occurrence of the '/' character,
- If this fails, find the last occurrence of the ':' character.
- Add 1 to the found index and return this value.
IllegalArgumentException
.- Parameters:
uri
- A URI string.- Returns:
- The index of the first local name character in the URI string. Note that this index does not reference an actual character if the algorithm determines that there is not local name. In that case, the return index is equal to the length of the URI string.
- Throws:
IllegalArgumentException
- If the supplied URI string doesn't contain any of the separator characters. Every legal (non-relative) URI contains at least one ':' character to seperate the scheme from the rest of the URI.
-
isCorrectURISplit
Checks whether the URI consisting of the specified namespace and local name has been split correctly according to the URI splitting rules specified inURI
.- Parameters:
namespace
- The URI's namespace, must not be null.localName
- The URI's local name, must not be null.- Returns:
- true if the specified URI has been correctly split into a namespace and local name, false otherwise.
- See Also:
-
isValidURIReference
Verifies that the supplied string is a valid RDF (1.0) URI reference, as defined in section 6.4 of the RDF Concepts and Abstract Syntax specification (RDF 1.0 Recommendation of February 10, 2004).An RDF URI reference is valid if it is a Unicode string that:
- does not contain any control characters ( #x00 - #x1F, #x7F-#x9F)
- and would produce a valid URI character sequence (per RFC2396 , section 2.1) representing an absolute URI with optional fragment identifier when subjected to the encoding described below
- encoding the Unicode string as UTF-8, giving a sequence of octet values.
- %-escaping octets that do not correspond to permitted US-ASCII characters.
- Parameters:
uriRef
- a string representing an RDF URI reference.- Returns:
true
iff the supplied string is a syntactically valid RDF URI reference,false
otherwise.- See Also:
-
escapeExcludedChars
Escapes any character that is not either reserved or in the legal range of unreserved characters, according to RFC 2396.- Parameters:
unescaped
- a (relative or absolute) uri reference.- Returns:
- a (relative or absolute) uri reference with all characters that can not appear as-is in a URI %-escaped.
- See Also:
-
isUnreserved
private static boolean isUnreserved(char c) A character is unreserved according to RFC 2396 if it is either an alphanumeric char or a punctuation mark. -
isValidLocalName
Checks whether the specified name is allowed as the local name part of an IRI according to the SPARQL 1.1/Turtle 1.1 spec.- Parameters:
name
- the candidate local name- Returns:
- true if it is a local name
-
isPN_CHARS_U
private static boolean isPN_CHARS_U(int codePoint) Check if the supplied code point represents either a valid prefixed name base character or an underscore.From Turtle Spec:
http://www.w3.org/TR/turtle/#grammar-production-PN_CHARS_U
[164s] PN_CHARS_U ::= PN_CHARS_BASE | '_'
-
isPLX_START
-
isPERCENT
-
isPN_LOCAL_ESC
-
isPN_CHARS_BASE
private static boolean isPN_CHARS_BASE(int codePoint) Check if the supplied code point represents a valid prefixed name base character.From Turtle Spec:
http://www.w3.org/TR/turtle/#grammar-production-PN_CHARS_BASE
[163s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
-
isNameStartChar
private static boolean isNameStartChar(int codePoint) Check if the supplied code point represents a valid name start character.- Parameters:
codePoint
- a Unicode code point.- Returns:
true
if the supplied code point represents a valid name start char,false
otherwise.
-
isNameChar
private static boolean isNameChar(int codePoint) Check if the supplied code point represents a valid name character.- Parameters:
codePoint
- a Unicode code point.- Returns:
true
if the supplied code point represents a valid name char,false
otherwise.
-
isPN_CHARS
private static boolean isPN_CHARS(int codePoint) Check if the supplied code point represents a valid prefixed name character.From Turtle Spec:
http://www.w3.org/TR/turtle/#grammar-production-PN_CHARS
[166s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
-