Package org.eclipse.rdf4j.model.util
Class URIUtil
- java.lang.Object
-
- org.eclipse.rdf4j.model.util.URIUtil
-
public class URIUtil extends java.lang.Object
Utility functions for working withURIs
.
-
-
Field Summary
Fields Modifier and Type Field Description private static char[]
LOCAL_ESCAPED_CHARS
private static java.util.Set<java.lang.Character>
mark
Punctuation mark characters, which are part of the set of unreserved chars and therefore allowed to occur in unescaped form.private static java.util.Set<java.lang.Character>
reserved
Reserved characters: their usage within the URI component is limited to their reserved purpose.private static java.util.regex.Pattern
unicodeControlCharPattern
Regular expression pattern for matching unicode control characters.
-
Constructor Summary
Constructors Constructor Description URIUtil()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static java.lang.String
escapeExcludedChars(java.lang.String unescaped)
Escapes any character that is not either reserved or in the legal range of unreserved characters, according to RFC 2396.static int
getLocalNameIndex(java.lang.String uri)
Finds the index of the first local name character in an (non-relative) URI.static boolean
isCorrectURISplit(java.lang.String namespace, java.lang.String localName)
Checks whether the URI consisting of the specified namespace and local name has been split correctly according to the URI splitting rules specified inURI
.private static boolean
isNameChar(int codePoint)
Check if the supplied code point represents a valid name character.private static boolean
isNameStartChar(int codePoint)
Check if the supplied code point represents a valid name start character.private static boolean
isPERCENT(java.lang.String name)
private static boolean
isPLX_START(java.lang.String name)
private static boolean
isPN_CHARS(int codePoint)
Check if the supplied code point represents a valid prefixed name character.private static boolean
isPN_CHARS_BASE(int codePoint)
Check if the supplied code point represents a valid prefixed name base character.private static boolean
isPN_CHARS_U(int codePoint)
Check if the supplied code point represents either a valid prefixed name base character or an underscore.private static boolean
isPN_LOCAL_ESC(java.lang.String name)
private static boolean
isUnreserved(char c)
A character is unreserved according to RFC 2396 if it is either an alphanumeric char or a punctuation mark.static boolean
isValidLocalName(java.lang.String name)
Checks whether the specified name is allowed as the local name part of an IRI according to the SPARQL 1.1/Turtle 1.1 spec.static boolean
isValidURIReference(java.lang.String uriRef)
Verifies that the supplied string is a valid RDF (1.0) URI reference, as defined in section 6.4 of the RDF Concepts and Abstract Syntax specification (RDF 1.0 Recommendation of February 10, 2004).
-
-
-
Field Detail
-
reserved
private static final java.util.Set<java.lang.Character> reserved
Reserved characters: their usage within the URI component is limited to their reserved purpose. If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI. http://www.isi.edu/in-notes/rfc2396.txt section 2.2.
-
mark
private static final java.util.Set<java.lang.Character> mark
Punctuation mark characters, which are part of the set of unreserved chars and therefore allowed to occur in unescaped form. See http://www.isi.edu/in-notes/rfc2396.txt
-
unicodeControlCharPattern
private static final java.util.regex.Pattern unicodeControlCharPattern
Regular expression pattern for matching unicode control characters.
-
LOCAL_ESCAPED_CHARS
private static final char[] LOCAL_ESCAPED_CHARS
-
-
Method Detail
-
getLocalNameIndex
public static int getLocalNameIndex(java.lang.String uri)
Finds the index of the first local name character in an (non-relative) URI. This index is determined by the following the following steps:- Find the first occurrence of the '#' character,
- If this fails, find the last occurrence of the '/' character,
- If this fails, find the last occurrence of the ':' character.
- Add 1 to the found index and return this value.
IllegalArgumentException
.- Parameters:
uri
- A URI string.- Returns:
- The index of the first local name character in the URI string. Note that this index does not reference an actual character if the algorithm determines that there is not local name. In that case, the return index is equal to the length of the URI string.
- Throws:
java.lang.IllegalArgumentException
- If the supplied URI string doesn't contain any of the separator characters. Every legal (non-relative) URI contains at least one ':' character to seperate the scheme from the rest of the URI.
-
isCorrectURISplit
public static boolean isCorrectURISplit(java.lang.String namespace, java.lang.String localName)
Checks whether the URI consisting of the specified namespace and local name has been split correctly according to the URI splitting rules specified inURI
.- Parameters:
namespace
- The URI's namespace, must not be null.localName
- The URI's local name, must not be null.- Returns:
- true if the specified URI has been correctly split into a namespace and local name, false otherwise.
- See Also:
URI
,getLocalNameIndex(String)
-
isValidURIReference
public static boolean isValidURIReference(java.lang.String uriRef)
Verifies that the supplied string is a valid RDF (1.0) URI reference, as defined in section 6.4 of the RDF Concepts and Abstract Syntax specification (RDF 1.0 Recommendation of February 10, 2004).An RDF URI reference is valid if it is a Unicode string that:
- does not contain any control characters ( #x00 - #x1F, #x7F-#x9F)
- and would produce a valid URI character sequence (per RFC2396 , section 2.1) representing an absolute URI with optional fragment identifier when subjected to the encoding described below
- encoding the Unicode string as UTF-8, giving a sequence of octet values.
- %-escaping octets that do not correspond to permitted US-ASCII characters.
- Parameters:
uriRef
- a string representing an RDF URI reference.- Returns:
true
iff the supplied string is a syntactically valid RDF URI reference,false
otherwise.- See Also:
- section 6.4 of the RDF Concepts and Abstract Syntax specification, RFC 3986, RFC 2396
-
escapeExcludedChars
private static java.lang.String escapeExcludedChars(java.lang.String unescaped)
Escapes any character that is not either reserved or in the legal range of unreserved characters, according to RFC 2396.- Parameters:
unescaped
- a (relative or absolute) uri reference.- Returns:
- a (relative or absolute) uri reference with all characters that can not appear as-is in a URI %-escaped.
- See Also:
- RFC 2396
-
isUnreserved
private static boolean isUnreserved(char c)
A character is unreserved according to RFC 2396 if it is either an alphanumeric char or a punctuation mark.
-
isValidLocalName
public static boolean isValidLocalName(java.lang.String name)
Checks whether the specified name is allowed as the local name part of an IRI according to the SPARQL 1.1/Turtle 1.1 spec.- Parameters:
name
- the candidate local name- Returns:
- true if it is a local name
-
isPN_CHARS_U
private static boolean isPN_CHARS_U(int codePoint)
Check if the supplied code point represents either a valid prefixed name base character or an underscore.From Turtle Spec:
http://www.w3.org/TR/turtle/#grammar-production-PN_CHARS_U
[164s] PN_CHARS_U ::= PN_CHARS_BASE | '_'
-
isPLX_START
private static boolean isPLX_START(java.lang.String name)
-
isPERCENT
private static boolean isPERCENT(java.lang.String name)
-
isPN_LOCAL_ESC
private static boolean isPN_LOCAL_ESC(java.lang.String name)
-
isPN_CHARS_BASE
private static boolean isPN_CHARS_BASE(int codePoint)
Check if the supplied code point represents a valid prefixed name base character.From Turtle Spec:
http://www.w3.org/TR/turtle/#grammar-production-PN_CHARS_BASE
[163s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
-
isNameStartChar
private static boolean isNameStartChar(int codePoint)
Check if the supplied code point represents a valid name start character.- Parameters:
codePoint
- a Unicode code point.- Returns:
true
if the supplied code point represents a valid name start char,false
otherwise.
-
isNameChar
private static boolean isNameChar(int codePoint)
Check if the supplied code point represents a valid name character.- Parameters:
codePoint
- a Unicode code point.- Returns:
true
if the supplied code point represents a valid name char,false
otherwise.
-
isPN_CHARS
private static boolean isPN_CHARS(int codePoint)
Check if the supplied code point represents a valid prefixed name character.From Turtle Spec:
http://www.w3.org/TR/turtle/#grammar-production-PN_CHARS
[166s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
-
-