Package net.sf.saxon.regex
Class JDK14RegexTranslator
- java.lang.Object
-
- net.sf.saxon.regex.JDK14RegexTranslator
-
public class JDK14RegexTranslator extends java.lang.Object
This class translates XML Schema regex syntax into JDK 1.4 regex syntax. Author: James Clark Modified by Michael Kay (a) to integrate the code into Saxon, and (b) to support XPath additions to the XML Schema regex syntax. This version of the regular expression translator treats each half of a surrogate pair as a separate character, translating anything in an XPath regex that can match a non-BMP character into a Java regex that matches the two halves of a surrogate pair independently. This approach doesn't work under JDK 1.5, whose regex engine treats a surrogate pair as a single character. The same translator is currently used for Saxon on .NET 1.1
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
JDK14RegexTranslator.BackReference
(package private) static class
JDK14RegexTranslator.CharClass
(package private) static class
JDK14RegexTranslator.CharRange
(package private) static class
JDK14RegexTranslator.Complement
(package private) static class
JDK14RegexTranslator.Dot
(package private) static class
JDK14RegexTranslator.Empty
(package private) static class
JDK14RegexTranslator.Property
(package private) static class
JDK14RegexTranslator.Range
(package private) static class
JDK14RegexTranslator.SimpleCharClass
(package private) static class
JDK14RegexTranslator.SingleChar
(package private) static class
JDK14RegexTranslator.Subtraction
(package private) static class
JDK14RegexTranslator.Union
(package private) static class
JDK14RegexTranslator.WideSingleChar
-
Field Summary
Fields Modifier and Type Field Description (package private) static int
ALL
(package private) static java.lang.String
CATEGORY_NAMES
(package private) static int[][]
CATEGORY_RANGES
(package private) static java.lang.String
NMCHAR_CATEGORIES
(package private) static java.lang.String
NMCHAR_EXCLUDE_RANGES
(package private) static java.lang.String
NMCHAR_INCLUDES
(package private) static java.lang.String
NMSTRT_CATEGORIES
(package private) static java.lang.String
NMSTRT_EXCLUDE_RANGES
(package private) static java.lang.String
NMSTRT_INCLUDES
(package private) static int
NONE
(package private) static java.lang.String
NOT_ALLOWED_CLASS
(package private) static int
SOME
(package private) static java.lang.String
SURROGATES1_CLASS
(package private) static java.lang.String
SURROGATES2_CLASS
-
Constructor Summary
Constructors Constructor Description JDK14RegexTranslator()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
main(java.lang.String[] args)
void
setIgnoreWhitespace(boolean ignore)
java.lang.String
translate(java.lang.CharSequence regExp, boolean xpath)
Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax ofjava.util.regex.Pattern
.
-
-
-
Field Detail
-
CATEGORY_NAMES
static final java.lang.String CATEGORY_NAMES
- See Also:
- Constant Field Values
-
CATEGORY_RANGES
static final int[][] CATEGORY_RANGES
-
NMSTRT_INCLUDES
static final java.lang.String NMSTRT_INCLUDES
- See Also:
- Constant Field Values
-
NMSTRT_EXCLUDE_RANGES
static final java.lang.String NMSTRT_EXCLUDE_RANGES
- See Also:
- Constant Field Values
-
NMSTRT_CATEGORIES
static final java.lang.String NMSTRT_CATEGORIES
- See Also:
- Constant Field Values
-
NMCHAR_INCLUDES
static final java.lang.String NMCHAR_INCLUDES
- See Also:
- Constant Field Values
-
NMCHAR_EXCLUDE_RANGES
static final java.lang.String NMCHAR_EXCLUDE_RANGES
- See Also:
- Constant Field Values
-
NMCHAR_CATEGORIES
static final java.lang.String NMCHAR_CATEGORIES
- See Also:
- Constant Field Values
-
NONE
static final int NONE
- See Also:
- Constant Field Values
-
SOME
static final int SOME
- See Also:
- Constant Field Values
-
ALL
static final int ALL
- See Also:
- Constant Field Values
-
SURROGATES1_CLASS
static final java.lang.String SURROGATES1_CLASS
- See Also:
- Constant Field Values
-
SURROGATES2_CLASS
static final java.lang.String SURROGATES2_CLASS
- See Also:
- Constant Field Values
-
NOT_ALLOWED_CLASS
static final java.lang.String NOT_ALLOWED_CLASS
- See Also:
- Constant Field Values
-
-
Method Detail
-
setIgnoreWhitespace
public void setIgnoreWhitespace(boolean ignore)
-
translate
public java.lang.String translate(java.lang.CharSequence regExp, boolean xpath) throws RegexSyntaxException
Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax ofjava.util.regex.Pattern
. The translation assumes that the string to be matched against the regex uses surrogate pairs correctly. If the string comes from XML content, a conforming XML parser will automatically check this; if the string comes from elsewhere, it may be necessary to check surrogate usage before matching.- Parameters:
regExp
- a String containing a regular expression in the syntax of XML Schemas Part 2xpath
- a boolean indicating whether the XPath 2.0 F+O extensions to the schema regex syntax are permitted- Returns:
- a String containing a regular expression in the syntax of java.util.regex.Pattern
- Throws:
RegexSyntaxException
- ifregexp
is not a regular expression in the syntax of XML Schemas Part 2, or XPath 2.0, as appropriate- See Also:
Pattern
, XML Schema Part 2
-
main
public static void main(java.lang.String[] args) throws RegexSyntaxException
- Throws:
RegexSyntaxException
-
-