Class JDK14RegexTranslator

java.lang.Object
net.sf.saxon.regex.JDK14RegexTranslator

public class JDK14RegexTranslator extends Object
This class translates XML Schema regex syntax into JDK 1.4 regex syntax. Author: James Clark Modified by Michael Kay (a) to integrate the code into Saxon, and (b) to support XPath additions to the XML Schema regex syntax.

This version of the regular expression translator treats each half of a surrogate pair as a separate character, translating anything in an XPath regex that can match a non-BMP character into a Java regex that matches the two halves of a surrogate pair independently. This approach doesn't work under JDK 1.5, whose regex engine treats a surrogate pair as a single character.

The same translator is currently used for Saxon on .NET 1.1

  • Field Details

  • Constructor Details

    • JDK14RegexTranslator

      public JDK14RegexTranslator()
  • Method Details

    • setIgnoreWhitespace

      public void setIgnoreWhitespace(boolean ignore)
    • translate

      public String translate(CharSequence regExp, boolean xpath) throws RegexSyntaxException
      Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax of java.util.regex.Pattern. The translation assumes that the string to be matched against the regex uses surrogate pairs correctly. If the string comes from XML content, a conforming XML parser will automatically check this; if the string comes from elsewhere, it may be necessary to check surrogate usage before matching.
      Parameters:
      regExp - a String containing a regular expression in the syntax of XML Schemas Part 2
      xpath - a boolean indicating whether the XPath 2.0 F+O extensions to the schema regex syntax are permitted
      Returns:
      a String containing a regular expression in the syntax of java.util.regex.Pattern
      Throws:
      RegexSyntaxException - if regexp is not a regular expression in the syntax of XML Schemas Part 2, or XPath 2.0, as appropriate
      See Also:
    • main

      public static void main(String[] args) throws RegexSyntaxException
      Throws:
      RegexSyntaxException