Package com.google.re2j
This package provides an implementation of regular expression matching based on Russ Cox's linear-time RE2 algorithm.
The API presented by com.google.re2j
mimics that of
java.util.regex.Matcher
and java.util.regex.Pattern
.
While not identical, they are similar enough that most users can
switch implementations simply by changing their import
s.
The syntax of the regular expressions accepted is the same general
syntax used by Perl, Python, and other languages. More precisely,
it is the syntax accepted by the C++ and Go implementations of RE2
described at https://github.com/google/re2/wiki/Syntax,
except for \C
(match any byte), which is not
supported because in this implementation, the matcher's input is
conceptually a stream of Unicode code points, not bytes.
The current API is rather small and intended for compatibility
with java.util.regex
, but the underlying implementation
supports some additional features, such as the ability to process
input character streams encoded as UTF-8 byte arrays. These may
be exposed in a future release if there is sufficient interest.
Example use:
import com.google.re2j.Matcher; import com.google.re2j.Pattern; Pattern p = Pattern.compile("b(an)*(.)"); Matcher m = p.matcher("by, band, banana"); assertTrue(m.lookingAt()); m.reset(); // "by, band, banana" assertTrue(m.find()); assertEquals("by", m.group(0)); // -- assertEquals(null, m.group(1)); // assertEquals("y", m.group(2)); // ^ assertTrue(m.find()); assertEquals("band", m.group(0)); // ---- assertEquals("an", m.group(1)); // ^^ assertEquals("d", m.group(2)); // ^ assertTrue(m.find()); assertEquals("banana", m.group(0)); // ------- assertEquals("an", m.group(1)); // ^^ assertEquals("a", m.group(2)); // ^ assertFalse(m.find());
-
Class Summary Class Description Matcher A stateful iterator that interprets a regexPattern
on a specific input.Pattern A compiled representation of an RE2 regular expression, mimicking thejava.util.regex.Pattern
API. -
Exception Summary Exception Description PatternSyntaxException An exception thrown by the parser if the pattern was invalid.