Class Parser<T>
- java.lang.Object
-
- org.jparsec.Parser<T>
-
- Direct Known Subclasses:
BestParser
,DelimitedParser
,EmptyListParser
,NestableBlockCommentScanner
,ReluctantBetweenParser
,RepeatAtLeastParser
,RepeatTimesParser
,SkipAtLeastParser
,SkipTimesParser
public abstract class Parser<T> extends java.lang.Object
Defines grammar and encapsulates parsing logic. AParser
takes as input aCharSequence
source and parses it when theparse(CharSequence)
method is called. A value of typeT
will be returned if parsing succeeds, or aParserException
is thrown to indicate parsing error. For example:Parser<String> scanner = Scanners.IDENTIFIER; assertEquals("foo", scanner.parse("foo"));
Parser
s run either on character level to scan the source, or on token level to parse a list ofToken
objects returned from another parser. This other parser that returns the list of tokens for token level parsing is hooked up via thefrom(Parser, Parser)
orfrom(Parser)
method.The following are important naming conventions used throughout the library:
- A character level parser object that recognizes a single lexical word is called a scanner.
- A scanner that translates the recognized lexical word into a token is called a tokenizer.
- A character level parser object that does lexical analysis and returns a list of
Token
is called a lexer. - All
index
parameters are 0-based indexes in the original source.
Parser.Mode.DEBUG
mode toparse(CharSequence, Mode)
and inspect the result inParserException.getParseTree()
. Alllabeled
parsers will generate a node in the exception's parse tree, with matched indices in the source.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Parser.Mode
Defines the mode that a parser should be run in.static class
Parser.Reference<T>
An atomic mutable reference toParser
used in recursive grammars.private static class
Parser.Rhs<T>
-
Constructor Summary
Constructors Constructor Description Parser()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description (package private) abstract boolean
apply(ParseContext ctxt)
private static <T> T
applyInfixOperators(T initialValue, java.util.List<? extends java.util.function.Function<? super T,? extends T>> functions)
private static <T> T
applyInfixrOperators(T first, java.util.List<Parser.Rhs<T>> rhss)
private static <T> T
applyPostfixOperators(T a, java.lang.Iterable<? extends java.util.function.Function<? super T,? extends T>> ms)
private static <T> T
applyPrefixOperators(java.util.List<? extends java.util.function.Function<? super T,? extends T>> ms, T a)
(package private) Parser<T>
asDelimiter()
As a delimiter, the parser's error is considered lenient and will only be reported if no other meaningful error is encountered.Parser<java.util.Optional<T>>
asOptional()
p.asOptional()
is equivalent top?
in EBNF.Parser<java.util.List<T>>
atLeast(int min)
Parser<T>
atomic()
AParser
that undoes any partial match ifthis
fails.Parser<T>
between(Parser<?> before, Parser<?> after)
<R> Parser<R>
cast()
Parser<java.util.List<T>>
endBy(Parser<?> delim)
Parser<java.util.List<T>>
endBy1(Parser<?> delim)
Parser<java.lang.Boolean>
fails()
Parser<T>
followedBy(Parser<?> parser)
Parser<T>
from(Parser<?> tokenizer, Parser<java.lang.Void> delim)
AParser
that takes as input the tokens returned bytokenizer
delimited bydelim
, and runsthis
to parse the tokens.Parser<T>
from(Parser<? extends java.util.Collection<Token>> lexer)
(package private) T
getReturn(ParseContext ctxt)
<R> Parser<R>
ifelse(java.util.function.Function<? super T,? extends Parser<? extends R>> consequence, Parser<? extends R> alternative)
<R> Parser<R>
ifelse(Parser<? extends R> consequence, Parser<? extends R> alternative)
Parser<T>
infixl(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> operator)
AParser
for left-associative infix operator.Parser<T>
infixn(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)
AParser
that parses non-associative infix operator.Parser<T>
infixr(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)
AParser
for right-associative infix operator.Parser<T>
label(java.lang.String name)
Parser<java.util.List<Token>>
lexer(Parser<?> delim)
AParser
that greedily runsthis
repeatedly, and ignores the pattern recognized bydelim
before and after each occurrence.Parser<java.util.List<T>>
many()
p.many()
is equivalent top*
in EBNF.Parser<java.util.List<T>>
many1()
p.many1()
is equivalent top+
in EBNF.<R> Parser<R>
map(java.util.function.Function<? super T,? extends R> map)
static <T> Parser.Reference<T>
newReference()
Creates a new instance ofParser.Reference
.<To> Parser<To>
next(java.util.function.Function<? super T,? extends Parser<? extends To>> map)
AParser
that executesthis
, maps the result usingmap
to anotherParser
object to be executed as the next step.<R> Parser<R>
next(Parser<R> parser)
Parser<?>
not()
AParser
that fails ifthis
succeeds.Parser<?>
not(java.lang.String unexpected)
AParser
that fails ifthis
succeeds.Parser<T>
notFollowedBy(Parser<?> parser)
Parser<T>
optional()
Deprecated.since 3.0.Parser<T>
optional(T defaultValue)
Parser<T>
or(Parser<? extends T> alternative)
p1.or(p2)
is equivalent top1 | p2
in EBNF.Parser<T>
otherwise(Parser<? extends T> fallback)
a.otherwise(fallback)
runsfallback
whena
matches zero input.T
parse(java.lang.CharSequence source)
Parsessource
.T
parse(java.lang.CharSequence source, java.lang.String moduleName)
Deprecated.Please useparse(CharSequence)
instead.T
parse(java.lang.CharSequence source, Parser.Mode mode)
Parsessource
under the givenmode
.T
parse(java.lang.Readable readable)
Parses source read fromreadable
.T
parse(java.lang.Readable readable, java.lang.String moduleName)
Deprecated.Please useparse(Readable)
instead.ParseTree
parseTree(java.lang.CharSequence source)
Parsessource
and returns aParseTree
corresponding to the syntactical structure of the input.Parser<T>
peek()
AParser
that runsthis
and undoes any input consumption if succeeds.Parser<T>
postfix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)
Parser<T>
prefix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)
(package private) static java.lang.StringBuilder
read(java.lang.Readable from)
Copies all content fromfrom
toto
.Parser<T>
reluctantBetween(Parser<?> before, Parser<?> after)
Deprecated.This method probably only works in the simplest cases.<R> Parser<R>
retn(R value)
Parser<java.util.List<T>>
sepBy(Parser<?> delim)
Parser<java.util.List<T>>
sepBy1(Parser<?> delim)
Parser<java.util.List<T>>
sepEndBy(Parser<?> delim)
Parser<java.util.List<T>>
sepEndBy1(Parser<?> delim)
Parser<java.lang.Void>
skipAtLeast(int min)
Parser<java.lang.Void>
skipMany()
p.skipMany()
is equivalent top*
in EBNF.Parser<java.lang.Void>
skipMany1()
p.skipMany1()
is equivalent top+
in EBNF.Parser<java.lang.Void>
skipTimes(int n)
Parser<java.lang.Void>
skipTimes(int min, int max)
AParser
that runsthis
parser for at leastmin
times and up tomax
times, with all the return values ignored.Parser<java.lang.String>
source()
AParser
that returns the matched string in the original source.Parser<java.lang.Boolean>
succeeds()
Parser<java.util.List<T>>
times(int n)
Parser<java.util.List<T>>
times(int min, int max)
Parser<Token>
token()
Parser<java.util.List<T>>
until(Parser<?> parser)
AParser
that matches this parser zero or many times until the given parser succeeds.Parser<WithSource<T>>
withSource()
AParser
that returns both parsed object and matched string.
-
-
-
Method Detail
-
newReference
public static <T> Parser.Reference<T> newReference()
Creates a new instance ofParser.Reference
. Used when your grammar is recursive (many grammars are).
-
retn
public final <R> Parser<R> retn(R value)
-
next
public final <To> Parser<To> next(java.util.function.Function<? super T,? extends Parser<? extends To>> map)
AParser
that executesthis
, maps the result usingmap
to anotherParser
object to be executed as the next step.
-
until
public final Parser<java.util.List<T>> until(Parser<?> parser)
AParser
that matches this parser zero or many times until the given parser succeeds. The input that matches the given parser will not be consumed. The input that matches this parser will be collected in a list that will be returned by this function.- Since:
- 2.2
-
many
public final Parser<java.util.List<T>> many()
p.many()
is equivalent top*
in EBNF. The return values are collected and returned in aList
.
-
skipMany
public final Parser<java.lang.Void> skipMany()
p.skipMany()
is equivalent top*
in EBNF. The return values are discarded.
-
many1
public final Parser<java.util.List<T>> many1()
p.many1()
is equivalent top+
in EBNF. The return values are collected and returned in aList
.
-
skipMany1
public final Parser<java.lang.Void> skipMany1()
p.skipMany1()
is equivalent top+
in EBNF. The return values are discarded.
-
atLeast
public final Parser<java.util.List<T>> atLeast(int min)
AParser
that runsthis
parser greedily for at leastmin
times. The return values are collected and returned in aList
.
-
skipAtLeast
public final Parser<java.lang.Void> skipAtLeast(int min)
-
skipTimes
public final Parser<java.lang.Void> skipTimes(int n)
-
times
public final Parser<java.util.List<T>> times(int min, int max)
AParser
that runsthis
parser for at leastmin
times and up tomax
times. The return values are collected and returned inList
.
-
skipTimes
public final Parser<java.lang.Void> skipTimes(int min, int max)
AParser
that runsthis
parser for at leastmin
times and up tomax
times, with all the return values ignored.
-
or
public final Parser<T> or(Parser<? extends T> alternative)
p1.or(p2)
is equivalent top1 | p2
in EBNF.- Parameters:
alternative
- the alternative parser to run if this fails.
-
otherwise
public final Parser<T> otherwise(Parser<? extends T> fallback)
a.otherwise(fallback)
runsfallback
whena
matches zero input. This is different froma.or(alternative)
wherealternative
is run whenevera
fails to match.One should usually use
or(org.jparsec.Parser<? extends T>)
.- Parameters:
fallback
- the parser to run ifthis
matches no input.- Since:
- 3.1
-
optional
@Deprecated public final Parser<T> optional()
Deprecated.since 3.0. Use {@link #optional(null)} orasOptional()
instead.p.optional()
is equivalent top?
in EBNF.null
is the result whenthis
fails with no partial match.
-
asOptional
public final Parser<java.util.Optional<T>> asOptional()
p.asOptional()
is equivalent top?
in EBNF.Optional.empty()
is the result whenthis
fails with no partial match. Note thatOptional
prohibits nulls so make surethis
does not result innull
.- Since:
- 3.0
-
not
public final Parser<?> not()
AParser
that fails ifthis
succeeds. Any input consumption is undone.
-
not
public final Parser<?> not(java.lang.String unexpected)
AParser
that fails ifthis
succeeds. Any input consumption is undone.- Parameters:
unexpected
- the name of what we don't expect.
-
peek
public final Parser<T> peek()
AParser
that runsthis
and undoes any input consumption if succeeds.
-
atomic
public final Parser<T> atomic()
AParser
that undoes any partial match ifthis
fails. In other words, the parser either fully matches, or matches none.
-
succeeds
public final Parser<java.lang.Boolean> succeeds()
-
fails
public final Parser<java.lang.Boolean> fails()
-
ifelse
public final <R> Parser<R> ifelse(Parser<? extends R> consequence, Parser<? extends R> alternative)
-
ifelse
public final <R> Parser<R> ifelse(java.util.function.Function<? super T,? extends Parser<? extends R>> consequence, Parser<? extends R> alternative)
-
cast
public final <R> Parser<R> cast()
Caststhis
to aParser
of typeR
. Use it only if you know the parser actually returns value of typeR
.
-
between
public final Parser<T> between(Parser<?> before, Parser<?> after)
AParser
that runsthis
betweenbefore
andafter
. The return value ofthis
is preserved.Equivalent to
Parsers.between(Parser, Parser, Parser)
, which preserves the natural order of the parsers in the argument list, but is a bit more verbose.
-
reluctantBetween
@Deprecated public final Parser<T> reluctantBetween(Parser<?> before, Parser<?> after)
Deprecated.This method probably only works in the simplest cases. And it's a character-level parser only. Use it at your own risk. It may be deleted later when we find a better way.AParser
that first runsbefore
from the input start, then runsafter
from the input's end, and only then runsthis
on what's left from the input. In effect,this
behaves reluctantly, givingafter
a chance to grab input that would have been consumed bythis
otherwise.
-
sepBy1
public final Parser<java.util.List<T>> sepBy1(Parser<?> delim)
AParser
that runsthis
1 or more times separated bydelim
.The return values are collected in a
List
.
-
sepBy
public final Parser<java.util.List<T>> sepBy(Parser<?> delim)
AParser
that runsthis
0 or more times separated bydelim
.The return values are collected in a
List
.
-
endBy
public final Parser<java.util.List<T>> endBy(Parser<?> delim)
AParser
that runsthis
for 0 or more times delimited and terminated bydelim
.The return values are collected in a
List
.
-
endBy1
public final Parser<java.util.List<T>> endBy1(Parser<?> delim)
AParser
that runsthis
for 1 or more times delimited and terminated bydelim
.The return values are collected in a
List
.
-
sepEndBy1
public final Parser<java.util.List<T>> sepEndBy1(Parser<?> delim)
AParser
that runsthis
for 1 ore more times separated and optionally terminated bydelim
. For example:"foo;foo;foo"
and"foo;foo;"
both matchesfoo.sepEndBy1(semicolon)
.The return values are collected in a
List
.
-
sepEndBy
public final Parser<java.util.List<T>> sepEndBy(Parser<?> delim)
AParser
that runsthis
for 0 ore more times separated and optionally terminated bydelim
. For example:"foo;foo;foo"
and"foo;foo;"
both matchesfoo.sepEndBy(semicolon)
.The return values are collected in a
List
.
-
prefix
public final Parser<T> prefix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)
AParser
that runsop
for 0 or more times greedily, then runsthis
. TheFunction
objects returned fromop
are applied from right to left to the return value ofp
.p.prefix(op)
is equivalent toop* p
in EBNF.
-
postfix
public final Parser<T> postfix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)
AParser
that runsthis
and then runsop
for 0 or more times greedily. TheFunction
objects returned fromop
are applied from left to right to the return value of p.This is the preferred API to avoid
StackOverflowError
in left-recursive parsers. For example, to parse array types in the form of "T[]" or "T[][]", the following left recursive grammar will fail:Terminals terms = Terminals.operators("[", "]"); Parser.Reference<Type> ref = Parser.newReference(); ref.set(Parsers.or(leafTypeParser, Parsers.sequence(ref.lazy(), terms.phrase("[", "]"), new Unary<Type>() {...}))); return ref.get();
Terminals terms = Terminals.operators("[", "]"); return leafTypeParer.postfix(terms.phrase("[", "]").retn(new Unary<Type>() {...}));
expr ? a : b
ternary operator. It too is a left recursive grammar. And un-intuitively it can also be thought as a postfix operator. Basically, we can parse "? a : b" as a whole into a unary operator that accepts the condition expression as input and outputs the full ternary expression:Parser<Expr> ternary(Parser<Expr> expr) { return expr.postfix( Parsers.sequence( terms.token("?"), expr, terms.token(":"), expr, (unused, then, unused, orelse) -> cond -> new TernaryExpr(cond, then, orelse))); }
OperatorTable
also handles left recursion transparently.p.postfix(op)
is equivalent top op*
in EBNF.
-
infixn
public final Parser<T> infixn(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)
AParser
that parses non-associative infix operator. Runsthis
for the left operand, and then runsop
andthis
for the operator and the right operand optionally. TheBiFunction
objects returned fromop
are applied to the return values of the two operands, if any.p.infixn(op)
is equivalent top (op p)?
in EBNF.
-
infixl
public final Parser<T> infixl(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> operator)
AParser
for left-associative infix operator. Runsthis
for the left operand, and then runsoperator
andthis
for the operator and the right operand for 0 or more times greedily. TheBiFunction
objects returned fromoperator
are applied from left to right to the return values ofthis
, if any. For example:a + b + c + d
is evaluated as(((a + b)+c)+d)
.p.infixl(op)
is equivalent top (op p)*
in EBNF.
-
infixr
public final Parser<T> infixr(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)
AParser
for right-associative infix operator. Runsthis
for the left operand, and then runsop
andthis
for the operator and the right operand for 0 or more times greedily. TheBiFunction
objects returned fromop
are applied from right to left to the return values ofthis
, if any. For example:a + b + c + d
is evaluated asa + (b + (c + d))
.p.infixr(op)
is equivalent top (op p)*
in EBNF.
-
token
public final Parser<Token> token()
AParser
that runsthis
and wraps the return value in aToken
.It is normally not necessary to call this method explicitly.
lexer(Parser)
andfrom(Parser, Parser)
both do the conversion automatically.
-
source
public final Parser<java.lang.String> source()
AParser
that returns the matched string in the original source.
-
withSource
public final Parser<WithSource<T>> withSource()
AParser
that returns both parsed object and matched string.
-
from
public final Parser<T> from(Parser<? extends java.util.Collection<Token>> lexer)
AParser
that takes as input theToken
collection returned bylexer
, and runsthis
to parse the tokens. Most parsers should use the simplerfrom(Parser, Parser)
instead.this
must be a token level parser.
-
from
public final Parser<T> from(Parser<?> tokenizer, Parser<java.lang.Void> delim)
AParser
that takes as input the tokens returned bytokenizer
delimited bydelim
, and runsthis
to parse the tokens. A common misunderstanding is thattokenizer
has to be a parser ofToken
. It doesn't need to be becauseTerminals
already takes care of wrapping your logical token objects into physicalToken
with correct source location information tacked on for free. Your token object can literally be anything, as long as your token level parser can recognize it later.The following example uses
Terminals.tokenizer()
:Terminals terminals = ...; return parser.from(terminals.tokenizer(), Scanners.WHITESPACES.optional()).parse(str);
And tokens are optionally delimited by whitespaces.Optionally, you can skip comments using an alternative scanner than
WHITESPACES
:Terminals terminals = ...; Parser<?> delim = Parsers.or( Scanners.WHITESPACE, Scanners.JAVA_LINE_COMMENT, Scanners.JAVA_BLOCK_COMMENT).skipMany(); return parser.from(terminals.tokenizer(), delim).parse(str);
In both examples, it's important to make sure the delimiter scanner can accept empty string (either through
optional()
orskipMany()
), unless adjacent operator characters shouldn't be parsed as separate operators. i.e. "((" as two left parenthesis operators.this
must be a token level parser.
-
lexer
public Parser<java.util.List<Token>> lexer(Parser<?> delim)
AParser
that greedily runsthis
repeatedly, and ignores the pattern recognized bydelim
before and after each occurrence. The result tokens are wrapped inToken
and are collected and returned in aList
.It is normally not necessary to call this method explicitly.
from(Parser, Parser)
is more convenient for simple uses that just need to connect a token level parser with a lexer that produces the tokens. When more flexible control over the token list is needed, for example, to parse indentation sensitive language, a pre-processor of the token list may be needed.this
must be a tokenizer that returns a token value.
-
asDelimiter
final Parser<T> asDelimiter()
As a delimiter, the parser's error is considered lenient and will only be reported if no other meaningful error is encountered. The delimiter's logical step is also considered 0, which means it won't ever stop repetition combinators such asmany()
.
-
parse
public final T parse(java.lang.CharSequence source)
Parsessource
.
-
parse
public final T parse(java.lang.Readable readable) throws java.io.IOException
Parses source read fromreadable
.- Throws:
java.io.IOException
-
parse
public final T parse(java.lang.CharSequence source, Parser.Mode mode)
Parsessource
under the givenmode
. For example:try { parser.parse(text, Mode.DEBUG); } catch (ParserException e) { ParseTree parseTree = e.getParseTree(); ... }
- Since:
- 2.3
-
parseTree
public final ParseTree parseTree(java.lang.CharSequence source)
Parsessource
and returns aParseTree
corresponding to the syntactical structure of the input. Onlylabeled
parser nodes are represented in the parse tree.If parsing failed,
ParserException.getParseTree()
can be inspected for the parse tree at error location.- Since:
- 2.3
-
parse
@Deprecated public final T parse(java.lang.CharSequence source, java.lang.String moduleName)
Deprecated.Please useparse(CharSequence)
instead.Parsessource
.- Parameters:
source
- the source stringmoduleName
- the name of the module, this name appears in error message- Returns:
- the result
-
parse
@Deprecated public final T parse(java.lang.Readable readable, java.lang.String moduleName) throws java.io.IOException
Deprecated.Please useparse(Readable)
instead.Parses source read fromreadable
.- Parameters:
readable
- where the source is read frommoduleName
- the name of the module, this name appears in error message- Returns:
- the result
- Throws:
java.io.IOException
-
apply
abstract boolean apply(ParseContext ctxt)
-
read
static java.lang.StringBuilder read(java.lang.Readable from) throws java.io.IOException
Copies all content fromfrom
toto
.- Throws:
java.io.IOException
-
getReturn
final T getReturn(ParseContext ctxt)
-
applyPrefixOperators
private static <T> T applyPrefixOperators(java.util.List<? extends java.util.function.Function<? super T,? extends T>> ms, T a)
-
applyPostfixOperators
private static <T> T applyPostfixOperators(T a, java.lang.Iterable<? extends java.util.function.Function<? super T,? extends T>> ms)
-
applyInfixOperators
private static <T> T applyInfixOperators(T initialValue, java.util.List<? extends java.util.function.Function<? super T,? extends T>> functions)
-
applyInfixrOperators
private static <T> T applyInfixrOperators(T first, java.util.List<Parser.Rhs<T>> rhss)
-
-