- All Superinterfaces:
TokenHandler3<RuntimeException>
- All Known Subinterfaces:
TokenHandler
- All Known Implementing Classes:
CommentRemovalHandler
A
TokenHandler3
that has no checked exceptions, backwards-compatible
with TokenProducer
2.x.
Most token handlers will report problems through error handlers and produce
no checked exceptions, in which case you should use this handler together
with TokenProducer
. In other use cases your handler may want to throw
checked exceptions, and then you must use TokenProducer3
together
with TokenHandler3
instead.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
character
(int index, int codePoint) Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.void
A commented string was found by the parser.void
control
(int index, int codePoint) A control character codepoint was found.void
endOfStream
(int len) The stream that was being parsed reached its end.default void
endPunctuation
(int index, int codePoint) Called when end punctuation (Pe) codepoints are found (except characters handled byrightCurlyBracket(int)
,rightParenthesis(int)
andrightSquareBracket(int)
).void
error
(int index, byte errCode, CharSequence context) An error was found while parsing.void
escaped
(int index, int codePoint) A codepoint preceded with a backslash was found outside of quoted text.void
leftCurlyBracket
(int index) Called when the{
codepoint is found.void
leftParenthesis
(int index) Called when the(
codepoint is found.void
leftSquareBracket
(int index) Called when the[
codepoint is found.void
quoted
(int index, CharSequence quoted, int quote) A quoted string was found by the parser.void
quotedNewlineChar
(int index, int codePoint) An unescaped FF/LF/CR control was found while assembling a quoted string.void
quotedWithControl
(int index, CharSequence quoted, int quoteCp) A quoted string was found by the parser, and contains control characters.void
rightCurlyBracket
(int index) Called when the}
codepoint is found.void
rightParenthesis
(int index) Called when the)
codepoint is found.void
rightSquareBracket
(int index) Called when the]
codepoint is found.void
separator
(int index, int codePoint) A separator (Zs, Zl and Zp unicode categories) was found.default void
startPunctuation
(int index, int codePoint) Called when start punctuation (Ps) codepoints are found (except characters handled byleftCurlyBracket(int)
,leftParenthesis(int)
andleftSquareBracket(int)
).void
tokenStart
(TokenControl control) At the beginning of parsing, this method is called, passing theTokenControl
object that can be used to fine-control the parsing.void
word
(int index, CharSequence word) A word was found by the parser (includes connector punctuation).
-
Method Details
-
tokenStart
At the beginning of parsing, this method is called, passing theTokenControl
object that can be used to fine-control the parsing.- Specified by:
tokenStart
in interfaceTokenHandler3<RuntimeException>
- Parameters:
control
- theTokenControl
object in charge of parsing.
-
word
A word was found by the parser (includes connector punctuation).- Specified by:
word
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the word was found.word
- the word.
-
separator
void separator(int index, int codePoint) A separator (Zs, Zl and Zp unicode categories) was found.- Specified by:
separator
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the separator was found.codePoint
- the codepoint of the found separator.
-
quoted
A quoted string was found by the parser.- Specified by:
quoted
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the quoted string was found.quoted
- the quoted sequence of characters, without the quotes.quote
- the quote character.
-
quotedWithControl
A quoted string was found by the parser, and contains control characters.- Specified by:
quotedWithControl
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the quoted string was found.quoted
- the quoted sequence of characters, without the quotes.quoteCp
- the quote character codepoint.
-
quotedNewlineChar
void quotedNewlineChar(int index, int codePoint) An unescaped FF/LF/CR control was found while assembling a quoted string.- Specified by:
quotedNewlineChar
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the control was found.codePoint
- the FF/LF/CR codepoint.
-
leftParenthesis
void leftParenthesis(int index) Called when the(
codepoint is found.- Specified by:
leftParenthesis
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.
-
leftSquareBracket
void leftSquareBracket(int index) Called when the[
codepoint is found.- Specified by:
leftSquareBracket
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.
-
leftCurlyBracket
void leftCurlyBracket(int index) Called when the{
codepoint is found.- Specified by:
leftCurlyBracket
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.
-
rightParenthesis
void rightParenthesis(int index) Called when the)
codepoint is found.- Specified by:
rightParenthesis
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.
-
rightSquareBracket
void rightSquareBracket(int index) Called when the]
codepoint is found.- Specified by:
rightSquareBracket
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.
-
rightCurlyBracket
void rightCurlyBracket(int index) Called when the}
codepoint is found.- Specified by:
rightCurlyBracket
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.
-
startPunctuation
default void startPunctuation(int index, int codePoint) Called when start punctuation (Ps) codepoints are found (except characters handled byleftCurlyBracket(int)
,leftParenthesis(int)
andleftSquareBracket(int)
).- Specified by:
startPunctuation
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.codePoint
- the found codepoint.
-
endPunctuation
default void endPunctuation(int index, int codePoint) Called when end punctuation (Pe) codepoints are found (except characters handled byrightCurlyBracket(int)
,rightParenthesis(int)
andrightSquareBracket(int)
).- Specified by:
endPunctuation
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the codepoint was found.codePoint
- the found codepoint.
-
character
void character(int index, int codePoint) Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.Symbols in So category are considered part of words and won't be handled by this method.
- Specified by:
character
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the punctuation was found.codePoint
- the codepoint of the found punctuation.
-
escaped
void escaped(int index, int codePoint) A codepoint preceded with a backslash was found outside of quoted text.- Specified by:
escaped
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the escaped codepoint was found.codePoint
- the escaped codepoint.
-
control
void control(int index, int codePoint) A control character codepoint was found.- Specified by:
control
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the control codepoint was found.codePoint
- the control codepoint.
-
commented
A commented string was found by the parser.- Specified by:
commented
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the commented string was found.commentType
- the type of comment.comment
- the commented string.
-
endOfStream
void endOfStream(int len) The stream that was being parsed reached its end.- Specified by:
endOfStream
in interfaceTokenHandler3<RuntimeException>
- Parameters:
len
- the length of the processed stream.
-
error
An error was found while parsing.Something was found that broke the assumptions made by the parser, like an escape character at the end of the stream or an unmatched quote.
- Specified by:
error
in interfaceTokenHandler3<RuntimeException>
- Parameters:
index
- the index at which the error was found.errCode
- the error code.context
- a context sequence. If a string was parsed, it will contain up to 16 characters before and after the error.
-