Interface TokenHandler3<E extends Exception>

All Known Subinterfaces:
TokenHandler, TokenHandler2
All Known Implementing Classes:
CommentRemovalHandler

public interface TokenHandler3<E extends Exception>
To be implemented by listeners that handle the different events generated by the TokenProducer3.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    character(int index, int codePoint)
    Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.
    void
    commented(int index, int commentType, String comment)
    A commented string was found by the parser.
    void
    control(int index, int codePoint)
    A control character codepoint was found.
    void
    endOfStream(int len)
    The stream that was being parsed reached its end.
    default void
    endPunctuation(int index, int codePoint)
    Called when end punctuation (Pe) codepoints are found (except characters handled by rightCurlyBracket(int), rightParenthesis(int) and rightSquareBracket(int)).
    void
    error(int index, byte errCode, CharSequence context)
    An error was found while parsing.
    void
    escaped(int index, int codePoint)
    A codepoint preceded with a backslash was found outside of quoted text.
    void
    leftCurlyBracket(int index)
    Called when the { codepoint is found.
    void
    leftParenthesis(int index)
    Called when the ( codepoint is found.
    void
    leftSquareBracket(int index)
    Called when the [ codepoint is found.
    void
    quoted(int index, CharSequence quoted, int quote)
    A quoted string was found by the parser.
    void
    quotedNewlineChar(int index, int codePoint)
    An unescaped FF/LF/CR control was found while assembling a quoted string.
    void
    quotedWithControl(int index, CharSequence quoted, int quoteCp)
    A quoted string was found by the parser, and contains control characters.
    void
    rightCurlyBracket(int index)
    Called when the } codepoint is found.
    void
    rightParenthesis(int index)
    Called when the ) codepoint is found.
    void
    rightSquareBracket(int index)
    Called when the ] codepoint is found.
    void
    separator(int index, int codePoint)
    A separator (Zs, Zl and Zp unicode categories) was found.
    default void
    startPunctuation(int index, int codePoint)
    Called when start punctuation (Ps) codepoints are found (except characters handled by leftCurlyBracket(int), leftParenthesis(int) and leftSquareBracket(int)).
    void
    At the beginning of parsing, this method is called, passing the TokenControl object that can be used to fine-control the parsing.
    void
    word(int index, CharSequence word)
    A word was found by the parser (includes connector punctuation).
  • Method Details

    • tokenStart

      void tokenStart(TokenControl control)
      At the beginning of parsing, this method is called, passing the TokenControl object that can be used to fine-control the parsing.
      Parameters:
      control - the TokenControl object in charge of parsing.
      Throws:
      E - in case of an error when processing the tokens.
    • word

      void word(int index, CharSequence word) throws E
      A word was found by the parser (includes connector punctuation).
      Parameters:
      index - the index at which the word was found.
      word - the word.
      Throws:
      E - in case of an error when processing the tokens.
    • separator

      void separator(int index, int codePoint) throws E
      A separator (Zs, Zl and Zp unicode categories) was found.
      Parameters:
      index - the index at which the separator was found.
      codePoint - the codepoint of the found separator.
      Throws:
      E - in case of an error when processing the tokens.
    • quoted

      void quoted(int index, CharSequence quoted, int quote) throws E
      A quoted string was found by the parser.
      Parameters:
      index - the index at which the quoted string was found.
      quoted - the quoted sequence of characters, without the quotes.
      quote - the quote character.
      Throws:
      E - in case of an error when processing the tokens.
    • quotedWithControl

      void quotedWithControl(int index, CharSequence quoted, int quoteCp) throws E
      A quoted string was found by the parser, and contains control characters.
      Parameters:
      index - the index at which the quoted string was found.
      quoted - the quoted sequence of characters, without the quotes.
      quoteCp - the quote character codepoint.
      Throws:
      E - in case of an error when processing the tokens.
    • quotedNewlineChar

      void quotedNewlineChar(int index, int codePoint) throws E
      An unescaped FF/LF/CR control was found while assembling a quoted string.
      Parameters:
      index - the index at which the control was found.
      codePoint - the FF/LF/CR codepoint.
      Throws:
      E - in case of an error when processing the tokens.
    • leftParenthesis

      void leftParenthesis(int index) throws E
      Called when the ( codepoint is found.
      Parameters:
      index - the index at which the codepoint was found.
      Throws:
      E - in case of an error when processing the tokens.
    • leftSquareBracket

      void leftSquareBracket(int index) throws E
      Called when the [ codepoint is found.
      Parameters:
      index - the index at which the codepoint was found.
      Throws:
      E - in case of an error when processing the tokens.
    • leftCurlyBracket

      void leftCurlyBracket(int index) throws E
      Called when the { codepoint is found.
      Parameters:
      index - the index at which the codepoint was found.
      Throws:
      E - in case of an error when processing the tokens.
    • rightParenthesis

      void rightParenthesis(int index) throws E
      Called when the ) codepoint is found.
      Parameters:
      index - the index at which the codepoint was found.
      Throws:
      E - in case of an error when processing the tokens.
    • rightSquareBracket

      void rightSquareBracket(int index) throws E
      Called when the ] codepoint is found.
      Parameters:
      index - the index at which the codepoint was found.
      Throws:
      E - in case of an error when processing the tokens.
    • rightCurlyBracket

      void rightCurlyBracket(int index) throws E
      Called when the } codepoint is found.
      Parameters:
      index - the index at which the codepoint was found.
      Throws:
      E - in case of an error when processing the tokens.
    • startPunctuation

      default void startPunctuation(int index, int codePoint) throws E
      Called when start punctuation (Ps) codepoints are found (except characters handled by leftCurlyBracket(int), leftParenthesis(int) and leftSquareBracket(int)).
      Parameters:
      index - the index at which the codepoint was found.
      codePoint - the found codepoint.
      Throws:
      E - in case of an error when processing the tokens.
    • endPunctuation

      default void endPunctuation(int index, int codePoint) throws E
      Called when end punctuation (Pe) codepoints are found (except characters handled by rightCurlyBracket(int), rightParenthesis(int) and rightSquareBracket(int)).
      Parameters:
      index - the index at which the codepoint was found.
      codePoint - the found codepoint.
      Throws:
      E - in case of an error when processing the tokens.
    • character

      void character(int index, int codePoint) throws E
      Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.

      Symbols in So category are considered part of words and won't be handled by this method.

      Parameters:
      index - the index at which the punctuation was found.
      codePoint - the codepoint of the found punctuation.
      Throws:
      E - in case of an error when processing the tokens.
    • escaped

      void escaped(int index, int codePoint) throws E
      A codepoint preceded with a backslash was found outside of quoted text.
      Parameters:
      index - the index at which the escaped codepoint was found.
      codePoint - the escaped codepoint.
      Throws:
      E - in case of an error when processing the tokens.
    • control

      void control(int index, int codePoint) throws E
      A control character codepoint was found.
      Parameters:
      index - the index at which the control codepoint was found.
      codePoint - the control codepoint.
      Throws:
      E - in case of an error when processing the tokens.
    • commented

      void commented(int index, int commentType, String comment) throws E
      A commented string was found by the parser.
      Parameters:
      index - the index at which the commented string was found.
      commentType - the type of comment.
      comment - the commented string.
      Throws:
      E - in case of an error when processing the tokens.
    • endOfStream

      void endOfStream(int len) throws E
      The stream that was being parsed reached its end.
      Parameters:
      len - the length of the processed stream.
      Throws:
      E - in case of an error when processing the tokens.
    • error

      void error(int index, byte errCode, CharSequence context) throws E
      An error was found while parsing.

      Something was found that broke the assumptions made by the parser, like an escape character at the end of the stream or an unmatched quote.

      Parameters:
      index - the index at which the error was found.
      errCode - the error code.
      context - a context sequence. If a string was parsed, it will contain up to 16 characters before and after the error.
      Throws:
      E - in case that the error handler decides to throw an exception.