Interface TokenHandler


public interface TokenHandler
To be implemented by listeners that handle the different events generated by the TokenProducer.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    character(int index, int codePoint)
    Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.
    void
    closeGroup(int index, int codePoint)
    Called when one of these codepoints is found: ), ], }
    void
    commented(int index, int commentType, String comment)
    A commented string was found by the parser.
    void
    control(int index, int codePoint)
    A control character codepoint was found.
    void
    endOfStream(int len)
    The stream that was being parsed reached its end.
    void
    error(int index, byte errCode, CharSequence context)
    An error was found while parsing.
    void
    escaped(int index, int codePoint)
    A codepoint preceded with a backslash was found outside of quoted text.
    void
    openGroup(int index, int codePoint)
    Called when one of these codepoints is found: (, [, {
    void
    quoted(int index, CharSequence quoted, int quote)
    A quoted string was found by the parser.
    void
    quotedNewlineChar(int index, int codePoint)
    An unescaped FF/LF/CR control was found while assembling a quoted string.
    void
    quotedWithControl(int index, CharSequence quoted, int quoteCp)
    A quoted string was found by the parser, and contains control characters.
    void
    separator(int index, int codePoint)
    A separator (Zs, Zl and Zp unicode categories) was found.
    void
    At the beginning of parsing, this method is called, passing the TokenControl object that can be used to fine-control the parsing.
    void
    word(int index, CharSequence word)
    A word was found by the parser (includes connector punctuation).
  • Method Details

    • tokenControl

      void tokenControl(TokenControl control)
      At the beginning of parsing, this method is called, passing the TokenControl object that can be used to fine-control the parsing.
      Parameters:
      control - the TokenControl object in charge of parsing.
    • word

      void word(int index, CharSequence word)
      A word was found by the parser (includes connector punctuation).
      Parameters:
      index - the index at which the word was found.
      word - the word.
    • separator

      void separator(int index, int codePoint)
      A separator (Zs, Zl and Zp unicode categories) was found.
      Parameters:
      index - the index at which the separator was found.
      codePoint - the codepoint of the found separator.
    • quoted

      void quoted(int index, CharSequence quoted, int quote)
      A quoted string was found by the parser.
      Parameters:
      index - the index at which the quoted string was found.
      quoted - the quoted sequence of characters, without the quotes.
      quote - the quote character.
    • quotedWithControl

      void quotedWithControl(int index, CharSequence quoted, int quoteCp)
      A quoted string was found by the parser, and contains control characters.
      Parameters:
      index - the index at which the quoted string was found.
      quoted - the quoted sequence of characters, without the quotes.
      quoteCp - the quote character codepoint.
    • quotedNewlineChar

      void quotedNewlineChar(int index, int codePoint)
      An unescaped FF/LF/CR control was found while assembling a quoted string.
      Parameters:
      index - the index at which the control was found.
      codePoint - the FF/LF/CR codepoint.
    • openGroup

      void openGroup(int index, int codePoint)
      Called when one of these codepoints is found: (, [, {
      Parameters:
      index - the index at which the codepoint was found.
      codePoint - the found codepoint.
    • closeGroup

      void closeGroup(int index, int codePoint)
      Called when one of these codepoints is found: ), ], }
      Parameters:
      index - the index at which the codepoint was found.
      codePoint - the found codepoint.
    • character

      void character(int index, int codePoint)
      Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.

      Symbols in So category are considered part of words and won't be handled by this method.

      Parameters:
      index - the index at which the punctuation was found.
      codePoint - the codepoint of the found punctuation.
    • escaped

      void escaped(int index, int codePoint)
      A codepoint preceded with a backslash was found outside of quoted text.
      Parameters:
      index - the index at which the escaped codepoint was found.
      codePoint - the escaped codepoint.
    • control

      void control(int index, int codePoint)
      A control character codepoint was found.
      Parameters:
      index - the index at which the control codepoint was found.
      codePoint - the control codepoint.
    • commented

      void commented(int index, int commentType, String comment)
      A commented string was found by the parser.
      Parameters:
      index - the index at which the commented string was found.
      commentType - the type of comment.
      comment - the commented string.
    • endOfStream

      void endOfStream(int len)
      The stream that was being parsed reached its end.
      Parameters:
      len - the length of the processed stream.
    • error

      void error(int index, byte errCode, CharSequence context)
      An error was found while parsing.

      Something was found that broke the assumptions made by the parser, like an escape character at the end of the stream or an unmatched quote.

      Parameters:
      index - the index at which the error was found.
      errCode - the error code.
      context - a context sequence. If a string was parsed, it will contain up to 16 characters before and after the error.