java.lang.Object
io.sf.carte.uparser.CommentRemovalHandler
- All Implemented Interfaces:
ContentHandler<RuntimeException>,ControlHandler<RuntimeException>,TokenErrorHandler<RuntimeException>,TokenHandler2,TokenHandler3<RuntimeException>
- Direct Known Subclasses:
MinificationHandler
A handler that removes comments.
Example:
String removeComments(String text) {
String[] opening = { "/*", "<!--" };
String[] closing = { "*/", "-->" };
CommentRemovalHandler handler = new CommentRemovalHandler(text.length());
TokenProducer tp = new TokenProducer(handler);
try {
tp.parseMultiComment(new StringReader(text), opening, closing);
} catch (IOException e) {
}
return handler.getBuffer().toString();
}
-
Constructor Summary
ConstructorsConstructorDescriptionCommentRemovalHandler(int bufSize) Construct the handler with the given initial buffer size.CommentRemovalHandler(StringBuilder buffer) Construct the handler with the given buffer. -
Method Summary
Modifier and TypeMethodDescriptionvoidcharacter(int index, int codePoint) Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.voidA commented string was found by the parser.voidcontrol(int index, int codePoint) A control character codepoint was found.voidendOfStream(int len) The stream that was being parsed reached its end.voiderror(int index, byte errCode, CharSequence context) An error was found while parsing.voidescaped(int index, int codePoint) A codepoint preceded with a backslash was found outside of quoted text.Get the buffer.protected intGet the codepoint that was last processed.voidleftCurlyBracket(int index) Called when the{codepoint is found.voidleftParenthesis(int index) Called when the(codepoint is found.voidleftSquareBracket(int index) Called when the[codepoint is found.voidquoted(int index, CharSequence quoted, int quoteCp) A quoted string was found by the parser.voidquotedNewlineChar(int index, int codePoint) An unescaped FF/LF/CR control was found while assembling a quoted string.voidquotedWithControl(int index, CharSequence quoted, int quoteCp) A quoted string was found by the parser, and contains control characters.voidrightCurlyBracket(int index) Called when the}codepoint is found.voidrightParenthesis(int index) Called when the)codepoint is found.voidrightSquareBracket(int index) Called when the]codepoint is found.voidseparator(int index, int codePoint) A separator (Zs, Zl and Zp unicode categories) was found.protected voidsetPreviousCodepoint(int codePoint) Set the codepoint that was last processed.voidtokenStart(TokenControl control) At the beginning of parsing, this method is called, passing theTokenControlobject that can be used to fine-control the parsing.voidword(int index, CharSequence word) A word was found by the parser (includes connector punctuation).Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface io.sf.carte.uparser.TokenHandler2
endPunctuation, startPunctuation
-
Constructor Details
-
CommentRemovalHandler
public CommentRemovalHandler(int bufSize) Construct the handler with the given initial buffer size.- Parameters:
bufSize- the initial buffer size.
-
CommentRemovalHandler
Construct the handler with the given buffer.- Parameters:
buffer- the buffer.
-
-
Method Details
-
getBuffer
Get the buffer.- Returns:
- the buffer.
-
getPreviousCodepoint
protected int getPreviousCodepoint()Get the codepoint that was last processed.If a character sequence was last processed it returns
65, and32for a separator or a control character.- Returns:
- the codepoint;
-
setPreviousCodepoint
protected void setPreviousCodepoint(int codePoint) Set the codepoint that was last processed.- Parameters:
codePoint- the codepoint.
-
tokenStart
Description copied from interface:TokenHandler2At the beginning of parsing, this method is called, passing theTokenControlobject that can be used to fine-control the parsing.- Specified by:
tokenStartin interfaceControlHandler<RuntimeException>- Specified by:
tokenStartin interfaceTokenHandler2- Parameters:
control- theTokenControlobject in charge of parsing.
-
word
Description copied from interface:TokenHandler2A word was found by the parser (includes connector punctuation).- Specified by:
wordin interfaceContentHandler<RuntimeException>- Specified by:
wordin interfaceTokenHandler2- Parameters:
index- the index at which the word was found.word- the word.
-
separator
public void separator(int index, int codePoint) Description copied from interface:TokenHandler2A separator (Zs, Zl and Zp unicode categories) was found.- Specified by:
separatorin interfaceContentHandler<RuntimeException>- Specified by:
separatorin interfaceTokenHandler2- Parameters:
index- the index at which the separator was found.codePoint- the codepoint of the found separator.
-
quoted
Description copied from interface:TokenHandler2A quoted string was found by the parser.- Specified by:
quotedin interfaceContentHandler<RuntimeException>- Specified by:
quotedin interfaceTokenHandler2- Parameters:
index- the index at which the quoted string was found.quoted- the quoted sequence of characters, without the quotes.quoteCp- the quote character.
-
quotedWithControl
Description copied from interface:TokenHandler2A quoted string was found by the parser, and contains control characters.- Specified by:
quotedWithControlin interfaceContentHandler<RuntimeException>- Specified by:
quotedWithControlin interfaceTokenHandler2- Parameters:
index- the index at which the quoted string was found.quoted- the quoted sequence of characters, without the quotes.quoteCp- the quote character codepoint.
-
quotedNewlineChar
public void quotedNewlineChar(int index, int codePoint) Description copied from interface:TokenHandler2An unescaped FF/LF/CR control was found while assembling a quoted string.- Specified by:
quotedNewlineCharin interfaceControlHandler<RuntimeException>- Specified by:
quotedNewlineCharin interfaceTokenHandler2- Parameters:
index- the index at which the control was found.codePoint- the FF/LF/CR codepoint.
-
leftParenthesis
public void leftParenthesis(int index) Description copied from interface:TokenHandler2Called when the(codepoint is found.- Specified by:
leftParenthesisin interfaceContentHandler<RuntimeException>- Specified by:
leftParenthesisin interfaceTokenHandler2- Parameters:
index- the index at which the codepoint was found.
-
leftSquareBracket
public void leftSquareBracket(int index) Description copied from interface:TokenHandler2Called when the[codepoint is found.- Specified by:
leftSquareBracketin interfaceContentHandler<RuntimeException>- Specified by:
leftSquareBracketin interfaceTokenHandler2- Parameters:
index- the index at which the codepoint was found.
-
leftCurlyBracket
public void leftCurlyBracket(int index) Description copied from interface:TokenHandler2Called when the{codepoint is found.- Specified by:
leftCurlyBracketin interfaceContentHandler<RuntimeException>- Specified by:
leftCurlyBracketin interfaceTokenHandler2- Parameters:
index- the index at which the codepoint was found.
-
rightParenthesis
public void rightParenthesis(int index) Description copied from interface:TokenHandler2Called when the)codepoint is found.- Specified by:
rightParenthesisin interfaceContentHandler<RuntimeException>- Specified by:
rightParenthesisin interfaceTokenHandler2- Parameters:
index- the index at which the codepoint was found.
-
rightSquareBracket
public void rightSquareBracket(int index) Description copied from interface:TokenHandler2Called when the]codepoint is found.- Specified by:
rightSquareBracketin interfaceContentHandler<RuntimeException>- Specified by:
rightSquareBracketin interfaceTokenHandler2- Parameters:
index- the index at which the codepoint was found.
-
rightCurlyBracket
public void rightCurlyBracket(int index) Description copied from interface:TokenHandler2Called when the}codepoint is found.- Specified by:
rightCurlyBracketin interfaceContentHandler<RuntimeException>- Specified by:
rightCurlyBracketin interfaceTokenHandler2- Parameters:
index- the index at which the codepoint was found.
-
character
public void character(int index, int codePoint) Description copied from interface:TokenHandler2Other characters including punctuation (excluding connector punctuation) and symbols (Sc, Sm and Sk unicode categories) was found, that was not one of the non-alphanumeric characters allowed in words.Symbols in So category are considered part of words and won't be handled by this method.
- Specified by:
characterin interfaceContentHandler<RuntimeException>- Specified by:
characterin interfaceTokenHandler2- Parameters:
index- the index at which the punctuation was found.codePoint- the codepoint of the found punctuation.
-
escaped
public void escaped(int index, int codePoint) Description copied from interface:TokenHandler2A codepoint preceded with a backslash was found outside of quoted text.- Specified by:
escapedin interfaceContentHandler<RuntimeException>- Specified by:
escapedin interfaceTokenHandler2- Parameters:
index- the index at which the escaped codepoint was found.codePoint- the escaped codepoint.
-
control
public void control(int index, int codePoint) Description copied from interface:TokenHandler2A control character codepoint was found.- Specified by:
controlin interfaceControlHandler<RuntimeException>- Specified by:
controlin interfaceTokenHandler2- Parameters:
index- the index at which the control codepoint was found.codePoint- the control codepoint.
-
commented
Description copied from interface:TokenHandler2A commented string was found by the parser.- Specified by:
commentedin interfaceContentHandler<RuntimeException>- Specified by:
commentedin interfaceTokenHandler2- Parameters:
index- the index at which the commented string was found.commentType- the type of comment.comment- the commented string.
-
endOfStream
public void endOfStream(int len) Description copied from interface:TokenHandler2The stream that was being parsed reached its end.- Specified by:
endOfStreamin interfaceContentHandler<RuntimeException>- Specified by:
endOfStreamin interfaceTokenHandler2- Parameters:
len- the length of the processed stream.
-
error
Description copied from interface:TokenHandler2An error was found while parsing.Something was found that broke the assumptions made by the parser, like an escape character at the end of the stream or an unmatched quote.
- Specified by:
errorin interfaceTokenErrorHandler<RuntimeException>- Specified by:
errorin interfaceTokenHandler2- Parameters:
index- the index at which the error was found.errCode- the error code.context- a context sequence. If a string was parsed, it will contain up to 16 characters before and after the error.
-