Class AvoidEscapedUnicodeCharactersCheck
- All Implemented Interfaces:
Configurable
,Contextualizable
Restricts using Unicode escapes (such as \u221e). It is possible to allow using escapes for non-printable, control characters. Also, this check can be configured to allow using escapes if trail comment is present. By the option it is possible to allow using escapes if literal contains only them.
-
Property
allowEscapesForControlCharacters
- Allow use escapes for non-printable, control characters. Type isboolean
. Default value isfalse
. -
Property
allowByTailComment
- Allow use escapes if trail comment is present. Type isboolean
. Default value isfalse
. -
Property
allowIfAllCharactersEscaped
- Allow if all characters in literal are escaped. Type isboolean
. Default value isfalse
. -
Property
allowNonPrintableEscapes
- Allow use escapes for non-printable, whitespace characters. Type isboolean
. Default value isfalse
.
To configure the check:
<module name="AvoidEscapedUnicodeCharacters"/>
Examples of using Unicode:
String unitAbbrev = "μs"; // OK, perfectly clear even without a comment. String unitAbbrev = "\u03bcs";// violation, the reader has no idea what this is. return '\ufeff' + content; // OK, an example of non-printable, // control characters (byte order mark).
An example of how to configure the check to allow using escapes for non-printable, control characters:
<module name="AvoidEscapedUnicodeCharacters"> <property name="allowEscapesForControlCharacters" value="true"/> </module>
Example of using escapes for non-printable, control characters:
String unitAbbrev = "μs"; // OK, a normal String String unitAbbrev = "\u03bcs"; // violation, "\u03bcs" is a printable character. return '\ufeff' + content; // OK, non-printable control character.
An example of how to configure the check to allow using escapes if trail comment is present:
<module name="AvoidEscapedUnicodeCharacters"> <property name="allowByTailComment" value="true"/> </module>
Example of using escapes if trail comment is present:
String unitAbbrev = "μs"; // OK, a normal String String unitAbbrev = "\u03bcs"; // OK, Greek letter mu, "s" return '\ufeff' + content; // -----^--------------------- violation, comment is not used within same line.
An example of how to configure the check to allow if all characters in literal are escaped.
<module name="AvoidEscapedUnicodeCharacters"> <property name="allowIfAllCharactersEscaped" value="true"/> </module>
Example of using escapes if all characters in literal are escaped:
String unitAbbrev = "μs"; // OK, a normal String String unitAbbrev = "\u03bcs"; // violation, not all characters are escaped ('s'). String unitAbbrev = "\u03bc\u03bc\u03bc"; // OK String unitAbbrev = "\u03bc\u03bcs";// violation, not all characters are escaped ('s'). return '\ufeff' + content; // OK, all control characters are escaped
An example of how to configure the check to allow using escapes for non-printable whitespace characters:
<module name="AvoidEscapedUnicodeCharacters"> <property name="allowNonPrintableEscapes" value="true"/> </module>
Example of using escapes for non-printable whitespace characters:
String unitAbbrev = "μs"; // OK, a normal String String unitAbbrev1 = "\u03bcs"; // violation, printable escape character. String unitAbbrev2 = "\u03bc\u03bc\u03bc"; // violation, printable escape character. String unitAbbrev3 = "\u03bc\u03bcs";// violation, printable escape character. return '\ufeff' + content; // OK, non-printable escape character.
Parent is com.puppycrawl.tools.checkstyle.TreeWalker
Violation Message Keys:
-
forbid.escaped.unicode.char
- Since:
- 5.8
-
Nested Class Summary
Nested classes/interfaces inherited from class com.puppycrawl.tools.checkstyle.api.AutomaticBean
AutomaticBean.OutputStreamOptions
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final Pattern
Regular expression for all escaped chars.private boolean
Allow use escapes if trail comment is present.private boolean
Allow use escapes for non-printable, control characters.private boolean
Allow if all characters in literal are escaped.private boolean
Allow use escapes for non-printable, whitespace characters.C style comments.private static final Pattern
Regular expression for escaped backslash.static final String
A key is pointing to the warning message text in "messages.properties" file.private static final Pattern
Regular expression for non-printable unicode chars.Cpp style comments.private static final Pattern
Regular expression Unicode control characters.private static final Pattern
Regular expression for Unicode chars. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
Called before the starting to process a tree.private static int
countMatches
(Pattern pattern, String target) Count regexp matches into String literal.int[]
The configurable token set.int[]
Returns the default token a check is interested in.int[]
The tokens that this check must be registered for.private boolean
hasTrailComment
(DetailAST ast) Check if trail comment is present after ast token.private static boolean
hasUnicodeChar
(String literal) Checks if literal has Unicode chars.private boolean
isAllCharactersEscaped
(String literal) Checks if all characters in String literal is escaped.private static boolean
isOnlyUnicodeValidChars
(String literal, Pattern pattern) Check if String literal contains Unicode control chars.private static boolean
isTrailingBlockComment
(TextBlock comment, int... codePoints) Whether the C style comment is trailing.final void
setAllowByTailComment
(boolean allow) Setter to allow use escapes if trail comment is present.final void
setAllowEscapesForControlCharacters
(boolean allow) Setter to allow use escapes for non-printable, control characters.final void
setAllowIfAllCharactersEscaped
(boolean allow) Setter to allow if all characters in literal are escaped.final void
setAllowNonPrintableEscapes
(boolean allow) Setter to allow use escapes for non-printable, whitespace characters.void
visitToken
(DetailAST ast) Called to process a token.Methods inherited from class com.puppycrawl.tools.checkstyle.api.AbstractCheck
clearViolations, destroy, finishTree, getFileContents, getLine, getLineCodePoints, getLines, getTabWidth, getTokenNames, getViolations, init, isCommentNodesRequired, leaveToken, log, log, log, setFileContents, setTabWidth, setTokens
Methods inherited from class com.puppycrawl.tools.checkstyle.api.AbstractViolationReporter
finishLocalSetup, getCustomMessages, getId, getMessageBundle, getSeverity, getSeverityLevel, setId, setSeverity
Methods inherited from class com.puppycrawl.tools.checkstyle.api.AutomaticBean
configure, contextualize, getConfiguration, setupChild
-
Field Details
-
MSG_KEY
A key is pointing to the warning message text in "messages.properties" file.- See Also:
-
UNICODE_REGEXP
Regular expression for Unicode chars. -
UNICODE_CONTROL
Regular expression Unicode control characters.- See Also:
-
ALL_ESCAPED_CHARS
Regular expression for all escaped chars. See "EscapeSequence" at https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.7 -
ESCAPED_BACKSLASH
Regular expression for escaped backslash. -
NON_PRINTABLE_CHARS
Regular expression for non-printable unicode chars. -
singlelineComments
Cpp style comments. -
blockComments
C style comments. -
allowEscapesForControlCharacters
private boolean allowEscapesForControlCharactersAllow use escapes for non-printable, control characters. -
allowByTailComment
private boolean allowByTailCommentAllow use escapes if trail comment is present. -
allowIfAllCharactersEscaped
private boolean allowIfAllCharactersEscapedAllow if all characters in literal are escaped. -
allowNonPrintableEscapes
private boolean allowNonPrintableEscapesAllow use escapes for non-printable, whitespace characters.
-
-
Constructor Details
-
AvoidEscapedUnicodeCharactersCheck
public AvoidEscapedUnicodeCharactersCheck()
-
-
Method Details
-
setAllowEscapesForControlCharacters
public final void setAllowEscapesForControlCharacters(boolean allow) Setter to allow use escapes for non-printable, control characters.- Parameters:
allow
- user's value.
-
setAllowByTailComment
public final void setAllowByTailComment(boolean allow) Setter to allow use escapes if trail comment is present.- Parameters:
allow
- user's value.
-
setAllowIfAllCharactersEscaped
public final void setAllowIfAllCharactersEscaped(boolean allow) Setter to allow if all characters in literal are escaped.- Parameters:
allow
- user's value.
-
setAllowNonPrintableEscapes
public final void setAllowNonPrintableEscapes(boolean allow) Setter to allow use escapes for non-printable, whitespace characters.- Parameters:
allow
- user's value.
-
getDefaultTokens
public int[] getDefaultTokens()Description copied from class:AbstractCheck
Returns the default token a check is interested in. Only used if the configuration for a check does not define the tokens.- Specified by:
getDefaultTokens
in classAbstractCheck
- Returns:
- the default tokens
- See Also:
-
getAcceptableTokens
public int[] getAcceptableTokens()Description copied from class:AbstractCheck
The configurable token set. Used to protect Checks against malicious users who specify an unacceptable token set in the configuration file. The default implementation returns the check's default tokens.- Specified by:
getAcceptableTokens
in classAbstractCheck
- Returns:
- the token set this check is designed for.
- See Also:
-
getRequiredTokens
public int[] getRequiredTokens()Description copied from class:AbstractCheck
The tokens that this check must be registered for.- Specified by:
getRequiredTokens
in classAbstractCheck
- Returns:
- the token set this must be registered for.
- See Also:
-
beginTree
Description copied from class:AbstractCheck
Called before the starting to process a tree. Ideal place to initialize information that is to be collected whilst processing a tree.- Overrides:
beginTree
in classAbstractCheck
- Parameters:
rootAST
- the root of the tree
-
visitToken
Description copied from class:AbstractCheck
Called to process a token.- Overrides:
visitToken
in classAbstractCheck
- Parameters:
ast
- the token to process
-
hasUnicodeChar
Checks if literal has Unicode chars.- Parameters:
literal
- String literal.- Returns:
- true if literal has Unicode chars.
-
isOnlyUnicodeValidChars
Check if String literal contains Unicode control chars.- Parameters:
literal
- String literal.pattern
- RegExp for valid characters.- Returns:
- true, if String literal contains Unicode control chars.
-
hasTrailComment
Check if trail comment is present after ast token.- Parameters:
ast
- current token.- Returns:
- true if trail comment is present after ast token.
-
isTrailingBlockComment
Whether the C style comment is trailing.- Parameters:
comment
- the comment to check.codePoints
- the first line of the comment, in unicode code points- Returns:
- true if the comment is trailing.
-
countMatches
Count regexp matches into String literal.- Parameters:
pattern
- pattern.target
- String literal.- Returns:
- count of regexp matches.
-
isAllCharactersEscaped
Checks if all characters in String literal is escaped.- Parameters:
literal
- current literal.- Returns:
- true if all characters in String literal is escaped.
-