Package tilda.utils
Class HTMLFilter
- java.lang.Object
-
- tilda.utils.HTMLFilter
-
public class HTMLFilter extends java.lang.Object
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.util.regex.Pattern
_BODYEND_PATTERN
protected static java.util.regex.Pattern
_BODYSTART_PATTERN
protected static java.util.regex.Pattern
_JS_PATTERN
protected static java.util.regex.Pattern
_ONXXX_PATTERN
protected static java.util.regex.Pattern
_TAGREMOVE_PATTERN
protected static java.lang.String
DQUOTED_STR
protected static java.lang.String
DQUOTED_STR_BASE
protected static java.lang.String
HTMLATTRIBUTE
protected static java.lang.String
NO_SPACE_STR
protected static java.lang.String
SQUOTED_STR
-
Constructor Summary
Constructors Constructor Description HTMLFilter()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.lang.String
cleanAbsolute(java.lang.String Str)
Blindly replaces all '<' and '>' in the passed in Str with '<' and '>' respectively.static java.lang.String
cleanSmart(java.lang.String Name, java.lang.String Str)
Detects and disables several potentially dangerous code snippets in HTML content.protected static int
detect(java.lang.String Name, java.util.regex.Pattern P, java.lang.String Str)
protected static int
findFirst(java.lang.String Name, java.util.regex.Pattern P, java.lang.String Str)
returns the start index of the first matchprotected static int
findLast(java.lang.String Name, java.util.regex.Pattern P, java.lang.String Str)
returns the end index of the last matchprotected static java.lang.String
formatReportOutput(java.lang.String Name, java.lang.String Value)
protected static java.lang.String
getEndTagRegex(java.lang.String Tag)
static java.util.List<java.lang.String>
getFilterReportForThread()
protected static java.lang.String
getStartTagRegex(java.lang.String Tag)
protected static java.lang.String
getTagBlockRegex(java.lang.String[] Tags)
protected static void
replace(java.lang.String Name, java.util.regex.Pattern P, java.lang.StringBuffer Src, java.lang.StringBuffer Dest, java.lang.String ReplaceStr)
-
-
-
Field Detail
-
DQUOTED_STR_BASE
protected static final java.lang.String DQUOTED_STR_BASE
- See Also:
- Constant Field Values
-
DQUOTED_STR
protected static final java.lang.String DQUOTED_STR
- See Also:
- Constant Field Values
-
SQUOTED_STR
protected static final java.lang.String SQUOTED_STR
- See Also:
- Constant Field Values
-
NO_SPACE_STR
protected static final java.lang.String NO_SPACE_STR
- See Also:
- Constant Field Values
-
HTMLATTRIBUTE
protected static final java.lang.String HTMLATTRIBUTE
- See Also:
- Constant Field Values
-
_BODYSTART_PATTERN
protected static final java.util.regex.Pattern _BODYSTART_PATTERN
-
_BODYEND_PATTERN
protected static final java.util.regex.Pattern _BODYEND_PATTERN
-
_TAGREMOVE_PATTERN
protected static final java.util.regex.Pattern _TAGREMOVE_PATTERN
-
_JS_PATTERN
protected static final java.util.regex.Pattern _JS_PATTERN
-
_ONXXX_PATTERN
protected static final java.util.regex.Pattern _ONXXX_PATTERN
-
-
Method Detail
-
cleanAbsolute
public static java.lang.String cleanAbsolute(java.lang.String Str)
Blindly replaces all '<' and '>' in the passed in Str with '<' and '>' respectively. Very fast, but it destroys HTML contents.- Parameters:
Str
- The string to clean up- Returns:
- the cleaned up Str
-
getStartTagRegex
protected static java.lang.String getStartTagRegex(java.lang.String Tag)
-
getEndTagRegex
protected static java.lang.String getEndTagRegex(java.lang.String Tag)
-
getTagBlockRegex
protected static java.lang.String getTagBlockRegex(java.lang.String[] Tags)
-
getFilterReportForThread
public static java.util.List<java.lang.String> getFilterReportForThread()
-
formatReportOutput
protected static java.lang.String formatReportOutput(java.lang.String Name, java.lang.String Value)
-
replace
protected static void replace(java.lang.String Name, java.util.regex.Pattern P, java.lang.StringBuffer Src, java.lang.StringBuffer Dest, java.lang.String ReplaceStr)
-
findFirst
protected static int findFirst(java.lang.String Name, java.util.regex.Pattern P, java.lang.String Str)
returns the start index of the first match
-
findLast
protected static int findLast(java.lang.String Name, java.util.regex.Pattern P, java.lang.String Str)
returns the end index of the last match
-
detect
protected static int detect(java.lang.String Name, java.util.regex.Pattern P, java.lang.String Str)
-
cleanSmart
public static java.lang.String cleanSmart(java.lang.String Name, java.lang.String Str)
Detects and disables several potentially dangerous code snippets in HTML content. It's much slower than {@link #FilterCleanAbsolute), but it conserves HTML contents.
The following patterns are tracked and addressed:- <SCRIPT>...</SCRIPT>
- <FRAME>...</FRAME>
- <LINK>...</LINK>
- <STYLE>...</STYLE>
- ...<BODY> and </BODY>...
- onXXX event handlers in any HTML tags
- "javascript:" in src and href tag attributes
- <A href="javascript:somethingbad();"> --> <A BADhref="">
- <IMG onHover="somethingbad();"> --> <IMG BADonHover="">
- <SCRIPT>SomethingBad()</script> --> <BADscript/>
If found, an attack is logged in the ThreadLocal List<String> which you can get through {@link #getFilterReportForThread}.- Parameters:
Name
- The Name associated to this string for the logging of offenses.Str
- The string to clean up- Returns:
- the cleaned up string, which should be identical if no offense has been found.
-
-