[Owasp-antisamy] Error parsing <style> contents containing CDATA

Gang Zheng gzheng at gmail.com
Mon Nov 24 16:10:50 EST 2008


Hi,

I tried the following input string with AnitSamy and encountered an exception:

Input String: <style type="text/css"><![CDATA[P {  margin-bottom:
0.08in; } ]]></style>

org.apache.batik.css.parser.ParseException: character
	at org.apache.batik.css.parser.Scanner.nextToken(Scanner.java:381)
	at org.apache.batik.css.parser.Scanner.next(Scanner.java:222)
	at org.apache.batik.css.parser.Parser.parseStyleSheet(Parser.java:185)
	at org.owasp.validator.css.CssScanner.scanStyleSheet(CssScanner.java:124)
	at org.owasp.validator.html.scan.AntiSamyDOMScanner.recursiveValidateTag(AntiSamyDOMScanner.java:318)
	at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan(AntiSamyDOMScanner.java:135)
	at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:99)

I traced the code, and it seems that AntiSamy passes the text of
"<![CDATA[P {  margin-bottom: 0.08in; } ]]>" to the CSS scanner, and
the CSS scanner does not like <![CDATA[...]]> as the surrounding of
the real style sheet contents.

If I remove the CDATA from the input and change the style sheet
contents to "<style type="text/css">P {  margin-bottom: 0.08in;
}</style>", everything works fine.

So my questions is, how can I make the AntiSamy/CSS Scanner correctly
parse the CDATA contents? After all, the CDATA section in the original
input is perfectly legal style sheet contents.

Thanks,

-Gang


More information about the Owasp-antisamy mailing list