[Owasp-antisamy] HTML Sanitization (Tidy), Why?

Arshan Dabirsiaghi arshan.dabirsiaghi at aspectsecurity.com
Tue Jan 8 17:20:09 EST 2008

Unfortunately, there's no getting around this one. There are different classes of attacks, and a few of them (recursion/canonicalization/fragmenting)
are based on the way the code is formatted. So, by formatting/cleaning their code we can protect ourselves from those types of attacks.

I wish it wasn't necessary, but HTML is a horrible mashup of data and code, and when we add forgiving browsers into the mix, it becomes quite ugly.
If you look at Samy's attacks you'll see what I'm talking about.

>>     Why does anti-samy first "clean up" any "broken" HTML before
>>     parsing/validating it? Is there a true technical need or does it
>>     simply make for easier parsing?
>>     With user generated content, I want the user to be able to see
>>     the same exact (often malformed) HTML they used in entry when
>>     they go back to edit.
>>     Can anti-samy or a similar implementation of anti-samy do that
>>     while still effectively protecting against XSS threat?
>>     Thanks a lot,
>>     --
>>     Sam Daoud
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-antisamy/attachments/20080108/c3dd8727/attachment.html 

More information about the Owasp-antisamy mailing list