[owasp-antisamy] French character issue in Antisamy

Jason Li jason.li at owasp.org
Thu Aug 11 10:41:13 EDT 2011


I believe this encoding is being done by the NekoHTML parser - though
someone on the AntiSamy mailing list can correct me if I'm wrong. There may
be a way to override this behavior but off the top of my head I'm not sure.

AntiSamy is meant to be an HTML validation/sanitizing engine and é is
the properly encoded HTML version of that particular character. Changing
this encoding behavior can probably be done - but I believe there have been
known XSS attacks in the past that have depended on the fact that some
international letters are interpreted differently depending on locale and
region. As a result, I believe it's safer to rely on the HTML entity encoded
version if possible.

Obviously if you're not placing the data directly into an HTML context, that
conversion might have side effects...


On Thu, Aug 11, 2011 at 7:12 AM, Jobus <jobuss at gmail.com> wrote:

> Hi Jason,
> I am facing an issue related to Antisamy. In my application user can give
> input in French characters. But Antisamy is encoding it and not giving the
> input string back
> eg:
> My input string is
> Pour accéder au journal de test
> and output given from getCleanHTMl is
> Pour acc&eacute;der au journal de test
> how can i solve this issue? i need to get exactly the same input string i
> provided. mine is a multilingual application.
> I really appreciate if you can help me on this issue.
> tanks
> Jobu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-antisamy/attachments/20110811/0f50077e/attachment.html 

More information about the Owasp-antisamy mailing list