[owasp-antisamy] Is it possible to use AntiSamy and keep code in pre/code tags intact?

Mohamad El-Husseini husseini.mel at gmail.com
Tue Sep 20 21:28:19 EDT 2011

Hi August,

Thank you for the response. Let me clarify.

My users will post markdown. I convert this markdown to HTML, then run it
through AntiSamy. I then save the result in the database.

This works well. However, I also need a raw version of the markdown. This
allows my users to see their original post exactly as it were if they decide
to edit a post. I want to keep them away from any HTML. When editing a post,
the textarea will show the raw markdown version.

This presents a huge problem. I can't save the raw version without
sanitizing it. But if I run AntiSamy on it, AntiSamy will remove everything
that the markdown converter would have converted to safe HTML, such as code
and pre tags. Using it would result in massive inconsistencies between the
HTML and raw versions in my database.

I can't run AntiSamy before converting to HTML because it will strip tags
that the markdown version would have made safe. I hope this makes sense.

Please look at this illustration of what I mean. http://imgur.com/czN1t If
you notice, Original Text to AntiSamy is completely stripped, while HTML to
AntiSamy is fine..

On Tue, Sep 20, 2011 at 6:12 PM, augustd <augustd at codemagi.com> wrote:

> You definitely do not want to allow someone to insert <script> tags! Even
> if they are inside of <pre> tags they can still be executed by browsers.
> If you want to allow people to post code samples on your site, what you
> really need is to output encode those script tags. This will change them
> from raw HTML tags into HTML entities that will display as code samples, but
> not execute.
> Take a look at the ESAPI project for this functionality. You want something
> like this:
> //performing output encoding for the HTML context
> String safeOutput = ESAPI.encoder().encodeForHTML( input );
> Regards,
> August
> On Tue, Sep 20, 2011 at 2:18 PM, Mohamad El-Husseini <
> husseini.mel at gmail.com> wrote:
>> Hi everyone!
>> I want to use AntiSamy to allow users to post code snippets and other
>> things. Is it possible to customize AntiSamy to allow script tags that are
>> nested in code/pre tags?
>> I want to use it in a similar capacity to StackOverFlow: they allow most
>> basic HTML, including any tags found inside pre/code tags.
>> AntiSamy strips such tags regardless. Is AntiSamy the right tool for what
>> I'm trying to do? Andy advice would be appreciated.
>> Thank you.
