[Owasp-antisamy] escaped tags goes thru without getting removed

Eric Kreiser ekreiser at mzinga.com
Mon Apr 13 13:28:54 EDT 2009

The other problem I have seen with antisamy is that if the value you 
send to antisamy is escaped... but you use the 
getCleanXMLDocumentFragment() to get your scrubbed value... it reverses 
all the escaping... leaving you now with a value that would have 
otherwise violated the policy file

Jason Li wrote:
> Girish,
> By default, script tags should be removed by AntiSamy.
> I think the problem may lie in your statement, "even if they are escaped."
> If you pass in:
> <script>alert('Channel Title Description Vulnerability - Type 2')</script>
> to AntiSamy, you should get nothing back.
> However, your statement leads me to believe that in fact you're passing in:
> &lt;script&gt;alert('Channel Title Description Vulnerability - Type
> 2')&lt;/script&gt;
> The above is "safe" from AntiSamy's perspective because it assumes
> that the content is directly rendered in an HTML interpreter.
> My guess from the behavior you describe and examples you give sounds
> like you have encoded HTML embedded in XML - so something that looks
> like this (here the tainted input is contained in an XML element, item
> description, and therefore encoded):
> <rss version="2.0">
>   <channel>
>     <title>Example</title>
>     <link>http://example.com</link>
>     <description>Example</description>
>     <item>
>       <title>Example</title>
>       <link>http://example.com</link>
>       <description>This is the text that you're trying to validate
> &lt;script&gt;alert('Channel Title Description Vulnerability - Type
> 2')&lt;/script&gt;</description>
>     </item>
>   </channel>
> </rss>
> AntiSamy can't know the context where your content is coming from -
> it's expecting HTML content that goes to an HTML interpreter. If the
> content you are provided is encoded HTML that goes to an interpreter
> that decodes the HTML, AntiSamy won't be able to properly validate it.
> You'd have to provide an HTML decoded version for AntiSamy to handle
> properly.
> Am I interpreting your use case correctly? And if so, does that
> explanation make sense?
> --
> -Jason Li-
> -jason.li at owasp.org-
> On Fri, Apr 10, 2009 at 6:52 PM, Girish <ivgirish at yahoo.com> wrote:
>> I am using 1.3 version and i have tried all the 4 policy files. They all
>> give the same result.
>> For example, if my html is this (passing line by line to antisamy):
>>      <script>alert('Channel Title Description Vulnerability -
>> Type 2')</script>
>>      <script>alert('Channel Link Vulnerability - Type
>> 2')</script>
>>      javascript:alert('Channel Image URL Vulnerability - Type 1');
>> the output I am getting is:
>>      &lt;script&gt;alert('Channel Title Description
>> Vulnerability - Type 2')&lt;/script&gt;
>>      &lt;script&gt;alert('Channel Link Vulnerability - Type
>> 2')&lt;/script&gt;
>>      javascript:alert('Channel Image URL Vulnerability - Type 1');
>> any idea on how to remove the tags like
>> script/javascript/embed/frame/etc even if they are escaped.

More information about the Owasp-antisamy mailing list