[Owasp-antisamy] escaped tags goes thru without getting removed

Jason Li jason.li at owasp.org
Mon Apr 13 13:36:42 EDT 2009


That's definitely an issue if encoded HTML gets decoded by the DOM parser...

That's something we need to look into and fix.

Thanks for pointing that out Eric!
--
-Jason Li-
-jason.li at owasp.org-



On Mon, Apr 13, 2009 at 1:28 PM, Eric Kreiser <ekreiser at mzinga.com> wrote:
> The other problem I have seen with antisamy is that if the value you
> send to antisamy is escaped... but you use the
> getCleanXMLDocumentFragment() to get your scrubbed value... it reverses
> all the escaping... leaving you now with a value that would have
> otherwise violated the policy file
>
>
>
> Jason Li wrote:
>> Girish,
>>
>> By default, script tags should be removed by AntiSamy.
>>
>> I think the problem may lie in your statement, "even if they are escaped."
>>
>> If you pass in:
>> <script>alert('Channel Title Description Vulnerability - Type 2')</script>
>>
>> to AntiSamy, you should get nothing back.
>>
>> However, your statement leads me to believe that in fact you're passing in:
>> &lt;script&gt;alert('Channel Title Description Vulnerability - Type
>> 2')&lt;/script&gt;
>>
>> The above is "safe" from AntiSamy's perspective because it assumes
>> that the content is directly rendered in an HTML interpreter.
>>
>> My guess from the behavior you describe and examples you give sounds
>> like you have encoded HTML embedded in XML - so something that looks
>> like this (here the tainted input is contained in an XML element, item
>> description, and therefore encoded):
>> <rss version="2.0">
>>   <channel>
>>     <title>Example</title>
>>     <link>http://example.com</link>
>>     <description>Example</description>
>>     <item>
>>       <title>Example</title>
>>       <link>http://example.com</link>
>>       <description>This is the text that you're trying to validate
>> &lt;script&gt;alert('Channel Title Description Vulnerability - Type
>> 2')&lt;/script&gt;</description>
>>     </item>
>>   </channel>
>> </rss>
>>
>> AntiSamy can't know the context where your content is coming from -
>> it's expecting HTML content that goes to an HTML interpreter. If the
>> content you are provided is encoded HTML that goes to an interpreter
>> that decodes the HTML, AntiSamy won't be able to properly validate it.
>> You'd have to provide an HTML decoded version for AntiSamy to handle
>> properly.
>>
>> Am I interpreting your use case correctly? And if so, does that
>> explanation make sense?
>> --
>> -Jason Li-
>> -jason.li at owasp.org-
>>
>>
>>
>> On Fri, Apr 10, 2009 at 6:52 PM, Girish <ivgirish at yahoo.com> wrote:
>>
>>> I am using 1.3 version and i have tried all the 4 policy files. They all
>>> give the same result.
>>>
>>> For example, if my html is this (passing line by line to antisamy):
>>>
>>>      <script>alert('Channel Title Description Vulnerability -
>>> Type 2')</script>
>>>      <script>alert('Channel Link Vulnerability - Type
>>> 2')</script>
>>>      javascript:alert('Channel Image URL Vulnerability - Type 1');
>>>
>>> the output I am getting is:
>>>
>>>      &lt;script&gt;alert('Channel Title Description
>>> Vulnerability - Type 2')&lt;/script&gt;
>>>      &lt;script&gt;alert('Channel Link Vulnerability - Type
>>> 2')&lt;/script&gt;
>>>      javascript:alert('Channel Image URL Vulnerability - Type 1');
>>>
>>> any idea on how to remove the tags like
>>> script/javascript/embed/frame/etc even if they are escaped.
> _______________________________________________
> Owasp-antisamy mailing list
> Owasp-antisamy at lists.owasp.org
> https://lists.owasp.org/mailman/listinfo/owasp-antisamy
>


More information about the Owasp-antisamy mailing list