[Owasp-antisamy] policy file questions

Arshan Dabirsiaghi arshan.dabirsiaghi at aspectsecurity.com
Fri Mar 27 18:30:41 EDT 2009

First of all, I'm moved your question to the AntiSamy mailing list where everyone can see your message and possibly offer advice. In the future, this is where general questions should go.
Second, thanks for using AntiSamy! I'm psyched every time I see another download. =]
The best documentation I can recommend for understanding AntiSamy is the paper:
http://owaspantisamy.googlecode.com/files/Arshan%20Dabirsiaghi%20-%20Towards%20Malicious%20Code%20Detection%20and%20Removal.PDF <http://owaspantisamy.googlecode.com/files/Arshan%20Dabirsiaghi%20-%20Towards%20Malicious%20Code%20Detection%20and%20Removal.PDF> 
To answer your specific questions
1. The "tags-to-encode" section of the policy file denotes those tags that you'd rather encode than filter out. Some users wanted to put this feature in so that users who put in text like "hey joe how's your mom? <g>" wouldn't the "grin" tag/emoticon thing filtered out.
So, the only entries I left for the default policy file are <g> and <grin> since apparently those are pretty common.
2. Let me specify the difference here between action="remove", action="filter" (which is default) and action="validate". Before I do that though, you should understand my terminology. Consider the following snippet:
In the DOM, that is 2 separate elements. The parent element is <script>, and the child element is the text node "alert(document.domain)".
With that in mind, let's talk about the actions. The "remove" action removes both DOM elements in question - the parent tag and the text children. The "filter" action removes the parent tag, but promotes the text content. If you set the action on "script" to "filter", the same text, after being run through the validator, would be this:
Since JavaScript text content won't make sense as text to be shown to users, we set the action for "script" to be "remove". However, the same isn't necessarily true for the "b" tag, for example. Imagine you didn't want bold tags for some reason. You probably still want to keep the text that the user put between the tags, so you set action="filter" and AntiSamy will rip out the tags themselves, but keep the inner text.
3. Your third question you sent in a seperate email, but I'm going to put it here in order to reduce traffic:
> Where can I find the necessary libs to build AntiSamy?  I'm not seeing classes for the org.w3c.css.sac
> package.

Normally Java projects don't necessarily include required libraries so that you can get those libraries directly from the official distributor and allow you to avoid "jar hell."
However, if you're into that kind of thing, we have packaged the libraries together for development purposes in the past. You can download that zip here:


From: mailman-bounces at lists.owasp.org on behalf of Frank Pedroza
Sent: Fri 3/27/2009 4:46 PM
To: owasp-antisamy-owner at lists.owasp.org
Subject: policy file questions

I'm just getting started with AntiSamy and am very excited about.  In trying to learn about it though, I'm finding the documentation a little lacking (which is to be expected).  My specific questions have to do with the policy files.

1) What is the purpose of the <tags-to-encode> tag?

2) I don't want to allow the <script> tag at all.  Do I need to include the following in the <tag-rules>?  I'd prefer that validation just fail if someone tries to use these tags.

        <tag name="script" action="remove"/> 
        <tag name="noscript" action="remove"/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-antisamy/attachments/20090327/c722fc6b/attachment.html 

More information about the Owasp-antisamy mailing list