For my project, I have made some modifications to meet some of our requirements.  I'd be interested in contributing generally-useful changes to the project, but I'm not sure about some of the change I'd like to see.  I had some questions about the code base.

 *   What is the utility of having a SAX and DOM version of the scan?  Right now, I'm basically calling into the DOM scanner directly, as I have my own DOM edits I need to do.
 *   Is there really that much demand for Java 1.4 compatibility?  I'd prefer a Java 5+ code base.  There are also some code cleanups that would be worthwhile (using spaces instead of hard tabs is one example).

Note that none of this affects the policy file and its processing.  That is great.  I changed the API (slightly) to make the Anti-Samy processing more extensible without touching the policy file processing.  Adding everyone's ad-hoc requirements as policy file options will quickly make Anti-Samy overcomplicated and difficult to use (a barrier to acceptance).

There are some other changes that would make it easier to embed:

 *   Following IOC (inversion of control) configuration design patterns
 *   Using DOM3 APIs instead of directly referencing Xerces classes.  (Not sure how to do HTML instead of XHTML output with that, though.)
 *   Allowing the caller to set the ResourceBundle.  It would also help if the i18n keys had a better namespace, like starting with "antisamy."  The ResourceBundle could use the provided properties file, or it could us something else (we have our own i18n system that we'd prefer.)

Is anyone interested in these updates/structural changes?  If not, I can fork the code (I'd rather not, for obvious reasons.)

