[Owasp-antisamy] dotNet - Some suggestions

Steve Sheldon steve at sodablue.org
Sun Mar 15 14:55:33 EDT 2009

I've been digging through the AntiSamy code, and have a couple of

1. Make Action on Tag and OnInvalid on Attribute into enums.  Rather then
doing string matches, I think it'd be better to have Enums... this would
make it more clear what are acceptable values.

2. I would break up recursiveValidateTag into sub methods. for filter,
validate and truncate actions.  Similarly validate should call sub methods
for attribute validation.  The method is just too long and this would
improve readibility.  Try to keep methods under 25 lines so they fit on the
screen.  In combination with the enum in #1, it'd mean the use of a switch()
instead of multiple if/then statements.

3. While I really like the use of Dependency Injection on the scanner with
the Policy object, I'd suggest taking this one more step and use interfaces:

  public interface IPolicy
>   {
>     string getDirective(string name);
>     IAttribute getGlobalAttributeByName(string name);
>     IProperty getPropertyByName(string propertyName);
>     string getRegularExpression(string name);
>     ITag getTagByName(string tagName);
>   }

  public interface IAttribute : ICloneable
>   {
>     IList AllowedValues { get; }
>     IList AllowedRegExp { get; }
>     string Description { get; set; }
>     string Name { get; set; }
>     string OnInvalid { get; set; }
>   }

  public interface IProperty
>   {
>     string Description { get; set; }
>     string Name { get; set; }
>     string OnInvalid { get; set; }
>   }

  public interface ITag
>   {
>     string Action { get; set; }
>     Attribute getAttributeByName(string name);
>     string Name { get; set; }
>   }

Using the interfaces would make it easier to extend the implementation of
Policy with a custom implementation.

Why? I'm just not a big fan of XML configuration.  So I am trying to
implement a Fluent interface that looks something like this when used:

    public static IPolicy GetFluentPolicy()
>     {
>         const string c_reghtmlTitle =
> @"^([a-zA-Z0-9\s-_',:\[\]!\./\\\(\)]*)$";
>         const string c_regonsiteURL = @"^^((\s)*((ht|f)tp(s?)://|mailto:
> )[A-Za-z0-9]+[~a-zA-Z0-9-_\[email protected]#$%&;:,\?=/\+!]*(\s)*)$";
>         const string c_regoffsiteURL = @"^((\s)*((ht|f)tp(s?)://|mailto:
> )[A-Za-z0-9]+[~a-zA-Z0-9-_\[email protected]#$%&;:,\?=/\+!]*(\s)*)$";
>         FluentPolicy policy = new FluentPolicy()
>           .AddDirective("maxInputSize", "500000")
>           .AddCommonAttributes(new FluentAttribute("lang")
>                                 .addAllowedRegExp(@"[a-zA-Z]{2,20}"))
>           .AddCommonAttributes(new FluentAttribute("title")
>                                 .addAllowedRegExp(c_reghtmlTitle))
>           .AddCommonAttributes(new FluentAttribute("href")
>                                 .addAllowedRegExp(c_regonsiteURL)
>                                 .addAllowedRegExp(c_regoffsiteURL))
>           .AddCommonAttributes(new FluentAttribute("align")
>                                 .addAllowedValue("center")
>                                 .addAllowedValue("left")
>                                 .addAllowedValue("right")
>                                 .addAllowedValue("justify")
>                                 .addAllowedValue("char"))
>           .AddGlobalAttribute("title")
>           .AddGlobalAttribute("lang");
>       return policy;
>     }

While I'd probably keep an instance of IPolicy in my inversion of control
container, I also suspect(but haven't verified) that instantiation of a
policy using this interface would be faster than from an xml configuration

4. Also, I'd build AntiSamy as a class assembly, and place the unit tests
and the console application logic in their own projects.  I just see no
reason for them to be in the core implementation.

I'll probably have some more thoughts as I dig in further.  It seems to me
some of the logic in terms of global and tag specific interfaces could be
better represented through the Policy class instead of having to be in the
domscanner, as this is kind of logic specific to the policy... the notion
that a tag attribute is implemented by tag specific attribute, or global
attributes if they exist.

Anyway, just some thoughts.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-antisamy/attachments/20090315/e7222a58/attachment.html 

More information about the Owasp-antisamy mailing list