[Esapi-user] [Esapi-dev] Canonicalization and ESAPI's Input Validation Layer

Jeff Williams jeff.williams at aspectsecurity.com
Sat Aug 30 22:30:48 UTC 2014


Hi Jerry,

Imagine a JSON value containing an escaped quote.  Canonicalizing would unescape it and break the syntax. We don't have a json codec, so this is only a problem because it is very similar to CSS escaping syntax.

That's why you shouldn't attempt to canonicalize or validate complex data without a specialized library like AntiSamy.  Which is wrapped in the getValidSafeHTML() method in ESAPI.  We don't yet have a method to get valid safe JSON.

--Jeff


On Aug 30, 2014, at 12:57 PM, "Jerry Hoff" <jerryhoff at gmail.com<mailto:jerryhoff at gmail.com>> wrote:

Interesting discussion gentlemen!

Jim - could you provide an example where canonicalization breaks html and json input?  I want to make sure I am following correctly.

Thank you,
Jerry

On Aug 30, 2014, at 19:39, Jim Manico <jim.manico at owasp.org<mailto:jim.manico at owasp.org>> wrote:

So what is the line drawn between an attack and acceptably encoded data in ESAPI?

Canonicalization also "breaks" some input types (html, json, etc) so canonicalization of •all• input is equally idiotic. (which is why all of ESAPI's IV functions can disable canonicalization)

--
Jim Manico
@Manicode
(808) 652-3805

On Aug 30, 2014, at 5:07 AM, Jeff Williams <jeff.williams at aspectsecurity.com<mailto:jeff.williams at aspectsecurity.com>> wrote:

There are many situations in which apps get encoded data that does not represent an attack. However, any kind of multiply encoded data is clearly an attack.  So the line we drew in ESAPI is, I think, the right one. We only throw an intrusion detection whenever there is absolutely a clear attack.  Otherwise we clean up and try to make the data work.

Turning off canonicalization, for the vast majority of inputs would be idiotic.  You simply can't reliably validate encoded data. If you follow this plan, the only CVE you will need to worry about is "Canonicalize, Validate, Escape"

--Jeff


On Aug 30, 2014, at 3:22 AM, "Fabio Cerullo" <fcerullo at gmail.com<mailto:fcerullo at gmail.com>> wrote:

Hi

What about someone who "needs" encoded data to be accepted by the webapp?

For example, a web form with a field that is used for completion percentage of something. So in that field you could input "80% complete".

That would be a legit case of canonicalise first, and then validate because that value needs to be stored as it is.

I believe the approach taken by ESAPI is: treat everything as potentially malicious, canonicalise just in case, and then validate  all entries.

Otherwise, how would you differentiate between an attack and a valid entry in example above?

Fabio

On Friday, August 29, 2014, Jim Manico <jim.manico at owasp.org<mailto:jim.manico at owasp.org>> wrote:
Or even better, if encoded data is detected, even if things like %80 get rejected, I prefer to reject right then and there. Pretty sure we Cannonicalize before we validate though, what code are you looking at?

Aloha,
--
Jim Manico
@Manicode
(808) 652-3805

On Aug 29, 2014, at 6:26 AM, Matt Seil <mseil at acm.org<javascript:_e(%7B%7D,'cvml','mseil at acm.org');>> wrote:

In the context of your question Jim, I think it makes more sense to canonicalize BEFORE you validate--not the other way around.  I noticed too, that all the "getValidInput" methods call Encoder.canonicalize() after validation and meant to ask why we would want that approach.





On Wed, Aug 27, 2014 at 6:51 PM, Jim Manico <jim.manico at owasp.org<javascript:_e(%7B%7D,'cvml','jim.manico at owasp.org');>> wrote:
ESAPI community,

I am very concerned about the design of the ESAPI validation layer and how it handles canonicalization for API's like:

isValidInput<https://owasp-esapi-java.googlecode.com/svn/trunk_doc/latest/org/owasp/esapi/reference/DefaultValidator.html#isValidInput%28java.lang.String,%20java.lang.String,%20java.lang.String,%20int,%20boolean%29>(java.lang.String context, java.lang.String input, java.lang.String type, int maxLength, boolean allowNull)

Right now, before validation, ESAPI will try to decode user input to it's normalized form and THEN it will try to validate the input. http://owasp-esapi-java.googlecode.com/svn/trunk/src/main/java/org/owasp/esapi/reference/validation/StringValidationRule.java

This seems rather dangerous in that encoded attacks will be "fixed" and will not alert of potential malicious input. If input is detected to be encoded, then I would suggest that we throw an IntrusionDetectionException and stop processing. Why do we "clean up" encoded data and then validate in these API's?

Aloha,
Jim

PS: You certainly can disable canonicalization as it stands today via:

  isValidInput<https://owasp-esapi-java.googlecode.com/svn/trunk_doc/latest/org/owasp/esapi/reference/DefaultValidator.html#isValidInput%28java.lang.String,%20java.lang.String,%20java.lang.String,%20int,%20boolean,%20boolean%29>(java.lang.String context, java.lang.String input, java.lang.String type, int maxLength, boolean allowNull, boolean canonicalize)


_______________________________________________
Esapi-dev mailing list
Esapi-dev at lists.owasp.org<javascript:_e(%7B%7D,'cvml','Esapi-dev at lists.owasp.org');>
https://lists.owasp.org/mailman/listinfo/esapi-dev




--
Matt Seil
Software Engineer
ACM/IEEE
_______________________________________________
Esapi-user mailing list
Esapi-user at lists.owasp.org<mailto:Esapi-user at lists.owasp.org>
https://lists.owasp.org/mailman/listinfo/esapi-user
_______________________________________________
Esapi-dev mailing list
Esapi-dev at lists.owasp.org<mailto:Esapi-dev at lists.owasp.org>
https://lists.owasp.org/mailman/listinfo/esapi-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.owasp.org/pipermail/esapi-user/attachments/20140830/46f9cee9/attachment.html>


More information about the Esapi-user mailing list