[OWASP-ESAPI] File Content Validation

Alex Smolen me at alexsmolen.com
Tue May 19 10:23:26 EDT 2009

I think this is why the ESAPI is an extensible API rather than a security library. The requirements for secure file contents is going to vary from organization to organization. Perhaps the reference implementation could implement the "strategy" pattern, where people can add their own file content checks (i.e. SafeFile.AddContentCheck(IContentChecker) or something similar) and we could have some default content checks (like SizeContentChecker, ImageContentChecker, etc). I don't think there is anything simple we can do to check for malicious file contents like viruses that's going to work cross-platform.


From: "Arshan Dabirsiaghi" <arshan.dabirsiaghi at aspectsecurity.com>
Sent: Tuesday, May 19, 2009 6:36 AM
To: "Jim Manico" <jim.manico at owasp.org>
Subject: Re: [OWASP-ESAPI] File Content Validation 

Speaking generally, it is very much an unsolved issue. Ignoring totally semantic issues (i.e., what about a "goatse" picture?), even pictures can be syntactically valid while still posing a threat in one way or another (see GIFAR, polyglots, IE mime-sniffing). 
I have zero confidence that someone "validating" an MS Word file on the server (in Java or .NET) is actually protecting me. What kind of "validation" are they doing? Sum(macros) == 0? It's a ridiculously large attack surface. We would be wise like an owl not to throw our chips into this hand IMHO. 
If we want to validate pictures, fine. It should be noted then that we will also be inheriting the security of how the browser and various plugins handle them. 


From: Jim Manico [mailto:jim.manico at owasp.org]
Sent: Sun 5/17/2009 1:10 AM
To: Arshan Dabirsiaghi
Cc: Dave Wichers; owasp-esapi at lists.owasp.org
Subject: Re: [OWASP-ESAPI] File Content Validation

> File content validation is an intractable problem most of the time.  
> My vote is not to attempt to "solve" this problem.  
Well, it really depends on the file type. If you are talking most image types, it's frankly a solved issue. Like Johnny is suggesting, the core of secure image upload is to load your image into a image rendering library image abstraction mechanism and have that library then re-save it. Most image library abstractions only save "pure" images when re-saved that are generally safe. Other secure image upload needs include being able to terminate an upload stream the moment a file size limit is reached, saving the file into protected portion of the OS until the file is deemed safe, AV checking, etc. A lot of very big and small players get at least image upload right. 
I suggest we split this api into image and non images API's. OWASP can offer a complete production quality and secure Java/PHP/.NET image file upload using some kind of standard back end image library like ImageMagick.  
Supporting non-image file for secure file upload is a mixed bag. 
Doing contect introspection on doc types from Microsoft like complex MS Word docs is brutal to impossible on the Java platform, but trivial on .NET.  We were able to squeak buy with imageMagic processing and verifying PDF's but complex PDF's start to require Adobe solutions (ech). Processing Photoshop files was nearly impossible without also giving Adobe a lot of money - forked off another effort that supported some native Java PSD processing from older versions oh Photoshop. Processing text and other files is easy. Unless you need to start processing multi-gig text files - then you might want to validate with a more stream based approach - which gets even trickier.  
Anyhow, in conclusion : we can solve secure image upload and provide a OWASP solution, IMO. Everything else depends on the need and software stack. 
- Jim 
----- Original Message ----- 

From: Arshan Dabirsiaghi 
To: Jim Manico 
Cc: Dave Wichers ; owasp-esapi at lists.owasp.org 
Sent: Saturday, May 16, 2009 2:56 PM
Subject: Re: [OWASP-ESAPI] File Content Validation

File content validation is an intractable problem most of the time. Virii signatures are easily bypassable and most file formats can have dangerous content by design. 

This is not to mention that every file format is different and the majority of time malicious intentions can be expressed in many ways.

My vote is not to attempt to "solve" this problem. Businesses may be able to implement a version of this API that  validates batch records or something, but the spec should denote the limited scope.


On May 16, 2009, at 7:56 PM, "Jim Manico" <jim.manico at owasp.org> wrote:

My position for safe upload is that you go all the way and do it right or disable the feature. For esapi to give a partial solution in the ref impl is dangerous, in my highly opinioniated opinion, cause of how vulnerable it is. 

I'd like to see safe upload throw a runtimeException that's points to a owasp doc explaining how to do this right - which is very complex.

Again, just my opinion. File upload is brutal - one of the difficult parts of web app sec.

PS: I'm on the beach writing from my iPhone. I didn't realize my last email was blasting the whole list, sorry. 

Jim Manico

On May 15, 2009, at 5:22 PM, "Dave Wichers" <dave.wichers at owasp.org> wrote:

It would absolutely be very interesting and a valuable contribution to  ESAPI. We tried to make it clear in the ESAPI API documentation what these method needs to do to be a good implementation, and then in the javadoc for our reference implementation we explain what ours does, which isn't that much, and what YOUR IMPLEMENTATION still needs to do (including antivirus scanning).

So, of you wanted to implement some more powerful capabilities that we could hook into ESAPI, that would be a great contribution.

The same idea goes for SafeHTML. ESAPI could have built some primitive capabilities but lucky for us, AntiSamy already existed, so we simply adopted that as the ESAPI solution which provides far more capability than we would have implemented ourselves.


From: owasp-esapi-bounces at lists.owasp.org [mailto:owasp-esapi-bounces at lists.owasp.org] On Behalf Of Jeremy Long
Sent: Friday, May 15, 2009 8:10 PM
To: owasp-esapi at lists.owasp.org
Subject: [OWASP-ESAPI] File Content Validation

I noticed the org.owasp.esapi.SafeFile class within the ESAPI and I started considering a very difficult security problem - validation of the contents of standard file types.  If you allow file uploads - forget about viruses - that can be done by hooking into a virus scanning API from one of the big companies. How do you know the content of the file is safe (think GIFAR).  I was told some of the social networks that allow image file upload (and I'm pretty sure blogspot.com does this) actually load the file into an Image Object and save the image from the image object (not the originally uploaded file).  So, loading of images could be fairly easily implemented.  But what about other common file types?  PDF, XLS, etc.

Anyone have any ideas on validating the content of files?  Would some base file-content validators be an interesting addition to the ESAPI (with images being the easiest one)?

--Jeremy Long

OWASP-ESAPI mailing list
OWASP-ESAPI at lists.owasp.org

OWASP-ESAPI mailing list
OWASP-ESAPI at lists.owasp.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-esapi/attachments/20090519/f4adbd7c/attachment-0001.html 

More information about the OWASP-ESAPI mailing list