[OWASP-ESAPI] File Content Validation

Arshan Dabirsiaghi arshan.dabirsiaghi at aspectsecurity.com
Tue May 19 09:36:04 EDT 2009

Speaking generally, it is very much an unsolved issue. Ignoring totally semantic issues (i.e., what about a "goatse" picture?), even pictures can be syntactically valid while still posing a threat in one way or another (see GIFAR, polyglots, IE mime-sniffing).
I have zero confidence that someone "validating" an MS Word file on the server (in Java or .NET) is actually protecting me. What kind of "validation" are they doing? Sum(macros) == 0? It's a ridiculously large attack surface. We would be wise like an owl not to throw our chips into this hand IMHO.
If we want to validate pictures, fine. It should be noted then that we will also be inheriting the security of how the browser and various plugins handle them.


From: Jim Manico [mailto:jim.manico at owasp.org]
Sent: Sun 5/17/2009 1:10 AM
To: Arshan Dabirsiaghi
Cc: Dave Wichers; owasp-esapi at lists.owasp.org
Subject: Re: [OWASP-ESAPI] File Content Validation

> File content validation is an intractable problem most of the time.
> My vote is not to attempt to "solve" this problem.
Well, it really depends on the file type. If you are talking most image types, it's frankly a solved issue. Like Johnny is suggesting, the core of secure image upload is to load your image into a image rendering library image abstraction mechanism and have that library then re-save it. Most image library abstractions only save "pure" images when re-saved that are generally safe. Other secure image upload needs include being able to terminate an upload stream the moment a file size limit is reached, saving the file into protected portion of the OS until the file is deemed safe, AV checking, etc. A lot of very big and small players get at least image upload right.
I suggest we split this api into image and non images API's. OWASP can offer a complete production quality and secure Java/PHP/.NET image file upload using some kind of standard back end image library like ImageMagick. 
Supporting non-image file for secure file upload is a mixed bag.
Doing contect introspection on doc types from Microsoft like complex MS Word docs is brutal to impossible on the Java platform, but trivial on .NET.  We were able to squeak buy with imageMagic processing and verifying PDF's but complex PDF's start to require Adobe solutions (ech). Processing Photoshop files was nearly impossible without also giving Adobe a lot of money - forked off another effort that supported some native Java PSD processing from older versions oh Photoshop. Processing text and other files is easy. Unless you need to start processing multi-gig text files - then you might want to validate with a more stream based approach - which gets even trickier. 
Anyhow, in conclusion : we can solve secure image upload and provide a OWASP solution, IMO. Everything else depends on the need and software stack.
- Jim
----- Original Message ----- 

	From: Arshan Dabirsiaghi <mailto:arshan.dabirsiaghi at aspectsecurity.com>  
	To: Jim Manico <mailto:jim.manico at owasp.org>  
	Cc: Dave Wichers <mailto:dave.wichers at owasp.org>  ; owasp-esapi at lists.owasp.org 
	Sent: Saturday, May 16, 2009 2:56 PM
	Subject: Re: [OWASP-ESAPI] File Content Validation

	File content validation is an intractable problem most of the time. Virii signatures are easily bypassable and most file formats can have dangerous content by design. 

	This is not to mention that every file format is different and the majority of time malicious intentions can be expressed in many ways.
	My vote is not to attempt to "solve" this problem. Businesses may be able to implement a version of this API that  validates batch records or something, but the spec should denote the limited scope.


	On May 16, 2009, at 7:56 PM, "Jim Manico" <jim.manico at owasp.org> wrote:

		My position for safe upload is that you go all the way and do it right or disable the feature. For esapi to give a partial solution in the ref impl is dangerous, in my highly opinioniated opinion, cause of how vulnerable it is. 

		I'd like to see safe upload throw a runtimeException that's points to a owasp doc explaining how to do this right - which is very complex.
		Again, just my opinion. File upload is brutal - one of the difficult parts of web app sec.

		PS: I'm on the beach writing from my iPhone. I didn't realize my last email was blasting the whole list, sorry. 

		Jim Manico

		On May 15, 2009, at 5:22 PM, "Dave Wichers" < <mailto:dave.wichers at owasp.org> dave.wichers at owasp.org> wrote:

			It would absolutely be very interesting and a valuable contribution to  ESAPI. We tried to make it clear in the ESAPI API documentation what these method needs to do to be a good implementation, and then in the javadoc for our reference implementation we explain what ours does, which isn't that much, and what YOUR IMPLEMENTATION still needs to do (including antivirus scanning).


			So, of you wanted to implement some more powerful capabilities that we could hook into ESAPI, that would be a great contribution.


			The same idea goes for SafeHTML. ESAPI could have built some primitive capabilities but lucky for us, AntiSamy already existed, so we simply adopted that as the ESAPI solution which provides far more capability than we would have implemented ourselves.




			From: <mailto:owasp-esapi-bounces at lists.owasp.org> owasp-esapi-bounces at lists.owasp.org [ <mailto:owasp-esapi-bounces at lists.owasp.org> mailto:owasp-esapi-bounces at lists.owasp.org] On Behalf Of Jeremy Long
			Sent: Friday, May 15, 2009 8:10 PM
			To: <mailto:owasp-esapi at lists.owasp.org> <mailto:owasp-esapi at lists.owasp.org> owasp-esapi at lists.owasp.org
			Subject: [OWASP-ESAPI] File Content Validation


			I noticed the org.owasp.esapi.SafeFile class within the ESAPI and I started considering a very difficult security problem - validation of the contents of standard file types.  If you allow file uploads - forget about viruses - that can be done by hooking into a virus scanning API from one of the big companies. How do you know the content of the file is safe (think GIFAR).  I was told some of the social networks that allow image file upload (and I'm pretty sure blogspot.com <http://blogspot.com/>  does this) actually load the file into an Image Object and save the image from the image object (not the originally uploaded file).  So, loading of images could be fairly easily implemented.  But what about other common file types?  PDF, XLS, etc.


			Anyone have any ideas on validating the content of files?  Would some base file-content validators be an interesting addition to the ESAPI (with images being the easiest one)?


			--Jeremy Long

			OWASP-ESAPI mailing list
			<mailto:OWASP-ESAPI at lists.owasp.org> OWASP-ESAPI at lists.owasp.org
			<https://lists.owasp.org/mailman/listinfo/owasp-esapi> https://lists.owasp.org/mailman/listinfo/owasp-esapi

		OWASP-ESAPI mailing list
		OWASP-ESAPI at lists.owasp.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-esapi/attachments/20090519/db5744f4/attachment.html 

More information about the OWASP-ESAPI mailing list