[OWASP-ESAPI] File Content Validation

Jim Manico jim.manico at owasp.org
Sun May 17 01:10:01 EDT 2009

> File content validation is an intractable problem most of the time.
> My vote is not to attempt to "solve" this problem.

Well, it really depends on the file type. If you are talking most image types, it's frankly a solved issue. Like Johnny is suggesting, the core of secure image upload is to load your image into a image rendering library image abstraction mechanism and have that library then re-save it. Most image library abstractions only save "pure" images when re-saved that are generally safe. Other secure image upload needs include being able to terminate an upload stream the moment a file size limit is reached, saving the file into protected portion of the OS until the file is deemed safe, AV checking, etc. A lot of very big and small players get at least image upload right.

I suggest we split this api into image and non images API's. OWASP can offer a complete production quality and secure Java/PHP/.NET image file upload using some kind of standard back end image library like ImageMagick. 

Supporting non-image file for secure file upload is a mixed bag.

Doing contect introspection on doc types from Microsoft like complex MS Word docs is brutal to impossible on the Java platform, but trivial on .NET.  We were able to squeak buy with imageMagic processing and verifying PDF's but complex PDF's start to require Adobe solutions (ech). Processing Photoshop files was nearly impossible without also giving Adobe a lot of money - forked off another effort that supported some native Java PSD processing from older versions oh Photoshop. Processing text and other files is easy. Unless you need to start processing multi-gig text files - then you might want to validate with a more stream based approach - which gets even trickier. 

Anyhow, in conclusion : we can solve secure image upload and provide a OWASP solution, IMO. Everything else depends on the need and software stack.

- Jim

----- Original Message ----- 
  From: Arshan Dabirsiaghi 
  To: Jim Manico 
  Cc: Dave Wichers ; owasp-esapi at lists.owasp.org 
  Sent: Saturday, May 16, 2009 2:56 PM
  Subject: Re: [OWASP-ESAPI] File Content Validation

  File content validation is an intractable problem most of the time. Virii signatures are easily bypassable and most file formats can have dangerous content by design. 

  This is not to mention that every file format is different and the majority of time malicious intentions can be expressed in many ways.

  My vote is not to attempt to "solve" this problem. Businesses may be able to implement a version of this API that  validates batch records or something, but the spec should denote the limited scope.


  On May 16, 2009, at 7:56 PM, "Jim Manico" <jim.manico at owasp.org> wrote:

    My position for safe upload is that you go all the way and do it right or disable the feature. For esapi to give a partial solution in the ref impl is dangerous, in my highly opinioniated opinion, cause of how vulnerable it is. 

    I'd like to see safe upload throw a runtimeException that's points to a owasp doc explaining how to do this right - which is very complex.

    Again, just my opinion. File upload is brutal - one of the difficult parts of web app sec.

    PS: I'm on the beach writing from my iPhone. I didn't realize my last email was blasting the whole list, sorry. 

    Jim Manico

    On May 15, 2009, at 5:22 PM, "Dave Wichers" <dave.wichers at owasp.org> wrote:

      It would absolutely be very interesting and a valuable contribution to  ESAPI. We tried to make it clear in the ESAPI API documentation what these method needs to do to be a good implementation, and then in the javadoc for our reference implementation we explain what ours does, which isn’t that much, and what YOUR IMPLEMENTATION still needs to do (including antivirus scanning).

      So, of you wanted to implement some more powerful capabilities that we could hook into ESAPI, that would be a great contribution.

      The same idea goes for SafeHTML. ESAPI could have built some primitive capabilities but lucky for us, AntiSamy already existed, so we simply adopted that as the ESAPI solution which provides far more capability than we would have implemented ourselves.


      From: owasp-esapi-bounces at lists.owasp.org [mailto:owasp-esapi-bounces at lists.owasp.org] On Behalf Of Jeremy Long
      Sent: Friday, May 15, 2009 8:10 PM
      To: owasp-esapi at lists.owasp.org
      Subject: [OWASP-ESAPI] File Content Validation

      I noticed the org.owasp.esapi.SafeFile class within the ESAPI and I started considering a very difficult security problem - validation of the contents of standard file types.  If you allow file uploads - forget about viruses - that can be done by hooking into a virus scanning API from one of the big companies. How do you know the content of the file is safe (think GIFAR).  I was told some of the social networks that allow image file upload (and I'm pretty sure blogspot.com does this) actually load the file into an Image Object and save the image from the image object (not the originally uploaded file).  So, loading of images could be fairly easily implemented.  But what about other common file types?  PDF, XLS, etc.

      Anyone have any ideas on validating the content of files?  Would some base file-content validators be an interesting addition to the ESAPI (with images being the easiest one)?

      --Jeremy Long

      OWASP-ESAPI mailing list
      OWASP-ESAPI at lists.owasp.org

    OWASP-ESAPI mailing list
    OWASP-ESAPI at lists.owasp.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-esapi/attachments/20090516/380f3200/attachment-0001.html 

More information about the OWASP-ESAPI mailing list