[Esapi-dev] Question about ESAPI encoding
Rohit Sethi
rklists at gmail.com
Mon Sep 20 11:16:47 EDT 2010
I think its an option on existing methods, and an optional attribute
for the tag libraries. 'DoubleEscapeProtection' or something similar
to the anti-xss library would work.
In order to preserve backwards compatibility I'd set the default for
this option to false, and advise users through the javadoc about the
benefits of turning it to true.
On 9/20/10, Jeff Williams <jeff.williams at owasp.org> wrote:
> I totally agree that the taglib issue is a big deal.
>
> Your idea here is good. If we had a way to escape for a particular context
> without fear of double escaping, we could prevent most screwed up content
> and stop XSS.
>
> I was hoping to take this one step further, and really clean up the data
> before we send it to a browser. That's why I'm focused on canonicalize. I've
> seen too many cases where a downstream process decodes and "reactivates" an
> attack. But I'm willing to let this go in order to get the big benefit now.
>
> Let's get the API worked out! Do you think this is a new method, an option
> on the existing escaping methods, or something else?
>
> Thanks,
>
> --Jeff
>
>
> -----Original Message-----
> From: esapi-dev-bounces at lists.owasp.org
> [mailto:esapi-dev-bounces at lists.owasp.org] On Behalf Of Rohit Sethi
> Sent: Monday, September 20, 2010 1:33 AM
> To: Jeff Williams
> Cc: ESAPI-Developers
> Subject: Re: [Esapi-dev] Question about ESAPI encoding
>
> I would certainly like to leverage the existing codecs.
>
> My objection to the canonicalize method is exactly the problem you
> discussed earlier - suppose a developer wants to show HTML syntax and
> outputs &lt; - the canonicalized version would incorrectly remove
> the double-encoding. The result would be < which is not what the
> developer intended. Correct me if I'm wrong here - I'm trying to
> understand this based on the current canonicalize method in the
> DefaultEncoder class.
>
> I think that canonicalizing all output prior to encoding will probably
> work correctly in most cases, but will definitely break in some cases
> (such as the example you articulated). The approach I'm suggesting
> will do exactly what the codec was intended to do (i.e. output
> HTML/HTML Attribute/JavaScript/CSS escaped data - without concern if
> that data has other, mixed encodings within it) albeit with the
> provision that it won't re-encode something that's already been
> escaped for that particular output type (e.g. HTML codec will not
> re-encode HTML entities).
>
> This would be a major boon because we could apply it seamless to most
> tag libraries that may output untrusted data without concern about
> double encoding and breaking the data. Without it, we have to expect
> developers to know the default encoding behavior of every tag library
> they use. Sure, some diligent developers will go out of their way to
> research this but I'm sure many others will just abandon trying to
> escape all together because they run into pesky double encoding.
>
> On Sun, Sep 19, 2010 at 9:56 PM, Jeff Williams
> <jeff.williams at aspectsecurity.com> wrote:
>> You're right that it *shouldn't* be overly complex. It just is. There
>> are roughly 70 different ways to encode the < character using the escape
>> formats you mentioned.
>>
>> I suggest you consider using the existing ESAPI codecs to do what you're
>> suggesting. What is your objection to using the existing canonicalize
>> method - which handles this already?
>>
>> --Jeff
>>
>>
>> -----Original Message-----
>> From: esapi-dev-bounces at lists.owasp.org
>> [mailto:esapi-dev-bounces at lists.owasp.org] On Behalf Of Rohit Sethi
>> Sent: Monday, September 20, 2010 12:32 AM
>> To: Jeff Williams; ESAPI-Developers
>> Subject: Re: [Esapi-dev] Question about ESAPI encoding
>>
>> (Joined the esapi-dev list on a different email account)
>>
>> Jeff, I'm proposing that doing this for just the html, html attribute,
>> javascript, and css codecs *shouldnt* be overly complex. I'm also
>> confused on how this idea would break in multiple / nested encodings
>> anymore than the current codecs already do.
>>
>> Maybe the best way to do this is to test it out .. I'd like to
>> volunteer for this but I can't realistically say when I'll get around
>> to it. If anyone else cares to do this in the interim I'd be much
>> obliged.
>>
>>
>>
>> On 9/19/10, Jeff Williams <jeff.williams at owasp.org> wrote:
>>> Hi Rohit, I think we're proposing essentially the same thing, except
>> that
>>> your approach doesn't work for all the different escaping formats
>> because it
>>> would be prohibitively complex to write a parser to handle them all.
>> The
>>> ESAPI Codecs are intentionally very simple to hide some of this
>> complexity.
>>> It also won't really do what you want if there are double or nested
>> encoding
>>> scenarios. You'll end up with messed up HTML. My assertion is that
>> 99% of
>>> the time, you want to canonicalize fully, then escape properly.
>>>
>>>
>>>
>>> --Jeff
>>>
>>>
>>>
>>> From: Sethi, Rohit [mailto:rohit at securitycompass.com]
>>> Sent: Sunday, September 19, 2010 3:15 PM
>>> To: 'jeff.williams at owasp.org'; 'jim at manico.net'
>>> Subject: Re: Question about ESAPI encoding
>>>
>>>
>>>
>>> Jeff, forgive my ignorance on the subject. I'm not sure I understand
>> the
>>> need to canonicalize prior to encoding. If there was a simple option
>> to not
>>> encode existing escaped sequences for the specific codec in question
>> then I
>>> believe this - without canonicalization - will still achieve the goal.
>> For
>>> example looking at &lt; the encoder would spot "&", avoid
>> encoding
>>> it, and the characters rendered in the browser would be "<" which
>> is the
>>> display html syntax you discussed below.
>>>
>>> I think I'm missing your point about nested encoding schemes. In what
>>> scenario would the above solution fail with nested encodings?
>>>
>>>
>>>
>>> _____
>>>
>>> From: Jeff Williams <jeff.williams at owasp.org>
>>> To: James Manico <jim at manico.net>
>>> Cc: Sethi, Rohit
>>> Sent: Sat Sep 18 21:37:24 2010
>>> Subject: Re: Question about ESAPI encoding
>>>
>>> The right way to do this is to canonicalize before encode. The
>> approach
>>> suggested won't work if there are multiple or nested encoding schemes
>> used
>>> in the data. ESAPI used to do exactly this. But there was some
>> pushback
>>> because a site that wants to display html syntax (for example) won't
>> work
>>> anymore as I requires double escaping. I like the idea of making an
>> option
>>> for this at the risk of confusing some. Actually I like better the
>> idea of
>>> making this the default and providing an option to allow multiple
>> encoding
>>> of a single type.
>>>
>>> --Jeff
>>>
>>>
>>>
>>> Jeff Williams
>>>
>>> Aspect Security
>>>
>>> work: 410-707-1487
>>>
>>> main: 301-604-4882
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sep 18, 2010, at 7:46 PM, James Manico <jim at manico.net> wrote:
>>>
>>> Jeff,
>>>
>>>
>>>
>>> Any thoughts here sir? Good idea from Rohit....
>>>
>>>
>>>
>>> Rohit, can I send this to the ESAPI-dev list?
>>>
>>>
>>>
>>> - Jim
>>>
>>>
>>>
>>> From: Sethi, Rohit [mailto:rohit at securitycompass.com]
>>> Sent: Thursday, September 16, 2010 5:07 AM
>>> To: James Manico
>>> Subject: Question about ESAPI encoding
>>>
>>>
>>>
>>> Jim, when creating the ESAPI encoding has anyone thought about adding
>> an
>>> optional configuration to avoid double encoding? I believe Microsoft's
>>> AntiXSS library has this concept.
>>>
>>>
>>> Basically, when encoding if I find the existence of a potential escape
>>> character (e.g. &) then I traverse the next few characters to see if
>> it
>>> matches a white list < gt; " & , etc. and
>> &#<digit><digit>; -
>>> in those cases I avoid encoding those specific haracters. This allows
>> us to
>>> safely wrap any tag library that might output HTML without worrying
>> about
>>> double encoding.
>>>
>>>
>>> You could provide equivalent functionality to other codecs such as
>> XML,
>>> HTMLAttributes, JavaScript, etc.
>>>
>>>
>>>
>>> Rohit Sethi
>>>
>>> Director, Professional Services
>>>
>>> Security Compass
>>>
>>> http://www.securitycompass.com
>>>
>>> Direct : 888-777-2211 ext. 102
>>>
>>> Mobile: 732.546.4473
>>>
>>> Twitter: rksethi
>>>
>>>
>>>
>>>
>> ************************************************************************
>> ************************************************************************
>> **************
>>>
>>> The information in this email is confidential and may be legally
>> privileged.
>>> Access to this email by anyone other than the intended addressee is
>>> unauthorized. If you are not the intended recipient of this message,
>> any
>>> review, disclosure, copying, distribution, retention, or any action
>> taken or
>>> omitted to be taken in reliance on it is prohibited and may be
>> unlawful. If
>>> you are not the intended recipient, please reply to or forward a copy
>> of
>>> this message to the sender and delete the message, any attachments,
>> and any
>>> copies thereof from your system.
>>>
>>>
>> ************************************************************************
>> ************************************************************************
>> **************
>>>
>>>
>>>
>>>
>>
>> --
>> Sent from my mobile device
>>
>> Rohit Sethi
>> Security Compass
>> http://www.securitycompass.com
>> twitter: rksethi
>> _______________________________________________
>> Esapi-dev mailing list
>> Esapi-dev at lists.owasp.org
>> https://lists.owasp.org/mailman/listinfo/esapi-dev
>>
>
>
>
> --
> Rohit Sethi
> Security Compass
> http://www.securitycompass.com
> twitter: rksethi
> _______________________________________________
> Esapi-dev mailing list
> Esapi-dev at lists.owasp.org
> https://lists.owasp.org/mailman/listinfo/esapi-dev
>
>
--
Sent from my mobile device
Rohit Sethi
Security Compass
http://www.securitycompass.com
twitter: rksethi
More information about the Esapi-dev
mailing list