[Esapi-dev] [Esapi-user] URL Validation and Encoding

Jim Manico jim.manico at owasp.org
Thu Sep 23 21:32:44 EDT 2010

Yup, I totally agree - we can use the URI class or ESAPI.encodeForURL()


I'm just looking for an encoding function in order to shove an untrusted
URL, URL Fragment, or URL parameter into a href link context in a way that
stops XSS without breaking the URL. These are special cases since its in an
attribute context <a href="DATA">click me</a>  bet we do not want to
attribute encode here.


And I think there are 2 cases to consider:


1)      The URL root is hard coded and you only need to encode a GET

a.       <a href="/site/user?id=UNTRUSTED-DATA">click me</a>

b.      In this case we just URL encode UNTRUSTED-DATA

2)      The untrusted data is a relative or absolute URL

a.       <a href="UNTRUSTED-DATA">click me</a>

b.      We cannot URL encode UNTRUSTED-DATA here or we will break the link

c.       We can surely use the URI class under the hood here


So I'm thinking that we still need:




ESAPI.encoder().encodeURL(String url)

And perhaps

ESAPI.encoder().encodeURL(List<String> legalProtocols, String url)

And/or make legal protocols configurable


Also, a URL needs to be valid to be encoded for safe display if we use this
scheme. If a URL is invalid at encoding time, perhaps just return a "#" or a
blank string?


Again, why this madness? I'm trying to get away from regular expression
based defense and instead take a page from compiler design thinking: (1)
first load the input in question (a url) into a object abstraction that
formally models that input and then (2) "write" the data in a whitelist way
only supporting features of that input that are safe. 


I've seen proprietary versions of AntiSamy that do this (where instead of an
regular-expression based set of rules, the untrusted HTML is loaded into a
HTML abstraction set of classes, like Wicket. Then, the "clean" function
would just just write out only the legal tags that are to be supported. This
kind of coding is (1) way faster (2) may more accurate (3) simpler code (4)
less chance of failure over time. I think.


- Jim









From: esapi-dev-bounces at lists.owasp.org
[mailto:esapi-dev-bounces at lists.owasp.org] On Behalf Of Chris Schmidt
Sent: Thursday, September 23, 2010 4:01 AM
To: esapi-dev at lists.owasp.org
Subject: Re: [Esapi-dev] [Esapi-user] URL Validation and Encoding


It seems like we may be redoing a lot of work here that is already done for
us - 

>From the JavaDocs on java.net.URL

Note, the  <http://download.oracle.com/javase/6/docs/api/java/net/URI.html>
URI class does perform escaping of its component fields in certain
circumstances. The recommended way to manage the encoding and decoding of
URLs is to use
<http://download.oracle.com/javase/6/docs/api/java/net/URI.html> URI, and to
convert between these two classes using
toURI() and

Unless I am missing something, why not just use the built-in API to perform
the encoding of the URL. 

Validation is another story altogether, but URL validation seems like a big
dark hole that could lead to some interesting assumptions and expectations -
I have written a couple of URL validators that even go so far as to do a DNS
lookup of the domain, submit a request to the url specified (and thought has
been given to scanning the response for *dangerous* content), verify the
response code is a 200 and only then would the URL be valid.

Point here being that while this sounds like something that may be somewhat
useful to a handful of people, and perhaps at least a basic - this is a
valid url - functionality would be helpful, I think that there are bigger
fish to fry that re-inventing RFC2396 Encoding for URLs. To the best of my
knowledge, the URI encoding is fully compliant. If we really want to add to
the encoding interface, perhaps just a delegation method to that is the
right way to go?


On 9/22/2010 11:58 PM, Jim Manico wrote: 

We can add a second encoder for relative URL's, but the programmer would
need to specify the domain, using one of the other URL constructors, like:
  new URL("http", "www.gamelan.com", "/pages/Gamelan.net.html");
And ESAPI would provide:
ESAPI.encoder().encodeCompleteURL(String URL);
ESAPI.encoder().encodeURLParameter(String data); //Javascript calls this a
ESAPI.encoder().encodeRelativeURL(String root, String relativeURL);
As well as
ESAPI.validator().assertValidCompleteURL(String url) throws
ESAPI.validator().assertValidRelativeURL(String root, String relativeURL)
throws ValidationException;
boolean ESAPI.validator().isValidCompleteURL(String url);
boolean ESAPI.validator().isValidRelativeURL(String root, String
- Jim
-----Original Message-----
From: Ed Schaller [mailto:schallee at darkmist.net] 
Sent: Wednesday, September 22, 2010 4:44 PM
To: augustd
Cc: Jim Manico; ESAPI-Developers; esapi-user at lists.owasp.org
Subject: Re: [Esapi-user] [Esapi-dev] URL Validation and Encoding
> Old Signed by an unknown key

This should be easy enough to do with built-in methods of java.net.URL


getProtocol(), getHost(), getPath(), etc.

Just to be the devil's advocate here, what happens if the URL the
developer wants to support doesn't have a protocol handler? Is this
something we care about? If it is, java.net.URL wont work well and
adding new protocol handlers has implications on ClassLoaders and java
2 security.


* Unknown Key
* 0xA1297841
Esapi-dev mailing list
Esapi-dev at lists.owasp.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/esapi-dev/attachments/20100923/03fbf574/attachment-0001.html 

More information about the Esapi-dev mailing list