[Owasp Source Flaws Top 10] Education and Culture; Canonicalization
erlend at oftedal.no
Thu Dec 18 10:58:14 EST 2008
I was merely using SQL as an example.
To me the real problem is not separating good and bad input. That is what
input validation is trying to solve.
To me the real problem is the fact that we are mixing control with data.
This is true not only for SQL queries, but for XSS, LDAP injection, XPATH
When we are mixing control and data, data can become control. So we need
to make sure that data stays data. In SQL this means escaping the data for
SQL. For HTML this means HTML-encoding the data. For LDAP you need to
escape the LDAP metacharacters. For web services you need to XML-encode
If you've read all my posts in this mailing list, you will see that I'm
definitely not saying that input validation is unecessary. I'm saying BOTH
input validation and output escaping should be used. There are domains
where it is easy to solve everything using input validation (example:
numeric), and there are domains where this is almost impossible (example:
blog comments in different languages). In most cases, a mix is probably
My point was only that input validation may not be enough. If you want to
allow people to post SQL statements, HTML-snippets, LDAP queries etc. in
the text of a page (example: a blog comment), this will be impossible to
solve using input validation, because syntactily and semantically the
comment is valid even though it contains characters that are dangerous for
a given subsystem. So we need to escape it for each subsystem we want
to exchange data with, be it a browser, SQL, Web Services or something
completely different, and that's not something we can do as a part of the
input validation. It needs to be performed when we are building the
queries and adding the user data.
So no, IMHO I'm not mixing a problem with its solution. Input validation
and output escaping are both attempts to solve the problem described
above. Input validation can also help solve other problems like semantic
validation as you mentioned in your post, but it cannot necessarily solve
On Thu, 18 Dec 2008, Andrew Petukhov wrote:
> Well, let us start from the beginning.
> Let us define what do we mean under "input validation". From my
> perspective, "input validation" is a process that aims at distinguishing
> bad input from good input for a given service.
> Alongside two questions arise:
> 1. How to determine good input?
> 2. What to do with bad input?
> There are two major kinds of checks in order to determine good input:
> syntax checking and semantics checking.
> For instance, let us consider Square root service, which takes one input
> HTTP parameter. In order to validate input we should check:
> - its syntax (i.e. a number was supplied)
> - its semantics (i.e. a nonnegative number was supplied)
> How to perform syntax checks is yet the question of further investigation
> (not a primary one): canonicalization, white listing vs black listing,
> using grammars, etc. Those are the solutions to the problem, how to
> separate bad input from good input.
> The second question: what to do with bad input?
> And possible solutions are:
> - remove it (strip carriage returns by Trim function)
> - stop processing and return an error (as in the square root example)
> And the exact answers to these two questions (how to detect bad vs good
> input and what to do with bad input) should be solved for eached used
> service: SQL, generation of HTML (i.e. using browser interpreter), Shell
> interpreter invocation, usage of mathematical operations and so on.
> And what you, Erlend, is proposing is to mix the problem (separate bad
> and good input) and particular way to solve it (output escaping).
> Furthermore, your arguments are based only upon common SQL service.
> However there are plenty of other services.
> How would you escape output before sqare root function?
> So, the fact is: we should not mix the problem and a solution to it in
> one entity.
> Erlend Oftedal wrote:
> Let us consider the following logic:
> 1. I cut of everythong I deem malicious.
> 2. If I encounter malicious input, I present a user a custom error page
> Definitely, there is no vulnerability here. However, output escaping is
> How do you know what is malicious? I you are going to allow users from all
> over the world to input data, how do you know whether a name is valid or
> not. And as mentioned before, the name "O'Brian" contains a quote. How do
> you handle this? Do you cut it out? O'Brian is a valid name, and should be
> stored as a valid name including the quote. For all I know "Delete",
> "Select" or "Union" might be names in some country, which makes it hard
> to filter out SQL keywords.
> Building a proper whitelist for input validation can be hard. It's easy
> for things like numbers, but it gets harder the more complex the user
> input is. Commments are probably the worst type to build a proper white
> list for. You might even want to allow people to enter SQL statements or
> at least SQL keywords like "select" or "delete" because those words are
> used in natural language
> Lets consider a different logic:
> 1. I escape user input whenever I send it to a subsystem, and I escape it
> for that subsystem (HTML-encoding for the browser, quote escaping etc. or
> parameterized queries for SQL database connections etc.)
> There are no injection vulnerabilities in this system, however input
> validation is missing.
> Erlend Oftedal
> Owasp-source-code-flaws-top-10 mailing list
> Owasp-source-code-flaws-top-10 at lists.owasp.org
More information about the Owasp-source-code-flaws-top-10