[Owasp-esapi-c++] ESAPI for C++ and compatibility with ESAPI for Java
Kevin W. Wall
kevin.w.wall at gmail.com
Thu Aug 4 05:51:51 EDT 2011
Gentleman,
First note that if you Reply-All, you will get a bounce from the ESAPI C++
list unless you have previously subscribed to it. So that's why I've addressed
several of you separately. (And in case you're wondering why I'm still up...
insomnia.)
Anyway, one issue that has already come up ins writing code for the Codec
class. In Codec, in the ESAPI for Java code, 'char' is used in a lot of methods.
As most of you are aware, 'char' in Java is 16-bits and is used to represent a
single Unicode character. In C/C++, 'char' is 8 bits. Furthermore, in
Java, 'char'
is unsigned and has a range of '\u0000' to '\uffff' [65,535] (see
http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html),
whereas on most architectures in C/C++, 'char' is an 8-bit signed quantify.
So, in C/C++, wchar_t is probably closer to Java's 'char' than 'char' is.
A similiar issue in Java's String vs C++'s std::string. Java's String is
Unicode, but std::string is basically mimics C's null-terminated char[]
strings. (There's probably even an appropriate conversion operator
to/from char[] to std::string, but I haven't checked.) So perhaps we
should be using strings made of wchar_t rather than 'char'??? (IIRC,
I think that would be std::wstring.)
So...the question becomes do we want to stick with 8-bit char and
std::string or use 16-bit wchar_t and std::wstring?
And I guess that's partly where it comes down to how all of you
envisioned Codec to be used. Specifically, in the ESAPI 2.0
for Java, do you envision the Codec methods that take char
and char[] areas actually being limited to ASCII (most of our
regexs that do checking seem to assume this for the validators
at least), or do you expect general Unicode in this context (including
possibly '\u0000')???
I think it's going to matter a lot in how we define the interfaces so we
need an answer to this pretty quickly. It's hard to read the intent of
how Jeff and Jim and others originally envisioned Codec and Validators
to be used. Like I said, most (all?) of the Validator regular expressions
only seem to whitelist ASCII.
Anyway, would appreciate your timely thoughts on this before we
get too far down the road and go off into the weeds.
And Dave Wichers: You might want to get some input from the client(s)
are are interested in using the ESAPI for C++ API.
Thanks all,
-kevin
--
Blog: http://off-the-wall-security.blogspot.com/
"The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We *cause* accidents." -- Nathaniel Borenstein
More information about the Owasp-esapi-c++
mailing list