[Esapi-user] Why ESAPI crypto uses a custom serialization scheme
Kevin W. Wall
kevin.w.wall at gmail.com
Fri Apr 30 17:03:20 EDT 2010
As promised... (Or should that be, "as threatened?" ;-)
In another thread (Call for review of crypto code), Mike Boberski
asked why I chose a custom serialization scheme rather than something
like CMS or PKCS#7.
Great question. (WARNING: If you fall asleep easily at boring technical
details, you may want to grab a cup of coffee. OTOH, if you are
suffering from insomnia, read on. :)
First, let me say, I never really seriously considered using PKCS#7.
CMS (Cryptographic Message Syntax, RFC 5652) is derived from the
latest version of PKCS#7 (v1.5), and since that time there have been
3 or 4 revisions of CMS. So for the most part PKCS#7 has been superceded
Secondly, let me state my reason and the design goals for some
serialization scheme, whether that be CMS or JSON or XML Encrypt
or some other custom serialization scheme.
We needed a _portable_ way to transport ciphertext over an
insecure communications channel that was independent of OS
or hardware architecture but that would not only allow the
recipient to decrypt it, but also allow the recipient to
detect whether or not an adversary had tampered with the
data stream. (Recall the assumption was that this might
not be transported over a secure channel.)
1) Portable across different hardware architectures (e.g.,
big-endian vs. little-endian issues)
2) Portable across programming languages. Should be independent
of size of 'int' types, whether integers are signed or unsigned,
3) Should have libraries available (preferably native to the
programming language) or should be easy to build in all that
is required to support it for all programming languages that
ESAPI supports. Where such libraries already exist, they should
be fairly easy to use.
4) Should be able to be "self-contained" in that it should store
everything that is needed to decrypt it except for the encryption
key itself. This would include the cipher algorithm, the cipher
mode, the padding scheme, the IV, and a MAC to ensure authenticity.
5) Representation of the encrypted serialized data should be compact
as possible. (At Qwest, there was a *lot* of pushback from
development teams when they discovered that SSN or CC#s took
more space to store when encrypted than as plaintext. Most of this
is because of the padding scheme and IV, but that didn't matter.)
6) There should be minimal additional processing overhead in
interacting with this encrypted serialized data. Keep in mind there
are applications where they may encrypt / decrypt several million
data items in a tight loop during some batch processing so
7) Should be extensible and the extensibility should be able to support
backward compatibility with earlier versions.
As I started looking into seeing if I could use CMS for this I realized
a couple of things.
a) CMS / PKCS#7 are much more complicated than what we needed. It does
much more than to encrypt arbitrary message content.. It can also
be used to support digital signatures, digests, authenticate., etc.
There have been 3 or 4 revisions of the RFC for this, depending on how
you count. That complexity makes it hard to implement correctly and
it would likely result in incompatibility across different versions
b) I did not find it widely implemented. In Java, the SunJCE has support
for portions of CMS, but they do not have any explicity java.* or
javax.* CMS related classes. (There may be some com.sun.* classes
to support it, but using those directly is probably best avoided
anyway.) I believe that Bouncy Castle has more generic support for
CMS, but we felt that we did not want to rely on any particular JCE
provider as most folks will just wish to stick with SunJCE.
c) After thinking about all the programming languages that ESAPI is
in the works to support (Java, .NET, PHP, classic ASP, ColdFusion/CFML,
Python, Haskell, and whatever language is used on SalesForce.com;
to encrypt on the client side in almost all cases), I realized that
something as complex as CMS would never be implemented on all these
languages. But because CMS is so complex (compared to the custom
serialization scheme that I chose), it would be much harder to
implement ESAPI encryption so we could build it in such a manner that
was interoperable across all the ESAPI versions. In fact, implementing
CMS in a language where you do not have an ASN.1 parser available
would make it very difficult to implement straight from the RFC.
BER and DER encoding is used throughout CMS. There quite likely is
some FOSS version of CMS for Java and I think that .NET has at least
partial support for CMS, but implementing this for PHP, Python,
Haskell, etc. would take quite a major effort.
d) In addition to CMS, I also (very briefly) considered XML Encrypt for
serialization. It is a much simpler standard and likely has broader
implementation support than does CMS, but it's big problem is that
its resulting size is huge by comparison to other possible
So instead, last October, I sought out advice of cryptographers on the
cryptography mailing list on Metzdowd.com. And one cryptographer,
Ian Griggs, responded by point me at some similar work that he had done.
That became the inspiration for the current custom serialization scheme
we are using today. While that serialization scheme is not implemented
on any other programming language of ESAPI today, I do believe that it
is simple enough to implement practically anywhere.
Finally, the use of this current custom serialization scheme does not
preclude the use of CMS or XML Encrypt or anything else.
If anyone has any other specific questions about this, I'll try to
answer them as best I can.
OK, time to wake up now!!!
Kevin W. Wall
"The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents." -- Nathaniel Borenstein, co-creator of MIME
More information about the Esapi-user