[Esapi-user] Feedback wanted -- Proposal for addressing recommendations for ESAPI's KDF (Google Issue #198)

Kevin W. Wall kevin.w.wall at gmail.com
Fri Jan 14 01:21:40 EST 2011

[Here's the discussion that I mentioned (threatened?) that I would
post a few days ago. Like most of my ramblings, it is rather lengthy
so it is best read with a double shot of espresso followed by a
Red Bull chaser.]

Both the NSA and Jeff Walton reviewed ESAPI 2.0's Key Derivation
Function (KDF), CryptoHelper.computeDerivedKey().

When it came to the ESAPI KDF, the NSA and Jeff Walton both had
different recommendations, although both made reference to the
same document (NIST SP 800-108).

The NSA recommended that we use a "FIPS-approved HMAC", which currently,
would mean one of the SHA-2 algorithms (HmacSHA256, HmacSHA384,
or HmacSHA512).  On the other hand, Jeff Walton's recommendation
focused on _how_ the HMAC should be used in NIST KDFs. Specifically,
Jeff recommended that at a minimum, that the iteration count of the
do-while loop be included as specified by NIST SP 800-108.

Implementing Jeff's recommendations without the optional NIST 'Context'
if very straight-forward. However, implementing it with 'Context'
would require changes to the way encryption / decryption is presently
done, and likely that would include the Encryptor encrypt() / decrypt()
interfaces. These changes are much more complex and have the negative trade-off
of greatly increasing the size of the serialized ciphertext, which may be
completely unacceptable to many. Therefore, I believe it prudent to not make
this change to include 'Context' in ESAPI's KDF at least until the ESAPI 2.1
release. (Furthermore, as Jeff points out in his analysis, ISO/IEC's KDF1 and
KDF2 do not use a context, so seemingly, it is not *that* essential as far as
security goes. Consequently, further examination may lead us to believe that
we do not need the Context at all in ESAPI's KDF.)

Thus we have two basic choices regarding ESAPI's KDF, computeDerivedKey() ...
    1) Change computeDerivedKey() as per NSA's and/or Jeff Walton's
    2) Leave computeDerivedKey() unchanged.

The argument in favor of changing computeDerivedKey() as per recommendations
as it is likely to be more resistant to future attacks as HmacSHA1 becomes
further vulnerable to forced collisions.

The argument in favor of leaving computeDerivedKey() unchanged is that if we
change it, these changes would impact anyone who had been using earlier
ESAPI 2.0_rc# versions to *persist encrypted data* that was encrypted via
ESAPI's JavaEncryptor class.  Any such changes would cause attempts using the
soon-to-be-released GA version of ESAPI 2.0 to fail attempts to decrypt such
encrypted data from earlier ESAPI 2.0 release candidates.

While this backward compatibility _could_ be addressed, I believe it is
judicious to ignore the backward compatibility *at this time* because:
    1) There has been discussions of this at various times on the ESAPI-Users
       list and developers have been warned that such changes might be in order
       depending on the feedback from the NSA.
    2) Supporting this backward compatibility will add sufficient complexity
       such that, at a minimum, a code review would be in order, and it is quite
       possible that this could delay ESAPI 2.0 GA release quite a bit. This
       would especially be the case if we only feel comfortable with the NSA
       reviewing the changed code.

Therefore, I do *not* intend to try to maintain backward compatibility with
previously encrypted persisted data *at this point in time*. However, once
ESAPI 2.0 is made GA, I promise to strive to maintain backward compatibility
whenever the ESAPI developers and users believe it is wise to do so. When
such backward compatibility is not possible, then I will try to provide some
sort of conversion routine to assist in the re-encryption of the previously
encrypted data using the newer version of ESAPI.  Most of the time, supporting
such backward compatibility should be possible as the serialized ciphertext
contains a 'version' stamp (currently as a 'long', but may change; see below)
as the first field of the serialized ciphertext. [** Note: This backward
compatibility issue is explained in a bit more detail in the footnote at the
end of this email.]

Currently this 'version' field is represented as a big-endian 'long' in the
CipherTextSerializer class.  We represent the version by the integer
representation of of the version date of CipherTextSerializer
(*not of the _ESAPI_ release date*!) using the form YYYYMMDD. A matching
internal version date is stored within ESAPI's CipherText class. As an
example, to represent a version (date) of Jan 9, 2011, we would set the
version to 20110109. Note that using this approach, we can go all the
way up to 12/31/9999 using only 27 bits of information because the integer
99991231 is the bit string 101111101011011111010111111 (27 bits). (If we wished
to use 28 bits, we could go way beyond Dec 31, 9999, but unless some of you
intend to become cryogenically frozen, the Y10K problem is not likely something
you will have to worry about in your lifetime and it's always good practice
to reserve a bit for future use. :)

So if we change computeDerivedKey to accommodate the recommended changes--
as I believe we should--then I propose that we should also change version from
a 'long' (8 bytes) to an 'int' (4 bytes).  (Note: I was originally trying to
make 'version' big enough to support the the maximal java.util.Date in Java,
but in retrospect, I don't even think that Cobol will still be around by
then, let alone Java. ;-)

If we are satisfied with 12/31/9999 as the maximum version date, then an
'int' is clearly large enough accommodate this and would shorten all
serialized ciphertexts by 4 bytes (assuming the same 160-bit HmacSHA1 is
still used for the MAC calculation). This can be significant when encrypting
short strings and if one adopts even the shortest (future) SHA-3 Hmac candidate
(224 bits), that alone would add another 8 bytes to the MAC (vs the 160-bit
HmacSHA1 we are presently using) so shorting the version field by 4 bytes
would somewhat compensate for that increase.

The downside of this, is of course, that this specific change is not backward
compatible with previous 2.0 release candidates either. However, I do not
see that as a major obstacle since other proposed changes to computeDerivedKey
are going to make it incompatible with previous versions anyhow. And because
I didn't intend supporting backward compatibility until 2.0 is GA, making this
change *now* seems to makes perfect sense. (Making it later would cause another
incompatibility after GA that *would* have to be addressed and backward

Now, in some sense, if we make this change to the 'version' field, we can
have our cake and eat it too (if we implement it before 2.0 GA).  Since a Java
'int' is 32 bits and only 27 bits are needed for the version date, up through
9999/12/31, what do we do with the remaining 5 bits? I was planning on doing
something like this. (Apologies for the poor ASCII 'art'; also included as an
attachment in case MUAs and MTAs wrap it as it's going through; best viewed
in wide window of ~100+ characters.) This diagram assumes "network byte
ordering" (i.e., big-endian); note that Intel CPUs are little-endian, so
this bit / byte ordering may look a little strange to you.

Here's my proposal for how to use the first 4-byte 'int' (see attachment if
this diagram below looks munged; shown in network byte order [aka, big-endian]):

================================== Big-Endian Bit Ordering
|<------ Byte[0] ------>|<------ Byte[1] ------>|<------ Byte[2] ------>|<------
Byte[3] ------>|
 |Un|    MAC    |
|<----------------- Version Date Indicator in form of YYYYMMDD
----------------->|us| Algorithm |
 |ed| Indicator |

So, basically, the Version info, represented as the integer YYYYMMDD, will
be bits 1 through 27.  Bit 28 will be unused. (It will be set to
zero, but is not used. Consider it RESERVED.) The bits 29 through 32
will represent the chosen MAC algorithm used in ESAPI's KDF, computeDerivedKey,
according to the following table.  A nibble (4 bits) allows us to support 16
different MAC algorithms. I propose using those four bits to represent the up
to 16 different MAC algorithms. This allows us to support something other than
HmacSHA1 and thus address the NSA concerns; in fact it goes beyond the
immediate NSA concerns of recommending a SHA-2 HMAC and allows us to use a
SHA-3 HMAC once NIST choses the winning SHA-3 candidate and a JDK or JCE
provider supports it.

//		Proposal for bits 29-32
// Allowed MAC algorithms and there respective key sizes.
// Value    MAC Alg name        hash size (bits)        Notes
// ==========================================================================
    00      HmacSHA1                    120             Default?
    01      HmacSHA256                  256
    02      HmacSHA384                  384
    03      HmacSHA512                  512
    04      Reserved for SHA-3 winner   224             SHA-3 must provide
    05      Reserved for SHA-3 winner   256             msg digests of 224,
    06      Reserved for SHA-3 winner   384             256, 384, & 512 bits.
    07      Reserved for SHA-3 winner   512             Names for SHA-3 TBD.
    08      OtherReservedFuture01       ???             Thus leaving us room
    09      OtherReservedFuture02       ???             for 8 other MACs in
    ...     ...                         ...             the future.
    15      OtherReservedFuture08       8192		Uncle Albert's MACjik

Also, a bit of good news. I checked JDK 1.5, and discovered that all
four of HmacSHA1, HmacSHA256, HMAC384, and HmacSHA512 are available in
JDK 1.5 in SunJCE. (So is HmacMD5, but this is not considered secure for a KDF
so I have intentionally omitted it from the table above.) I am not so sure
about what MAC algorithms that are available in other ESAPI implementation
programming languages so I think the safe bet is to leave HmacSHA1 the default
(at least for now) and add a property (proposing, Encryptor.KDF.PRF=HMacSHA1)
to ESAPI.properties to set it so it can be easily overridden. Doing so should
address the specific concerns that the NSA had regarding computeDerivedKey.

Addressing Jeff Walton's recommendations (excluding the apparently
optional NIST 'Context') is rather simple, but important. As Jeff
noted in his analysis (see attachment in Google issue #198), for the
calculated MAC, everything other than the iteration count stays the same
each time through the do-while loop, so I feel that it's at least important
that we get that change into the 2.0 GA release. And since we have to change
it, we might as well add everything except for the apparently optional Context.
(In fact, I probably will add the optional Context argument, but simply have
JavaEncryptor set it to null or an empty string so that computeDerivedKey
will not have to be changed at a later time.)

So that's my plans on how I would like to address the findings of the
analysis of ESAPI's KDF, computeDerivedKey(), done by the NSA reviewer
and by Jeff Walton.

*Any objections?* If so, speak now or forever hold your peas and your
other favorite vegetables.

* A More Detailed Explanation of the Backward Compatibility Issues

[Most of you can skip this section.]

Unlike what Jeff Walton expressed in a private email, my primary
concern with changing CryptoHelper.computeDerivedKey() is not
changing it's interface from the perspective of some developer
somewhere using it directly. (It is a public method in a public
class.)  If I thought someone were using computeDerivedKey()
directly, I certainly could add another one and maintain the old
interface at little complexity.

But that is *not* what I am concerned with. computeDerivedKey() was
only made 'public' because it was in a different pkg (org.owasp.esapi.crypto)
than where JavaEncryptor lives (org.owasp.esapi.reference.crypto). Originally
I had it in JavaEncryptor itself, but thought it might be useful in the future
to others building their own implementations of the Encryptor interface (which
I expect to be relatively rare BTW).

The bigger issue in my mind is this. You have a key for encryption
that was created using (say) ESAPI 2.0_rc10 and you have used it
to encrypt *and persist* some serialized ciphertext data (i.e., see
CipherTextSerializer for details). You later switch to the GA version
of ESAPI 2.0 where the implementation of computeDerivedKey() method
as it is used used by JavaEncryptor has been been changed (so it would
not use any possible deprecated computeDerivedKey but rather the newer

Now as soon as client code attempts to decrypt that previously encrypted
serialized ciphertext data, the MAC will fail (because it now uses a different
derived key than that it was originally computed with) and thus the decryption
operation itself will fail (unless I explicitly chose make everything backward
compatible NOW).

Why will it fail? Simple...the altered computeDerivedKey will compute
*different* derived keys that than previous RC versions of ESAPI 2.0
did. Even if the MAC was not present in the persisted serialized ciphertext
(say it was stripped off and not stored), the derived key used for
encryption / decryption would be different.

That, in essence, is the problem. Now when I attempt the decryption in
JavaEncryptor, I could pull off the first Version info (currently the first 8
bytes [a long]) and if it is <= 20100122 (indicating an older version), then
decrypt using derived keys from the deprecated computeDerivedKey.

So it's not that I couldn't do that, it's just that I'd *prefer* not to.
I am a great believer in keeping things as simple as possible. As I'm
sure that you have heard my many rants about complexity being the enemy of
security.  Also, such a major change to JavaEncryptor possibly would negate the
NSA review results and none of us wish to wait another 6 months or so
while they take a 2nd look at it.

So that's what the backward compatibility issue is all about and why I am
not going to address it until AFTER ESAPI 2.0 becomes GA.
Kevin W. Wall
"The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents."        -- Nathaniel Borenstein, co-creator of MIME
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Version-as-int.txt
Url: https://lists.owasp.org/pipermail/esapi-user/attachments/20110114/89e248f0/attachment.txt 

More information about the Esapi-user mailing list