[Esapi-php] [PATCH] ESAPI4JAVA's Issue #45
jah
jah at jahboite.co.uk
Fri Feb 5 10:08:51 EST 2010
On 05/02/2010 04:09, Linden Darling wrote:
> Those ord() calls are intended to handle single bytes whose encoding fails to get detected appropriately by mb_detect_encoding(). Should cater for the issue you've identified (i.e. the nature of ord) by adding a test at start of each IF statement in detectEncoding() to determine that the string is in fact only a single byte in length - shame on me for not doing so in the first place! That'd leave the logic mostly as-is and mb_detect_encoding() handle the bulk of tasks within detectEncoding(), as it should. Sounds OK?
>
I think the difficulty in determining whether a string is only a single
byte (or merely truncating a string to a single byte character) is that,
without knowing the character encoding of the string, it doesn't seem
possible to determine how many bytes per character there are. ord()
itself seems (from looking at its source [1]) to look at the first byte
regardless how many there might be so if we can avoid using ord, all the
better.
> Java strings are Unicode by nature, hence ESAPI4JAVA doesn't face this decode issue. Tending towards UTF-8 using your proposed method seems like a good way to roll.
>
I'll certainly try it out and see what happens...
Best,
jah
[1] -
http://svn.php.net/viewvc/php/php-src/branches/PHP_5_2/ext/standard/string.c?view=markup#l2588
More information about the Esapi-php
mailing list