[Esapi-user] Localization and InputValidation

Jim Manico jim.manico at owasp.org
Wed Jan 27 01:16:00 EST 2010

I cannot answer this easily. Does anyone else on the dev team have 
experience with i18n and RegEx's inside of ESAPI?

- Jim

> Hi guys, a question has arisen re: input validation
> I should prefix this by stating we are on 1.4, not 2.0.
> Let's say I want to pass "グ" in my input.  For those of you who can't 
> read that, it's a Japanese Katakana with Unicode value 30B0
> http://www.fileformat.info/info/unicode/char/30b0/index.htm
> I want to allow this in my input, so I need to create a regex that 
> will permit it.  What I'm not sure about is:
> 1) what canonicalize is going to do to that string, and
> 2) if there's a locale-aware way of identifying characters in a regex.
> I can see this potentially showing up as
> \u30b0, where I would need to permit \ characters,
> \u30b0, where the slash is encoded, though I doubt this.
>> the latter can lead to two possibilities
> 1) my regex would need to allow a range of Unicode values
> 2) a character class (\p{Alpha} and such) would seamlessly match 
> 'letters' of any langauge.
> The confusion on my end is due to lack of knowledge on characters 
> outside the typical US character set.  Can anyone shed some light on 
> this issue, as to the expected canonicalization and recommended 
> whitelist regex?
> _______________________________________________
> Esapi-user mailing list
> Esapi-user at lists.owasp.org
> https://lists.owasp.org/mailman/listinfo/esapi-user

Jim Manico
OWASP Podcast Host/Producer
OWASP ESAPI Project Manager

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/esapi-user/attachments/20100126/463f7e4c/attachment.html 

More information about the Esapi-user mailing list