[OWASP-ESAPI] BaseEncoder.whitelist(String, char[])

Neil Matatall nmatatal at uci.edu
Mon Aug 3 15:07:31 EDT 2009

I am planning on committing it to the quality branch.  If we go with all 
of my suggested changes (specifically adding the new whitelist 
signature), it would actually break existing Validators that don't 
extend the BaseValidator class so I didn't want to put it in the main 
branch just yet.


Jim Manico wrote:
> This is right on the money. Did you make this chance in the "quality" 
> branch?
> - Jim
> ----- Original Message ----- From: "Neil Matatall" <nmatatal at uci.edu>
> To: <owasp-esapi at lists.owasp.org>
> Sent: Tuesday, July 28, 2009 4:52 PM
> Subject: [OWASP-ESAPI] BaseEncoder.whitelist(String, char[])
>> Hello All,
>> I was going through the validation package and came accross the
>> whitelist method
>>     /* (non-Javadoc)
>>      * @see i
>> org.owasp.esapi.reference.validation.IValidationRule#whitelist(java.lang.String, 
>> char[])
>>      */
>>     public String whitelist( String input, char[] list ) {
>>         StringBuilder stripped = new StringBuilder();
>>         char c;
>>         for (int i = 0; i < input.length(); i++) {
>>             c = input.charAt(i);
>>             if (Character.isDigit(c)) {
>>                 stripped.append(c);
>>             }
>>         }
>>         return stripped.toString();
>>     }
>> The 'list' param is ignored and only digits are returned so this method
>> needs to be fixed one way or another.  Because arrays are used, using
>> Arrays.binarySearch would cause O(input.length * log list.length).  The
>> other option is to convert it to a Set and use contains for constant
>> access which now makes whitelist perform at O(input.length).  Obviously,
>> for short Strings (such as credit card numbers) this isn't a big issue.
>> These Sets should be declared as constants, which I can't do in the
>> Encoder

>> (static initializers are not allowed in Interfaces).
>> If the char[]s were Character[]s we could do:
>> public static final HashSet<Character> DIGITS = new 
>> HashSet<Character> {{
>>     addAll(Arrays.asList(CHAR_PASSWORD_DIGITS));
>> }};
>> Changing the char[]s to Character[]s breaks a lot of the reference code
>> as well as VBScriptCodec and User classes.
>> Here is my proposal let me know what you guys think:
>> * Move the char[] constants to an external class (Following Josh Bloch's
>> EP #19, and in the spirit of isolating the API) and build Set<Character>
>> constants.
>> * Leave delegate constants in Encoder to the new class constants.
>> * Create a new whitelist method that takes a Set<Character>, original
>> method will convert any char[]s to Sets and pass the set to the new 
>> method.
>> This approach should not break any existing code in the ESAPI project or
>> any external reference implementations, the method will work
>> appropriately, and should have great performance (although we're paying
>> twice for the char[]s).  The drawback of this is that to use the new
>> whitelist(String, Set) method, you have to cast to a BaseValidationRule
>> unless we want to add the new signature to the ValidationRule
>> interface.  Doing such would break external implementations.
>> Gee, it is quite a drastic change that will yield little benefit but I
>> do think it's the right way to do things.  Ideally, if everyone agrees,
>> I think we should add the whitelist(String, Set) to the interface and
>> deprecate the use of the array based method.  We could use Collection if
>> it is preferred flexibility, but a Set seems more appropriate.
>> I've already coded up all the proposed changes, if this is a go I will
>> commit it.
>> Sorry for the essay :-/
>> Neil
>> _______________________________________________
>> OWASP-ESAPI mailing list
>> OWASP-ESAPI at lists.owasp.org
>> https://lists.owasp.org/mailman/listinfo/owasp-esapi

More information about the OWASP-ESAPI mailing list