[Owasp-antisamy] antisamy problems with  

Arshan Dabirsiaghi arshan.dabirsiaghi at aspectsecurity.com
Wed Feb 27 09:52:13 EST 2008


Thanks for the kind words. AntiSamy_currently_ replacies all HTML named entities with their ASCII version. That's not Neko doing that. I will be removing this process in 1.1, whose release date keeps moving further away as I integrate more fixes.
 
Also, Klemen Savs has emailed me his patch directly to address some encoding issues:
 
AntiSamyDOMScanner.java:109-113 is currently:
 
   ByteArrayInputStream bais = new ByteArrayInputStream(html.getBytes());
 
   DOMFragmentParser parser = new DOMFragmentParser();
   parser.setProperty("http://cyberneko.org/html/properties/names/elems", "lower");
   parser.parse(new InputSource(bais),dom);
 
 
That now becomes:
 
   DOMFragmentParser parser = new DOMFragmentParser();
   parser.setProperty("http://cyberneko.org/html/properties/names/elems", "lower");
   parser.parse(new InputSource(new StringReader(html)),dom);
 
Cheers,
Arshan

________________________________

From: owasp-antisamy-bounces at lists.owasp.org on behalf of David Whitlock
Sent: Fri 2/22/2008 2:15 PM
To: owasp-antisamy at lists.owasp.org
Subject: Re: [Owasp-antisamy] antisamy problems with  



I ran across with   entities, too.  NekoHTML replaces the   entity with the non-breaking space character (ASCII 160).  It is possible that your character encoding is converting an ASCII 160 into another character (latin-1 vs. UTF-8 encoding issues, perhaps?).

 

For me, an ACII 160 character was not acceptable in my HTML because some browsers render ASCII 160 differently than    To work around this, I simply replaced ASCII 160 with the string " " in the validated output.

 

Hope this helps,

 

Dave

 

P.S.  Arshan, thank you very much for antisamy.  It was exactly the tool we needed and it has worked very well for us.

 

________________________________

From: owasp-antisamy-bounces at lists.owasp.org [mailto:owasp-antisamy-bounces at lists.owasp.org] On Behalf Of Arshan Dabirsiaghi
Sent: Friday, February 22, 2008 10:42 AM
To: support at gameonleagues.com; owasp-antisamy at lists.owasp.org
Subject: Re: [Owasp-antisamy] antisamy problems with  

 

Can you provide some sample input and output? Also, what environment are you testing this in? What character encoding are you using to output?

 

Cheers,

Arshan

 

________________________________

From: owasp-antisamy-bounces at lists.owasp.org on behalf of support at gameonleagues.com
Sent: Fri 2/22/2008 4:47 AM
To: owasp-antisamy at lists.owasp.org
Subject: [Owasp-antisamy] antisamy problems with  

Hoping to use Antisamy for our business except we keep coming across some strange encoding problems.  

 It seems that all " " on the incoming HTML are converted in unfamiliar characters like "á".  I tried using a patch I found on the mailing list and changing the encoding, but that has not solved the problem.

Anybody have any suggestions on what I may be doing wrong?

Thanks,

Tim

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/owasp-antisamy/attachments/20080227/0cc6470b/attachment.html 


More information about the Owasp-antisamy mailing list