[Owasp-antisamy] Why does   tags get escaped?

Mike Christensen imaudi at comcast.net
Mon Feb 2 05:42:13 EST 2009

Hi guys - there appears to be a bug in AntiSamy (actually it might be 
more accurate to say there's a bug in the HtmlAgilityPack) that's kinda 
driving me nuts.  It appears if you enter the HTML:

Hello There

It gets converted to:

Hello There

Which is obviously not what I want.  This is happening in 
AntiSamyDOMScanner.cs in the scan function on this line:

string finalCleanHTML = doc.DocumentNode.InnerHtml;

It appears the InnerHtml property actually escapes markup within the 
document.  Are people aware of this issue and is there any documented 
work-around or planned fix?  I think it's perfectly valid for HTML to 
safely contain these entities and I don't want markup to be escaped and 
displayed back to my users.  For now, I've worked around this with:

res = res.Replace(" ", " ");

But that's a bit lame <g>


