[Owasp-java-html-sanitizer] Style/font transformation

Dan Rabe dan.rabe at oracle.com
Tue Jan 29 23:24:41 UTC 2013


I'm curious about the design decision to transform certain "style" 
attribute information to "font" tags. In my opinion, that transformation 
is not ideal for a couple of reasons:

(1) The font tag is deprecated in HTML 4.0, and is not supported in HTML 5.

(2) If there are multiple font tags involved, the rendering is changed. 
For example, consider this HTML snippet:

<table style="color: rgb(0, 0, 0); font-family: Arial, Geneva, sans-serif;">
<tbody>
<tr>
<th>Column One</th><th>Column Two</th>
</tr>
<tr>
<td align="center" style="background-color: rgb(255, 255, 254);"><font 
size="2">Size 2</font></td>
<td align="center" style="background-color: rgb(255, 255, 254);"><font 
size="7">Size 7</font></td>
</tr>
</tbody>
</table>

If you display this in a browser, all the text inside the table renders 
in a sans-serif font. After transforming it with the HTMLSanitizer, it 
looks more like this:

<table>
<font face="Arial, Geneva, sans-serif" style="color:#000">
<tbody>
<tr>
<th>Column One</th>
<th>Column Two</th>
</tr>
<tr>
<td align="center"><font style="background-color:#fffffe"><font 
size="2">Size 2</font></font></td>
<td align="center"><font style="background-color:#fffffe"><font 
size="7">Size 7</font></font></td>
</tr>
</tbody>
</font>
</table>

And the table text is rendered in serif instead of sans-serif. I suspect 
the nested table tags aren't really a great idea.
Since the code is able to emit a sanitized style attribute for the font 
tag, why not just emit sanitized/whitelisted data into the original 
style attribute?

Thanks,
--Dan



More information about the Owasp-java-html-sanitizer mailing list