2587Re: Suggestion for whitespace exception when building an HTML string
- Aug 25, 2011Actually what you are describing is not minification. It is obfuscation, which is different. YUI can do both.
In markup languages, particularly HTML/XML, each and every white space character always has value. This is evident when you use the white-space css property with the "pre" value. Most of the time, however, the value of the white space is trivial as it is typically tokenized or saturated into the structure prior to being painted to the screen. In these cases white space can be minified to a single character without substantive harm, although the value of the document is irrevocably different. When white space is entirely removed not only is the value of the document different, but so is the structure of the document. This is evident in reflection to the removal of text nodes from the DOM. Changing the structure from removing text nodes that only contain white space is typically ok if there is understanding on which text nodes are removed and the impact to the tokenized result of the parsed content, however this is certainly less safe than merely tokenizing white space without removing any DOM nodes. This is less safe in that automated agents can no longer walk the DOM with predefined guidance and produce identical results. This warning is otherwise trivial as content oriented automated agents, such as meta-data bots, walk the DOM from the content out.
If you are not ready to consider these aspects of parsing markup then you should not be concerning yourself with dynamically altering the white space of markup code. You may likely be harming and altering the data value of markup documents without knowledge to the effect.
Austin Cheney, CISSP
- << Previous post in topic