Not all code heavy sites suffer from code bloat: Very simple example. You must take into account matters like the amount of code that goes into making, for example, a link. So if you have a link to say, the NY Times (http://www.nytimes.com/2002/11/07/technology/circuits/07stat.html) and the text for the link is "Tablet PC" then you're going to have a 6:1 ratio of "non-content" to content according to the tool. However, the link is legitimate code. It's what's necessary to put forward the content of the site.
Code bloat should instead refer to matters like unnecessary code.... Graphics where text would do, applets which serve purely "decorative" purposes (remember blinking links?) This would allow us to get at "objectionable" bloat, i.e., things which do not move the content forward...
In addition, a page can be Bloated With Content, if for example, it's like Yahoo or the other portals, which seem to have to have all links to all content on the front page. The problem here isn't excess code by any means--it's pure and simple, a thicket of content so snarled that you can't make your way through it to find what you're really looking for. I was speaking to a web designer the other day, who had created a three column front page (you've seen it before--large center, with two smaller siderbars, left and right). I asked him why he was crowding his designs... he said, "This seems to be the trend... Look at... x.portal...." Oh well... I'm also kind of angry with the new news.google site which, out of keeping with their totally uncluttered single column betas, went to a dual column layout on the final build.... I'd rather scroll than be distracted or have to squint....
Oh well, each to his opinion... It seems to me these are all things which Steve Outing has covered in one way or another over the last couple weeks, but now this "tool" to determine bloat levels seems to make a mush of it.
BobDaily Rotationhttp://www.dailyrotation.com
Aside from the "text content," the remainder of the page might be made up of pictures, or graphics or interactive applications such as search, interactive charts, navigation or graphic designs which serve to highlight and identify choices on the page. All of this would be "content," but would not be represented by the "text" percentage that the tool identifies.
And I'm sure that the advertisers who are struggling with the idea of interactive advertising anyway will be interested in the notion that ads aren't content, either.
A page that was 90% text content would look like white pages of the telephone book.
So saying that 8% of CNN.com is text content *doesn't* mean that that the 92% which isn't text is someone's bloated code.