We've looked at a number of aspects of the weblog Boing Boing over the last little while. The topics discussed included things like posts over time by author, day of the week analysis, images/post by author, outbound links and acronym use. This continues our analysis by examining in more detail the contents of the actual posts. What are they writing about ?
The Radial Treemap shown below illustrates which topics from my simple topic hierarchy get more emphasis. This is scaled by the number of words written on the various topics. Posts which didn't match any of the topics very well were grouped under None.
Here are the first 3 high-level topics shown by themselves so more details are clear.
These diagrams do seem to give a reasonable weight to the topics that Boing Boing seems to emphasize although before I did the measurement I expected that Technology would be larger than the Arts and Society topics.
How well is the categorizer working ? Let's look at the posts which most closely match some of the given 3rd level topics.
For Photography:
For Military:
For Aerospace Engineering:
These examples seem to match well but I know this is a pretty simplistic categorizer. I expect the labels for posts farther down the lists to be more questionable. The post labeled as Aerospace Engineering with the lowest score is this:
Related Links:
Boing Boing Analysis - Part 1 (Posts Over Time by Author)