WeFollow has quickly become one of the primary directories of Twitter users. The site lets people assign up to 3 tags to their own account in order to describe their interests. People visiting WeFollow can then see for each tag the list of matching accounts sorted by number of followers.
When you categorize yourself on WeFollow, it sends out a tweet to all your followers having the form: 'Just added myself to the http://wefollow.com twitter directory under: #tag1, #tag2, #tag3'. This automatic viral message has helped WeFollow spread across the twittersphere. Some people have complained that they see too many of these and call them spam. Personally, I find it interesting to see how the people I'm following classify themselves.
These automatic registration messages can be tracked using Twitter Search and reveal lots of information about WeFollow that isn't publically available on their own site. I have analyzed the set of WeFollow registration tweets for the two month period Mar 28 - May 28, 2009. There were 144,506 tweets matching my search pattern in this time frame, or roughly 2400 new people added to the directory per day. Here is the graph over time:
The peak during this time frame occurred at the end of March and was about 6000. The time period for the analysis was shortly after the WeFollow launch which likely accounts for the rough gradual decline shown. It would be nice to see the data for the launch date but unfortunately limitations in Twitter Search prevent me from accessing this data. There appears to be a new peak showing up at the end of May and there are two obvious troughs around April 10th and 22nd. I've checked other data streams I'm monitoring and they don't show troughs or 'holes' during these two dates so it looks pretty likely that there was a problem with WeFollow infrastructure during those periods rather than it being a data collection problem.
The main page of WeFollow shows the 'top tags' but bases this on the number of followers of the people using those tags rather than the tag count itself. Which tags are actually used most often ? An analysis of our sample gives this graph:
The top three tags by follower count on the WeFollow site are Celebrity, TV, and Entrepeneur. When ranking instead by the number of people who actually self-assign these tags these rankings drop to 12 for Celebrity, 44 for TV, and 3 for Entrepeneur. This shows quite clearly that the average account tagged Celebrity or TV has more followers than, say, those tagged with Blogger.
The WeFollow registration tweets also show which tags are used together. I've constructed a couple of different types of graphics to illustrate the tag similarity relationships. This first one is a Clustered Word Cloud and show colored groups of tags that are frequently used together. The big blue group in the middle seems to contain many of the most frequently used tags and doesn't appear particularly cohesive. Many of the others do, at least subjectively, seem to make sense. Here are a couple of example clusters from the image: (church, conservative, christian, pastor, tcot) , (publishing, poetry, books, writing, poet).
This last image was created using the same layout technique as my recent Twitter Account Graphs. Basically, the tag nodes are positioned near others that they are 'similar to' in the sense that they are often used together.