Some of you may have noticed the featured users list just changed and there's been a few people either angry or confused as to why they were removed, and others asking about the algorithm so I figured I'd write a quick post about it.
Featured users was an idea I came up with one night in response to ROY4L's thoughts on the motivation of users to produce quality content as well as my near religious following of Clay Shirky's "A Group Is Its Own Worst Enemy" essay.
"Featured users" was supposed to provide two main benefits; to find users who consistently make good content that we could highlight on the front page in an effort to make the front page less overwhelming and less of a "treasure hunt". Second, and more often overlooked was to create an incentive for the "cream of the crop" users to create new content and interact with the site more often.
The goal was (and is still) ambitious; figure out who the best content producers are mathematically. I was surprised that a simple algorithm could produce fairly good results. The last featured users list was roughly 98% of generated by algorithm, with the last 2% being me adding or removing users manually.
So here, for the first time, is a run down of the incredibly simple featured users algorithm
(originally called 'user score'):
(average_site_score * 1.2)
(average_number_of_votes * 0.23)
x (number_of_favorites * 0.43)
= Your dumb score.
We calculate a score for all the users, and then take the top score and convert it to 10000, and convert all other scores to fit into that percentage. An example of how this turns out shows that the weight is very light towards the top and heavy towards the bottom. Here are some sample results from an old run
#1 nutnics 10000
#2 ROY4L 9947.16
#3 phaseblue 8328.45
#4 max 6374.33
#5 astuteNacute 5957.35
#6 krebstar 4685.71
#7 syncan 4451.45
#8 kingstefan 3757.21
#9 ALMusic 3620.85
#10 PCF 3549.24
From there, the scores went down drastically, because users near the top skew the results for everyone else. The requisite for getting on the "list" was a score over 200, which only roughly 300 people achieve.
At, I used more data to base the scores off of, number of comments (this is why whetstone made the list), number of views, etc. I then realized sites like "Blue Ball Machine" skewed the averages for everyone, so I tried to do them with the top 5% of each users sites excluded (trying to discard anomalies), but the results were still really off.
The numbers don't lie. This algorithm is working on a large enough set of data that a few up-voting alts wont make a difference. More people are viewing, favoriting and voting on the featured users than those who arent featured (even if you use a time scale of a period before featured users existed).
Now the only thing that you can really muck with here is the weight on each piece of the algorithm. Normally, you can look at the results and change the algorithm to remove results you don't like or get results you do like, but a huge part of this is opinion based. Multiple people want DarthWang to be featured, but I find I think the problem is that you can't please everyone with a single list.
This time around, after I was persuaded to let it happen, BTape and Teknorat took the generated list and then added and removed people as they saw fit, which is what caused much of the recent change. So focus your rage towards them for the next couple weeks.
I also made a quick change to the featured users content box, which filters out duplicate users, so at any one point in time you wont see more than one site by each user, which I think will deal with a lot of the spam issues.
anyway, back to work, dongs.