Why Reddit is broken, and how to fix it

by Ben Orenstein

Inflammatory titles aside, I think reddit is a fantastic site. I read it constantly.

The interface is beautifully clean, just simple text arranged neatly on a white-linen background. On top of that, they were founded by hackers, and funded by another hacker (and my personal hero). These factors alone give them a lot of potential.

But lately, something important has changed, and reddit is adapting slowly. Not only that, some of its adaptations are quite literally broken. As the reddit audience has grown, the type of content submitted to the site has changed. It turns out that the “wisdom of the crowd” idea on which social bookmarking sites are predicated can get tricky as your crowd gets more diverse.

Early on, reddit articles were exactly my style: the cream of the crop, content-wise, and usually technically oriented. The titles summarized the content well, and the signal-to-noise ratio was impressive. I suspect these strengths were related to reddit’s status as a well-trafficked site written in Lisp–something which catches the attention from the hacker-types. My kind of people.

Reddit’s success has brought with it a more mainstream audience. With that audience, the focus of the site has changed. These days it’s hard to tell what that focus is. Maybe it’s politics: at the time of writing, eleven of the twenty-five front page links are politically oriented. Maybe it’s funny pictures (six links). Maybe it’s a fansite for a certain unpronounceable webcomic.

Anyhow, it doesn’t particularly matter what reddit’s focus is. What matters is that it’s changing. Depending on who you are and what you like, this may be a good thing. You might love the new reddit. But plenty of people do not. This is a very common phenomenon:

  1. A site is created.
  2. First generation users fall in love.
  3. Word spreads, and the site is rapidly overwhelmed with new users.
  4. Early users grow to hate the new users who have no idea how things are done around here and now everything is RUINED!

It’s like a stampede of younger siblings.

So now we’ve got this huge swarm of users, and everyone wants things their way. Oh well, some people are going to lose interest and leave, but that’s inevitable because you can’t please everyone. The links are chosen by the majority and if you’re not the majority anymore, tough.

It doesn’t have to be like this. There is, in fact, a way to please everyone: show every user only the links he wants to see.

Now, this is a fairly complex problem, and reddit’s first solution is neat, plausible, and wrong: subreddits. Subreddits are wrong because people have diverse interests. The programming subreddit has some gems on unit testing and Ruby, but I’m not particularly interested in “Sexprs in Leopard,” which sounds like it should be on the nsfw subreddit anyway. Also, there are a few political topics I’d like to see, and maybe even some stuff from entertainment or sports.

Now stand back, because I’m going to use an analogy here: links on reddit are like different kinds of food. Everyone has different preferences when it comes to food, and delicious to me might be gag-reflex for you. How do we show people only food they like? Reddit’s current approach is like dividing options into three boxes: breakfast, lunch, and dinner. Clearly this doesn’t work. There’s not enough granularity–just because I don’t like cereal doesn’t mean I don’t like omelettes. Okay, let’s split breakfast into two more boxes: cold and hot. Still not good enough. Omelettes are tasty but a fondness for grits is indefensible. The problem here is that classifying at this level will always be too general to give good recommendations. We’re approaching this problem with a top-down mindset, and we’re not getting good results. So what should we do instead? How about bottom-up?

What if we analyzed ingredients instead? I don’t like this dip, or this chicken dish. Cold appetizer, hot entree, no commonality with the top-down approach. But if we check the ingredients, they both have cilantro. Hmmm. Might be good to recommend fewer dishes with cilantro, and let’s keep a watch for more dishes I dislike. If further dishes have that ingredient, it’s awfully likely I don’t care for that particular spice. However, someone else might love cilantro, so let’s watch to see if that’s listed in lots of the dishes they do like, and be more likely to recommend pro-cilantro recipes for them.

This approach is likely looking very familiar to many of you. It’s the basis of something called bayesian filtering. Bayesian filtering is the closest we’ve come to solving the problem of identifying spam. It works because it creates lists of words that make something likely to be spam for you. “Viagra” might be an immediate red flag for you, but for me it’s a necess–never mind.

So back to reddit. Let’s ditch the subreddits. Instead, why not implement bayesian filtering to recommend links to the users? Build the word lists from the content of the sites linked to, and maybe the submitted titles too. When I click a link, add its key words to the list of things I like. When I downvote, file them right there under cilantro. And why stop there? Why not also notice when things I like are similar to things another user likes? Then recommend stuff HE likes to me! It’s like wisdom of the crowd TIMES bayesian filtering. We can call it Web 3.0!

Okay, this idea is not so new, and not so clever. In fact, I’m surprised this hasn’t already been implemented. After all, reddit-funder Paul Graham was the guy who thought up using bayesian filtering for spam. Reddit even already has a “Recommended” link which claims it learns based on your voting, but this is a horrible, horrible lie. I have voted down dozens of Ron Paul articles (not from dislike, he just doesn’t particularly interest me) and the recommendations page I just refreshed has six articles with “ron paul” in their titles.

So I appeal to you, reddit-guys, let’s flip the whole approach and come at this bottom-up. Don’t segregate your audience into huge buckets, create a customized reddit for each and every one of us. Bring those of us with common interests together. Show us things we didn’t even know we liked yet! While you do that, I’m gonna go shut my office door and check out “sexprs in leopard.”