Strongly Connected Components

Notes from the folks at Pomelo.

Small-town charm

leave a comment »

Although websites with user generated content are all the rage these days, it often feels like their communities are too big. I face the same problems over and over when picking though content (Youtube, Amazon, Reddit, etc.):

  • Why do I see content I don’t care about?
  • Why do I have to weed through comments or reviews from people who don’t think like me?
  • What use is a four-star rating over 25,000 reviews if most of the reviewers are morons?

The cause of these problems is understandable; websites have built up all this content, and they want to show it off. A site that boasts a large number of users or reviewers is more likely to continue growing, as people consider it a reliable source for content.

But how can we manage large communities (and related content) such that they still maintain small-town charm? Specifically, how can we match users and content? I’ll describe one such solution: increasing the usefulness of content-heavy websites by maintaining clusters of similar people.

These thoughts were inspired by this article. It’s long and great; read it. To summarize, the article proposes some mechanisms for maintaining community within social bookmarking sites. Many of the proposed ideas can be generalized to all web communities.

The Internet has supported community for a long time. Here’s a general timeline of Internet communities I’ve been involved in over the years (although mostly as a lurker): VAX Notes > Usenet > various Listservs > Slashdot > Kuro5hin > Reddit > Hacker News. I generally stick around until the community grows too large, then move on to a new site.

There’s all sorts of writing about Internet communities; what makes them thrive or fail, and how they relate to real-world anthropology. More than these aspects, I’m interested in how we can provide a good experience for users, which in turn can promote the formation of community.

To illustrate the issue, let’s look at recipe websites, since I can’t seem to find one I like (although Cookthink is pretty nice).

Consider somebody who stumbles on a recipe website and wants to find something for dinner. They type in (or click on) ‘pasta’, and see a slew of results. Since there are many styles of cooking, it’s quite likely that many of the results are useless to that person (too simple, too complicated, too unhealty, etc.). Although we don’t know anything about the user yet, we want these results to become more useful in the future.

Most commonly, the recipes are ordered by popularity or date, and the visitor can sort or filter the results by criteria such as cooking time or ingredients. In other cases, users can join pre-established communities (e.g. subreddits), and content is dictated by these group memberships. Sometimes, related content can be displayed (like Amazon’s Customers with Similar Searches Purchased…).

But how can we display appropriate content without the user doing anything? Perhaps on their first search, we can’t. But as soon as they start clicking, we’re learning about the user. We’re learning which recipes they like, and we’re learning how the user relates to all of those recipes’ attributes. We’re learning about the recipes and attributes they don’t care about. More importantly (in the interest of promoting community) we’re discovering how this user relates to other users.

Suppose, after a few clicks, we assign the person to some subgroup of similar users (we maintain a user graph with edge weights that represent community proximity). It’s transparent to the user; they have no idea this happens. From now on, we display content, comments and reviews only from closely related people, perhaps with some more distant content to fill in gaps.

All that other stored content? Perhaps it’s accessible through deeper searches, or perhaps it’s not accessible at all. People don’t like being presented with an immense number of results to sift through; just the good ones. It seems to work for In-N-Out.

The key to this scheme is some metric to evaluate the happiness or satisfaction of users. Sometimes, site success can be measured by the length and quality of discussions (some users may prefer one-word conversations while others prefer longer, more involved responses). Perhaps the happiness of users of a cooking site can be measured by a constant or increasing frequency of use. The graph (and its related subgroups) can be shuffled or destroyed based on the percieved happiness of each user. Perhaps the site itself (the attributes used to generate the user graph, or the metrics used for happiness) can evolve to produce higher quality groups.

This idea isn’t about tacking social networking aspects on to existing sites. It’s about transparent mechanisms with which sites can better present content and promote community. Though I’m not a social scientist, I’ll bet that when users consistently see the same names on comments or reviews, they’ll be more likely to forge relationships. And isn’t that what we’re all about?

Written by Jay Boice

March 31, 2009 at 3:27 pm

Posted in web

Leave a Reply