Media 2.0 continued – relevance & automation

In continuing on the theme of being able to parse and digest the current wave and coming tsunami of microchunked content from my previous post, I would like to offer a potential idea for a service offering in this area up for debate. Since the blogosphere is full of customers of this type of service – and I’m sure many of you are still choking on all of the content being sent or pulled your way – I’m thinking of this as a sort of focus group.

Brad Feld posted on this topic after seeing hundreds of redundant postings that largely said the same thing about the “Yahoo /” deal. For me, this may be the straw that breaks the camel’s back. I’ve been seeing this problem among the feeds that I read for some time now, but this deal was over the top in terms of redundancy.

One of Brad’s readers questioned whether there are built-in reasons for this in the current blog-search market. That is, bloggers are hooked on Technorati / Google for relevance, and get it by posting about and linking to the hot stories, regardless of how many times it is told. The problem is, by analogy, if I’m subscribing to every small town newspaper from across the country, I’m not going to look at the World/National section in every one. I’ll read the WSJ or NY Times or equivalent for that, and read only the local news for relevant stories in any particular paper.

There is a larger issue here – is finding the best blogs or best individual posts a search-driven process? When I think of the Web, I think of it as a vast body of information and services that are available to me. Search is an appropriate metaphor, because I generally know what I want and I just need to parse the available options for what it is that I’m looking for. The search method works for this … and using links as a proxy for relevance seems to be a good algorithm.

Now, when I think of blog reading and subscribing to content, I have a different mindset. I do not know what the news is on any given day, nor do I know what original analyses, insights, topics, etc. people are going to want to publish on any given day. Here I look for trusted sources and referrals. I then subscribe to the sources, parse the content, and recommend / discuss with others. And the cycle starts again. In the old world there were not many choices – WSJ, NY Times, Forbes, Meet the Press, etc.

With the world of content syndication – blogs, podcasts, and video – the choices are too numerous to be able to find all of the trusted sources that would be relevant to me. Plus, from those trusted sources, not all of the postings are going to be relevant, so I’m more interested in the microchunks of content – specific posts or segments of podcasts from any given source. For example, from which of my numerous sources do I want to read about the Yahoo/ deal? Who do I trust the most on that topic?

So, if this is the case and the search metaphor does not scale for microchunk subscription content, how does one scale a trusted source & referral model? I do think that it starts with word of mouth and some searching …. that is to build your initial list of trusted blogs and other subscriptions. But these will not be enough, nor will the list be dynamic.

From there, I think there needs to be a service offering in the market that can (1) find relevant new trusted sources for me regularly and dynamically, (2) parse those sources for relevant postings / microchunks and deliver them to me, and (3) allow an automated filter for redundant posts from across sources.

The service I’m imagining would require that users upload all of their existing subscriptions in a categorization or list structure from their existing feed reader (with rankings if the feed reader allows for export of rankings). The service would then apply an intelligent filtering algorithm (probably based on a collaborative filter) to recommend new content sources based on what others in the network are reading and finding valuable. The user can syncronize with the service to get these recommended new feeds.

As a second order of value, the service would allow users to receive only those postings or microchunks of content that other subscribers like them in the network found to be valuable. This filter would use the trusted sources and again do some form of collaborative filtering to determine which individual postings and microchunks are relevant. In order to score the quality of a posting, it is likely that some form of meta data about that post would need to be uploaded from the user to the service for each post (e.g. tags, rating, or whether the post was read).

A well functioning service such as that described above, would benefit all in that (1) readers get better content from across more trusted sources while having to read fewer postings and (2) the quality bar for content will go up dramatically since only the most relevant content will reach readers (and the post-spamming of hot stories to get listed on search will decrease). Well I guess the least relevant bloggers would not benefit much.

The business model for all of this …. assuming the algorithms work and it’s easy enough to use …. could be a service subscription model and advertising. The advertising could be shared with the bloggers. It would be likely that subscribers could use the feed reader / microchunk reader (when we start digesting podcast chunks, etc.) of their choice (maybe with a plug-in to deliver the appropriate meta data for the algorithms to do their work.

I would love comments …..

— bkm