Cloud Street

Wednesday, February 07, 2007

Great big bodies

I think the thing that really irritates me about the Long Tail is just how basic the statistical techniques underlying it are. If you've got all that data, why on earth wouldn't you do something more interesting and more informative with it? It's really not hard. (In fact it's so easy that I can't help feeling the Long Tail image must have some other appeal - but more on that later.)

As you may have noticed, this weblog hasn't been updated for a while. In fact, when I compared it with the rest of my RSS feed I found it was a bit of an outlier:

blogs2

The Y axis is 'number of blogs': two updated today (zero days ago), 11 in the previous 10 days, 1 in the 10-day period before that, and so on until you get to the 71-80 column. Note that each column is a range of values, and that the columns are touching; technically this is a histogram rather than a bar chart.

You can do something similar with 'posts in last 100 days':

blogs1

This shows that the really heavy posters are in the minority in this sample; twelve out of the eighteen have 30 or fewer posts in the last 100 days.

So it looks as if I'm reading a lot of reasonably regular but fairly light bloggers, and a few frequent fliers. If you put the two series together you can see the two groups reflected in the way the sample smears out along the X and Y axes without much in the middle:

blogs3

My question is this. If you can produce readable and informative charts like this quickly and easily (and I assure you that you can - we're talking an hour from start to finish, and most of that went on counting the posts), what on earth would make you prefer this:

blogs5

or this:

blogs4

I can only think of two reasons. One is that it looks kind of like a power law distribution, and that's a cool idea. Except that it isn't a power law distribution, or any kind of distribution - it's a list ranked in descending order, and, er, that's it. The same criticism applies, obviously, to the classic 'power law' graphic ranking weblogs in descending order of inbound links.

DIGRESSION
You can compute a distribution of inbound links across weblogs using very much the techniques I've used here - so many weblogs with one link, so many with two and so forth. Oddly enough, what you end up with then is a curve which falls sharply then tapers off - there are far fewer weblogs with two links than with only one, but not so much of a difference between the '20 links' and '21 links' categories. However, even that isn't a power law distribution, for reasons explained here and here (reasons which, for the non-mathematician, can be summed up as 'a power law distribution means something specific, and this isn't it').
END DIGRESSION

The other reason - and, I suspect, the main reason - is that the Long Tail privileges ranking: the question it suggests isn't how many of which are doing what? but who's first?. A histogram might give more information, but it wouldn't tell me who's up there in the big head, or how far down the tail I am.

People want to be on top; failing that, they want to fantasise about being on top and identify with whoever's up there now. Not everyone, but a lot of people. The popularity of the Long Tail image has a lot in common with the popularity of celebrity gossip magazines.

Labels: , ,

Wednesday, November 22, 2006

They don't know about us

Some dystopian thoughts on data harvesting, usage tracking, recommendation engines and consumer self-expression. First, here's Tom, then me:
"This is going to be one of the great benefits of ambient/pervasive computing or everyware - not the tracking of objects but the tracking and collating of you yourself through objects."

This sentence works just as well with the word 'benefits' replaced by 'threats'. It all depends who gets to do the tracking and collating, I suppose.

Now here's Max Levchin, formerly of Paypal, and his new toy Slide (via Thomas):
If Slide is at all familiar, it's as a knockoff of Flickr, the photo-sharing site. Users upload photos, which are displayed on a running ticker or Slide Show, and subscribe to one another's feeds. But photos are just a way to get Slide users communicating, establishing relationships, Levchin explains.

The site is beginning to introduce new content into Slide Shows. It culls news feeds from around the Web and gathers real-time information from, say, eBay auctions or Match.com profiles. It drops all of this information onto user desktops and then watches to see how they react.

Suppose, for example, there's a user named YankeeDave who sees a Treo 750 scroll by in his Slide Show. He gives it a thumbs-up and forwards it to his buddy" we'll call him Smooth-P. Slide learns from this that both YankeeDave and Smooth-P have an interest in a smartphone and begins delivering competing prices. If YankeeDave buys the item, Slide displays headlines on Treo tips or photos of a leather case. If Smooth-P gives a thumbs-down, Slide gains another valuable piece of data. (Maybe Smooth-P is a BlackBerry guy.) Slide has also established a relationship between YankeeDave and Smooth-P and can begin comparing their ratings, traffic patterns, clicks and networks.

Based on all that information, Slide gains an understanding of people who share a taste for Treos, TAG Heuer watches and BMWs. Next, those users might see a Dyson vacuum, a pair of Forzieri wingtips or a single woman with a six-figure income living within a ten-mile radius. In fact, that's where Levchin thinks the first real opportunity lies - hooking up users with like-minded people. "I started out with this idea of finding shoes for my girlfriend and hotties on HotOrNot for me," Levchin says with a wry smile. "It's easy to shift from recommending shoes to humans."

If this all sounds vaguely creepy, Levchin is careful to say he's rolling out features slowly and will only go as far as his users will allow. But he sees what many others claim to see: Most consumers seem perfectly willing to trade preference data for insight. "What's fueling this is the desire for self-expression," he says.

Nick:
I'm not sure that I see, in today's self-portraits on MySpace or YouTube or Flickr, or in the fetishistic collecting of virtual tokens of attention, the desire to mark one's place in a professional or social stratum. What they seem to express, more than anything, is a desire to turn oneself into a product, a commodity to be consumed. And since, as I wrote earlier, "self-commoditization is in the end indistinguishable from self-consumption," the new portraiture seems at its core narcissistic. The portraits are advertisements for a commoditized self

Granny Weatherwax:
"And sin, young man, is when you treat people as things. Including yourself. That's what sin is. ... People as things, that's where it starts."

More precisely, that's where some extraordinarily unequal and dishonest social relationships can start.

Labels: , ,

Monday, November 13, 2006

Got a web between his toes

Now that Nick has read the last rites for Web 2.0, perhaps it's safe to return to a question that's never quite been resolved.

To wit: what is Web 2.0? (We've established that it's not a snail.) Over at What I wrote, I've just put up a March 2003 article called "In Godzilla's footprint". In it, I asked similar questions about e-business, taking issue with the standard rhetoric of 'efficiency' and 'empowerment'. I suggested that e-business wasn't - or rather isn't - a phenomenon in its own right, but the product of three much larger trends: standardisation, automation and externalisation of costs. (Read the whole thing.)

Assuming for the moment that I called this one correctly - and I find my arguments pretty persuasive - what of Web 2.0? More of the same, only featuring the automation of income generation (AdSense) and the externalisation of payroll costs ('citizen journalism')? Or is there more going on - and if so, what?

Update 16/11

It would be remiss of me not to give any pointers to my own thinking on Web 2.0. So I'm republishing another column at What I wrote, this time from February of this year. Most of you will probably have seen it the first time round, when it appeared in iSeries NEWS UK, but I think it's worth giving it another airing. Have a gander.

Labels: , , , , ,