Monthly Archives: March 2009

Will recommendation systems always suck?

OK, there’s nothing like a leading question to get things going in the morning.

I have made several forays into the world of recommendation systems and machine learning both as a user and as a developer.  As a user I have been utterly underwhelmed, every time.

There are several possible outcomes for a recommendation system:

  1. Recommends something you already have (obviously this is a degenerate case and easily filtered)
  2. Recommends something you’ve previously tried and rejected
  3. Recommends something you’ve never tried but don’t like
  4. Recommends something you’ve never tried and end up liking

Here’s an example.  I like a band called Great Big Sea.  The mere act of including this band into my declared content almost automatically generates a pile of recommendations that fall firmly into buckets 2 and 3.  This is probably because there are a lot of people so desperate for a Great Big Sea sound that they’ll wander into a world of Irish folk music that has about as much in common with Great Big Sea’s essence as the band Phish has with marine lifeforms.

Intuitively I believe that recommendation systems will converge towards clusters.  Even if you occupy more than one cluster you’re going to get the middle of the road for that cluster.  Some people like that cluster.  They may also like the manufactured pop band of the moment.  While I don’t begrudge them that choice, they hardly need a recommendation system for that.

Today I read an article that suggests that recommendation systems can increase diversity for individuals but they do so at the cost of overall diversity.  It suggests that diversity comes from increasing the friction of recommendations not decreasing them.

I once built a playlist and music sharing system based on the idea that you start out knowing what song you want to hear and then enjoy the serendipitous discovery of the songs that come next on someone else’s playlist.  It’s like channel surfing on the radio.  You stop when you recognize a song that you like, not one you don’t recognize.  Then you stay on that station until they play too many ads or play a song you really don’t like.  If you’re lucky you’ll hear a song that you didn’t know but really like.  Unfortunately our system never saw the light of day but at least I found a few new songs.

Applications masquerading as social networks

I finally took a cursory look at FriendFeed today.

In the old days we used to say that every application expands until it reads mail.  These days every new application starts out as a social networking platform with a relatively minor application on top.  It then becomes a game of trendiness and marketing to reach critical mass.  The world is sufficiently fickle and the applications sufficiently diverse that this can happen over and over again.

Without having used it much I can’t really say if FriendFeed truly solves interesting problems.  I suspect it does not.  It looks to me like a fairly obvious RSS aggregator combined with a social network.  OK, kudos to them for having thought of it but brickbats for the chutzpah to try to make it yet another place.  FriendFeed isn’t a place, it’s a minor application feature.

I don’t like or trust Facebook but on one level I’ll agree with what they are trying to do.  It’s absolutely pointless for me to create, import and maintain separate social networks for each application or trend that comes along.

Should FriendFeed have built a Facebook application?  Almost certainly.  Are we somehow happier, safer, more productive that they chose to roll their own network?  Hardly.  Throw in some address book scraping features and if anything they make me less secure.  Anyone who types their Facebook or gmail or any other password into some other service is being very optimistic.  Application developers who encourage users to do so are being unprofessional.

Is the solution to move everything to Facebook?  Of course not.  Maybe I was wrong, maybe we really do need the OpenSocial app platform, if only so as to prevent silly things FriendFeed.  I still don’t like applications.  I prefer protocols  like RSS.  It’s very hard for me to see FriendFeed as more than trivial aggregator of RSS feeds.  I see no benefit in having that run on a central service that will sooner or later be forced to monetize it.

I want a diverse culture of applications and implementations around some focused and well supported protocols.  A key enabling technology for any social networking is of course a list of contacts, in other words, a glorified address book.  Imagine that along with your email account your ISP also provided an integrated account that enabled social networking simply by providing a way to model relationships with your contacts and to manage access to those relationships (both for users and applications).

Ingredients:

  • Identity
  • Access control
  • Contact List
  • Relationship model
  • Applications

Am I slowly re-inventing OpenSocial?  It seems likely but I know I’m starting from a different set of goals.  Either way the journey is valuable and interesting.  Thanks for riding along with me this far.

Of bees, monoculture, research and taxes

The search for the cause of honey bee deaths was probably one of the most important investigations in science this past year.  It’s great that they appear to have understood the problem but just as great that the research was done in an open way.  It doesn’t appear that anyone is claiming any patents just yet.

I doubt we’ll be so lucky with the “cure.”  It seems likely that pure scientific collaboration will lose out to business which would then impose a juicy tax on the use of GM bees or whatever.  Of course it’s not called a tax, that’s only for “bad” stuff like government taxes that pay for research “best left” to private industry.  Sheesh.

Ruby irritants

The Windows port of Ruby is a second class citizen and for no good reason.  If you opt for the one-click installer then it really is fine.  The “official” binary release is missing a very specific set of external DLLs.  How hard would it be to have an official Windows ruby page that points to the missing binaries?  Fortunately there is a page, even if has no instructions.  FYI, you just unpack most of the zips into the ruby base dir.  The only exception is iconv where you should upack iconv.dll and iconv.exe into rubybin.

Rubyforge is truly a bizarre bazaar.  I envy the relative clarity of CPAN.  To be fair this sometimes works out because new approaches can bubble up at any time.  For the most part however, it’s just a pain.  You have to spend time evaluating the health of a particular library/gem on rubyforge before actually using it.  Even within the standard libraries there’s a fair amount of overlap and strange historical artifacts that you have to just deal with.

The ruby distribution has an insane amount of files.  Eleven thousand files!  This is a minor deal once it’s installed but it’s a pain when it comes to repeated deployment and my practice of storing my entire dev env in version control always hurts when applied to Ruby.

Ruby debugging is stuck in the 80s.  This is a fairly well solved problem in any modern development environment, why do we put up with this?

I won’t even touch on the perfomance issue.  For the most part it hasn’t been a problem for me.

In spite of all that, Ruby remains my favorite language.  I have tried and failed to like Python.  I like C# and although it is less verbose (not to say redundant) than Java, it still feels clunky compared to Ruby.

Why I never accept any Facebook applications

Facebook’s guiding principles and developer terms of use for applications include prohibitions that read like a checklist of ways that 3rd party applications can and will accumulate data about their users.

Facebook seems to realize they need to increase trust but their Application Verification Program is laughable because it can only verify the behaviour at the time of the test.  It’s pathetically easy to change the behaviour of an application after testing and even easier to be gathering personal tracking data about every user optimistic enough to run some application.

I use Facebook sparingly as a glorified address book.  I never use any 3rd party applications.

Social networking is not about gadgets and apps

I’ve been slowly catching up with OpenSocial and occasionally stumbling over examples of it in the wild.

I really think they missed the point here.

OpenSocial is one of those classic “standards” efforts by a bunch of big but not relevant players to catch up in a space where they have been completely blindsided.  I’m talking, of course, about Facebook.  When OpenSocial was announced, the hotest activity on Facebook was all the new applications on the relatively new application platform Facebook had provided.  OpenSocial was an obvious attempt to be relevant and doomed if not to failure then a security and privacy nightmare until the end of time.

Trying to catch up to Facebook by implementing a competing app model is like trying to pass Michael Shumacher on the track by wearing the same T-Shirt he had when you last saw him.

Already Facebook is moving past 3rd party applications.  They always were just a way to generate “social activity” and to embed Facebook into users’ lives.  Just as Microsoft and Google did before it, Facebook will swallow up any new social application that it considers to be core to its business model and leave the rest on the fringe to proove how open its platform is.  Just as with Microsoft and Google, nobody is likely to beat them by taking them head on.

Reach for the stars kids!

A lot has already been written about the 4 Spanish students who launched a balloon to the edges of the atmosphere and took some amazing pictures.  But I can’t resist.

I want to try something like this with my kids, even if we start on a much smaller scale.  When I read about this I think less about the pictures or even the accomplishment and more about how it will inspire them and others to reach beyond their immediate grasp.

One thing I find interesting is that the Telegraph article is horribly misleading and vague.  The article implies that these guy just tied a camera to a balloon, let it go up, waited for it to come down and then downloaded the pics to a computer.  Even without reading their site it’s obvious that this was a real engineering project and not something anyone could seriously attempt without some serious preparation.  The article is also painfully stingy with links and real attribution.  In short, the Telegraph article attempts to be entertainment but the blog posting I found attempts to be informatino.  This is why newspapers as we know them, online or otherwise, are going to die.

Dave Winer has said this many times.  Don’t be jealous of your readers, provide them with links away from your site and they’ll come back over and over again.

Do I have to start reading everyone’s blogs again?

One of the reasons I stopped blogging was because I also stopped reading.

I stopped reading because I got busy, very busy.  Working as a consultant and tracking my time by the hour meant that there was very little time left for serendipitous reading and research.  After a full day of work, and an evening full of children I rarely felt like diving into the blogsphere.

I still miss OnFolio.  It was almost the perfect fit for the way I liked to read blogs and I could blast through a ton of them in no time.  Google Reader is close but isn’t quite right.  OnFolio had two big things going for it.  By running as a browser extension it was able to integrate itself into the normal flow of pages and links in a way that just worked and worked fast.  By storing all my feed related information locally I was comfortably private.

I suppose I should try Sage again.  Do I have to?

New page: Links of note

I’ve added a new page to the site that is, for all intents and purposes, a linkblog. It contains links I recently found noteworthy but didn’t have the time or inclination to write about them.  The page is actually a rendering of links from a Delicious account I setup specifically for this purpose.  It’s convenient enough for me because I’ve installed a Delcious plugin that hijacks the “Add to bookmarks/favorites” feature of my browser.  I’m not worried about privacy because these are links I intended to publish anyway.

How long before I start rendering a twitter feed as well?  Don’t hold your breath.

Recommended reading: Untangled – Roy T. Fielding’s blog

http://roy.gbiv.com/untangled/

I just read a few posts here and learned something from each one.  More importantly, even if I don’t choose to follow everything he says, I now do so deliberately with a better understanding of the cost.

I really like how Roy Fielding thinks.  He’s sweating details around API design that most of us haven’t had the time to even notice.  I intuitively get that REST APIs must be hypertext driven.  Now I want to understand exactly why.  How do I sell this to my engineering team?

I suspect that where I will disagree with Roy is in areas I call pragmatism and he would call trouble waiting to happen.  For any given project only time will tell.

Lots of reading to catch up on.