URL shortening services are all the rage. This is no doubt driven by increased use of Twitter and the need to deal with its 140 character limit.
This has of course generated a gold rush of investment in URL shortening services. It’s hard to see how these investments can really pay off.
As people use these services more and more the online community is wakingupto their problems:
Single point of failure
Perfect way to mask SPAM
Delegation of trust
Delegation of control
The potential for abuse and mischief and outright malign intent is spectacular. How long before someone implements an URL shortening service with a worm or SPAM Trojan?
The response has been predictably lame. Now there are un-shortening libraries and even web services. What a classic case of the solution being worse than the cure. How do users deploy such solutions without falling afoul of exactly the same risks?
Jeff Attwood thinks Google should become the benevolent provider. Hah! The absolute last thing anyone should want is to put more trust and dependency in Google. There’s no small irony in the fact that it is Google’s predilection for URLs that match the page title that has contributed to this problem.
The solution isn’t in the browser either. Greasemonkey scripts that try to resolve the URLs are still subject to the same mischief as the user and represent yet another attack surface for the malign shortener.
The new hotness is the rev=”canonical”idea. The concept is simple enough. When you discover an URL on a page it self-describes a shortened version of itself, deemed canonical. It sounds great but it isn’t. Imagine if I shortened all the URLs on this page and specified URLs on my site (or even a shortener I choose to use). I’m effectively claiming to represent the canonical URL for someone else’s content. This goes from delegation of control to inversion of control. How does the recipient know that server declaring the relationship is authoritative? If the original URL is relative then it’s a safer bet but still not guaranteed. Anything else and we’ve simply moved the problem around. Perhaps I’m reacting to the word canonical. I think it would have made more sense to simply say rel=”shorter” and not pretend that it is authoritative. Maybe I’m missing something here, feel free to correct me.
Applications whose implementation details impose a limit on URLs should have an internal URL shortening service. If Twitter wants to optimize the size of their packets then they can and should do so internally. They don’t even need to buy an expensive 5 letter domain. Instead of http://stup.id/ahk1n they can store !ahk1n. The user edits URLs normally, the editor converts them when posting and the view converts them back. Twitter users can continue to trust Twitter as much (or as little) and the solution doesn’t increase any of the risks listed above. There’s reason to believe that flickr is working along these lines.
Instead of having rev=”canonical” on each link why not have a shortening service on each domain (or even URL space). The service would convert URLs to and from long and short form in a way that truly is canonical. The service could be declared once on the page. A page that links to other domains could, at its discretion, discover the shortened URLs and use those. Users would trust them no more or less than the full URL but considerably more than http://tinyurl.com/dkkweg.