Tue, 04 Sep 07

Permalinks for del.icio.us bookmarks (posts)

28-Apr-2009. This is now a firefox extension hosted at addons.mozilla.org.

I’ve wondered in the past about sending trackbacks (or pingbacks) to bookmarked resources on del.icio.us. As with most things, you can either wait for someone to do it for you, or you can do it yourself… I started to investigate last Thursday afternoon and soon got sucked into the much more interesting issue of the missing permalinks for bookmarks. You see, every time you bookmark something on del.icio.us, you are creating a new resource. Unfortunately, it’s a very sad :-( resource. Sad because it doesn’t have a URL and so can’t be linked to. An anonymous resource floating around the world wide web looking for love. Boo hoo. Anyway, enough of that. I figured that if we could assign a unique tag to each of these resources then they would, as a side effect, get a permalink (you can already get your bookmarks containing a given tag by using http://del.icio.us/USERNAME/MY_TAG). I experimented with using the url as a tag. Although there seem to be some limits on what you can use, I found that if you strip the protocol and trailing slash from a url you are generally OK. So, if we bookmark BBC News and tag it with news.bbc.co.uk then we can access that individual bookmark at http://del.icio.us/chrisjroos/news.bbc.co.uk.

As you can only bookmark the same URL once, it should guarantee that the url-tag is unique. I can, however, see a problem with the trailing slash. Although it’s possible to bookmark URLs that differ only in the trailing slash (http://example.com/article1 and http://example.com/article1/ for example), it doesn’t seem possible to create a tag that contains that same trailing slash. OK, so we could always ensure that we remove the trailing slashes from the URLs we bookmark but that doesn’t seem like a very robust solution.

While experimenting, I found that I could use forward slashes within my tags, www.foo.com/bar/baz for example. So, I started prepending url/ to my url-tag so that I could easily recognise posts that already had permalinks. It was at this time that I noticed the similarity between the del.icio.us url history page del.icio.us/url/MD5_OF_URL (e.g. BBC News) and my permalinked bookmark, del.icio.us/chrisjroos/url/URL. If I was to hash the bookmarked url in the same way as del.icio.us then I would end up with a url like del.icio.us/chrisjroos/url/MD5_OF_URL. You can see the similarity in the URLs below.

It looks as though it’s part of del.icio.us itself doesn’t it? Cool huh. It’d be really cool to extend the del.icio.us firefox extension to automatically add this tag when bookmarking a site.

I’ve created a crappy script that will add this url/MD5_OF_URL tag to each of your posts. It’s hosted on google code and pasted below. Err, use at your own risk by the way.

USERNAME = 'YOUR_USERNAME'
PASSWORD = 'YOUR_PASSWORD'
POSTS_CACHE = 'FILE_TO_STORE_YOUR_EXISTING_DELICIOUS_POSTS_IN'

# GET ALL POSTS
unless File.exists?(POSTS_CACHE)
  puts "Downloading all posts..."
  curl_cmd = <<-EndCurl
curl "https://api.del.icio.us/v1/posts/all" \
-u"#{USERNAME}:#{PASSWORD}" \
-s
> #{POSTS_CACHE}
  EndCurl
  `#{curl_cmd}`
end

require 'hpricot'
require 'md5'
require 'cgi'

def add_url_hash_to_post(post)
  url = CGI.unescapeHTML(post['href'])
  url_hash = MD5.md5(url)
  url = CGI.escape(url)

  tags = post['tag'].split(' ').collect { |tag| CGI.unescapeHTML(tag) }
  tags << "url/#{url_hash}" unless tags.include?("url/#{url_hash}")
  tags = tags.collect { |tag| CGI.escape(tag) }
  tags = tags.join(' ')

  shared = post['shared']
  description = CGI.escape(CGI.unescapeHTML(post['description']))
  extended = CGI.escape(CGI.unescapeHTML(post['extended']))

  curl_cmd = <<-EndCurl
curl "https://api.del.icio.us/v1/posts/add" \
-u"#{USERNAME}:#{PASSWORD}" \
-d"url=#{url}" \
-d"description=#{description}" \
-d"extended=#{extended}" \
-d"tags=#{tags}" \
-d"shared=#{shared}" \
-s
  EndCurl
  `#{curl_cmd}`
end

# ADD THE URL/<md5_hash> TAG TO EACH POST
posts_xml = File.open(POSTS_CACHE) { |f| f.read }
posts_doc = Hpricot(posts_xml)
count = 1
(posts_doc/'posts'/'post').each do |post_xml|
  puts "Bookmark: #{count}" if (count % 5) == 0
  add_url_hash_to_post(post_xml.attributes)
  count += 1
end