Wed, 03 Oct 07

A microformat for my contact details (instant messaging, twitter et al)

I haven’t done any research on this so it may have already been proposed over at the microformats site.

The problem

Sometimes I want to communicate with the author of a webpage. I don’t want to have to hunt around to find contact details and I probably don’t want to fill in a form.

A proposal to fix part of the problem

If the author’s contact details were semantically marked up, either on the page I was looking at or linked to elsewhere on the web, then I could use a simple parser to extract those details and select the best communication tool to use manually.

Maybe some code can help explain what I mean.

A semantically marked up contact page.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">

<html>

  <head>
    <title>Contact Details</title>
  </head>

  <body>
    <dl class="contactDetails">
      <dt>Skype</dt>
      <dd class="skype identifier">skype-username</dd>
      <dt>MSN</dt>
      <dd class="msn identifier">msn-username</dd>
    </dl>
  </body>

</html>

A parser in ruby

require 'rubygems'
require 'hpricot'

class ContactParser
  def self.from_html(html)
    doc = Hpricot(html)
    new(doc)
  end
  def initialize(hpricot_doc)
    @hpricot_doc = hpricot_doc
  end
  def service_identifiers
    (@hpricot_doc/'.contactDetails .identifier').inject({}) do |hash, e|
       service = (e.classes - ['identifier']).first
       identifier = e.inner_text
       hash[service] = identifier
       hash
    end
  end
  def identifier(service)
    (@hpricot_doc/".contactDetails .#{service}.identifier").inner_text
  end
end

contact_details_file = File.dirname(__FILE__) + '/contact-details.html'
html = File.open(contact_details_file) { |f| f.read }
parser = ContactParser.from_html(html)

p parser.service_identifiers
# => {"skype"=>"skype-username", "msn"=>"msn-username"}
p parser.identifier('skype')
# => 'skype-username'
p parser.identifier('blurgh')
# => ''

Further considerations

Anyone have any thoughts?

Code is all over on google code.

1 I’m thinking of multi protocol tools, like adium, in particular.