Sun, 10 Dec 06

Crappy ruby script to download photos from a flickr photoset

So, it seems far more difficult than it should be to retrieve a bunch of my photos from Having unsuccessfully experimented with some of the flickr libs on rubyforge I hacked this together.

UPDATE: Something caused curl to get ‘stuck’ downloading a particular picture so I amended the script to ensure it ignores already downloaded files.

UPDATE 2: Surround output filename with quotes.

require 'cgi'

LOCAL_PHOTO_DIR = '/users/chrisroos/desktop/photos/'

curl_cmd = %[curl "\#{API_KEY}&photoset_id=\#{PHOTOSET_ID}"]
photoset_photos = `\#{curl_cmd}`

number_of_photos = photoset_photos.scan(/<photo .*>/).size
current_photo = 0

photoset_photos.scan(/<photo .*>/) do |line|
  current_photo += 1

  photo_id = line[/id="(\d+)"/, 1]
  photo_secret = line[/secret="(\w+)"/, 1]
  photo_server = line[/server="(\d+)"/, 1]
  photo_title = line[/title="(.+?)"/, 1]

  url = "\#{photo_server}/\#{photo_id}_\#{photo_secret}_o_d.jpg"
  filename = CGI.unescapeHTML(photo_title).gsub(/ |&|,|-/, '_').gsub(/'/, '').downcase.squeeze('_') + '_' + photo_id + '.jpg' # Sanitize title to use as filename
  filepath = LOCAL_PHOTO_DIR + filename

  if File.exists?(filepath)
    puts "Skipping photo (already downloaded) \#{current_photo} of \#{number_of_photos}"
    puts "Retrieving photo \#{current_photo} of \#{number_of_photos}"
    curl_cmd = %[curl "\#{url}" > "\#{filepath}"]