Integrating with Dropbox API for fun and profit

Integrating with Dropbox API for fun and profit

04 Sep 2012 – Warsaw

TL;DR

Integrating with Dropbox API and using its one, relatively new feature, resulted in both improved user experience and reduced operation costs for my store with music. Includes technical details and code.

A Shameless Plug

If you follow me on Twitter (or any other social service), you probably know that in may I’ve launched my first startup, MusicRage.org which is about, basically, selling time-limited packages of independent music with strict time limit (a package is available to buy for two weeks), priced with pay-what-you-want model. Yes, it is heavily inspired by HumbleBundle , but for music.

You can see for yourself, as at the time of publishing this blog post we’re running fourth round of packages, with four different genres (rock, electronic, indie/alternative and comedy).

The problem with buying music on the internet…

… is not only the actual buying, but downloading and managing bought files. One usually needs to either download and decompress an archive file (usually zip) containing the whole album or, in worst-case-scenario, download every. Single. File. Separately. This is one of the things that Apple did properly with iTunes: combining buying, managing and listening of music in one, convenient application. But this is also something that a store nowhere close to iTunes market share is able to pull off.

When MusicRage launched, we’ve offered both “classical” ways of downloading bought files: either single files or a zip archive with whole album in desired format (be it WAV or MP3). But, for a whole package of four albums, that still means downloading (click, select destination, ‘save’) and unzipping (click, select destination, ‘extract’) four zipfiles. I’m not even touching the idea of downloading all files separately.

There should certainly be a more convenient way of getting bought music, maybe even more convenient than what every BitTorrent client does (download a whole folder of ready-to-play files). I agree with Gabe Newell that customers want to buy content and support creators, it’s just that the service usually sucks .

Enter Dropbox

I love Dropbox . They have solved the problem of backing up, synchronizing and collaborating on files in a way that’s painless for user and doesn’t require technical knowledge.

What’s even better, less than a year ago, Dropbox has launched their API that allows third-party applications to interact with user’s Dropbox, in a way that’s both secure (by default applications can operate only within a specified directory in user’s Dropbox account) and convenient (authorizing applications via OAuth).

Dropbox integration could do for MusicRage what iTunes does: have all the music files of a given album downloaded, backed up and ready-to-play with whole operation stripped down to single click of “add this album to my Dropbox”. Use a cool API that could also reduce our operations costs (see below)? There was simply no reason for me to refuse such idea.

How Dropbox could reduce our operations costs? We host our downloadable files on S3, which has many upsides (bandwidth, scalability, security) and one downside (it costs). If we could find a way to put the files on Dropbox once and then make them copy to other users’ accounts within Dropbox infrastructure, that could remove the need to upload file for every single user. Which saves time and bandwidth for everyone.

A short story of copy_ref

There was once a functionality of Dropbox that prevented multiple uploading of same file by different users: when the application has detected that a file of this size and hash is already on Dropbox, it would just copy it internally (probably make a hardlink, I have no idea about Dropbox’ internals) and therefore finish “upload” of even largest file in a split second.

This feature been disabled after concerns have been raised and some applications that exploit this feature have appeared . Other parts of it (uploading only changed parts, similar to how RSync works) are still there, but if we were to upload MusicRage files onto users’ Dropbox, for every file we’d still have to do at least one full upload for every single user. Not the effectiveness I’ve been hoping for.

I don’t want to initiate a Dropbox upload for every single file added-to-dropbox by every single MusicRage customer. It takes a noticeable time, so would require background job workers and, since uploaded by our servers, puts us at risk of hitting the limit of our upload bandwidth (it’s exactly the reason why I’ve decided to host downloadables on S3 in the first place). Might make the dropboxy experience actually worse than manual file download. No way.

But then, a few months ago, copy_ref has been introduced which is basically the same functionality, but implemented in a secure and non-exploitable way. It’s available only via API and application needs to know the exact reference hash of the file already stored on Dropbox to make a quick copy. Which is exactly what I was looking for!

Choose your poison gem

MusicRage is a Ruby On Rails application, so I began looking for decent gem that’d wrap Dropbox API in some nice Ruby objects.

Copy_ref has been introduced 6 months ago (as of writing this blog post, obviously), but official Dropbox gem still doesn’t have support for it. I’ve of course found that out after some experiments, registering MusicRage as a Dropbox application and basically producing some working code.

There’s an another Ruby gem for interacting with Dropbox API, but I’ve closed my browser tab after reading about authorizing by (de)serializing OAuth session object. That’s not how you do OAuth and I’m not going that way.

Finally I’ve settled for dropbox-api gem that’s not only supporting copy_ref part of Dropbox API, but also has more ruby-esque library design than the official one. I was positively surprised after finding out that it’s actually written and maintained by Marcin Bunsch , a guy I know (and admire).

Getting (finally) dirty

Enough fluff. I like sharing code and ideas, and I’m all out of ideas. Let’s get down to the actual Ruby code that lets a webapp like MusicRage use Dropbox API and its copy_ref part for great justice!

First, a standard setup of application key-secret pair in an initializer:

# config/initializers/dropbox.rb
# you might want to put the key and secret in a configuration file
# that's not versioned with application code
Dropbox::API::Config.app_key    = 'xxxyyyzzz'
Dropbox::API::Config.app_secret = 'aaabbbccc'
# 'sandbox' mode because designated app-exclusive directory is fine for us
Dropbox::API::Config.mode       = "sandbox"

Before user can request application to add files to his Dropbox, he needs authorize the application with Dropbox first:

# app/controllers/dropbox_controller.rb
def authorize
  consumer = Dropbox::API::OAuth.consumer(:authorize)
  request_token = consumer.get_request_token
  session[:request_token] = request_token.token
  session[:request_token_secret] = request_token.secret
  redirect_to request_token.authorize_url(:oauth_callback => order_authorized_callback_dropbox_url(@order.access_token))
end

def authorized_callback
  consumer = Dropbox::API::OAuth.consumer(:authorize)
  request_token = OAuth::RequestToken.new(consumer, session[:request_token], session[:request_token_secret])
  access_token = request_token.get_access_token(:oauth_verifier => params[:oauth_token])
  # MusicRage doesn't have 'Users', just Orders accessed with secure URLs
  # that's why we save key-secret pair on Order, not User basis
  @order.dropbox_access_key = access_token.token
  @order.dropbox_access_secret = access_token.secret
  @order.save
  redirect_to order_download_path(@order.access_token)
end

Of course due to copy_ref constraints, every single file should store its copy_ref after being uploaded to Dropbox. Here’s a part of “Dropbox seed” script that goes through newly-added albums, asks the Dropbox for given file and saves its copy_ref in the database:

# script/dropbox_seed.rb
# album_asset is an instance of ActiveRecord model
album.tracks.each do |album_asset|
  original_filename = find_original_file_name(album_asset.asset)
  df = $client.find(relative_file_path(album, original_filename))
  cr = df.copy_ref.copy_ref
  album_asset.update_attribute(:dropbox_copy_ref, cr)
end

Now that we have copy_refs and a user willing to get these files into his Dropbox, let’s try the simplest, most-straightforward way to use them (synchronously, therefore with an optimistic assumption, for now, that the time for all those copy_ref API calls will be before the request times out):

# app/controllers/dropbox_controller.rb
# that's some ugly code, a proof-of-concept
def add_album
  # Check if user has no dropbox session, redirect to authorize if so
  return redirect_to(:action => 'authorize') unless(@order.dropbox_access_key.present? && @order.dropbox_access_secret.present?)

  client = Dropbox::API::Client.new(:token => @order.dropbox_access_key, :secret => @order.dropbox_access_secret)

  album = @order.pack.albums.find(params[:album_id])
  count = 0
  album.tracks(params[:format]).each do |aa|
    if(aa.dropbox_copy_ref.present?)
      client.copy_from_copy_ref aa.dropbox_copy_ref, "#{album.dropbox_directory_name}/#{aa.asset_spaced_filename}"
      count += 1
    end
  end
  flash[:notice] = "Added #{count} files to your Dropbox"
  redirect_to order_download_path(@order.access_token)
end

And… here it is. The code presented here is ugly and from the feature-spike time, but proven to work exactly as expected. It’s the least-working code, so do yourself a favor and refactor it!

Conclusion

Adding this single feature results in everyone winning:

  1. User: it improves user experience (getting album with a single click)
  2. MusicRage: decreasing operation costs
  3. Dropbox: engaging users to use Dropbox more (maybe upgrade to a paid account)

This might sound mighty fluffy, but I’m all in for situations where some work (in this case, a few lines of code) makes everyone involved win.

If you ever do an application that provides user with large or multitude of files, think about how many clicks it will require to use these files. Because maybe there’s a service your application could leverage to improve the experience.

Oh, and if you haven’t yet, check out MusicRage.org , there’s some really awesome music there.