Writing my Python Marvel API Wrapper Part 3: Refactor

This is part three of my ongoing adventure in writing a Python interface to the Marvel API. If you’ve not read Part 1 about structure and Part 2 about adding requests then you may want to do that first.

So we left off with a functioning interface for the Marvel API, but ended up with the realisation that things could be so much simpler if we just refactor all the things.

About refactoring

By refactoring code, you’re essentially restructuring without changing the function. The main purpose is to improve maintainability of your code by, for example, splitting up long processes into smaller subprocesses or working to remove duplication.

In my case, we’ll be getting rid of some of the duplication that is in the code. As we’ve seen, a lot of my methods were building the exact same request URL and so that’s where we’ll start.

createRequestURL()

We can reduce a significant amount of duplicate code by handling the request URL in its own little method function, like getHash().

Step 1: Create a basic request URL

First, we’ll move the code to create the base request URL over to the new function.

def getRequestURL(self):
  request_url = self.API_URL + '?apikey=%s' % self.API_KEY
  return request_url

Step 2: Move the timestamp and hash generation

Next, we’ll have the function handle the timestamp and hash generation so that we don’t have to do so in each function separately.

def getRequestURL(self):
  request_url = self.API_URL + '?apikey=%s' % self.API_KEY

  ts = str(time.time())
  reqHash = self.getHash(ts, self.PRIV_KEY, self.API_KEY)
  request_url += '&ts=%s&hash=%s' % (ts, reqHash)

  return request_url

Step 3: Move additional parameters

Finally, we need to move the additional parameters like id, args, and subtypes. This can get a little tricky, because we don’t always need to use them. Therefor, we need to see if they’re being passed to the function.

So for id this would be something like:

def getRequestURL(self, id=None):
  request_url = self.API_URL

  if id:
    request_url += '/%s' % id

  ts = str(time.time())
  reqHash = self.getHash(ts, self.PRIV_KEY, self.API_KEY)

  request_url += '?apikey=%s&ts=%s&hash=%s' % (self.API_KEY, ts, reqHash)

  return request_url

The subType parameter can be added to the function in pretty much the same way and the arguments can be handled in the exact same way as they are now. The end result of this refactor is the following function:

def getRequestURL(self, id=None, subType=None, args={}):
  request_url = self.API_URL

  if id:
    request_url += '/%s' % id

  if subType:
    request_url += '/%s' % subType

  ts = str(time.time())
  reqHash = self.getHash(ts, self.PRIV_KEY, self.API_KEY)

  request_url += '?apikey=%s&ts=%s&hash=%s' % (self.API_KEY, ts, reqHash)

  for arg in args:
    request_url += "&%s=%s" % (arg, args[arg])		

  return request_url

Having moved all this code to a single function, our methods are now much, much cleaner. Check it out:

def getList(self, args):
  request_url = self.getRequestURL(args=args)

  return requests.get(request_url)

def getOne(self, id):
  request_url = self.getRequestURL(id)

  return requests.get(request_url)

def getCharacters(self, id, args):
  request_url = self.getRequestURL(id=id, subType='characters', args=args)

  return requests.get(request_url)

In the next part

We have a very limited amount of requests to this API. Which, I believe, is around a 1000 when you start out. My account is currently at 3000 requests per day. Generally speaking though, even if you don't have a request limit with an API it's a good idea to cache your requests. You really shouldn't need to make repeated requests to the API for the same data. I'm considering implementing this into a Django app so that we can make use of its Model system to store data we request for a month or more before requesting it again.

I may also consider reorganising the code so that you could potentially do something like:

stan_lee = marvel.creators.getOne(id=30)
stan_lee.getComics()

The options for improvement are almost endless at this point.

As always, you can find this stuff on Github. If you want to let me know how much this stuff sucks, I'm on Twitter of fill out this form.

Writing my Python Marvel API Wrapper Part 2: Requests

With my structure largely in place I was now ready to add functionality.

To make things a little easier on me, I’ve decided to use the wonderful Requests Python library. It makes it significantly easier to make requests and handle responses, because it does all the heavy lifting for you.

Discoveries

I quickly discovered there’s more to making these requests than that. Because I’m technically building an app, I’m required to also send along a few other things:

  • A unique identifier, like a timestamp
  • An MD5 hash of the API key, private key, and timestamp

Luckily, Python has built in date and encoding libraries so tacking on these extra requirements is pretty straight forward:

import hashlib
import time

It also means we’ll need to adjust our main object so that it can accept the private key that is provided by Marvel.

class MarvelAPIObject(object):
  BASE_URL = "http://gateway.marvel.com/v1/public"
	
  def __init__(self, apikey=None, privkey=None):
    self.API_KEY = apikey
    self.PRIV_KEY = privkey

I’ve placed my hash generator in a separate method, to make it easy to re-use. Which looks something like this:

def getHash(self, ts, priv_key, pub_key):
  return hashlib.md5(ts+priv_key+pub_key).hexdigest()

The main reason I’m not generating the timestamp in this function is because the timestamp also needs to be sent along with the API request. As such, it needs to be the exact same and it’s easier to deal with by simply generating it where I also create the URL.

Building the request URL

def getList(self, args):
  request_url = self.API_URL + '?apikey=%s' % self.API_KEY
  ts = str(time.time())

    request_url += '&ts=%s' % ts
    request_url += '&hash=%s' % self.getHash(ts, self.PRIV_KEY, self.API_KEY)

    for arg in args:
      request_url += "&%s=%s" % (arg, args[arg])

    return requests.get(request_url)

There are a few things to note here. As mentioned before, I’m generating the timestamp in this method so that I can easily append it and use it in the hash generator.

I’m also going through the arguments and simply appending the key:value pairs to the URL. As a future improvement I’d probably need to make sure that they are valid to avoid exploits of some sort.

And finally, it’s simply returning whatever the response is of the request. This means that you get a request object that will have a status code and response text for you to handle. If everything went well, your response text will be JSON and the status code will be 200.

Next: Refactor, refactor, refactor

At this point you may have already noticed that we’ve got all these methods that do the exact same thing and only differ slightly in their endpoints.

While the application works in this state, it’s probably a good idea to start refactoring the code and rip out all the repetition to make this thing easier to maintain in the future.

This will be an undertaking that I’ll describe in a future post. Until then, feel free to fork my stuff on Github or let me know what you think so far either by filling out this form or passing me a message on Twitter.

Writing my Python Marvel API Wrapper Part 1: Structure

A little while ago, Marvel released an API that allows you to get information on just about anything related to Marvel comic books. Be it a first appearance of a character, what series a specific writer was responsible for, or which characters were part of specific events in the Marvel universe, you can most likely get it from their API.

I felt this API would be the perfect opportunity to hone my skills, learn, and have a lot of fun doing it.

In this multi-part series I’ll attempt to document my road to a completed API wrapper in Python and give you insight in the how and why. You can follow my code’s progress on Github. You may also want to keep the API documentation handy.

So let’s start with the first part, setting up the project and structuring the classes.

Setting up

Starting out, I knew I had to create at least one model that would deal with the connection to the API, and then create models for each type(Characters, Comics, Creators, Events, Series, Stories).

At this point, my folder structure looks something like this:

|-- marvelapi
|   |-- api
|   |   |-- __init__.py
|   |   |-- characters.py
|   |   |-- comics.py
|   |   |-- creators.py
|   |   |-- events.py
|   |   |-- series.py
|   |   |-- stories.py
|   |-- marvel_api.py

My main marvel_api class looked like this:

class MarvelAPIObject(object):
  BASE_URL = "http://gateway.marvel.com:80/v1/public"

  characters = MarvelCharacters(BASE_URL)
  comics = MarvelComics(BASE_URL)
  series = MarvelSeries(BASE_URL)
  events = MarvelEvents(BASE_URL)
  creators = MarvelCreators(BASE_URL)
  stories = MarvelStories(BASE_URL)

There’s a few things going on here:

First, it’s setting a BASE_URL variable as part of the main object you’d be initiating. Second, it creates a number of properties with which you can request the API with. The broad syntax becomes as simple as something like marvelobject.characters.method or marvelobject.events.method after you initialise the object once.

The API key

In order to use the API, you need to sign up for an account and obtain an API key. This API key is then used in your request to the API. For example: http://gateway.marvel.com:80/v1/public/characters?name=Spiderman&limit=50&apikey=whatever-your-api-key-is.

This posed a slight problem. How was I going to pass my API key around the different classes I had created? The solution, as it turns out, is simple. Store it in my MarvelAPIObject and pass it over to the initiation of each type.

class MarvelAPIObject(object):
    BASE_URL = "http://gateway.marvel.com:80/v1/public"

  def __init__(self, apikey=None):
    self.API_KEY = apikey

    self.characters = MarvelCharacters(self.API_KEY, self.BASE_URL)
    self.comics = MarvelComics(self.API_KEY, self.BASE_URL)
    self.series = MarvelSeries(self.API_KEY, self.BASE_URL)
    self.events = MarvelEvents(self.API_KEY, self.BASE_URL)
    self.creators = MarvelCreators(self.API_KEY, self.BASE_URL)
    self.stories = MarvelStories(self.API_KEY, self.BASE_URL)

This way, you’re able to initiate the API wrapper in the following manner and continue using the API without too much issue as described before:

marvel = MarvelAPIObject(API_KEY)
character = marvel.characters.getOne(60)

So what’s going on in this class specifically now? The __init__ method is executed the moment you initiate the MarvelAPIObject. By defining this method function you can also define what values it can take and also a default for these values. In this case, it takes the default self variable(which refers to the initiated object and allows you to access the local object scope) and an apikey which is set to “None” by default.

Fleshing out

With the base structure worked out, we’re ready to flesh out the methods for the different classes. When you look at the API documentation, it’s fairly easy to see that the methods for each type are very similar. It’s this realisation that led to creating a parent class that all other classes would inherit from.

Since all I’m going to be doing is making requests and returning the response, it’s in the best interest to keep the code as simple as possible and not repeat yourself. For example, you can get the events for comics and creators. The main difference is the fact that you’d request one by /comics/{id}/events and the other by /creators/{id}/events.

So instead of creating methods for each request you can do by type, the parent class contains all the methods and the individual classes now only return a message when what you’re trying isn’t possible.

As an example:

# from api/parent.py
class MarvelParent(object):
  def getList(self):
      pass        

  def getOne(self, id):
    pass

  def getCharacters(self, id):
    pass

  def getComics(self, id):
    pass

  def getCreators(self, id):
    pass

  def getEvents(self, id):
    pass

  def getSeries(self, id):
    pass

  def getStories(self, id):
    pass

# from characters.py
from parent import MarvelParent

class MarvelCharacters(MarvelParent):

  def __init__(self, apikey, base_url):
    self.API_URL = base_url + '/characters'
    self.API_KEY = apikey

  def getCharacters(self, id):
    return "Not a valid method for Characters, use getOne or getList instead"

  def getCreators(self, id):
    return "Not a valid method for Characters"

This snippet shows my parent class and all its methods to request single items and lists of items. For the time being, each of these methods is just defined and doesn’t actually do anything(pass).

The MarvelCharacters class inherits all of these methods, whether they’re actually available in the API or not. To get around this, I’m re-defining the methods that aren’t available and simply returning a basic message with a suggestion for how to use it instead. This way, as mentioned before, I don’t have a getOne method for each type that, really, does the same thing and is coded exactly the same.

Another thing to note about this class is that it, too, has an __init__ method defined. In here, the api key is processed and the type specific API_URL is created.

Finishing up

With my classes worked out and methods in place, it was time to build the actual request URLs for each of these methods and allow for variable arguments to be passed.

You may have already noticed that we’ve defined a BASE_URL in the main MarvelAPIObject class. You may even have already noticed that we’re using this base URL to define an API url for the type. This will be the starting point for our API request.

I modified my methods to accept an arguments dict. Simply because while the request is similar for each type, the arguments may not be. Doing it this way allows for the greatest amount of flexibility in use and it means I won’t have to modify my code should any of my wrapper code should any of the argument’s names change.

Building the URL now becomes very simple:

def getList(self, args):
  request_url = self.API_URL + '?apikey=%s' % self.API_KEY

  for arg in args:
    request_url += "&%s=%s" % (arg, args[arg])

  return request_url

Basically, we’re using the API_URL(Which is the BASE_URL + /type) as the base. We then append the API key as the first argument. And finally, we then loop through each argument that is passed and append it to the URL. For testing purposes, I’m returning the created URL so I can simply print the result and check if it’s correct.

What’s Next?

So now we have the basics set up, we’ll need to actually turn those URLs we’ve made into requests and we’ll need to account for errors. This will be the subject of the next part.

Writing this post, I’ve realised I could probably make some improvements. For example, I could include the api key in the API_URL at the initalisation of the object.

Video APIs and mobile devices

When developing websites it’s inevitable that you’ll come across the need to implement videos of some sort, and to save your own server’s bandwidth you’re using a video provider like YouTube or Vimeo.

Now, it’s easy to just drop in the embed code where it’s needed. And there’s nothing wrong with that. However, embed a large amount of videos and you’re likely going to see slow downs in your page loads.

The simplest way to deal with this is to leverage the power of their respective APIs to load videos on demand. And when doing so, it may be useful to have these videos play automatically as soon as they’re loaded too.

While this works fine on desktop computers, I’ve recently found out that this won’t work very well on mobile devices. Trying to dynamically load a video and then using JavaScript to automatically start playing it will result in the video box remaining black and not responding to anything. Removing the auto play part of the script allows the video to load and will allow the user to hit play on it manually.

So why does this happen? Well, as it turns out this is somewhat intentional. In my quest to fix this issue when it occurred during development, I stumbled across this Safari HTML5 Audio and Video Guide. This article blatantly gives the following, and very valid, reason:

Warning: To prevent unsolicited downloads over cellular networks at the user’s expense, embedded media cannot be played automatically in Safari on iOS—the user always initiates playback. A controller is automatically supplied on iPhone or iPod touch once playback in initiated, but for iPad you must either set the controls attribute or provide a controller using JavaScript.

In short: to prevent things from hogging costly data, embedded media won’t play until the user tells it to. While I can’t find any specific documentation for Android, one can only assume that it’s treated the same way for the same reasons.

A good compromise is to verify whether you’re on one of these devices, before calling your play() function on your video.

Disclaimer: I don’t usually condone the use of browser/device sniffing. It’s a bad habit and there’s really no way to guarantee it’s future-proof. However in this case, I’m not sure if there’s an alternative, because checking the window’s width alone is just as wibbly-wobbly.

To do so, all you really need to do is something similar to this(Sorry people on 800×600 screens, if you’re still there!):

if ( !navigator.userAgent.match(/(iPhone|iPod|iPad|Android|BlackBerry)/) || 
     jQuery(window).width() > 1024 ) {
        // Whatever play function you have
}

And that’s it. It won’t be running any code if the device’s reported user agent contains iPhone, iPod, iPad, Android, or BlackBerry, or if the width of the screen is smaller than 1024px(Decided on this value because of tablets in landscape mode, as I said, width alone is just as wibbly-wobbly and unreliable).