Backups are as easy as 1-2-3

I like Leo Laporte‘s idea of 1-2-3 backup solutions.

  1. Local Copy – I keep anything local that i need to access quickly. every time i format the hard drive or buy a new computer, i only copy across the bare essentials. everything else, i grab from the external drive when needed.
  2. Cloud Copy – I use Backblaze for unlimited backup including external drives!
  3. Different Hard Drive – I use a simple rsync script, triggered nightly by a cron job, to backup to my external hard drive (which I plug in every night). The rcync script goes something like this:
#!/bin/sh
rsync --progress --recursive -a --exclude="*.Trash*" --exclude="Downloads/*" /Users/matt/ "/Volumes/laptop-2tb/matt-2013/"

Pretty simple hey?

Using Google Maps API

I’ve just started playing with the Google Maps Javascript API for a web version of Hpflsk. So far so good, but there are a few little quirks that have left me a bit baffled. I’ve also thrown in some examples that fit our use case.

 Include the Libraries You Need

Firstly, if you jump directly into the API Reference, wherever you see the word library after the heading in the table of contents, e.g. Places Library, you need to make sure you load that library when you load the API.

https://maps.googleapis.com/maps/api/js?key=XXXXXXX&sensor=false&libraries=places

I spend way too long in the javascript console writing google.maps.pl… and hitting the tab key and couldn’t work out why places was missing!

Geocoding versus Places Search

Geocoding is the process of turning an address – e.g. 123 Fake Street – into a geographic point (latitude/longitude). If you want to search for a location by name – e.g. Lucky Donut Shop – similar to what you can do in the search box on Google Maps, you need to use TextSearchRequest in the Places Library. Here’s how I implemented a text search for places near the user’s current location:

// HTML
...
<div id="map-modal-inner"></div>
...
<script type="text/javascript" src="https://maps.googleapis.com/maps/api/js?key=XXXXXXX&sensor=false&libraries=places"></script>

// JS
// Find the venue location by searching on the venue name
// Get client's location using HTML5 navigator.geolocation
var venueLocation;
navigator.geolocation.getCurrentPosition(function(position) {
  var latitude = position.coords.latitude;
  var longitude = position.coords.longitude;
  var myLocation = new google.maps.LatLng(latitude, longitude);
  // Search for places nearby and take the first result
  // First make a temporary map object
  var mapOptions = {
    center: myLocation,
    zoom: 16,
    mapTypeId: google.maps.MapTypeId.ROADMAP
  };
  var map = new google.maps.Map(document.getElementById("map-modal-inner"), mapOptions);
 
  // Then search for places matching the venue name
  var places = new google.maps.places.PlacesService(map);
  var placesRequest = {
    query: "Lucky Donut Shop",
    location: myLocation,
    // favour results within 50km of currrent location
    radius: '50000'
  };
  
  places.textSearch(placesRequest, function(results, status) {
    if (status == google.maps.places.PlacesServiceStatus.OK) {
      // for some reason jb = latitude, kb = longitude
      var loc = results[0].geometry.location;
      venueLocation = new google.maps.LatLng(
        results[0].geometry.location.jb, 
        results[0].geometry.location.kb);
    }
  });
});
</script>

Django + Elastic Beanstalk

Hpflsk has been playing around with Django and AWS Elastic Beanstalk. We’ve hit a few roadblocks along the way, but so far we’re up and running and everything seems to be fine.

Just to summarise our current situation:

  • We have 1 Android and 3 iOs developers working hard on version 2 of the app
  • These guys and me (sometimes) are the only people hitting the new API
  • The new version of the app will be hitting the new AWS Elastic Beanstalk setup

Here are a few roadblocks  and successes we hit along the way:

  • Setup using this excellent guide from Grigory Kruglov but each of the .config files, change commands: from commands: to container_commands:. This cost be a few frustrating and headbangingly awful days
  • The official docs get you started with a simple example, but in my opinion are really lacking
  • The whole config system is wack. The options are spread across two seperate files and one of them seems to update from AWS at random times. Uhh, it sucks and leads to so many headaches.

One thing I wanted to talk about was cost. There’s a guide that is on the Amazon site which estimates costs at something ridiculous – in the $1000s of dollars – and way outside the range of a lean startup. You definitely want to monitor your expenses using a Cloudwatch alarm.

So far, Hpflsk has been charged around $50 AUD a month with only a grand total of 5 people hitting the API regularly. This includes the free tier. This may seem like not that much, but it definitely discourages developers from creating a bunch of apps and seeing what sticks – which seems to be the lean startup model. So if you want to keep everything super lean and mean, you’re better off with a VPS (~$100 AUD per year). It’s that not much of a pain in the ass to setup – took me around a day – and can be used to host multiple projects.

instragram user IDs

Been playing with the instagram API a little bit today and hit a brick wall trying to pull user data.

So here’s the deal: username and user id are not the same in instagram-land. For example, in API speak the user id of 3 is allocated to kevin (founder of instagram). But in instagram profile land, the username of 3 is some complete other dude.

There’s no easy way to easily translate  between these two elements except these API calls may help:

Also, some dudes on stackoverflow have created some nice tools to do this.

Confused? Yeah, I sure was.

Logging and Application Performance Management Options

Disclaimer: I’m a developer fumbling around in the world of operations. Or in short, I’m trying to becoming as little a devops guy as necessary.

I have deployed the hpflsk backend, a python django app, to an Amazon Elastic Beanstalk (horrible name, pretty good product) instance, which has a generous free tier. Elastic Beanstalk instances have the option to roll, zip and send log files to S3. This is all well and good but what do we do with all those log files? Download, unzip and trawl through them by hand? No way.

So I wanted to find an easy way to index, analyse and store log files which makes it easy for us to find errors, monitor performance and analyse application usage. Here are the current contenders:

  • Sentry: Cloud-based or self-hosted open source exception tracking. Integrates with django and web frontends, amongst other things. Free trial 100 events/day, 7 day history
  • Loggly: Cloud-based log collection and analysis solution. Free tier 200MB/day of log data and 1 week of log retention. Integrates easily with django
  • Papertrail: Another cloud-based log collection and analysis solution. Free tier 100MB/day data, 1 week retention, 48 hours of search. Can integrate with django
  • Logstash: Self-hosted open source log collection and analysis. Awesome logo! Can centralize log collection and visualize using the built-in web UI or send to Graphite and view through its web interface
  • Graphite with Statsd: Self-hosted open source statistics collection and scalable realtime graphing system. Statsd integrates with django and can send events to Graphite
  • New Relic: Cloud-based user (web/app frontend), application and server monitoring. Basic plan is free. Integrates with django

Off we go. May the best man(agement software) win!

Settings the timezone for Facebook test users

Another (see below) problem I’ve had with test Facebook users is setting the timezone. The timezone of the user is what is returned with a query to the Facebook graph API https://graph.facebook.com/me

Here’s an example result:

{
  "id": "xxxxxxxxxx", 
  "name": "Eric Emo", 
  "first_name": "Eric", 
  "last_name": "Emo", 
  "link": "xxxxxxxxxx", 
  "location": {
    "id": "108252552529332", 
    "name": "Perth, Western Australia"
  }, 
  "gender": "female", 
  "timezone": 0, 
  "locale": "en_US", 
  "updated_time": "2012-11-20T02:52:31+0000"
}

We want the timezone to be set to Perth (WST) which is +0800, or 8 hours ahead of UTC. According to Facebook help, timezone is set when the user logs in. So how do we set the timezone for our test users? Simple.

  • Ensure the timezone of your computer is set to your desired timezone
  • Logout of your developer Facebook account
  • Login using the email provided on the App Roles page – something like eric_xxxxxx_emo@tfbnw.net and the password provided when you created your test user (or you can reset the password to something simple on the Roles page)
  • You may need to do this a couple of times to get it to “stick” (don’t ask me why!)

Make another call to /me on the Graph API Explorer (make sure you use your test user’s access token) and you should see the updated timezone field.

Playing with Facebook test users

I’ve been working on the second iteration of the hpflsk backend and have decided to follow best practice and create test Facebook users to play around with (before, we were using our personal accounts as test accounts and causing all sorts of mayhem). The backend is, again, implemented in django – the web framework for hipsters, with deadlines.
My original plan was to create a new batch of test users for every test (in the setUp process) and delete them again after each test (during tearDown). However, I found one of the limitations of the Facebook API is that not all user fields are updatable – notably location and timezone. To get around this, I decided to create 10 static test users and manually update their details by logging in (through the roles sections of the app development page). Hopefully I only need to do this once. Here’s the code to create the test users:

from django.conf import settings
import requests
import json

def create_test_user(name, locale='en_US', permissions=[]):
    r = requests.get("https://graph.facebook.com/%s/accounts/test-users"
            % settings.FACEBOOK_APP_ID,
            params={
                'installed': 'true',
                'name': name,
                'locale': locale,
                'permissions': ','.join(permissions),
                'method': 'post',
                'access_token': settings.FACEBOOK_APP_TOKEN,
                })
    return json.loads(r.text)

def delete_test_user(id, access_token):
    r = requests.get("https://graph.facebook.com/%s?" % id,
            params={
                'method': 'delete',
                'access_token': access_token,           
                })
    return r.text


Another limitation I found with using Facebook test users is that you can’t extend the life of their access token using the method described here. By default, test users are issued with a short term (1 hour) access token. But I want these users to be accessible by my test suite for all eternity! Here’s the trick: every time you fetch a list of your test users, their access tokens will be updated for another hour. So all you need to do is follow the process here (see the Accessing section) and you’ll get a list of all of your test users id’s and updated access tokens. You can then update your test users with the latest access tokens like so (I use the brilliant django-social-auth to handle all my social logins):

from django.conf import settings
from social_auth.models import UserSocialAuth
import requests

def refresh_test_users():
    r = requests.get("https://graph.facebook.com/%s/accounts/test-users" % settings.FACEBOOK_APP_ID,
            params = {
                'access_token': settings.FACEBOOK_APP_TOKEN
            })
    j = json.loads(r.text)
    for u in UserSocialAuth.objects.all():
       user_data = [i for i in j['data'] if i['id'] == u.uid][0]
       u.extra_data['access_token'] = user_data['access_token'] 
       u.save()


I run this refresh script every time my test suite starts so that I have a fresh set of access tokens before I start the integration tests with Facebook.

A Snake and a Plane

Google Appengine + Django

AppenginePython Logo

Getting started on a new project can be tough, especially when you have little to no idea about 80% of the technologies involved. As I enjoy making life difficult for myself, I’ve decided to tackle a new web venture from the top down – in other words, tackling some of the big software questions before writing a single line of code – and I think I’m almost there. Hopefully this can act as a template for future web projects.

The Big Decisions

So here’s a brief rundown of the buzz words I was looking for in the platform:

  • Cheap! (not really a buzz word) – I want this site to be pretty much free to get up and going
  • Scalable – The platform should scale gracefully (both in cost and capacity) with the  amount of traffic
  • Portable – The ability to pack up our toys and go elsewhere
  • Open – In a number of ways: To be able to clearly communicate about the technologies involved without giving away any trade secrets, to leverage as much open source as possible to begin with and hopefully be able to give back to the open source community as much as possible
  • Simple! – Stick to standard technologies so that getting up and running with development is (should be) a breeze
At the end of the day I was won over by Appengine (on pricing, scalability and portability), Django (on openness) and Python (on simplicity). To complement this trio of nerd hipsterness I’ve gone with a Github private repo (some of which will become public) for development, Virtualenv (to keep Python dependencies in check) and most importantly Django-nonrel and the related apps over at allbuttonspressed (to give a portable wrapper over Google Appengine’s Database backend).
For the front end, I’m thinking jQuery and Coffeescript (a nicer and hopefully more familiar syntax for writing Javascript).
If it all goes pear-shaped, at least this post has been deliciously trendy.