Flickr Badge

Wednesday, August 31, 2011

Setting up Postgres on Ubuntu 10.04

We've just migrated our Django build server to a new box, and I'm currently installing all the necessary dependencies. Even though I've done this many times, I still keep forgetting the steps to do it properly, so this blog post will document the steps for future reference.

PostgreSQL

Installing postgres is a major pain, because there are so many annoying small things to do to get it working properly.

First, before you do anything you want to ensure that you set up your locale to UTF-8.
locale-gen en_US.UTF-8

update-locale LANG=en_US.UTF-8
If you forget to do this and its not setup by default, then postgres will use ASCII which is probably not what you want. If you sill forget this (yes, I forgot), and you haven't put in any data yet, then you can drop and recreate your postgres cluster.
pg_dropcluster --stop 8.4 main

pg_createcluster --start -e UTF-8 8.4 main
Okay, with that out of the way..

Installing Postgres

The latest version of Postgres on Ubuntu 10.04 is Postgres 8.4
sudo apt-get install postgresql-8.4
You'll also need the dev package to compile psycopg2 later
sudo apt-get install libpq-dev
And don't forget the python dev packages to compile psycopg2
sudo apt-get install python-dev
Setting up Postgres

By default postgres is configured to use your system users for authentication. If you want to use a specific user/password combination, you'll need to change this.

Open up /etc/postgresql/8.4/main/ph_hba.conf and change
local all all ident
to
local all all md5
Then restart postgres.

Create a postgres user and database

These are the two commands for creating a user and database. If you intend to run the Django unit tests, then dont forget to give CREATEDB permission for the user.
CREATE USER username WITH PASSWORD 'password' CREATEDB;

CREATE DATABASE db_name OWNER username ENCODING 'UTF-8';
Compiling psycopg2

You should be able to pip install psycopg2. Note that it needs to be compiled, so you should have build-essential package installed beforehand.
sudo apt-get install build-essential

pip install psycopg2
There is a gotcha here: psycopg2 version 2.4.2 is NOT compatible with Django 1.3. Either use the trunk version of Django or use psycopg2 version 2.4.1.
pip install psycopg2==2.4.1
Troubleshooting
  1. make or gcc is not found: install the build-essential package
  2. postgres gives the error "ERROR: new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII)": You haven't set the locale properly before installing postgres. See the top of the post for setting the locale
  3. Django gives the error "Got an error creating the test database: permission denied to create database": If you want to run unit tests, the user must have the CREATEDB permission
  4. postgres gives the error FATAL: Ident authentication failed for user "username": You need to edit the pg_hba.conf file to turn off ident authentication and set it to md5 instead
  5. psycopg2 gives the error 'PyType_GenericAlloc' undeclared (first use in this function): Install the python-dev package
  6. psycopg2 gives the error pg_config: command not found: Install the libpg-dev package

Monday, August 01, 2011

My Django Dash 2011 experience

About Django Dash

Django Dash is an online, weekend Django hackathon. You get 48 hours to develop a Django app of some sort, from scratch.

Apart from being a lot of fun, its a great way to squeeze in a bunch of learning over a short period of time. Kausik and I set out to build a badge generation application.

(View the site here - Make My Badge)

The idea

A lot of people probably know about the badge generation python script. This is a script that takes a list of people, a badge template and generates badges for all the attendees. The badges can be printed out and laminated beforehand for attendees to pick up at the event registration.

The badge design itself is based on a simple principle (surprisingly violated by a huge number of events!) - keep the name large and easy to read from across the hall. We wanted to avoid badges where the name of the event is prominent but the name of the person is tiny. I'm also not a fan of handwritten badges. Not only do they look ugly, but the scribbling is rarely readable.

We also wanted to auto-scale the font size based on the length of the name, and to split long names into two lines so that we could use larger font sizes. These badges have been used in a number of events in Chennai, and have always been popular, with some people even preserving them as souvenirs. (See the badge in use)

One of the problems with the script is that you need some technical knowledge to generate the badge. This made it complicated for a people to generate badges for an event. It also made it difficult to generate and print badges on the fly at the registration desk for on the spot registrants.

We decided to build a web app in 48 hours to do everything that the badge generator did, but as a Django site.

Something that we wanted to do (apart from developing the site of course) was to get up to date on the current cutting edge technology in the Python/Django space. Sure we use a lot of cutting edge tools for ToolsForAgile.com but that application is now almost 3 years old. Although we do keep updating it, its a big application and updating platforms and infrastructure can take time. Plus we wanted to do a quick spike of some of the technologies so that we could take the learning back to our product. Django Dash was the perfect opportunity for that.

Here are some of the things we learned

Django deployment scenarios

When we first developed our product, the only option for python and django deployments was to use Apache with mod_python. Then the WSGI standard was finalised and mod_wsgi came along. mod_python is now officially end-of-line with no more support available for it.

Today one popular deployment configuration is to use an nginx frontend server to serve static files, with a reverse proxy to an Apache + mod_wsgi configuration to serve dynamic files.

Apart from Apache, a lot of other backend servers are becoming popular, most notably gunicorn.

Django hosting

There are now a gazillion hosting providers for Django. When we started out with ToolsForAgile.com the only option was to roll your own setup on a VPS or dedicated server. You still want to do that for complex deployment setups, but there are now any number of django specific hosts for the common scenarios - ep.io, gondor.io, Stable, Dotcloud, AppHosted, DjangoZoom. Ken Cochrane has an excellent series of blog posts with a comparison of django hosting providers.

All these services allow you to simply deploy, while they handle setting up the deployment stack, load balancing and auto scaling for you.

And of course, for the simpler use cases, there is always good old Webfaction. I've hosted various sites with them for four years, with not a single problem so far.

ep.io

ep.io and gondor.io were sponsoring Django Dash, so apart from setting up a custom stack on a Linode VPS, participants also had the choice of deploying on one of these services. We decided to try out ep.io for our submission.

Overall, I must say that I really liked ep.io. It was a bit complicated to get my head around first. It took a while to get file uploads, static files and celery set up. That was more to do with the fact that we were using ep.io for the first time. Once setup, deployment was a breeze. All we had to do was to upload the project using their command line client and ep.io would automatically provision the app with all the required services. They also have an interface where you can push your project directly via git or mercurial.

I should note here that the ep.io client only works in linux as it requires ssh to execute remote commands. Since we were on windows, it took us a fair while to hack up the client and hook it up to do its work through Putty instead.

pip

This is the first time we exclusively used pip along with a requirements.txt file to sync up the python dependencies of all the local dev environment as well as the remote server setup. It worked really well. pip rocks!

Celery

Celery is a distributed task queue. It's amazing how popular it has become in such a short span of time. For our submission, we decided to try out celery to generate the badge images asynchronously through the tasks queue.

Celery can work on top of a number of transports. Popular configurations are to use it over an AMQP broker like RabbitMQ or over Redis. We used it over RabbitMQ (which in turn requires Erlang), whereas ep.io supports celery over Redis. The nice thing is that we just had to tell ep.io that we wanted to use celery and it would set everything up, including launching the celery daemon, configuring it to use redis and what not.

On our side, we used django-celery, a Django integration of celery. With djcelery you can create a tasks.py file in your app with a bunch of task definitions which you can call asynchronously through a view.

After some initial setup hiccups, it worked like a charm, allowing us to push out the badge image generation out to the workers. When badge generation was done, it would trigger another task to zip up all the images and allow the user to download a single zip with all the badges. Everything happening through Celery. It was pretty exciting to see the whole flow in action.

Django 1.3

We used Django 1.3 for this submission. Our product is built on a version of django that is somewhere between 1.1 and 1.2.

We got a chance to play with some of the new stuff in 1.3. We got a chance to try out class based generic views, the new logging setup, TemplateResponse, some improvements to the admin, and a bunch of other stuff.

The admin interface itself is a huge section to explore. We briefly thought about doing the whole site through a custom admin interface, but we were running out of time so we shelved the idea for the time being.

Django 1.3 also has a completely new way of dealing with static files. The old django-staticfiles app has been added as a contrib package and is now the official way to handle static files. The old way had a single static directory with all the static files in it. You now have to put your static files either under an app folder, or in a global static dir. You then call the collectstatic management command to pull in all the static files into a directory that the frontend server will serve.

Similarly, there seems to have been a change in the way uploaded files are handled. Previously they used to go under the static folder. They now get a folder of their own.

What got done in 48 hours

We eventually ran out of time on all the features that we wanted. Thats not surprising, with the amount of new stuff we were using for the first time. It took almost one and a half days to setup and get comfortable with ep.io, celery, django-celery and all the changes in Django 1.3, and getting them to work properly on the local dev environment and online on ep.io.

However, since we built the whole app incrementally, we managed to get the important features in. If you go to Make My Badge you'll be able to login, generate and download a sample set of badges. You can create events, though you need to go to the admin interface to add people to the event.

What is left to do?

We ran out of time before we could integrate django-registration for new users to register. We've created a sample user through the admin interface for now. We also wanted to be able to upload a list of event attendees through a CSV upload form, but couldn't put that in. As a workaround, you need to add people through the admin interface.

Apart from those two main features, there were lots of UI design work that we wanted to do which we couldn't finish.

Where are we going from here?

Once the event judging is over, we have a bunch of enhancements to add. The CSV upload feature, for example, got done about 2 minutes past the deadline. We want to put that in. Integration with django-registration is also ready on the development environment.

We want to eventually add integration with a payment system. Charge Rs.1 per badge generated or something like that.

Overall Impressions

Django Dash was awesome fun. Although we couldn't complete everything we sure did learn a lot of new things that we can take back into other projects. And now that we know the tech, I'm pretty certain we can have a similar app complete in 48hrs.

I was just going through some of the teams, and here are some submissions that I found. Remember, all these were done by scratch in 48 hours! - Drawn by, Family Feed, ProposalMatic, FutureFI, Goal Rally, Django Lint, ConsoliTweet, Courtside, Linky, Git Awesome, Stardust, Set With Me, Libman, Show Offfr, SmartLinky, Staste, Django Docs, Code War, Grepo, Codr Space, My Img, Gearoscope