Amazon Web Services offer us all cheap, ready access to serious cloud computing infrastructure. So how do we run Python on it?
Setting up Python on Amazon EC2
EC2 is Amazon’s Elastic Compute Cloud. It’s the service used to create and operate virtual machines on AWS. You can interact with these machines using SSH, but it’s much nicer to use the IPython HTML Notebook set up as a web app.
You can set up an IPython Notebook server manually, but there’s a couple of easier options.
- NotebookCloud is a simple web app that enables you to create IPython Notebook servers from your browser. It’s really easy to use, and free.
- StarCluster, from MIT, is a much more powerful library for working with Amazon, which uses profiles to simplify creating, copying and sharing cloud configurations. It supports IPython out the box and there are extra profiles available online.
Both options are open source.
You do not need to use IPython or Amazon EC2 to use AWS, but there’s a lot of advantages in doing so.
Whether you are running your machine on EC2 or just want to use some of the services from a regular machine, you’ll often want a way for your programs to speak to the AWS servers.
Python Boto Library
AWS has an extensive API, allowing you programmatic access the each of the services. There are a number of libraries for using this API, and for Python, we have boto.
Boto provides a Python interface to nearly all of the Amazon Web Services, as well as some other services, such as Google Storage. Boto is mature, well documented and easy to use.
To use Boto, you’ll need to provide your AWS credentials, specifically your Access Key and Secret Key. These can be provided manually each time you make a connection, but it’s easier to add them to a boto configuration file, which enables boto to provide the keys automatically.
If you wish to use a config for your boto set up, you need to create a file at ~/.boto. If you want to make this config system wide, you should instead create the file at /etc/boto.cfg. The file uses the .ini format and should at least contain a Credentials section, which will look something like the following:
aws_access_key_id = <your access key>
aws_secret_access_key = <your secret key>
You use boto by creating connection objects, which represent a connection to a service, then interfacing with those connection objects.
from boto.ec2 import EC2Connection conn = EC2Connection()
Note that you’ll need to pass your AWS keys to any connection constructor if you don’t have them set up in a config file.
conn = EC2Connection(access_key, secret_key)
Creating Your First Amazon EC2 Machine
Now, you have a connection, you can use it to create a new machine. You’ll first need to create a security group that allows you to access any machine you create in that group.
group_name = 'python_central' description = 'Python Central: Test Security Group.' group = conn.create_security_group( group_name, description ) group.authorize('tcp', 8888,8888, '<a href="http://0.0.0.0/0">0.0.0.0/0</a>')
Now that you have a group, you can create a virtual machine using it. For this, you’ll need an AMI, an Amazon Machine Image, which is a cloud based software distribution that your machine will use for an operating system and stack. We’ll use NotebookCloud’s AMI as it’s available and set up with some Python goodies already.
We’ll need some random data to create a self-signed certificate for this server, so we can use HTTPS to access it.
import random from string import ascii_lowercase as letters # Create the random data in the right format data = random.choice(('UK', 'US')) for a in range(4): data += '|' for b in range(8): data += random.choice(letters)
We also need to create a hashed password to log into the server with.
import hashlib # Your chosen password goes here password = 'password' h = hashlib.new('sha1') salt = ('%0' + str(12) + 'x') % random.getrandbits(48) h.update(password + salt) password = ':'.join(('sha1', salt, h.hexdigest()))
Now, we’ll add the hashed password to the end of the data string. We’ll pass this data string to AWS when we create a new virtual machine. The machine will use the data in the string to create a self-signed certificate and a configuration file containing your hashed password.
data += '|' + password
Now, you can create the server.
# NotebookCloud AMI AMI = 'ami-affe51c6' conn.run_instances( AMI, instance_type = 't1.micro', security_groups = ['python_central'], user_data = data, max_count = 1 )
To find the server online, you’ll need a URL. It takes a minute or two for your server to fire up, so this is a good point to take a break and put the kettle on.
To get the URL, we’ll just poll AWS to see if the server has a public DNS name assigned yet. The following code assumes the instance you just created is the only one on your AWS account.
import time while True: inst = [ i for r in conn.get_all_instances() for i in r.instances ] dns = inst.__dict__['public_dns_name'] if dns: # We want this instance id for later instance_id = i.__dict__['id'] break time.sleep(5)
Now turn the dns name into a proper URL and point your browser at it.
You should now be the proud owner of a brand new IPython HTML Notebook Server. Remember, you’ll need the password you provided above to log in to it.
If you want to terminate the instance, you can do that easily with the following line. Note that you need the instance id to do this. If you get stuck, you can always visit the AWS Console and control all your instances there.