At work (an instant messaging company), we recently moved data centre and performed a brief exercise to evaluate whether we should move to EC2 or take out another data centre contract. We settled on a hybrid approach, running our core service in a "normal" data centre and using EC2 for services that would need to scale rapidly. This provides us with the following advantages over a complete EC2-based solution:
- Lower cost. The total cost of running our core service is about 20% of the approximate cost of running it on EC2. We did however purchase the hardware ourselves (with some negotiation on price) and take out a long term contract with the data center (again with cost negotiations).
- Predictable costs. During our costing exercise we found it difficult to predict what the actual costs of running an EC2 solution would be, particularly when it came to predicting how many I/O requests to EBS volumes we would have. Additionally there is no way to "cap" your monthly EC2 costs that I am aware of, so if something unexpected occurs you could be left with a very large bill.
- More control: We purchased the hardware and therefore have complete control over our servers, switches and storage solutions.
- More personal service: We can actually go an meet the people at the data centre to discuss our support requirements. They work closely with us to resolve issues such as a recent DDOS attack.
For our highly scalable services we use EC2 for the following reasons:
- On-demand instances. Of course the first reason is that we can just add instances in minutes. Although the cost of EC2 is higher than a traditional data centre it is vastly quicker to scale up and down by adding and removing machines. No calls to the data centre or possible cost negotiations required. I should probably add the our data centre does not currently offer a "cloud computing" solution.
- Cost grows with usage. As we deploy only new features on our EC2 instances, the cost grows gradually as the service grows. So it's easier to estimate future costs based on current usage and costs figures. Once the service grows to a certain size we can then consider purchasing hardware and offloading the service to our data centre to reduce costs.
- No maintenance required. The big problem with purchasing hardware is that we have to maintain it! Everything from installation to hardware failures requires a trip to the data centre which means lost programming-time. This is more of a "purchase vs lease" point I suppose.
- Just do it their way. It can be seen as a good point or bad point about services such as EC2 that they usually offer one way to do something, such as HTTP load-balancing for example. If you want to do HTTP load balancing, use their load balancer - it's simple and easy to configure. If we want to do it on our own hardware then we have to do it ourselves by evaluating different software/hardware options and configuring them - more lost programming time. Again the total cost is lower for our (software) solution, but this does not take into account our time spent configuring and maintaining it. I've listed it as an advantage because you can of course implement your own software solutions on EC2 if required (just not hardware).
On reflection we should have included our time spent purchasing, configuring and maintaining hardware and talking to the data centre in our cost estimates as we would not incur many of these costs with EC2.
I have also recently been experiment with Google App Engine. I like that it is very different to EC2 in that it is a cloud platform rather than cloud infrastructure. My observations in general are:
I have also recently been experiment with Google App Engine. I like that it is very different to EC2 in that it is a cloud platform rather than cloud infrastructure. My observations in general are:
- More "Cloudy": In my opinion GAE is more in keeping with the "cloud-services" idea. You just deploy your app and it automatically scales from zero requests upwards. EC2 is a middle-ground between handling all the data centre stuff and a full cloud solution. You still have to monitor your app to see if it needs more resources, you still have to provision more if required and you still have to install the O/S and and software requirements. I should point out that I am aware of the Elastic Beanstalk solutions from Amazon but haven't yet looked at them in much detail.
- Limited functionality: The above advantage (as always) comes at a price. Understandably there are a lot of things that you cannot do (yet) as every feature must fit into a scalable architecture. GAE is basically limited to handling HTTP requests within a limited time or more persistent backend tasks. You can't handle or create raw TCP connections and the only persistent connection you can create from a client is using the Channel API which limits you to Javascript clients. So if your application cannot fit into the GAE limitations then you just have to go somewhere else.
- Cost Caps! You can cap your monthly bill; if you reach the cap your app will just stop accepting requests (I think). This is very useful for a cloud-based solution because you don't want a dodgy loop in your software or a malicious attack on your system resulting in a huge monthly bill because of massive I/O / storage / app-instance costs.
- Free to start with. If your app is not being used then you're not being charged. With EC2, even if you're not doing anything you still require a whole instance sitting there. Also when you scale-up on EC2 it is by adding whole machines or moving up to the next size of instance. GAE scales (in terms of cost) more linearly. It is worth mentioning that Amazon have recently started a "free micro-instance for a year" for new accounts.
Running GAE on Ubuntu
GAE requires python 2.5 at the moment and Ubuntu 10.10 comes with Python 2.6. Support for 2.7 is on the roadmap but until then, here's how I got it running on 10.10 (in a terminal):
- sudo apt-add-repository ppa:fkrull/deadsnakes
- sudo apt-get update
- sudo apt-get install python2.5 python2.5-dev libjpeg62 libjpeg62-dev build-essential gcc libssl-dev libbluetooth-dev sqlite3 libsqlite3-dev
- wget http://effbot.org/media/downloads/Imaging-1.1.6.tar.gz
- tar xzf Imaging-1.1.6.tar.gz
- cd Imaging-1.1.6
- edit setup.py line 38: JPEG_ROOT = libinclude("/usr/lib")
- sudo python2.5 setup.py install
- wget http://pypi.python.org/packages/source/s/ssl/ssl-1.15.tar.gz
- tar xzf ssl-1.15.tar.gz
- cd ssl-1.15/
- sudo python2.5 setup.py install
No comments:
Post a Comment