Project 4
CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken
- Handed out Monday November 15th, 2006
- Part A due November 27th, Part B due December 4th
- In-class presentations Thursday December 14th, 2006
- Final site and write-up due 8am December 14th, 2006
Important
- Projects 1 thru 4 are to be done in groups of 2 (two). Each group must hand in a single solution with both group member names on it. If your partner is dropping out of the course, email the instructor as soon as possible.
- A lot of resources are available on-line and in the library. It is great to use these, but please always cite them in your projects.
Objective
Project 4 consists of scaling the web site you have built up and proving that you have removed all the bottlenecks. This is an iterative process: test, measure, fix problems, repeat. At the end you should be able to add lots of machines to grow without bounds. (More realistically, the truly required database accesses will become the bottleneck that you can't scale easily given the EC2 infrastructure.)
Test set-up
You will test your site using httperf, which is a powerful web site benchmarking tool. Familiarize yourself with the tool and figure out how to simulate many users that use your site according to the critical paths you have identified in project 3.
Launch a load-generator instance on EC2 and use httperf on it to load-up your web site. Plot response time (sec/req) and throughput (req/sec) as you increase the load. Set-up measurements and analyze log files to determine where the bottleneck is.
For this testing you want to make sure you are using apache 2.2 as reverse proxy into mongrel_cluster running a number or rails processes. You are welcome to experiment with other set-ups and earn "extra points"!
Part A: Scale a single server - due Monday Nov 27th
Push the envelope using a single server without compromising scalability. The first step is to make sure your database is working as expected. Are there any indexes you need to create? Can you optimize the SQL statements that take longest or are executed the most frequently?
As a second step you might launch an instance of memcached and cache the most expensive to generate page fragments. Keep it simple and motivated by measurements! (If you are running tight in time, postpone the use of memcached.)
For part A, update your wiki project page to showcase your results. If you have not broken your page up into multiple pages, do so now. This may be a good time to split things up so you have an overview page for your project off which you hang the information from projects 2 & 3, and onto which you can now add the pieces for project 4 such that in the end you have everything ready for the final site.
- You need to explain the workload you are applying. Presumably you will simulate N concurrent users walking through critical path #1, M through path #2, etc. Justify the ratio of users.
- Explain the paramters you are using for httperf.
- Graph the results. If you perform optimizations, graph the before and after (it's ok to just graph the before all optimizations and after all optimizations, no need for a before/after for every optimization).
- If you see major bottlenecks that you have not resolved, explain what you tried and what you still plan to do. Also see Stefan to discuss.
Part B: Scale to multiple servers - due Monday Dec 4th
Now is the time to really harness the power of EC2: run multiple instances and test the scalability of your system. You should start with 3 instances: one for the database, one for your app, and one for memcached (if you have made use of it). Compare the performance with one instance. Explain what you observe.
Finally run your application on multiple EC2 instances. You may want to run the Apache reverse proxy on one of the machines, the additional load probably doesn't matter too much. If you start using "lots" of machines, you may want to move that to a separate box (maybe it can sit on the memcached machine just fine?). As you add instances, measure and explain what you see.
Note that in order to connect to your database you need to remove "skip-networking" from /etc/my.cnf on the database machine. You further need to open up the EC2 firewall to allow the instances to communicate with one-another (we will make a different security group available for this purpose). At that point it becomes important for you to set passwords on the database!
For part B:
- explain your set-up
- graph initial performance observed
- explain bottlenecks and how you plan to fix them
In-class presentation - Thursday Dec 14th
At the end of the quarter you will present your project in-class. You will have 10 minutes, including questions (that's short!). You may use slides or your wiki pages. Your presentation should include the following:
- Explanation of the site's purpose.
- Brief demo of the site, focusing on the critical paths that you tested (rehearse to keep it short).
- Presentation of results obtained.
- Explanation of major hurdles you overcame or special tricks you employed.
Final write-up - due Thursday Dec 14th 9am
The final write-up will be on the web site, i.e., you will augment the wiki page(s) you already have. This is your final documentation of your project and should include:
- Information about your project that you already wrote throughout the quarter.
- Description of the test set-ups, including critical paths, httperf workload files, and all the settings used with httperf.
- Description of the results, including graphs, of each stage of scaling and optimization. Make sure your graphs are correctly labeled!
- Explanation of results observed.
- Discussion of the limits of scalability encountered and suggestions on how to overcome them.
- Discussion of special techniques you employed.
- Contact information (the site will stay up past the end of the quarter, you probably want to obfuscate your email addresses slightly to thwart spam, for example just username@cs instead of username@cs.ucsb.edu).
- Screeshots: add a few screenshots to your write-up so people coming to the site later can get an idea of what it looked like.
- Use the turnin program to turn the source code of your app in (source code only!).
- Run your service on a single EC2 server for us to play with, give the instance a nickname composed of your project name and '-final'.
