Project4B

CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken

Jump to: navigation, search

Contents

Scaling Hot Spots On mutiple EC2 Servers

Overview

After improving the performance of our application on a single server, we wanted to see how it scales to multiple servers. The benchmark used to collect the data was httperf, and two different setups were used. In the first setup, we moved the database on its own EC2 instance, and had Apache and Mongrel running on the other instance. We increased the number of application servers running on a single instance from 1 to 4. For the second setup, we used a seperate instance for the database, the web server and each app server. This allowed us to achieve around 5x improvement through each of our critical paths.


Setup Scenario 1

Node 1 : Apache + Rails

Node 2 : Database


The first configuration we used was running Apache and Mongrel on the same instance and the MySQL database on a seperate EC2 instance. The motivation for this was to get maximum performance out of the database by running it on a seperate instance. For this setup, we collected data for Path 1 of our critical path. We expect simmilar results for our other two paths.



Path 1

Image:mutipleapp.jpg


Analysis

For the first set of runs, we increased the number of our Application servers from 1-4 on a single instance running Apache 2.2 and Mognrel. The maximum performance is 57 Replies/sec and 353 milliseconds Response time. We did not get any drastic improvement when going from 2 to 3 Application servers. By increasing the number of Application severs to 4, we were able to achieve 2X performance. This is due to the fact that the App servers were all running on a single instance. The CPU utilization was peaked at 45% during maxumum load. This is the reason we did not get 4x performance by running four App servers on a single instance. This is still a good result, since we used two times the resources (two EC2 instances) and got double the performance.The data for running each App server on a seperate EC2 instance shows that without the CPU bottleneck, we can scale to four Application servers.


Setup Scenario 2

Node 1 : Apache

Node 2 : Database

Node 3-n: Mongrel app servers

For the second set of runs, each Application server is running on a seperate EC2 instance. We collected data for each of the critical paths through our site. The httperf --rate parameter was varied from 1 to 29 by 2 for this setup, to show the full behavior of our application for this set up.



Path 1

Image:mutipleapp2.jpg


Image:mutipleapp3.jpg


Path 2

Image:mutipleapp4.jpg

Image:mutipleapp5.jpg


Path 3

Image:mutipleapp6.jpg

Image:mutipleapp7.jpg



Analysis

The graphs above show that by running the multi-tier architecture setup on EC2 brings tremendous performance improvement to our application. With 5 app servers running on independent machines, we observed a maximum of about 180 replies/sec, which is more than 5 times higher than the basic setup from part A, where all 3 components, apache, single app, and db, reside in one box.

In the first scenario, as we added more app servers to the box, we could witness a gradual increase in performance. However, usually after passing 6 sessions/sec rate, the replies/sec slope tends to flatten. We suspect that this is due to the fact that these app servers share the same resources - CPU, disk and memory. Therefore, they begin to fail to operate in parallel fashion once the resource gets over-utilized; one overloaded app server will affect all the other processes in the same box, thus degrading the performance of the other app servers as well as the apache servers. The bottleneck of the first scenario is the CPU of the machine that runs multiple app servers.

In the second setup, a benefit of assigning an individual machine to an apache server and each app server is much more apparent and significant. We had to expand the session rate up to 30 to discover a flat line in the reply rate for some cases; even with 2 app servers, the reply rate consistently increases until it reaches 15 sessions per sec, which is already twice more than any cases in the scenario 1 -- with 5 app servers, it is uncertain whether it ever reached its peak. Such an astonishing improvement is possible since each server fully utilizes the resource of a a machine for itself, allowing each service to reach the state of a maximum utilization of the available resource. Thus, the bottleneck here is also the resource of a machine, which can be observed in the graph above -- with 2 apps and 3 apps cases, every app server reaches its maximum resource utilization point. However, as more app servers get added, it is very likely that the database server will get overloaded with requests faster than an app server reaches its resource saturation peak. Thus, as some point, we suspect that the bottleneck will swing from an app server's resource to the database's resource due to the single point data access schema of our multi-tier architecture design. However, with the httperf benchmark simulation, where a static sequence of requests is fed into the app servers repeatedly, it is difficult to observe the database bottleneck due to the explicit RoR cache implementation. It is far more efficient than the database query cache mechanism since an app server does not even need to contact the database server to process a request; it checks the request URI and returns the cached page, which is stored locally, if available, therefore, not a single line in the method gets executed. Due to this desirable caching feature, we haven't yet been able to detect the database bottleneck in our simulation.

Challenges

Gotchas

1. When moving each application server to its own instance for setup2, the session must also be replicated across each instance. In our implementation, this is done by storing the session in the database.

2. When running multiple application servers on distributed systems for setup2, RoR cache implementation stores cached pages locally -- they are not shared among the servers. However, when one application server writes or updates a review, every application machine must expire its locally cached pages related to that review. This operation is performed by a bash script, rather than relying on RoR.

3. Collecting CPU utilization and disk utilization for the second setup. Since httperf is running on its own instance, it only collects the disk/CPU utilization on that node. One way to solve this is to run some CPU/disk/network monitoring utlity on each node, and have a master script that fires up the utilities. Although this was beyond the scope of the project, it was a useful insight.

Personal tools