Project 4: Website Scaling
CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken
Up to BaDTunes project overview.
Contents
|
Scale a single server
Description of the Test Set-ups
To create the load on the servers, we used a program called httperf which was called in the following way:
httperf --hog --server servername --wsesslog=$X,0,crit_path.txt --session-cookie --rate $X --port 3000
where 1<=$X<=30
We used the range of: 1<=$X<=30 after trying a lot of different ranges (up to several hundreds) but we found that after already a few sessions per second, the req/s and s/req doesnt improve anymore.
Therefore, we executed this command 30 times for each critical path and for each version of our site (before optimization, after optimization). What this does it is that it starts $X sessions per seconds. Where each sessions has Y number of requests where Y is the length of the critical path. Critical paths have been identified in Project 3 and are the typical paths through our site that a user would go through. They model typical (expected) user behavior. Eventhough, the sessions are started sequentially (within a second), we, therefore, model concurrent users accessing our page since it takes a while to get through the user path. Note: We removed logout from our critical path since we were using the same username for all session and, thereby, a logout call would have logged out all current sessions.
httperf Workloads
NOTE: In our httperf workload, we analyzed the Mongrel and Apache logs to match the behavior of a real web browser (Firefox). Therefore, images are only retrieved once during the session because the web browser caches images and fetches the images only periodically (or as needed).
Critical Path 1
#1. Homepage
/store
/stylesheets/music.css?1163624739
/images/banner.jpg
/images/banner_filler.jpg
/images/star_full.gif
/images/star_empty.gif
/images/star_half.gif
/images/sidebar.jpg
#2. Search For Bob Marley by Artist Name
/store?query=Bob+Marley&search=artist&commit=Search
#3. View Results
#4. Listen to Song
#5. Login
/store/create_rating?search=artist&query=Bob+Marley&music_id=3583&song=The+Axe+M
an&artist=Bob+Marley
/account/login method=POST contents='login=brett&password=brett&commit=Log+in'
#6. Rates Song
/store/create_rating method=POST contents='rating%5Bmusic_id%5D=3583&rating%5Bre
view%5D=super+sstuff&rating%5Bscore%5D=5&commit=Save+Rating'
#7. Add to Playlist (Cart)
/store/add_to_cart/3583
#8. Checkout
/store/checkout
#9. Place order
/store/save_order method=POST contents='order%5Bname_cd%5D=Hello+Brett+is+using+
AutoBench&order%5Bname%5D=Brett&order%5Bstreet_address%5D=Perf&order%5Bcity%5D=P
erfLand&order%5Bstate%5D=PE&order%5Bzipcode%5D=91111&order%5Bemail%5D=perf%40htt
p.com&order%5Bpay_type%5D=cc&order%5Bcc_num%5D=349839&order%5Bcc_exp%5D=348&orde
r%5Bcc_cvv%5D=348&commit=Place+Order
#10. Log out
#/account/logout
Critical Path 2
#1. Homepage
/store
/stylesheets/music.css?1163624739
/images/banner.jpg
/images/banner_filler.jpg
/images/star_full.gif
/images/star_empty.gif
/images/star_half.gif
/images/sidebar.jpg
#2. Browse
/store
/store?page=2
/store?page=1
#3. Listen to top rated songs
#4. Add top 3 songs to playlist (Cart)
/store/add_to_cart/642
/store/add_to_cart/2505
/store/add_to_cart/8079
#5, 6. Login & Checkout
/store/checkout
/account/login method=POST contents='login=brett&password=brett&commit=Log+in'
#7. Place order
/store/save_order method=POST
contents='order%5Bname_cd%5D=Brett+is+using+AutoBench+Top+Songs&order%5Bname%5D=Brett&order%5Bstreet_address%5D=
Perf&order%5Bcity%5D=PerfLand&order%5Bstate%5D=PE&order%5Bzipcode%5D=91111&order%5Bemail%5D=perf%40http.com&order%5B
pay_type%5D=cc&order%5
Bcc_num%5D=349839&order%5Bcc_exp%5D=348&order%5Bcc_cvv%5D=348&commit=Place+Order
#8. Log out
#/account/logout
Critical Path 3
#1. Homepage
/store
/stylesheets/music.css?1163624739
/images/banner.jpg
/images/banner_filler.jpg
/images/star_full.gif
/images/star_empty.gif
/images/star_half.gif
/images/sidebar.jpg
#2. Views User CDs
/store/index_cds
/store/index_cds?page=2
/store/index_cds?page=3
#3. Adds the 3rd User CD to playlist (Cart)
/store/add_all_to_cart/127%2C124%2C78%2C
#4, 5. Login & Checkout
/store/checkout
/account/login method=POST contents='login=brett&password=brett&commit=Log+in'
#6. Place order
/store/save_order method=POST contents='order%5Bname_cd%5D=Brett+is+using+AutoBench+UserCD&order%5Bname%5D
=Brett&order%5Bstreet_address%5D=Perf&order%5Bcity%5D=PerfLand&order%5Bstate%5D=PE&order%5Bzipcode%5D=91111
&order%5Bemail%5D=perf%40http.com&order%5Bpay_type%5D=cc&order%5Bcc_num%5D=349839&order%5Bcc_exp%5D=348&order%5B
cc_cvv%5D=348&commit=Place+Order
#7. Log out
#/account/logout
Description of the Results
Optimizations:
Optimization 1 - Artwork Proxy
When a list of songs were loaded in our store, we used yahooapis.com web service to acquire an image of the artist. Since this was done for each song, this caused serious delays / slow downs. Note that this optimization was done before the testing was done. So, this optimization is not reflected in the graphs bellow.
Optimization 2 - Ratings
Originally, we calculated the ratings for a song by averaging all the rating entries of this song in the ratings table. Instead of doing this, we are now keeping subtotals of the ratings for each song in the musics table. Every time a rating is added or modified we change these fields accordingly.
Optimization 3 - Adding Indexes
Added indexes:
- Table: musics
- index for artist
- index for song
- index for album
- Table: users
- index for login
Graphs
| Critical Path 1 | Critical Path 2 | Critical Path 3 | |
|---|---|---|---|
| Request Rate | | | |
| Response Time | | | |
Explanation of Results Observed
The graphs above display the results of performing optimizations on adding indexes to our tables and improving our SQL queries for our rating system. The artwork code was commented out since results could take over a minute per page. Although not specifically evident in the preceding results, we determined that the primary bottleneck of our system is due to the database queries. These operations are essentially the only ones (other than loading images) that require disk access which is a well known bottleneck. Our primary improvements came by reducing the number of SQL queries that we were performing when browsing/searching, computing the highest rated songs, and creating more efficient database tables via indexes and improving the overall design of our system. The indexes improved our overall performance when searching for a particular artist/album/song by about 5%. We also removed the inefficient artwork that was being proxied from Yahoo's music service. Our first implementation of retrieving the album artwork was extremely slow as a result of opening an HTTP connection, using REST to query for the artwork, parsing the results to retrieve the URL, and then displaying the URL for up to 10 images per page.
On the response time graphs, we can see the improvements achieved. When the number of sessions started per second is low, the optimized and pre-optimized stores are both able to keep up with the number of requests. Once the rate increases, the pre-optimization badTunes store starts performing worse, i.e. it takes it longer to fulfill the requests and, therefore, the response time is higher than for the optimized badtunes store.
Discussion of the Limits of Scalability Encountered
The main limits of scalability appear to be with how well we can improve the efficiency of our SQL queries and reduce the number of SQL queries that we are performing.
Scale to multiple servers
Description of the Test Set-ups
EC2 Instances used:
- one load generator
- one load balancer (running on one of the BadTunes store instances)
- one database instance
- one memcached instance
- BaDTunes store instances (where i = 1, 3, 5, 10 )
Load Balancer
In order to achieve load balancing, we are using the load balancer provided by Apache 2.2 (more specifically, its proxy module, mod_proxy_balancer) which redirects the incoming HTTP requests according to a load balancer scheduler to our instance running BadTunes Store.
Load Generator
To create the load on the servers, we used a program called httperf which was called in the following way:
httperf --hog --server servername --wsesslog=$X,0,crit_path.txt --session-cookie --rate $Y --port 3000
where 0.5<=$X<=10
and $X/$Y = 30
We used the range of: 0.5<=$X<=10 since the tests in part A of project 4 showed that a single instace isnt able to handle much more than about 5 sessions per second. We chose step size of 0.5 so that we get more detailed information on the performance within this range than the step size of 1 used in part A.
The time period during which all sessions are started lasts 30 seconds. This is done by adjusting $X according to the changes in the current rate, $Y. In other words: $X/$Y=30 seconds. $Y defines how many sessions per second will be started. Each session has Z number of requests where Z is the length of the critical path. Critical paths have been identified in Project 3 and are the typical paths through our site that a user would go through. They model typical (expected) user behavior. $Y sessions are started sequentially within a second, we, therefore, model concurrent users accessing our page since it takes a while to get through the critical paths.
Note: We removed logout from our critical path since we were using the same username for all sessions and, thereby, a logout call would have logged out all current sessions.
Database Instance
This instance running on EC2 is running the database which provides the database for all the BaDTunes instances.
- ~100K songs
- 30K+ orders
Total Database size: 30MB
Since our total database size is even with about 100K songs only 30MB, MySQL can perform optimizations by loading our entire database into memory.
Memcached Instance
We are using memcached (http://www.danga.com/memcached), a distributed memory object caching system, and the memcache-client (http://rubyforge.org/projects/rctools/) to reduce the load on the database. We are storing all the session data in the memcached instance instead of in the database.
BaDTunes Store Instances
These instances are called by the load balancer and are running the BaDTunes store. In order to support multiple store instances, all the session data is stored to disk (/ memcached) so that all the store instances have access to and the session data can be carried over when a user interacts with multiple store instances within the same session.
Description/Explanation of the Results
Initial Tests
After running the first tests with multiple instances of our BaDTunes store, we discovered that the database quickly became a bottleneck (see left graph below).
| MySQL getting "Railed" | MySQL on Rails! |
|---|---|
The image above clearly demonstrates the fundamental limit of our system. Here MySQL quickly maxes out the available CPU at roughly 46.5% of the CPU. This odd limit was reaffirmed by writing a C program with an infinite loop that again maxed out the CPU around the same CPU limit. We hypothesize that this is due to Amazon restricting the maximum CPU usage per virtual machine. Therefore, they can take advantage of a dual core configuration and run 2-4 virtual machines on the same physical box. After analyzing our results, there was minimal improvements scaling to multiple servers since the database was being overloaded even by a single server (see figures below). Our system hence was unable to benefit from multiple application servers since they were stuck waiting for the database to return queries. Thus we began to scrutinize our SQL logs since something was VERY wrong.
| Replies/sec | Response Time |
|---|---|
| | |
MySQL Slow Query Log
# Time: 061203 3:57:58 # User@Host: badtunes[badtunes] @ domU-12-31-34-00-02-D0.usma2.compute.amazonaws.com [216.182.238.28] # Query_time: 2 Lock_time: 0 Rows_sent: 1 Rows_examined: 92992 SELECT count(*) AS count_all FROM musics; # Time: 061203 3:58:04 # User@Host: badtunes[badtunes] @ domU-12-31-34-00-02-D0.usma2.compute.amazonaws.com [216.182.238.28] # Query_time: 2 Lock_time: 0 Rows_sent: 10 Rows_examined: 93002 SELECT * FROM musics ORDER BY average_rating desc LIMIT 0, 10;
These two slow SQL queries showed up repeatedly during our tests since our main page executes these queries. The solution to improving these queries was not immediately apparent. We ended up finding the solutions to both slow SQL queries on the website: http://www.mysqlperformanceblog.com
- The select count(*) slow query was due to using the Innodb table type which is extremely inefficient in counting the rows in a table. (Analyzing the development logs showed that the Rails paginate function performs this count SQL query). Here is a quote from a comment posted by a user on the website:
COUNT(*) for Innodb Tables
I guess note number one about MyISAM to Innodb migration is warning what Innodb is very slow in COUNT(*) queries. The part which I often however see omitted is fact it only applies to COUNT(*) queries without WHERE clause.
So if you have query like SELECT COUNT(*) FROM USER It will be much faster for MyISAM (MEMORY and some others) tables because they would simply read number of rows in the table from stored value. Innodb will however need to perform full table scan or full index scan because it does not have such counter, it also can’t be solved by simple singe counter for Innodb tables as different transactions may see different number of rows in the table.
If you have query like SELECT COUNT(*) FROM IMAGE WHERE USER_ID=5 this query will be executed same way both for MyISAM and Innodb tables by performing index rage scan. This can be faster or slower both for MyISAM and Innodb depending on various conditions. In real applications there are much more queries of second type rather than first type so it is typically not as bad problem as it may look. Most typically count of rows is needed by admin tools which may show it in table statistics, it may also be used in application stats to show something like “We have 123.345 users which have uploaded 1.344.656 images” but these are normally easy to remove.
So remember Innodb is not slow for ALL COUNT(*) queries but only for very specific case of COUNT(*) query without WHERE clause. It does not mean I would not like to see it fixed though, it is pretty annoying.
-- http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/
ORDER BY with LIMIT queries
- The second query in which we ordered our results by the average song rating was alleviated by adding an index to the average_rating field which had been recommended here: http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/
Here is a quote from the MySQL performance blog on this issue:
Suboptimal ORDER BY implementation, especially together with LIMIT is often the cause of MySQL Performance problems.
Here is what you need to know about ORDER BY ... LIMIT optimization to avoid these problems
ORDER BY with LIMIT is most common use of ORDER BY in interactive applications with large data sets being sorted. On many web sites you will fine top tags, recently registered users etc - which would often require ORDER BY with LIMIT in the back end. In general this type of ORDER BY looks like: SELECT ..... WHERE [conditions] ORDER BY [sort] LIMIT N,M
Make sure it uses index It is very important to have ORDER BY with LIMIT executed without scanning and sorting full result set, so it is important
for it to use index - in this case index range scan will be started and query execution stopped as soon as soon as required amount of rows
generated.
These 2 improvements alone increased the performance of our system by more than 400%!!!
| Reply Rate | Response Time |
|---|---|
| | |
| | |
| | |
After converting our tables to MyISAM and adding the average review index the database was no longer the bottleneck. Instead the application servers were the slowest link in our system. Thus we were able to use Apache's load balancer and add multiple application servers to partially parallelize the dynamic content generation. The more application servers that we utilized the slower and more overloaded the database became, yet it scaled relatively well even with 10 application servers with the reply rate doubling by a factor of approximately 2.5 from a single application server.
Discussion of the Limits of Scalability Encountered
In the process of our testing we encountered two primary bottlenecks in our system. As discussed previously our bottleneck was due to the MySQL database in part 1. This was consequently due to our MySQL server utilizing the Innodb table format which has notoriously poor counting performance since it does not store the number of rows in a variable thus executing a full row count upon every COUNT query. In order to mitigate this problem we switched the MySQL table format from Innodb to MyISAM which optimizes COUNT SQL queries. In addition, performing an ORDER BY with a LIMIT on the average rating field in our table was slow due to the process of sorting 100K songs by their average rating. This was easy to solve by placing a sorted index on this field. After both of these optimizations were performed our application servers then became the bottleneck, hence adding more servers increasing the performance linearly as our graphs indicate.
Possible Improvements
- Memcache more SQL queries therefore reducing the load on the database
- Replicate and/or partition our MySQL database: increase performance of read operations
- Add load balancing for our MySQL database
