Reviewlicious
CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken
Contents |
Reviewlicious
Authors
- Yuemin Yu (yyu [at] cs) (yueminyu [at] gmail [dot] com)
- Bita Mazloom (betamaz [at] cs) (betamaz [at] gmail [dot] com)
Description / Functionalities
Often users seeking an online service are faced with the question: "Is this a bogus reivew? Based on this, do I want to pay for the service?" Our goal is to deisgn a new product review system which is better than what is commonly out there. In this way, we can strengthen a user's confidence in the choice of service they make. We've created a restaurant search tool that provides a standard application to develop our webservice and to test our ranking system. We incorporate current novel tools that provide an easy mechanims for users to share and contribute their views. User can leave feedback on restaraunts and rate other users' reviews. By adding tags, users of the site experience their local offerings from a different angle.
Server
Reviewlicious on Amazon EC2 user/pass: yyu/1234
Reviewlicious Load Balancer Manager
Stats
As of 2006/12/13 8:51PM
- Number of Restaurants: 100154
- Number of Restaurant Reviews: 1001
- Number of Restaurant Tags: 2706
- Number of Unique Tags: 45
Screenshots
Goals
Features Proposed
Expected
- Review restaurants
- Tag restaurants
- See other people's reviews
- Approve/disprove other's review
- Using reviewer's reputation to weight reviews
Optional
- Locate restaurants by zip code
- Map restaurant coordinates on Google/Yahoo/MS live maps
- Map by zipcode & tag
- MS Live "birds eye" using coordinates
Features Achieved
- Review restaurants
- Search restaurants
- Tag restaurants
- Search by tag
- See other people's reviews
- Approve/disprove other's review
- Map restaurant coordinates on Google maps
Web Service Setup
Amazon Elastic Computing Cloud Setup
Basic Setup
- 1 Apache Reverse Proxy Load-balancing Server + Memcached Server
- 1 MySQL Database Server
- 1 to 15 Application Server(s) consist of 4 Mongrel Servers per Cluster within each Application Server
Load Balancing
- Instead of the default lbmethod=byrequests we are using lbmethod=bytraffic instead
- Load balancing by number of bytes throughput seems like a better way to load balance between the servers. Number of bytes served most likely indicates heavier load.
Database
Tables
Testing Setup
Testing Path
- We are using 1 path this time, to simulate a more realistic usage through the entire site instead of just possible paths. The path consists of several critical paths which are repeated based on their popularity. By popularity we mean for example, a user is more likely to search many more times than contribute reviews.
- User logs in (here we've assumed user 'yyu' had previously signed up--and enters valid username and password combination)
- User looks up the first restaurant in the listings.
- User adds a review and is taken to the listings page.
- User scroll's to the listings page 2.
- User tags restaurant 6 with 'bitas_tag'.
- User searches for 'chinese'.
- User navigates to a few pages (2, 3 and 4).
- User tags restaurant 19 with 'bitas_tag'.
- User searches for 'mexican', 'goleta', 'italian'.
- User tags restaurant 7570 with 'bitas_tag'.
- User searches for 'burger'.
- User looks up a restaurant from search results.
- User logs out.
/ /stylesheets/screenstyle.css /restaurant/index /account/login /account/login method=POST contents='login=yyu&password=1234' /restaurant /restaurant/list /restaurant_review/new/1 /restaurant_review/create method=POST content='restaurant_review[user_id]=1&restaurant_review[overall]=2&restaurant_review[food]=2&restaurant_review[service]=2 \ &restaurant_review[ambiance]=2&restaurant_review[value]=2&restaurant_review[pro]=blahblah&restaurant_review[con]=moreblah \ &restaurant_review[comment]=nada' /restaurant/list /restaurant/list?page=2 /restaurant_tag/new/6 /restaurant_tag/create method=POST contents='restaurant_tag[restaurant_id]=bitas_tag' /restaurant/list /restaurant/list?page=2 /restaurant/show/6 /restaurant/search2 method=POST contents='restaurant[search]=chinese' /restaurant/index /restaurant/list?page=2 /restaurant/list?page=3 /restaurant/list?page=4 /restaurant_tag/new/19 /restaurant_tag/create method=POST contents='restaurant_tag[restaurant_id]=bitas_tag' /restaurant/list /restaurant/search2 method=POST contents='restaurant[search]=mexican' /restaurant/show/53199 /restaurant/search2 method=POST contents='restaurant[search]=goleta' /restaurant/show/50 /restaurant/search2 method=POST contents='restaurant[search]=italian' /restaurant/show/5 /restaurant_tag/new/7570 /restaurant_tag/create method=POST contents='restaurant_tag[restaurant_id]=bitas_tag' /restaurant/search2 method=POST contents='restaurant[search]=burger' /restaurant/show/50 /restaurant/show/43416 /account/logout /restaurant
Experiments
Overview of Performance Results
- The first set two graphs, labled 'Overview of Performance Results', shows an overview of our results across all test. The entire experirment consists of testing 1..15 servers with caching enabled and 1 server with no cache. As shown the difference between the one server with and without the optimized memcachd is significant. On average the number of replies/sec for a server without caching is about 5.75 replies/sec, on the other hand one server with caching achieves on average 8.50 replies/sec. We see the same pattern, when looking at performance in terms of response time in ms and session initiation rate, ie. caching is a must have optimization. Another understanding the overview perspective gives us, is how much of a dramatic difference an additional server with memcachd has on the amount of concurrent requests our site can handle. With two servers, we come close to doubling the number of sessions that are handled, achieving an average of 10 replies/sec. Also, with two servers the response time more than halves, going down to an average of 2.45 seconds (2450 ms).
Close Up of Performance Results
- The second set of graphs, labled 'Close Up of Performance Results', shows a close up view of our results for the tests that invlove more than one server which has caching enabled. For this set, the graph of response time is much easier to read than the graph of replies per second. Starting with the response time graph, we see that increasing the number of servers from 1..4 quickly ramps up the response time from approx 1.85 seconds (1850 ms) to 2.35 seconds (2350 ms). At first, this may seem counter intuitive until we look at two key properties of the performance tests: webservice bottle neck and the smaller impact that our experiment setup has.
- As shown in the 'Testing Setup' subsection 'Testing Path', our httperf workload file consists of mainly of database intensive actions, such as searching. Therefore, since there is only one mysql server available as the number of servers increases, the number of concurrent sessions that can be handled (and therefore created) increases. The result is that more servers are waiting in queue to have their requests served by the database.
- Each httperf test creates only a total of 30 sessions. To keep the tests consistent and not wanting to spend our whole lives babysiting the numerous servers running, we compromised test length for a greater number of servers we scaled to. While 30 total sessions seemed enough at first (and looking closely we are able to see on average minor increase between the replies per second of 2..15 swervers), it is difficult to distinguish the change adding more than two servers has on replies/sec and response time. In order to see a more dramatic difference between the replies/second result of 2..15 servers we would need to siginificantly increase the total number of sessions each httperf created.
Sources
Data
- Yahoo Local V3 Search on the term "restaurants" in various areas
- Google mapping API
Photos
- Photo Credit: mbauhs, Watermellon Face
- Photo Credit: tomaste, Apple
- Photo Credit: de jack Mamsall, Help!
Others
- This star rating system is based on code provided by Rogie King's CSS Star Rating Part Delux tutorial
- CSS Template Open Source Web Design
Project Iterations
- Project 1 - Start this wiki, and fetch items from Yahoo Local
- Project 2 - Start site writing. Design database. Add Authentication
- Project 3 - Determine critical path in site. Optimize DB/code
- Project 4A - Deploy on a one Amazon Elastic Computing Cloud server. Test performance. Optimize.
- Project 4B - Deploy on multiple Amazon Elastic Computing Cloud servers. Test performance. Optimize.







