Todd Hoff's picture

Welcome to High Scalability

We started High Scalability to help you build successful scalable websites. This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your system with confidence. Hopefully this site will move you further and faster along the learning curve of success. Please Start Here.

Todd Hoff's picture

New Facebook Chat Feature Scales to 70 Million Users Using Erlang

I've done some XMPP development so when I read Facebook was making a Jabber chat client I was really curious how they would make it work. While core XMPP is straightforward, a number of protocol extensions like discovery, forms, chat states, pubsub, multi user chat, and privacy lists really up the implementation complexity. Some real engineering challenges were involved to make this puppy scale and perform. It's not clear what extensions they've implemented, but a blog entry by Facebook's Eugene Letuchy hits some of the architectural challenges they faced and how they overcame them.

Todd Hoff's picture

Digg Architecture

Update: Digg now receives 230 million plus page views per month and 26 million unique visitors - traffic that necessitated major internal upgrades.

Traffic generated by Digg's over 1.2 million famously info-hungry users can crash an unsuspecting website head-on into its CPU, memory, and bandwidth limits. How does Digg handle all this load?

Todd Hoff's picture

Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance

High Performance Multithreaded Access to Amazon SimpleDB is a great follow up to the idea in How SimpleDB Differs from a RDBMS that more programming is the price paid for performance in SimpleDB. It shows how much work and infrastructure is required to batter better performance out of SimpleDB.

Remember, in SimpleDB you get keys to records from queries so if you want to get all the fields for records you need to make separate requests. Since SimpleDB isn't exactly a speed daemon the obvious strategy is to parallelize. Even if a job takes a 100 msecs you can get a lot done in a little time if you can execute enough jobs in parallel.

Parallelization is the approach taken by Haakon@AWS in his Java code example of how to get the most out of SimpleDB. You can find the code at Indexing and Querying Amazon S3 Metadata with Amazon SimpleDB. We'll also consider how a back-end service architecture built on Erlang may be a better fit with cloud computing.

Todd Hoff's picture

HSCALE - Handling 200 Million Transactions Per Month Using Transparent Partitioning With MySQL Proxy

Update 2: A HSCALE benchmark finds HSCALE "adds a maximum overhead of about 0.24 ms per query (against a partitioned table)." Future releases promise much improved results.
Update: A new presentation at An Introduction to HSCALE.

After writing Skype Plans for PostgreSQL to Scale to 1 Billion Users, which shows how Skype smartly uses a proxy architecture for scaling, I'm now seeing MySQL Proxy articles all over the place. It's like those "get rich quick" books that say all you have to do is visualize a giraffe with a big yellow dot superimposed over it and by sympathetic magic giraffes will suddenly stampede into your life. Without realizing it I must have visualized transparent proxies smothered in yellow dots.

One of the brightest images is a wonderful series of articles by Peter Romianowski describing the evolution of their proxy architecture. Their application is an OLTP system executing 200 million transaction per month, tables with more than 1.5 billion rows, and a 600 GB total database size. They ran into a wall buying bigger boxes and wanted to move to a sharded architecture. The question for them was: how do you implement sharding?

Todd Hoff's picture

Strategy: Break Up the Memcache Dog Pile

Caching is like aspirin for headaches. Head hurts: pop a 'sprin. Slow site: add caching. Facebook must have a lot of headaches because they popped 805 memcached servers between 10,000 web servers and 1,800 MySQL servers and they reportedly have a 99% cache hit rate. But what's the best way for you to cache for your application? It's a remarkably complex and rich topic. Alexey Kovyrin talks about one common caching problem called the Dog Pile Effect in Dog-pile Effect and How to Avoid it with Ruby on Rails. Glenn Franxman also has a Django solution in MintCache.

Data is usually cached because it's too expensive to calculate for every hit. Maybe it's a gnarly SQL query you want to avoid and a little stale data is OK. Or maybe the amount of data you have is simply larger than physical memory on any one machine. Or maybe you have the temerity to write to your database and cause its cache to flush so database caching isn't sufficient at a certain level of scale.

Typical examples are for caching article vote counts, comment threads, and event streams. One familiar example that bit me hard is displaying the the top N blog articles. Do you want to scan through your entire access log table for every page display? Absolutely not. Especially when the nightly backups are going on and the network is very slow. Not good :-) Yet you still want to update the results every X minutes so the stats stay fresh.

Data freshness requires a refrigeration truck or an expiry time on your cache entry that causes stats to be periodically recalculated. Now, what happens when your cached data expires and a 1000 requests simultaneously try to recalculate the expensive to calculate data? Database load spikes and the world nearly ends. And since memcached operations are not atomic it's possible stale data could be cached and you'll serve stale data. Which kind of defeats of the purpose of taking load off the data while providing accurate data. So, how do you unpile the dogs?

Put the web server on a diet and increase scalability

Misusing HTTP sessions is probably the number one obstacle to building scalable web sites today. Here are some tips how to consume HTTP sessions responsibly.

Todd Hoff's picture

Product: nginx

Update 5: In Load Balancer Update Barry describes how WordPress.com moved from Pound to Nginx and are now "regularly serving about 8-9k requests/second and about 1.2Gbit/sec through a few Nginx instances and have plenty of room to grow!".
Update 4: Nginx better than Pound for load balancing. Pound spikes at 80% CPU, Nginx uses 3% and is easier to understand and better documented.
Update 3: igvita.com combines two cool tools together for better performance in Nginx and Memcached, a 400% boost!.
Update 2: Software Project on Installing Nginx Web Server w/ PHP and SSL. Breaking away from mother Apache can be a scary proposition and this kind of getting started article really helps easy the separation.
Update: Slicehost has some nice tutorials on setting up Nginx.

From their website:
Nginx ("engine x") is a high-performance HTTP server and reverse proxy, as well as an IMAP/POP3/SMTP proxy server. Nginx was written by Igor Sysoev for Rambler.ru, Russia's second-most visited website, where it has been running in production for over two and a half years. Igor has released the source code under a BSD-like license. Although still in beta, Nginx is known for its stability, rich feature set, simple configuration, and low resource consumption.

Todd Hoff's picture

Friends for Sale Architecture - A 300 Million Page View/Month Facebook RoR App

Update: Jake in Does Django really scale better than Rails? thinks apps like FFS shouldn't need so much hardware to scale.

In a short three months Friends for Sale (think Hot-or-Not with a market economy) grew to become a top 10 Facebook application handling 200 gorgeous requests per second and a stunning 300 million page views a month. They did all this using Ruby on Rails, two part time developers, a cluster of a dozen machines, and a fairly standard architecture. How did Friends for Sale scale to sell all those beautiful people? And how much do you think your friends are worth on the open market?

Todd Hoff's picture

10 More Rules for Even Faster Websites

80-90% of the end-user response time is spent on the frontend, so it makes sense to concentrate efforts there before heroically rewriting the backend. Take a shower before buying a Porsche, if you know what I mean. Steve Souders, author of High Performance Websites and Yslow, has ten more best practices to speed up your website:

  • Split the initial payload
  • Load scripts without blocking
  • Don’t scatter scripts
  • Split dominant content domains
  • Make static content cookie-free
  • Reduce cookie weight
  • Minify CSS
  • Optimize images
  • Use iframes sparingly
  • To www or not to www

    Sadly, according to String Theory, there are only 26.7 rules left, so get them while they're still in our dimension.

    Here are slides on the first few rules. Love the speeding dog slide. That's exactly what my dog looks like traveling down the road, head hanging out the window, joyfully battling the wind.

    Also see 20 New Rules for Faster Web Pages.

  • Marcelb's picture

    Rather small site arhitecture.

    Website stats:

    Webserver: Apache 2.2
    Database: MySQL 5.0
    APC cache for php
    CMS: Drupal 6.2 (bleeding-edge version)*
    *Aggressive caching is ON, Page Compression ON, Block Cache ON (can't use CCS),Optimize CSS/JS ON.
    2 Servers: Apache/Mysql (low-tech servers - Celeron processors, 512 MB RAM, 7200 RPM HDD)
    Bandwidth 10 Mb/s

    The benchmark:

    Used ab : ab -n 1000 -c 20 howwhatwho.com

    Server Software: Apache/2.2.3
    Server Hostname: howwhatwho.com
    Server Port: 80
    Document Path: /
    Document Length: 41639 bytes
    Concurrency Level: 20
    Time taken for tests: 13.556796 seconds
    Complete requests: 1000
    Failed requests: 0
    Write errors: 0
    Total transferred: 42118000 bytes
    HTML transferred: 41639000 bytes
    Requests per second: 73.76 [#/sec] (mean)
    Time per request: 271.136 [ms] (mean)
    Time per request: 13.557 [ms] (mean, across all concurrent requests)
    Transfer rate: 3033.90 [Kbytes/sec] received

    The Apache server is also running the postifx and bind although they aren't resource intensive applications.
    The Cron job for drupal runs every 50 minutes, and the agreggator module is enabled and fetches more than 30 rss feeds every time.

    The site used to be hosted on a single Celeron machine but on peak times the CPU went up to 80 %.

    Question : Does anybody know a website hosted on an IBM Mainframe? :) Todd?

    Todd Hoff's picture

    Strategy: Sample to Reduce Data Set

    Update: Arjen links to video Supporting Scalable Online Statistical Processing which shows
    "rather than doing complete aggregates, use statistical sampling to provide a reasonable estimate (unbiased guess) of the result."

    When you have a lot of data, sampling allows you to draw conclusions from a much smaller amount of data. That's why sampling is a scalability solution. If you don't have to process all your data to get the information you need then you've made the problem smaller and you'll need fewer resources and you'll get more timely results.

    Todd Hoff's picture

    Product: Hbase

    Update: InfoQ interview: HBase Leads Discuss Hadoop, BigTable and Distributed Databases. "MapReduce (both Google's and Hadoop's) is ideal for processing huge amounts of data with sizes that would not fit in a traditional database. Neither is appropriate for transaction/single request processing."

    Hbase is the open source answer to BigTable, Google's highly scalable distributed database. It is built on top of Hadoop (product), which implements functionality similar to Google's GFS and Map/Reduce systems.

    Todd Hoff's picture

    Paper: MapReduce: Simplified Data Processing on Large Clusters

    With Google entering the cloud space with Google AppEngine and a maturing Hadoop product, the MapReduce scaling approach might finally become a standard programmer practice. This is the best paper on the subject and is an excellent primer on a content-addressable memory future.

    Some interesting stats from the paper: Google executes 100k MapReduce jobs each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory.

    One common criticism ex-Googlers have is that it takes months to get up and be productive in the Google environment. Hopefully a way will be found to lower the learning curve and make programmers more productive faster.

    From the abstract:

    Todd Hoff's picture

    How do you explain cloud computing to your grandma?

    Once upon a time I worked at an Asynchronous Transfer Mode (ATM) switch startup. Over a delicious Christmas punch my grandma asked me what I did for a living that I could afford such extravagantly inexpensive gifts. Always so subtle. I explained I worked on an ATM switch. Mistake. She sniffed, said that's nice, and asked me why the Automated Teller Machine ate her bank card that morning. No matter how hard I tried I couldn't convince her I didn't work on bank ATMs. To all future job interrogations I waxed off, protesting I do boring software stuff that nobody cares about.

    Not put off in the least, grandma asked me last night to explain this cloud computing thing she keeps hearing about at her church club. Afraid of being another victim of the distortion field surrounding cloud computing, I instead referred her to Kent Langley's excellent overview of the subject in Cloud Computing: Get Your Head in the Clouds. It does a good job demystifying the very confusing concept of cloud computing. It has nice diagrams, definitions, examples and is a great place to start.

    She agreed that she had learned a lot, but one thing still troubled her: what's the difference between cloud computing and utility computing? They seem to be the same to her. Always so perceptive. She felt sure if she could drive this point home she would score big points with her church group. Oh the pressure.

    Behind The Scenes of Google Scalability

    The recent Data-Intensive Computing Symposium brought together experts in system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to better understand existing capabilities in the development and application of large-scale computing systems, and to explore future opportunities.

    Google Fellow Jeff Dean had a very interesting presentation on Handling Large Datasets at Google: Current Systems and Future Directions. He discussed:

    • Hardware infrastructure
    • Distributed systems infrastructure:
    –Scheduling system
    –GFS
    –BigTable
    –MapReduce
    • Challenges and Future Directions
    –Infrastructure that spans all datacenters
    –More automation

    It is really like a "How does Google work" presentation in ~60 slides?

    Check out the slides and the video!

    Todd Hoff's picture

    Using Google AppEngine for a Little Micro-Scalability

    Over the years I've accumulated quite a rag tag collection of personal systems scattered wide across a galaxy of different servers. For the past month I've been on a quest to rationalize this conglomeration by moving everything to a managed service of one kind or another. The goal: lift a load of worry from my mind. I like to do my own stuff my self so I learn something and have control. Control always comes with headaches and it was time for a little aspirin. As part of the process GAE came in handy as a host for a few Twitter related scripts I couldn't manage to run anywhere else. I recoded my simple little scripts into Python/GAE and learned a lot in the process.

    Todd Hoff's picture

    The Search for the Source of Data - How SimpleDB Differs from a RDBMS

    Update 2: Yurii responds with the Top 10 Reasons to Avoid Document Databases FUD.
    Update: Top 10 Reasons to Avoid the SimpleDB Hype by Ryan Park provides a well written counter take. Am I really that fawning? If so, doesn't that make me a dear?

    All your life you've used a relational database. At the tender age of five you banged out your first SQL query to track your allowance. Your RDBMS allegiance was just assumed, like your politics or religion would have been assumed 100 years ago. They now say--you know them--that relations won't scale and we have to do things differently. New databases like SimpleDB and BigTable are what's different. As a long time RDBMS user what can you expect of SimpleDB? That's what Alex Tolley of MyMeemz.com set out to discover. Like many brave explorers before him, Alex gave a report of his adventures to the Royal Society of the AWS Meetup. Alex told a wild almost unbelievable tale of cultures and practices so different from our own you almost could not believe him. But Alex brought back proof.

    Using a relational database is a no-brainer when you have a big organization behind you. Someone else worries about the scaling, the indexing, backups, and so on. When you are out on your own there's no one to hear you scream when your site goes down. In these circumstances you just want a database that works and that you never have to worry about again. That's what attracted Alex to SimpleDB. It's trivial to setup and use, no schema required, insert data on the fly with no upfront preparation, and it will scale with no work on your part. You become free from DIAS (Database Induced Anxiety Syndrome). You don't have to think about or babysit your database anymore. It will just work. And from a business perspective your database becomes a variable cost rather than a high fixed cost, which is excellent for the angel food funding. Those are very nice features in a database. But for those with a relational database background there are some major differences that take getting used to.

    Google App Engine - what about existing applications?

    Recently, Google announced Google App Engine, another announcement in the rapidly growing world of cloud computing. This brings up some very serious questions:

    1. If we want to take advantage of one of the clouds, are we doomed to be locked-in for life?
    2. Must we re-write our existing applications to use the cloud?
    3. Do we need to learn a brand new technology or language for the cloud?

    This post presents a pattern that will enable us to abstract our application code from the underlying cloud provider infrastructure. This will enable us to easily migrate our EXISTING applications to cloud based environment thus avoiding the need for a complete re-write.

    Todd Hoff's picture

    What CDN would you recommend?

    Update 3: Mr. Rayburn takes A Detailed Look At Akamai's Application Delivery Product . They create a "bi-nodal overlay network" where users and servers are always within 5 to 10 milliseconds of each other. Your data center hosted app can't compete. The problem is that people (that is, me) can understand the data center model. I don't yet understand how applications as a CDN will work.
    Update 2: Dan Rayburn starts an interesting series of articles on Highlights Of My Day In Cambridge With Akamai. Akamai is moving strong into the application distribution business. That would make an interesting cloud alternative..
    Update: Streamingmedia links to new CDN DF Splash that specializes in instant-on TV-quality video streaming.

    A question was raised on the forum asking for a CDN recommendation. As usual there are no definitive answers, but here are three useful articles that may help your deliberations.

  • First, Tony Chang shows how to drive down response times using edge acceleration strategies.
  • Then Pingdom gives a nice overview and introduction to CDNs.
  • And last but not least, Dan Rayburn from StreamingMedia.com gives a master class in how much you should pay for your CDN, what you should be getting for your money, and how to find the right provider for your needs.

    Lots and lots of good stuff to learn, even if you didn't roll out of bed this morning pondering the deeper mysteries of content delivery networks and the Canadian dollar.

  • Todd Hoff's picture

    Product: Hadoop

    Update 2: Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides. Topics include: Pig, JAQL, Hbase, Hive, Data-Intensive Scalable Computing, Clouds and ManyCore: The Revolution, Simplicity and Complexity in Data Systems at Scale, Handling Large Datasets at Google: Current Systems and Future Directions, Mining the Web Graph. and Sherpa: Hosted Data Serving.
    Update: Kevin Burton points out Hadoop now has a blog and an introductory video staring Beyonce. Well, the Beyonce part isn't quite true.

    Hadoop is a framework for running applications on large clusters of commodity hardware using a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed on any node in the cluster. It replicates much of Google's stack, but it's for the rest of us. Jeremy Zawodny has a wonderful overview of why Hadoop is important for large website builders:

    Todd Hoff's picture

    Scaling Mania at MySQL Conference 2008

    The 2008 MySQL Conference & Exp