[SXSW notes] Beyond LAMP: Scaling Websites Past MySQL

https://my.sxsw.com/events/event/386
Serkan Piantino - Facebook, newsfeed
Alan Schaaf - imgur.com
Kevin Weil - twitter analytics
Christopher Slowe - reddit
Jason Kincaid (moderator)

Architecture
* Imgur
** CDN
** MySQL, memcache, HAproxy
** Uses mod_rewrite to break down hash
** Nginx
* Reddit
** Python
** 97% open source (3% is the anti-cheating)
** Hosted on EC2
** 20 app servers running 20 processes of reddit
** big speed boost for doing single-threaded
** HAproxy
** Postgres (mostly used as a key/value store), 4 masters
** Memcache
** If you need more DB servers than app servers, you’re doing something wrong
** Using memcachedb a lot as a replacement for Postgres, moving to Cassandra
** RabbitMQ (sp?)
* Twitter
** Began making it async — queueing
** Stripped out parts of Rails and Active Record
** Rewrote daemons in Scala (based on JVM)
** Moved towards more service-oriented architecture — more modular
** Moving to Cassandra
** Rely heavily on memcache
* Facebook
** Using MySQL as key/value
** 30-40 TB of memcache
** Compiled PHP into C++
** Modular systems.
** Thrift
Why Cassandra over MongoDB? (@twitter)
* Twitter is very write-heavy
* When you tweet, that message is placed in the inbox of every follower
* No disc seeks when you write in Cassandra
* No master in Cassandra — you can write to all machines
* Can use commodity machines (over MySQL)

How do you know when these tools begin to apply?
* FB: Ganglia (monitoring) - https://ganglia.sourceforge.net/
* You will inevitably be bitten in the ass by something you’re not monitoring
Replace/enhance relational databases
* memcached
* memcachdb / berkleydb
* cassandra
* hadoop/hive/hbase (used by Facebook)
* Tokyo Cabinet

How do you scale search?
* slowe: using solr across 3 machines. Doing 2 queries/sec.
* Piantino: Lucene.
* “search is hard”
What was the first thing that blew up?
* Schaaf: mysql, then apache. switched to Nginx
* Weil: Social graph store. user id and follower id. Gets very bad - billions of rows. MySQL - bad for this. Built de-normalized lists. Bought them 6 months while they build their own social graph store in Scala.
* Piantino: were saturating the rack switch. built rack-awareness into the app. kept processing within the rack.

Questions:
* FB: Hiphop (PHP compiled into C++) - increase of 40-50%
* Twitter: very little capistrano — using Murder (bittorrent deployment). FB uses a bittorrent deployment service as well.
* Hardware DB? FB: played with Fusion IO cards.
* Cassandra (eventually consistent) — how do you deal with data latency? It’s a trade off. If you can’t handle EC, you can’t run something like Cassandra.
* Slowe: 2-tiered caching based on type of data. Figuring out what data will change the key — need to know what data needs to be refreshed.

Hi there, I'm Jon.

Writer. Musician. Adventurer. Nerd.

Purveyor of GIFs and dad jokes.