From time to time, I'll divert from talking business into getting down into tech details, because doing more with fewer servers or other resources will save you money and that's good for business, especially when bootstrapping.
For this tutorial, you'll need to be running your apps in PHP and can configure it, and have the ability to install software to your server. If you're still reading, I'll assume you're comfortable with both. Let's go!
Important update: see bottom of post
The Usual: Memcache
You may be familiar with memcache. Memcache is a simple, fast, in-memory caching system that is used for caching data that is requested often. By using memcache, you avoid the need of repeatedly pulling information from disk or from a database query. This speeds up serving requests and reduces the need for disk accesses. It can easily be made to expand the caching across servers, so you are not bound to the memory limits of a single server. Your memcache storage can be as large as the added memory of many servers.
Memcache and Sessions
In PHP, user sessions are stored on disk by default. This is fine for keeping track of sessions for small traffic and a small number of users, capable of being handled by a single server. But once your site is getting heavy traffic, the repeated disk accesses for session files will start to slow down requests, as well as exhausting the filesystem's capability of handling many files per directory.
Should you decide to increase the number of servers to handle the traffic (using a load balancer) you'll run into the next challenge: users who hit one server on one request may not end up on the same server on the next request. So the session that had their logged-in state may be on server A, but on next request they get routed to server B which doesn't have a session for them, or may have a logged-out session for them. So the user ends up being logged in or logged out, and it's at the whim of the load balancer. Some load balancers are smart in that they will keep a user "stuck" to one server for session's sake.
The next logical step is to put sessions in memcache or in the database, so that it won't matter what server the user ends up on, their session will still be available for the request. Putting sessions in the database is better than files, but still involves a database request for page request. The better option is to put sessions in memcache. Since all servers share the same memcache storage pool, each has access to all the sessions, and the session lookups are super fast.
The Problem with Memcache
Memcache is great, but once you start running low on memory (as you cache more info) lesser-used items in the cache will be destroyed to free up more space for new items. This can result in users getting logged out. Also, if one of the servers in the pool fails or gets rebooted, all the data it was holding is lost, and then the cache must get "warmed up" again.
What's a developer to do?
Membase, the Next Big Thing
Membase is memcache with data persistence. And it doesn't use something like memcache, it is memcache. So if you have code that already is using memcache, you can have it use membase right away, usually with no change to your code.
The improvement of having data persistence is that if you need to bring down a server, you don't have to worry about all that dainty, floaty data in memory that is gonna get burned. Since membase has replication and persistence built-in, you can feel free to restart a troublesome server without fear of your database getting pounded as the caches need to refill, or that a set of unlucky users will get logged out. I'll let you read about all the [many other advantages of membase here]. It's much more than I've mentioned here.
Membase is a NoSQL solution, so if you've been looking to try one out but haven't had data problems big enough, this is your opportunity to jump in.
Membase server comes in two flavors: community and enterprise. The community version is open source, and doesn't have some of the high-end features of the enterprise version. But for our project here and many of your projects, the community version will be more than enough. Enterprise is not necessarily for the bootstrapped. You get more features and support, but you'll pay thousands for it.
Who's using it? Between Membase and its cousin, Couchbase (based on CouchDB) there are a number of high performance and high-traffic sites using them.
Let's start in on the real meat and potatoes here.
I'm going to focus on an Debian/Ubuntu-based package install, since I'm fond of it and stopped liking installing from source years ago. Those on other Linux platforms should be comfortable here, however. No worries, folks.
Cleaning up old Junk
Since Membase provides memcache within its install, we don't want any previously installed memcache stuff hanging around and messing up our mojo.
These are the Ubuntu packages you'll want to get rid of:
- memcache (may or may not exist as a package for your Linux flavor)
That's right, get rid of them. Just apt-get remove ‘em.
In addition, to make sure you remove any memcache.ini or memcached.ini that may be sitting in /etc/php5/conf.d
service apache2 restart
to restart apache.
Getting on the Right Memcache
You'll notice I listed memcache and memcached. We're going to stick to memcached for the remainder of this tutorial. Yes, it matters.
Getting the code
Since we're staying away from the Enterprise version here, let's go straight to the download page.
As you can see, there are binaries available for a number of platforms, including 64-bit versions.
After downloading, let's run (on Ubuntu 64-bit):
dpkg -i membase-server-community_x86_64_220.127.116.11.deb
Like I said, once it's done with a very quick install, you'll have memcached installed and running. You can run this to see the services it runs, including its bundled memcached:
ps -aef | grep mem
Now point your browser to http://yourserverip:8091
If the browser just spins or gives you an error, you'll need to edit your firewall to allow TCP access to port 8091 from outside.
When the membase admin on your server shows up, you'll see this nice welcome:
On the next screen you'll enter the path to where you want data to be stored (or leave it as default), and memory size option for setting up a new cluster. We're going to do a cluster of one for this tutorial.
Next, you'll set how much memory you'll want for you first bucket. Oh yeah, I didn't discuss buckets. You can save things to separate buckets. Investigate.
I also set it to have 1 backup copy via replication.
Next to last detail: enable software notifications:
The very last step is setting a username and password (not shown here).
Once you've set your password, you're in! Soon you will see these graphs full of very hot lines.
The next step is to install php5-memcached:
apt-get install php5-memcached
This will also install memcached, as you can see from the output:
Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: libevent-1.4-2 libmemcached2 memcached Suggested packages: libcache-memcached-perl libmemcached The following NEW packages will be installed: libevent-1.4-2 libmemcached2 memcached php5-memcached 0 upgraded, 4 newly installed, 0 to remove and 39 not upgraded.
This additional memcached will conflict in a subtle way with the one membase installed. Not to worry. Just get rid of it:
apt-get remove memcached
You should have a new file that php5-memcached made for you: /etc/php5/conf.d/memcached.ini
with this as the content:
extension = memcached.so
Changing PHP to Handle Membase Sessions
Since membase is sitting right behind memcache, you'll still use memcache session handling. Edit your php.ini file (or add a custom ini file to conf.d directory) with this:
session.save_handler = memcached session.save_path = "localhost:11211"
Now just restart apache and you're done.
If you fire up a phpinfo() you'll see this in your session section (edited for brevity):
You should see "memcached" as one of your registered save handlers, as well as your session.save_handler. You should see the save path as above also.
Updating Your Existing Memcache Code
If you are already using memcache as a caching mechanism for other things, you'll need to make sure you're using the Memcached library. It's not greatly different from the Memcache library, so converting your code to use it should be trivial.
You'll need to change lines like this:
$memcache = new Memcache; $memcache->connect('memcache_host', 11211);
$memcache = new Memcached; $memcache->addServer('memcache_host', 11211);
In general, I don't like to deal with Memcached library directly. I wrap it in a utility class so that if I need to add more membase/memcache servers in the future or change the way I create hash keys, I only have to do it in one place.
If needed, at any time you may restart membase by issuing:
service membase-server restart
Hope this tutorial helped you out.
Update (Nov. 4th, 2011): After further checking and integration, some frameworks require php5-memcache. Installing both php5-memcache and php5-memcached causes no weird side effects. They are just separate libraries for talking to memcached.
Update (Nov. 5th, 2011): You'll notice over time that the items in cache are increasing and rarely decreasing, and it's nothing to be afraid of. Explained here in a post by Mikkel Ovesen.
Update (Jan. 6th, 2011): Some colleagues have run into troubles with this approach, as PHP's memcached does not store session expiration correctly. That means that when your membase runs out of memory, it stops working instead of freeing up old sessions. I'm researching and will update when I find out a solution. Please comment below if you can help.
Call to Action
Did you discover anything new or helpful you'd like to share about this tutorial? Let everyone know in a comment below!