nginx: reverse proxy panacea
A few weeks ago in Reverse Proxy Roundup I was evaluating reverse HTTP proxy solutions. At the time I had settled on Pound, but frankly it wasn't good enough for the load I expected it to handle. Additionally, spawning so many threads used far more RAM and CPU than it should've.
Not long after I wrote that post, a friend of mine pointed me at nginx. nginx is an extremely high performance HTTP server that offers reverse proxying, FastCGI, SSL, and even IMAP/POP3 proxying.
I have had precisely zero problems with nginx so far after about 5 weeks and hundreds of millions of proxied requests. Forget about lighttpd, nginx uses less memory, doesn't leak, and has higher performance (in my FreeBSD configuration, anyway).
The reason I didn't blog about it sooner is that the project is Russian and so was a large majority of its documentation at the time. With the 0.4 release, there's a much more complete draft of the English documentation available. Check it out; your high traffic servers will thank you for it.
I’ve never heard of nginx, but it certainly looks interesting.
Comment by Joseph Scott — 2006-09-13 @ 1:57 pm
Bob,
You mentioned Nginx on the TurboGears list a few weeks ago. Based on your recommendation, I converted all our servers from Pound + Lighty to Nginx and I’m really happy I did. I didn’t really have an issue with Pound (I’m not subjecting it to the high loads you are), but the improvement over Lighty is phenomenal. By that I mean the time I no longer waste fighting bugs and misfeatures.
For anyone who’s concerned, the lack of English documentation wasn’t even an issue for the most part. The configuration is so straightforward all you need is the examples from the Russian documentation. Contrast this with Lighty where volumes of English documentation still left me baffled (although I suspect the fact that the software often doesn’t do what the documentation claims it should is the culprit here).
Anyway, thanks again for the pointer.
Comment by Cliff Wells — 2006-09-13 @ 3:26 pm
It maybe worth note in fairness that Pound well known to handle loads far excess of those you throw.
Also, if fail tuned your system for multi-threaded app, problem is with you, not app.
Issue with thread utiilzation and memory allocation, I would not blame Pound all alone. I would look first at FreeBSD kernel, which is… well, just big pile of dung, with smelling old MACH code still used.
Pound service very large high traffic site, Flickr, for your informations.
Lighty suck but still better than elephantine apache :)
Comment by borat — 2006-11-17 @ 9:12 am
No. Pound sucks. It uses a thread per connection, which means lots of RAM and unnecessary context switches. It doesn’t matter if the OS is tuned to handle it or not, it’s still a pretty shitty design choice.
If Flickr uses it, then they’re wasting money. I’ve also yet to see any evidence that would lead me to believe that they handle more load than I do per server. They need an assload of servers just for storage anyway, they’ve probably got plenty of RAM and CPU to waste on Pound.
Comment by bob — 2006-11-17 @ 9:37 am
Quantify “lots”, please. Right now you’re just out here making ridiculous, unfounded assertions about a piece of software with a solid reputation. Back it up.
And just how much load, precisely, are you dealing with — throughput in/out, concurrent connects, etc?
Comment by borat — 2006-11-20 @ 9:12 am
It is solid, and I even used it for a while once Apache started to fall over. However, Pound falls over not too far after mod_proxy does. Thread per connection is simply the second worst model for writing a server (next to process per connection). This should be obvious to anyone who write networking code, I don’t need numbers to prove it.
Even if nginx’s performance was equivalent to Pound, Pound still uses 20-30x more RAM. Now why the hell would you want to do that?
Comment by bob — 2006-11-20 @ 9:50 am
I’ve asked some simple questions, to help understand the context in which you are asserting that Pound is limited in its ability to scale. You have so far provided no answers.
It might be enlightening if you actually tested your assertions using benchmarks and posted the results.
Generalizations like “such and such is the worst way to do so-and-so” are often in my expereince false and lead to bad things.
As for “I don’t need to prove it with numbers” — well yeah, actually you do, if you expect anyone to accept what you say as fact. It’s called computer _science_ and you’re making these assertions in public.
Comment by borat — 2006-11-22 @ 11:37 am
This isn’t a thesis paper, it’s a blog. It’s a record of my experiences and thoughts. I have tried Pound in a production scenario, and it eventually failed. It used hundreds of megs of RAM and a shitload of threads, and then it started falling over. I switched to nginx, and it uses less than ten megs of RAM, less CPU, and has been able to handle all of the real traffic I’ve thrown at it.
I have no vested interest one way or the other in nginx or Pound. I don’t care what you want to believe, I’m just trying to help people out by telling them what works well. I can’t cover exact numbers for business reasons and I have no reason whatsoever to do benchmarks outside of that.
And actually no. I don’t need to prove with numbers that heavyweight threads or processes is a bad model for this kind of application. This is a well known fact that has been proved many times over. Use the google.
Comment by bob — 2006-11-22 @ 12:02 pm
Bob posted a few numbers a while back (not a benchmark, but from the same server, assuming similar loads). I’ll quote him here since I think it gives at least a reasonable picture:
“”"
I currently have Nginx doing reverse proxy of over tens of millions of HTTP requests per day (thats a few hundred per second) on a single server. At peak load it uses about 15MB RAM and 10% CPU on my particular configuration (FreeBSD 6).
Under the same kind of load, Apache falls over (after using 1000 or so processes and god knows how much RAM), Pound falls over (too many threads, and using 400MB+ of RAM for all the thread stacks), and Lighty leaks more than 20MB per hour (and uses more CPU, but not significantly more).
“”"
Clearly you don’t need a benchmark at this point. The difference is so staggering that all a benchmark would demonstrate is whether Nginx is 10x better or 20x better.
Also, the thread per connection model is certainly bad, especially for slow connections or large downloads (which keep a thread hanging around for a long time).
Comment by Cliff Wells — 2006-11-29 @ 8:53 am
Bob,
I’m interested in your setup, specifically your architecture downstream of nginx. Coming from the Java world, I have load balanced multiple Tomcat servers with Apache and mod_proxy on the front-end. Is this the same kind of deal but with Pylon instances ? How do you handle session replication across the servers, how about caching across servers ? I have several production sites I would like to switch from struts/jsp to django/python but these kind of infrastructure issues have me scared.
Thanks,
Rick Lawson
Comment by Rick Lawson — 2007-01-05 @ 6:13 am
Yes, you can do the same thing with Pylons, Django, or anything else. You can get distributed sessions by sticking them in the DB, or you could set a cookie or something for machine affinity. Caching is a non-issue for me because the only content that can be cached are static files and those are coming from nginx, but I suppose you could use memcached or simply let each server have its own isolated cache.
Comment by bob — 2007-01-06 @ 11:56 am
Nginx looks interesting. Maybe a little too much features for my taste.
I usually use http://haproxy.1wt.eu/ for all my http proxying/loadbalancing needs. Fast, stable, _excellent_ logging.
Dirk
Comment by Dirk Vleugels — 2007-03-15 @ 3:13 am