Scaling Apache httpd as a ReverseProxy

We recently had the need to make sure our front end apache httpd reverse proxy and ssl termination server could handle the larger number of websocket connections we are going to use with it. Given websockets are longer lived connections, this is a different use of apache httpd and we want to get it right. The proxied service is capable of handling tens of thousands of concurrent connections, if not hundreds of thousands or more.

First, our testing tool is custom made, it makes all the websocket connections first and then proceeds to ping. This is important as it exercises the concurrent connections capabilities of httpd. When using it, the client system needs the ability to create enough sockets. The first limit I encountered was with my test client system. The shell environment defaults to 1024 open files limited. It is a soft limit, so use ulimit -S to adjust the limit. Even ab will show an error of “socket: Too many open files (24)” if you use -n 1050 and -c 1050 options.

$ ulimit -n
$ ulimit -Hn
$ ulimit -Sn 65536
$ ulimit -n

Now, your testing tool can create more than 1024 connections. The next limit I ran into was that of connections on the httpd server. Even mpm_event uses thread per request (do not let the event name fool you). The default ubuntu apache2 mpm_event configuration allows for 150 concurrent connections:

 StartServers 2
 MinSpareThreads 25
 MaxSpareThreads 75
 ThreadLimit 64
 ThreadsPerChild 25
 MaxRequestWorkers 150
 MaxConnectionsPerChild 0

A tool like ab won’t halt at 150. A tool named slowhttptest is in xenial/universe. Run apt install slowhttptest to install it. It is a flexible tool and has a great man page and -h help output.

$ slowhttptest -c 1000 -H -g -o my_header_stats -i 10 -r 200 -t GET -u -x 24 -p 3

slowhttptest version 1.6
– –
test type: SLOW HEADERS
number of connections: 1000
verb: GET
Content-Length header value: 4096
follow up data max size: 52
interval between follow up data: 10 seconds
connections per seconds: 200
probe connection timeout: 3 seconds
test duration: 240 seconds
using proxy: no proxy

Tue Sep 27 14:33:03 2016:
slow HTTP test status on 5th second:

initializing: 0
pending: 284
connected: 667
error: 0
closed: 0
service available: YES

This screen will update as connections are created until service available changes from YES to NO.

In my tests it closed: value was exactly 150. I can view the my_header_stats.csv file to see when max was reached.

Next, lets adjust Apache httpd to allow for more concurrent connections. My target is 15,000 connections, so I’ll increase numbers linearly 2 processes (StartServers) with 75 threads each (ThreadsPerChild) gave 150 connections. 20 processes with 750 threads each should give 15,000 connections.

Edit mpm_event.conf: ($ sudo vi /etc/apache2/mods-enabled/mpm_event.conf)

<IfModule mpm_event_module>
 StartServers 10
 MinSpareThreads 25
 MaxSpareThreads 750
 ThreadLimit 1000
 ThreadsPerChild 750
# MaxRequestWorkers aka MaxClients => ServerLimit *ThreadsPerChild
 MaxRequestWorkers 15000
 MaxConnectionsPerChild 0
 ServerLimit 20
 ThreadStackSize 524288

Restart (full restart, not graceful – ThreadsPerChild change requires this) apache2 httpd and retry the slowhttptest. Notice service available is always YES.

Now turn up the slowhttptest numbers. Change the -c parameter to 15000 and the -r to 1500. It should take 10sec to ramp up the connections. In my use case I could not create that many connections so quickly. slowhttptest was maxing out a CPU core.

All of the above apache httpd config was done using the mpm_event processing module. The next issue I ran into was a case of mpm_worker not behaving as I expected. I have a doubly proxied system, because this is super real world where we route http things all over the place, sometimes in ways we shouldn’t but because we are lazy, or it is easier or… anyway…

In ubuntu/trusty with apache httpd 2.4.7 mpm_worker has a limit of 64 ThreadsPerChild even if you configure it with a larger number. There is no warning. You’d never know unless you take a look at the number of processes in a worker: $ ps -uwww-data -opid,ppid,nlwp  The fix is to switch from mpm_worker to mpm_event.

$ sudo a2dismod mpm_worker
$ sudo a2enmod mpm_event
$ sudo service apache2 restart

I thought that I’d need to do more, but this got me to where I needed to be.