We recently had the need to make sure our front end apache httpd reverse proxy and ssl termination server could handle the larger number of websocket connections we are going to use with it. Given websockets are longer lived connections, this is a different use of apache httpd and we want to get it right. The proxied service is capable of handling tens of thousands of concurrent connections, if not hundreds of thousands or more.
First, our testing tool is custom made, it makes all the websocket connections first and then proceeds to ping. This is important as it exercises the concurrent connections capabilities of httpd. When using it, the client system needs the ability to create enough sockets. The first limit I encountered was with my test client system. The shell environment defaults to 1024 open files limited. It is a soft limit, so use ulimit -S to adjust the limit. Even ab will show an error of “socket: Too many open files (24)” if you use -n 1050 and -c 1050 options.
$ ulimit -n
1024
$ ulimit -Hn
65536
$ ulimit -Sn 65536
$ ulimit -n
65536
Now, your testing tool can create more than 1024 connections. The next limit I ran into was that of connections on the httpd server. Even mpm_event uses thread per request (do not let the event name fool you). The default ubuntu apache2 mpm_event configuration allows for 150 concurrent connections:
StartServers 2 MinSpareThreads 25 MaxSpareThreads 75 ThreadLimit 64 ThreadsPerChild 25 MaxRequestWorkers 150 MaxConnectionsPerChild 0
A tool like ab won’t halt at 150. A tool named slowhttptest is in xenial/universe. Run apt install slowhttptest to install it. It is a flexible tool and has a great man page and -h help output.
$ slowhttptest -c 1000 -H -g -o my_header_stats -i 10 -r 200 -t GET -u http://system.under.test.example.com/ -x 24 -p 3
slowhttptest version 1.6
– https://code.google.com/p/slowhttptest/ –
test type: SLOW HEADERS
number of connections: 1000
URL: http://system.under.test.example.com/
verb: GET
Content-Length header value: 4096
follow up data max size: 52
interval between follow up data: 10 seconds
connections per seconds: 200
probe connection timeout: 3 seconds
test duration: 240 seconds
using proxy: no proxyTue Sep 27 14:33:03 2016:
slow HTTP test status on 5th second:initializing: 0
pending: 284
connected: 667
error: 0
closed: 0
service available: YES
This screen will update as connections are created until service available changes from YES to NO.
In my tests it closed: value was exactly 150. I can view the my_header_stats.csv file to see when max was reached.
Next, lets adjust Apache httpd to allow for more concurrent connections. My target is 15,000 connections, so I’ll increase numbers linearly 2 processes (StartServers) with 75 threads each (ThreadsPerChild) gave 150 connections. 20 processes with 750 threads each should give 15,000 connections.
Edit mpm_event.conf: ($ sudo vi /etc/apache2/mods-enabled/mpm_event.conf)
<IfModule mpm_event_module> StartServers 10 MinSpareThreads 25 MaxSpareThreads 750 ThreadLimit 1000 ThreadsPerChild 750 # MaxRequestWorkers aka MaxClients => ServerLimit *ThreadsPerChild MaxRequestWorkers 15000 MaxConnectionsPerChild 0 ServerLimit 20 ThreadStackSize 524288 </IfModule>
Restart (full restart, not graceful – ThreadsPerChild change requires this) apache2 httpd and retry the slowhttptest. Notice service available is always YES.
Now turn up the slowhttptest numbers. Change the -c parameter to 15000 and the -r to 1500. It should take 10sec to ramp up the connections. In my use case I could not create that many connections so quickly. slowhttptest was maxing out a CPU core.
All of the above apache httpd config was done using the mpm_event processing module. The next issue I ran into was a case of mpm_worker not behaving as I expected. I have a doubly proxied system, because this is super real world where we route http things all over the place, sometimes in ways we shouldn’t but because we are lazy, or it is easier or… anyway…
In ubuntu/trusty with apache httpd 2.4.7 mpm_worker has a limit of 64 ThreadsPerChild even if you configure it with a larger number. There is no warning. You’d never know unless you take a look at the number of processes in a worker: $ ps -uwww-data -opid,ppid,nlwp The fix is to switch from mpm_worker to mpm_event.
$ sudo a2dismod mpm_worker
$ sudo a2enmod mpm_event
$ sudo service apache2 restart
I thought that I’d need to do more, but this got me to where I needed to be.