Optimizing uwsgi for Many Many Threads and Processes

tl;dr : Consider optimizing uwsgi by setting `threads-stacksize = 64` or some small value in your uwsgi config. Python apps which do not use many C modules do not use the C stack very much. A smaller stack size mean threads use less memory and you can safely have more of them servicing requests.

Long story:

Years ago I was deploying a new flask web service using uwsgi. I needed it to scale to thousands of connections. I read a blog post (I searched and cannot find it now) which suggested 10 processes with 10 threads each to be able to serve 100 concurrent connections. After testing and tuning this particular app, we settled on 10 processes and 100 threads per process. It ran well.

Recently, a production app, which I helped deploy, fell on its face. It was performing very poorly, seemingly out of nowhere. This app was originally deployed with the same 10 processes, 100 threads per process configuration which I had used so successfully in the past. The ops team had already reduced the process count to 4 due to excessive memory use of the application. This means the application was only able to service 400 concurrent connections.

I still cannot entirely explain why the app ran for many months and then suddenly had problems. I’m guessing it is because of recent announcements driving more traffic to the site. The 400 threads were actually being used instead of sitting idle waiting for connections.

In the process of trying to restore service, our ops team wisely used a tool which I was not likely to have used (huge thanks to them). The tool is pmap and it shows mapped memory for a given process. I noticed something interesting in the output of pmap:

00007fcf75061000   8192K rw---   [ anon ]
00007fcf75861000      4K -----   [ anon ]
00007fcf75862000   8192K rw---   [ anon ]
00007fcf76062000      4K -----   [ anon ]

This was repeated with the same memory increment 50 times for a total of 100. It occurred to me that the default stack size of a thread in Linux is 8MB and that these memory maps were the stack of each thread. I was able to confirm this suspicion by running the app myself and adjusting the size by configuring uwsgi with –threads-stacksize.

I started by moving to 1MB which I know is the default Windows thread stack size, guessing it would still be plenty. Then I started to play limbo and see how low can I go. I started to get pretty happy when I broke the 256KB mark and our app was still functioning. Our app has the luxury of not having any deep calls. I might have been able to go lower, but once I got to 64KB, I didn’t see my point. Every order of magnitude decrease was smaller and smaller an improvement.

Moving from 8MB to 1MB took memory usage from 3.2GB to 400MB. Every halving of stack size halved overall memory usage of the thread stacks by this app. First 512KB/thread for 200MB, then 256KB/thread for 100MB, then 128KB/thread for 50MB, then 64KB/thread for 25MB. At this point, everything about the app was running exactly the same, the only difference being that I wasn’t wasting 3.2GB of memory in unused thread stacks.


Book Review: BeagleBone Robotics Projects

I was asked by Packt Publishing if I would read and review this book. I’ve owned a BeagleBoneBlack for a little while now. My use case was not robotics. This book might shed some new light on my old Black, so I agreed to review it.

The book starts off very accessible. Chapter 1 covers just about everything I did with my BBB when I first received it, hooking it up like a PC, replacing the default distro, making sure I could SSH to it were all in there. The author, Richard Grimmett, goes a step further and installs XFCE gui and vncserver and walks through connecting from a Windows PC using vncclient. All in all, chapter 1 is a great super basic tour.

Chapter 2 dives into programming on the thing and introduces Python. It does it in a really weird (to me) way. It has the reader running emacs in a putty window remote connected to the device. This must just feel weird to me because I do a lot of remote programming and its never with emacs (I’m a vim guy) and its rarely remote. For a new user, it seems to me like it would have been simpler and more friendly to say “use an editor of your choice” and “here is notepad2 or sublime” along with “here is how you copy files to and from the device.” I think this is mostly my background causing me to see things differently. The emacs in putty walk-through is very adequate.

Its not a programming book, so this is really a nit pick, but technically some of the descriptions of python aren’t really true. For example, if __name__==”__main__”: does not “tell the program to begin its execution at this point.” Again I’m nit picking, but I do feel like a different phrase that isn’t so very false to someone who knows python could have been found. Still, its not a programming book. The beginning of the chapter does list many resources for learning python.

Ugh, and then the book moves on to C++ and has quotes like this, “C++ is the original language of Linux” I’ve used Linux for almost as long as I’ve programmed C, and I am very (perhaps overly?) sensitive to the difference between C and C++.

OMG what do you mean Speech Input and Output? Really?  Chapter 3 tackles it. Really. For real. Speech Input and Output on that tiny little board. I can make my own Siri! This is a really cool topic; espeek is something I’ve only played with a little bit prior to reading this. It looks fun.

Speech recognition is done with software I’ve never used before called PocketSphinx. It isn’t packaged and so one has to compile it. Pretty sweet BBB being able to compile stuff like that. (I’m thinking of iOS and Android where I’ve not seen a compiler run on device.) The demo walks through limiting the grammar of speech input so that you don’t have to train the recognizer.

I’m a programmer, so I’m going to nitpick programmer things. I really wish authors wouldn’t do this, “I like to make a copy of the current file into continuous.c.old, so I can always get back to the starting program if it is required.” I really do wish authors would just say “go read about version control systems.”

Whew, speech is fun. Next step is video. Hook up a webcam and let’s do some image recognition. The book walks through OpenCV and it is as this point that we are forced to do a bunch of Linux sysadmin stuff to make our SD have enough free space to have a dev environment. This really could have gone anywhere in the book. I kind of like that it put it off until it was necessary.

The python image tracking example using OpenCV looks pretty cool. It is a complete example without going too deep or going off in the weeds.

Making the Unit Mobile introduced me to mobile platforms. The Magician Chassis that the book shows first, I found online for under $20! I knew that this stuff was accessible, but this is downright cheap. I feel almost guilty NOT getting one and trying it out.

The motor controller tutorial looks very straightforward. I already have ideas for code changes. Immediately after the simple time based tutorial it goes into speech controlled movement, which is pretty sweet.

After the wheeled robot tutorial is a walking robot example. The author makes a compelling argument for this type of robot, and the Pulse Width Modulation servo motors are cool, but I have to admit, this type of robot just doesn’t excite me. The book also punts on the PWM, using a controller which interprets serial USB commands into the PWM for the servos. For beginners, this is certainly the right choice.

Incidentally, the –help output from UscCmd includes Version, Culture, PublicKeyToken values like a Mono program might. I wonder if it is written in C# and running via Mono. I’m going to assume it is. That is pretty sweet. Indeed the linked download page mentions C#. http://www.pololu.com/docs/0J40/3.b

The sonar sensors section is a straightforward and great introduction to the use of them. I never knew how those things worked or what kind of value they returned. Now I do. Mounting the sensor to a survo makes for a nice subsystem on the bot.

Next, a fully remote control system is built. I don’t know if I like the choice of using an LCD monitor. It seems like overkill, but depending on the particular robotic application it would be a good choice. For the applications I have in mind, I think I’ll skip it. A wireless usb keyboard and mouse makes for an obvious choice. At this point, I just keep thinking about bluetooth and using an extra Wiimote, mostly because I think it would be a more fun control.

Oh, a GPS receiver! This could be necessary for when I lose my robot in a parking lot or the woods. As with the LCD Monitor and KB chapter, I kind of feel like I know how to do this since I’ve looked into it before. It is great coverage and good intro to the topic.

Much of my day job is what would traditionally be called Systems Programming so Chapter 10 is kind of a duh to me. I’d have started there, but that is just how I think about coding these days. Its great to have this in a chapter to tie some things together. In other words, read this chapter!

Using the BBB in sea, air and submarine applications is an interesting idea. I don’t think it is for me yet, but the book gives introduction to some ideas on the topic. The introduction to feedback control is very welcome.

Overall this is a great book. It really gave me a lot of ideas. It also showed me how easy it is to get started, something which I’d been a little hesitant to do. I’m actually a little excited to dive in now. I’ll be doing a bunch of this stuff with my 6yo over the next few years.