Tuning NGINX for Better Performance

Written by: Ben Cane

September 20, 2016
  • While NGINX is much younger than other web servers, it has quickly become a popular choice. Part of its success is based on it being the web server of choice for those looking for a lightweight and performant web server.

In today’s article, we’ll be taking an out-of-the-box instance of NGINX and tuning it to get more out of an already high-performance web server. While not a complete tuning guide, this article should provide readers with a solid understanding of tuning fundamentals and a few common NGINX tuning parameters.

Before we get into tuning however, let’s first install NGINX.

Installing NGINX

For this article, we will be running NGINX on an Ubuntu Linux-based server, so we can perform the installation with the apt-get command.

root@nginx-test:~# apt-get install nginx

This step will install a generic installation of NGINX, which already has some tuning parameters set out of the box. The default installation of NGINX, however, doesn’t offer much in the way of content to serve. In order to give ourselves a realistic web application to tune, let’s go ahead and deploy a sample site from GitHub.

root@nginx-test:~# git clone https://github.com/BlackrockDigital/startbootstrap-clean-blog.git /var/www/html
Cloning into '/var/www/html'...
remote: Counting objects: 308, done.
remote: Total 308 (delta 0), reused 0 (delta 0), pack-reused 308
Receiving objects: 100% (308/308), 1.98 MiB | 0 bytes/s, done.
Resolving deltas: 100% (119/119), done.
Checking connectivity... done.

When performance tuning, it’s important to understand the type of application that’s being tuned. In the case of NGINX, it’s important to know if you’re tuning for static content or dynamic content served by a downstream application. The difference between these two types of content can alter what tuning parameters to change, as well as the values for those parameters.

In this article, we’ll be tuning NGINX to serve static HTML content. While most of the parameters will apply to NGINX in general, not all of them will. It’s best to use this article as a guide for your own tuning and testing.

Now that our basic instance is installed and a sample site deployed, let’s see how well an out-of-the-box installation of NGINX performs.

Establishing a Baseline

One of the first steps in performance tuning anything is to establish a unit of measurement. For this article, we will be using the HTTP load testing tool ApacheBench, otherwise known as ab to generate test traffic to our NGINX system.

This load-testing tool is very simple and useful for web applications. ApacheBench provides quite a few options for different types of load-testing scenarios; however for this article, we’ll keep our testing pretty simple.

We will be executing the ab command with the -c (concurrency level) and -n (number of requests) parameters set.

$ ab -c 40 -n 50000 http://159.203.93.149/

When we execute ab, we’ll be setting the concurrency level (-c) to 40, meaning ab will maintain at least 40 concurrent HTTP sessions to our target NGINX instance. We will also be setting a limit on the number of requests to make with the -n parameter. Essentially these two options together will cause ab to open 40 concurrent HTTP sessions and send as many requests as possible until it reaches 50000 requests.

Let’s go ahead and execute a test run to establish a baseline and identify which metric we will use for our testing today.

# ab -c 40 -n 50000 http://159.203.93.149/
This is ApacheBench, Version 2.3 <$Revision: 1528965 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 159.203.93.149 (be patient)
Completed 5000 requests
Completed 10000 requests
Completed 15000 requests
Completed 20000 requests
Completed 25000 requests
Completed 30000 requests
Completed 35000 requests
Completed 40000 requests
Completed 45000 requests
Completed 50000 requests
Finished 50000 requests
Server Software:        nginx/1.10.0
Server Hostname:        159.203.93.149
Server Port:            80
Document Path:          /
Document Length:        8089 bytes
Concurrency Level:      40
Time taken for tests:   16.904 seconds
Complete requests:      50000
Failed requests:        0
Total transferred:      420250000 bytes
HTML transferred:       404450000 bytes
Requests per second:    2957.93 [#/sec] (mean)
Time per request:       13.523 [ms] (mean)
Time per request:       0.338 [ms] (mean, across all concurrent requests)
Transfer rate:          24278.70 [Kbytes/sec] received

In the above output, there are several interesting metrics. Today we will be focusing on the Requests per second metric. This metric shows the average number of requests our NGINX instance can serve in a second. As we adjust parameters, we should see this metric go up or down.

Requests per second:    2957.93 [#/sec] (mean)

From the above, we can see that the mean requests per second was 2957.93. This might seem like a lot, but we will increase this number by quite a bit as we continue.

When performance tuning, it’s important to remember to make small incremental changes and compare the results with the baseline. For this article, 2957.93 requests per second is our baseline measurement. For a parameter to be successful, it must result in an increase in requests per second.

With our baseline metrics set, let’s go ahead and start tuning NGINX.

Worker Threads

One of the most basic tuning parameters in NGINX is the number of worker threads available. By default, the value of this parameter is auto, which tells NGINX to create one worker thread for each CPU available to the system.

For most systems, one worker process per CPU is an even balance of performance and reduced overhead. With this article however, we are trying to get the most out of NGINX serving static content which should be pretty low CPU overhead. Let’s go ahead and see how many requests per second we can get by increasing this value.

For our first test, let’s go ahead and start two worker processes for each CPU on the system.

In order to figure out how many worker processes we need, we first need to know how many CPUs are available to this system. While there are many ways to do this, in this example we will use the lshw command to show hardware information.

root@nginx-test:~# lshw -short -class cpu
H/W path      Device  Class      Description
============================================
/0/401                processor  Intel(R) Xeon(R) CPU E5-2650L v3 @ 1.80GHz
/0/402                processor  Intel(R) Xeon(R) CPU E5-2650L v3 @ 1.80GH

From the output above, it appears our system is a 2 CPU system. This means for our first test, we will need to set NGINX to start a total of 4 worker processes.

We can do this by editing the worker_processes parameter within the /etc/nginx/nginx.conf file. This is the default NGINX configuration file and the location for all of the parameters we will be adjusting today.

worker_processes auto;

The above shows that this parameter is set to the default value of auto. Let’s go ahead and change this to a value of 4.

worker_processes 4;

After setting the new value and saving the /etc/nginx/nginx.conf file, we will need to restart NGINX in order for the configuration change to take effect.

root@nginx-test:~# service nginx restart
root@nginx-test:~# ps -elf | grep nginx
1 S root     23465     1  0  80   0 - 31264 sigsus 20:16 ?        00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
5 S www-data 23466 23465  0  80   0 - 31354 ep_pol 20:16 ?        00:00:00 nginx: worker process
5 S www-data 23467 23465  0  80   0 - 31354 ep_pol 20:16 ?        00:00:00 nginx: worker process
5 S www-data 23468 23465  0  80   0 - 31354 ep_pol 20:16 ?        00:00:00 nginx: worker process
5 S www-data 23469 23465  0  80   0 - 31354 ep_pol 20:16 ?        00:00:00 nginx: worker process
0 S root     23471 23289  0  80   0 -  3628 pipe_w 20:16 pts/0    00:00:00 grep --color=auto nginx
root@nginx-test:~#

We can see from the above that there are now 4 running processes with the name of nginx: worker process. This indicates that our change was successful.

Checking the effect

With our additional workers started, let’s run ab again to see if there has been any change in throughput.

# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second"
Requests per second:    3051.40 [#/sec] (mean)

It seems that our change has had very little effect: our original Requests per second was 2957.93, and our new value is 3051.40. The difference here is roughly 100 more requests per second. While this is an improvement, this is not the level of improvement we were looking for.

worker_processes 8;

Let’s go ahead and change the worker_processes value to 8, four times the number of CPU’s available. In order for this change to take effect, we will once again need to restart the NGINX service.

root@nginx-test:~# service nginx restart

With the service restarted, we can go ahead and rerun our ab test.

# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second"
Requests per second:    5204.32 [#/sec] (mean)

It seems that 8 worker threads have a much more significant effect than 4. Compared to our baseline metrics, we can see that with 8 worker threads we are able to process roughly 2250 more requests per second.

Overall this seems like a significant improvement from our baseline. The question is how much more improvement would we see if we increased the number of worker threads further?

Remember, it’s best to make small incremental changes and measure performance increases each step of the way. For this parameter, I would simply increase its value in multiples of two and rerun a test each time. I would repeat this process until the requests per second value no longer increases. For this article however, we will go ahead and move on to the next parameter, leaving the worker_processes value set to 8.

Worker Connections

The next parameter we are going to tune is the worker_connections configuration within NGINX. This value defines the maximum number of TCP sessions per worker. By increasing this value, the hope is that we can increase the capacity of each worker process.

The worker_connections setting can be found within the events block in the /etc/nginx/nginx.conf configuration file.

events {
        worker_connections 768;
        # multi_accept on;
}

The default setting for Ubuntu’s installation of NGINX is 768. For this first test, we will try to change this setting to 1024 and measure the impact of that change.

events {
        worker_connections 1024;
        # multi_accept on;
}

Like the previous configuration change, in order for this adjustment to take effect we must restart the NGINX service.

root@nginx-test:~# service nginx restart

With NGINX restarted, we can run another test with the ab command.

# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second"
Requests per second:    6068.41 [#/sec] (mean)

Once again, our parameter change has resulted in a significant increase in performance. With just a small change in worker_connections, we were able to increase our throughput by 800 requests per second.

Increasing worker threads further

If a small change in worker_connections can add 800 requests per second, what affect would a much larger change have? The only way to find this out is to make the parameter change and test again.

Let’s go ahead and change the worker_connections value to 4096.

worker_rlimit_nofile 4096;
events {
        worker_connections 4096;
        # multi_accept on;
}

We can see the worker_connections value is 4096, but there is also another parameter whose value is 4096. The worker_rlimit_nofile parameter is used to define the maximum number of open files per worker process. The reason this parameter is now specified is because, when adjusting the number of connections per worker, you must also adjust the open file limitations.

With NGINX, every open connection equates to at least one or sometimes two open files. By setting the maximum number of connections to 4096, we are essentially defining that every worker can open up to 4096 files. Without setting the worker_rlimit_nofile to at least the same value as worker_connections, we may actually decrease performance, because each worker will try to open new files and would be rejected by the open file limitations or 1024.

With these settings applied, let’s go ahead and rerun our test to see how our changes affect NGINX.

# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second"
Requests per second:    6350.27 [#/sec] (mean)

From the results of the ab test run, it seems we were able to add about 300 requests per second. While this may not be as significant of a change as our earlier 800 requests per second, this is still an improvement in throughput. As such, we will leave this parameter as is to move on to our next item.

Tuning for Our Workload

When tuning NGINX or anything else for that matter, it’s important to keep in mind the workload of the service being tuned. In our case, NGINX is simply serving static HTML pages. There is a set of tuning parameters that are very useful when serving static HTML.

http {
        open_file_cache max=1024 inactive=10s;
        open_file_cache_valid 120s;

The open_file_cache parameters within the /etc/nginx/nginx.conf file are used to define how long and how many files NGINX can keep open and cached in memory.

Essentially these parameters allow NGINX to open our HTML files during the first HTTP request and keep those files open and cached in memory. As subsequent HTTP requests are made, NGINX can use this cache rather than reopening our source files.

In the above, we are defining the open_file_cache parameter so that NGINX can cache a maximum of 1024 open files. However, of those files, the cache will be invalidated if they are not accessed within 10 seconds. The open_file_cache_valid parameter is defining a time interval to check if currently cached files are still valid; in this case, every 120 seconds.

These parameters should significantly reduce the number of times that NGINX must open and close our static HTML files. This means less overall work per request, which should mean a higher throughput. Let’s test our theory with another run of the ab command.

# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second"
Requests per second:    6949.42 [#/sec] (mean)

With an increase of nearly 600 requests per second, the open_file_cache parameters have quite an effect. While this parameter might seem very useful, it is important to remember that this parameter works in our example because we are simply serving static HTML. If we were testing an application that was serving dynamic content every time, these parameters may result in rendering errors for end users.

Summary

At this point, we have taken an out-of-the-box NGINX instance, measured a baseline metric of 2957.93 requests per second, and tuned this instance to 6949.42 requests per second. As a result, we’ve gotten an increase of roughly 4000 requests per second. We did this by not only changing a few key parameters, but also experimenting with those parameters.

While this article only touched on a few key NGINX parameters, the methods used in this article to change and measure impact can be used with other common NGINX tuning parameters, such as enabling content caching and gzip compression. For more tuning parameters, check out the NGINX Admin Guide which has quite a bit of information about managing NGINX and configuring it for various workloads.

Original article