ALGORITHMS FOR SELECTING SERVER AND LOCATION BLOCKS IN NGINX

Lecture



Nginx is one of the most popular web servers in the world. It can handle high loads and a large number of simultaneous connections. Nginx can also be used as a balancer, mail server or reverse proxy.

This manual will tell you how Nginx handles client requests. Understanding this mechanism will help you optimize the processing of requests.

Nginx block configuration

Nginx logically divides the configurations intended for serving various content into blocks that are assembled into a hierarchical structure. Nginx starts processing each client request by defining the necessary configuration blocks. This decision making process will be the focus of this manual.

The basic blocks that we discuss are called server and location.

The server block is a subset of the Nginx configuration that defines the virtual server used to process requests of a particular type. Administrators often set up multiple server blocks, where each block handles connections based on the requested domain, port, and IP address.

The location block is located in the server block and is used to enable Nginx to process requests for different resources and the URI of the parent server. With this block, the administrator can divide the URI space as required. This is an extremely flexible model.

1: Server block lookup

Nginx allows you to define several server blocks that function as separate instances of virtual web servers. Therefore, Nginx needs a procedure for determining which of these blocks will be used to process the request.

For this, Nginx applies a specific system of checks that are used to find the best match. The main server block directives that help Nginx determine the required block are listen and server_name.

Listen directive

Nginx first looks at the IP address and port of the request. It maps these values ​​to the listen directive of each server block and creates a list of blocks that can service the request.

The listen directive usually specifies the IP address and port of the server block. By default, any server block that does not have a listen directive receives parameters 0.0.0.0:80 (or 0.0.0.0:8080 if Nginx is started by a regular non-root user). This allows such blocks to respond to requests on any interface on port 80. But this standard value does not have much weight in the process of selecting a block.

The listen directive can specify:

  • IP address and port.
  • Only IP-address (then the default port 80 will be used).
  • Only the port (then all interfaces will be tapped).
  • Unix socket path.

The latter option, as a rule, is used only when sending requests between different servers.

First, Nginx will try to select a block based on the listen directive using the following rules:

  • Nginx translates all "incomplete" listen directives, replacing the missing values ​​with default values, in order to then evaluate each block by its IP address and port. For example:
    • If there is no listen directive in the block, the block will be assigned the value 0.0.0.0:80.
    • If only IP address 111.111.111.111 is specified in the block, the standard port will be assigned to it: 111.111.111.111:80.
    • If only port 8888 is specified in the block, it will be assigned the standard IP address: 0.0.0.0:8888.
  • Then Nginx tries to build a list of server blocks that match the request, in particular, based on the IP address and port. This means that any block that uses the IP address 0.0.0.0 will not be selected if there are blocks that are configured for the specified IP address. The port must match exactly.
  • If the web server finds only one match, it simply uses this server block to service the request. If he finds several blocks that meet all the requirements, Nginx will select one block based on the server_name directive.

It is important to understand that Nginx will evaluate the server_name directive only when it needs to select one block from the list of blocks selected by the listen directive. For example, if the example.com domain is located on port 80 at 192.168.1.10, the request for example.com will always be served by the first block in the example below, despite the server_name directive in the second block.

server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}

If Nginx selected several blocks with the same level of specificity, then it will check the server_name directive.

Server_name directive

For further evaluation of requests that have the same definition of the listen directive, Nginx checks the Host request header. This value contains the domain or IP address that the client is requesting.

Nginx is looking for the best match of this value in the server_name directive of each block that passed the previous selection stage. Nginx evaluates this directive using this formula:

  • First, Nginx tries to find a server block whose server_name value exactly matches the value in the Host request header. If such a block is found, it will be used to service the request. If Nginx finds several exact matches, the first block found is used.
  • If Nginx did not find an exact match, it will try to find a block, the server_name directive of which begins with a special character *. If Nginx found such a block, this block will be used to service the request. If Nginx finds several matches, the most accurate match will be used to service the request.
  • If Nginx does not find matches by a special character at the beginning of server_name, it will search for a block whose server_name value ends with a special character *. If such a block is found, it is used to service the request. If Nginx finds several matches, the most accurate one will be used to service the request.
  • If Nginx does not find matches by a special character at the end of server_name, it evaluates the blocks whose server_name value uses regular expressions (they are determined by the ~ character in front of the name). To service the request, the first block will be used, which contains a regular expression in server_ name that matches the Host header.
  • If it was not possible to find a block using regular expressions, Nginx selects the default server block for this IP address and port.

For each combination of IP address and port, there is a default server block, which is used if the web server could not find another block. As a rule, this is either the first block in the configuration, or the block that contains the default_server parameter as part of the listen directive (it overrides the search algorithm for the first match). For each combination of IP address and port, there can be only one default_server declaration.

Examples

If there is a block in the configuration with the server_name directive, the value of which completely coincides with the Host request header, the request is passed to this block for processing.

For example, if the Host request header is host1.example.com, the web server will select the second server block to service it:

server {
listen 80;
server_name * .example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}

If Nginx does not find exact matches, it will look for a block in which server_name starts with a special character. If Nginx finds several matches, the most accurate one will be used to service the request. For example, if the request contains the Host header www.example.org, Nginx will select the second block:

server {
listen 80;
server_name www.example. *;
. . .
}
server {
listen 80;
server_name * .example.org;
. . .
}
server {
listen 80;
server_name * .org;
. . .
}

If it was not possible to find a block by a special character at the beginning of the directive, Nginx will search for a block whose server_name value ends with a special character. If he finds several matches, he uses the most accurate one. For example, to process a request with the Host www.example.com header, the web server uses the third server block:

server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example. *;
. . .
}

If the search for a block by a special character did not work, Nginx will look for server_name directives that contain regular expressions. To process the request, the first block whose regular expression in the directive matches the request header will be used.

For example, to service a request with the Host www.example.com header, the web server selects the second server block:

server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~ ^ (www | host1). * \. example \ .com $;
. . .
}
server {
listen 80;
server_name ~ ^ (subdomain | set | www | host1). * \. example \ .com $;
. . .
}

If none of the search engines yielded any results, the web server will apply the server block by default.

2: Block location search

A similar algorithm Nginx uses to find the location block.

Location block syntax

First you should familiarize yourself with the syntax of the location block. Location blocks are located inside server blocks (or other location blocks) and are used to determine how to handle the request URI (that part of the request that follows the domain name or IP address / port).

As a rule, the location block has the following form:

location optional_modifier location_match {
. . .
}

The location_match in the example above indicates that Nginx should verify the request URI. The presence or absence of the modifier in the example above affects how Nginx will look for the location block.

There are such location block modifiers:

  • (no): if there is no modifier in the block, the location block is interpreted as a prefix. This means that to determine compliance, the specified block will be compared with the beginning of the request URI.
  • =: this block will be selected if the request URI exactly matches the specified location.
  • ~: such a block will be interpreted as a match for a regular expression case sensitive.
  • ~ *: such a block will be interpreted as a match for a regular expression and is not case sensitive.
  • ^ ~: if this block is selected as the most accurate match without a regular expression, then the web server will not search for a regular expression.

Location block syntax examples

As an example of a prefix search, you can use the following location block to respond to URI requests (/ site, /site/page1/index.html, or /site/index.html).

location / site {
. . .
}

Below you will find an example of exact URI matching. Such a block will always be used to service the URI.

/ page1. It will not respond to the request URI /page1/index.html. Keep in mind that if this block is selected and the request is served by an index page, there will be an internal redirect to another location block, which will be the actual request handler.

location = / page1 {
. . .
}

The interpretation of the location block as a regular expression is case sensitive in the following example. This block will be used to process requests for /tortoise.jpg, but not for /FLOWER.PNG:

location ~ \. (jpe? g | png | gif | ico) $ {
. . .
}

In the following example, the location block is interpreted as a regular expression and is not case sensitive. Such a block will be able to process requests for /tortoise.jpg and /FLOWER.PNG.

location ~ * \. (jpe? g | png | gif | ico) $ {
. . .
}

The next block will disable regular expression search if it is selected as the best match without regular expressions. It can handle requests for /costumes/ninja.html:

location ^ ~ / costumes {
. . .
}

As you can see, modifiers indicate how the location block should be interpreted. However, this does not define the algorithm that Nginx uses to decide which location block to send the request to.

Select location block

Nginx selects the location block in the same way as it selects the server block. It starts a process that determines the best location block for a particular request. Understanding this process is essential for reliable and accurate Nginx configuration.

Keeping in mind the types of ads that we reviewed above, Nginx evaluates possible location contexts by comparing the request URI with each of the locations. He does this using the following algorithm:

  • First, Nginx checks all location blocks specified by prefix strings. For this, the location is compared with the full URI string.
  • Then Nginx is looking for an exact match. If it finds a location with the = modifier, it stops searching and uses the found configuration.
  • If an exact match is not found, the web server searches for inaccurate matches. It searches for a location with a matching prefix of maximum length for a given URI, which is then evaluated this way:
    • If a location with a matching maximum length prefix contains the ^ ~ modifier, then Nginx will immediately stop searching and select this location block to serve requests.
    • If a location with a matching maximum length prefix does not contain the ^ ~ modifier, then Nginx will remember this prefix and continue the search.
  • After Nginx has found and remembered a location with a matching prefix of maximum length, it proceeds to the evaluation of regular expressions (with and without register). If there are any location blocks with regular expressions in a location with a matching prefix of maximum length, Nginx will place them at the top of the list of regular expressions to check. Then Nginx will successively compare blocks with regular expressions. The first expression that matches the request URI will be selected for processing.
  • If a regular expression match is not found, Nginx uses the configuration of the previously stored prefix location.

It is important to understand that by default Nginx will serve regular expressions, preferring prefix matches. However, he first evaluates the prefix location, allowing the administrator to override this behavior by specifying location using the = and ^ ~ modifiers.

It is also important to note that, although prefix locations are usually chosen based on the maximum length prefix (the most accurate match), Nginx will stop evaluating regular expressions when it finds the first suitable location. This means that the location in the configuration of location blocks with regular expressions is of paramount importance.

Evaluation of location blocks

Now we need to figure out in which cases the evaluation of the location blocks goes to other locations.

In general, since the location block is selected to serve the request, the request is processed entirely in this context. Only the selected location block and inherited directives determine how the request is processed, and the neighboring location blocks cannot interfere with this process.

This is a general rule that allows you to design location blocks in a predictable way. But it is important to understand that there are cases when certain directives start a new search for a location inside the selected location block. Exceptions to the rule can lead to unpredictable results.

Here are some of the directives that can cause this behavior:

  • index
  • try_files
  • rewrite
  • error_page

The index directive always causes an internal redirect if it is used to process a request. Exact matches of location are often used to speed up the selection process, because it will immediately stop the execution of the algorithm. However, if the exact location match is a directory, there is a chance that for actual processing the request will be redirected to another location.

In this example, the first location matches the request’s / exact URI, but the index directive inherited by the block triggers an internal redirect to the second block to process the request:

index index.html;
location = / exact {
. . .
}
location / {
. . .
}

If you want the request in the above case to be processed by the first block, you will have to come up with another method of dropping the request into the directory. For example, you can set the wrong index for this block and enable autoindex:

location = / exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}

This is one way to prevent the request from being redirected from the first context, but it is probably not suitable for most configurations. Basically, an exact match in directories can be useful for operations such as rewriting a query (which also leads to a new search for the location block).

Another case in which a new search for a location can begin is the use of the try_files directive. This directive tells Nginx to check for a named set of files or directories. The last parameter can be the URI to which Nginx will redirect internally.

Consider this configuration:

root / var / www / main;
location / {
try_files $ uri $ uri.html $ uri / / fallback/index.html;
}
location / fallback {
root / var / www / another;
}

If in the example above, a request is made for / blahblah, the first location will receive the request first. It will try to find a file named blahblah in the / var / www / main directory. If he cannot find it, he will look for a file named blahblah.html. He will then try to find out if the blahblah / directory is in the / var / www / main directory. If all these attempts fail, the request will be redirected to / fallback/index.html. This will trigger a new location search, and the request will go to the second block. It will serve the file /var/www/another/fallback/index.html.

Also the block directive is affected by the rewrite directive. By processing rewrite with no parameters or with the last parameter, Nginx will search for a new location block based on the results of the rewriting.

For example, if you change the last example and add a rewrite to it, you will see that the request is sometimes sent directly to the second location block, without relying on the try_files directive:

root / var / www / main;
location / {
rewrite ^ / rewriteme /(.*)$ / $ 1 last;
try_files $ uri $ uri.html $ uri / / fallback/index.html;
}
location / fallback {
root / var / www / another;
}

In the example above, the request / rewriteme / hello will be processed first by the first location block. It will be rewritten to / hello, and the web server will look for the location. In this case, it will again match the first location and will be processed by the try_files directive (perhaps using internal redirection to return to / fallback/index.html if nothing was found).

However, if a request is made for / rewriteme / fallback / hello, the first block will again respond to the request. At the same time, overwriting is applied again, this time / fallback / hello. Then the request will be served by the second block.

A similar situation arises with the return directive when sending status codes 301 or 302. The difference in this case is that it leads to a completely new request from outside the redirect. The same situation can occur with the rewrite directive when using the redirect or permanent flags.

The error_page directive can lead to internal redirection in the same way that try_files does. This directive is used to define actions that are performed when detecting certain status codes. These actions will probably never be executed if the try_files directive is set, since this directive handles the entire request life cycle.

Consider this example:

root / var / www / main;
location / {
error_page 404 /another/whoops.html;
}
location / another {
root / var / www;
}

Каждый запрос (кроме тех, которые начинаются с /another) будет обрабатываться первым блоком, который будет обслуживать файлы из каталога /var/www/main. Однако если файл не найден (статус 404), произойдет внутреннее перенаправление на /another/whoops.html, что приведет к новому поиску блока location, который в конечном итоге окончится вторым блоком. Этот блок будет обслуживать файл /var/www/another/whoops.html.

Как видите, понимание условий, при которых Nginx запускает новый поиск блока location, может помочь предсказать поведение веб-сервера при выполнении запросов.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Running server side scripts using PHP as an example (LAMP)

Terms: Running server side scripts using PHP as an example (LAMP)