Five application scenarios for understanding Nginx thoroughly

post thumb
Devops
by Admin/ on 03 Jan 2022

Five application scenarios for understanding Nginx thoroughly


We will share the five application scenarios for Nginx: HTTP server, static server, reverse proxy server, load balancer, and separation of static and dynamic contents.

I. HTTP server

Nginx itself is also a static resource server. When there are only static resources, you can use Nginx to be a server, if a website is only a static page, then it can be deployed in this way.

1. first in the document root directory Docroot (/usr/local/var/www) to create html directory, and then in the html put a test.html;

post thumb

2. configure the server in nginx.conf

user mengday staff;

http {
    server {
        listen 80;
        server_name localhost;
        client_max_body_size 1024M;

        # default location
        location / {
            root /usr/local/var/www/html;
            index index.html index.htm;
        index.html; index.htm; }
    }
}

3. Access test

http://localhost/ points to /usr/local/var/www/index.html, index.html is the html that comes with nginx installation http://localhost/test.html points to /usr/local/var/www/html/test.html

Note: If you get 403 Forbidden errors when accessing images, it may be because the first line of nginx.conf user configuration is not correct, the default is #user nobody; it is commented out, change it to user root under linux; change it to user username group under macos; then reload the configuration file or reboot and try again. The user name can be checked by who am i command.

4. Command Introduction

  • server : used to define the service, there can be multiple server blocks in http
  • listen : Specify the IP address and port for the server to listen to requests, if the address is omitted, the server will listen to all addresses, if the port is omitted, the standard port is used
  • server_name : the name of the service, used to configure the domain name
  • location : The configuration corresponding to the mapped path uri, a server can have multiple locations, location followed by a uri, can be a regular expression, / means match any path, when the client accesses the path meets this uri will execute the code inside the location block
  • root : root path, when visit http://localhost/test.html, “/test.html” will match to “/” uri, find root as /usr/local/ var/www/html, the user accesses the resource physical address = root + uri = /usr/local/var/www/html + /test.html = /usr/local/var/www/html/test.html
  • index : set the home page, when only access server_name without any path behind is not to go root directly to the index command; if the access path does not specify a specific file, then return the index set resources, if you visit http://localhost/html/ then the default return index.html

5. location uri regular expression

  • . : match any character other than the newline character
  • ? : repeat 0 times or 1 times
  • + : repeat 1 or more times
  • * : repeat 0 or more times
  • \d :Match a number
  • ^ : Match the beginning of the string
  • $ : Match the end of the string
  • {n} : Repeat n times
  • {n,} : Repeat n or more times
  • [c] : matches a single character c
  • [a-z] : matches any of the lowercase letters a-z
  • (a|b|c) : the genus line means match any of the cases, each case is separated by a vertical line, usually enclosed in parentheses, and matches a character or a b character or a c character
  • \ Backslash: used to escape special characters

The content matched between the parentheses () can be referenced later by $1, and $2 indicates the content in the second () before. It is easy to confuse people inside the regular \ escaping special characters.


II. Static Server

In the company often encounter a static server, usually provides a function of upload, other applications if you need static resources from the static server.

  1. Create images and img directories under /usr/local/var/www, and put a test.jpg under each directory respectively.
post thumb
http {
    server {
        listen 80;
        server_name localhost;


        set $doc_root /usr/local/var/www;

        # default location
        location / {
            root /usr/local/var/www/html;
            index index.html index.htm;
        index.html; }

        location ^~ /images/ {
            root $doc_root;
       }

       location ~* \. (gif|jpg|jpeg|png|bmp|ico|swf|css|js)$ {
           root $doc_root/img;
       }
    }
}

Custom variables use set directive, syntax set variable name value; reference use variable name value; reference use variable name; here customize doc_root variable.

There are two general ways to map static server location.

  • Use path, such as /images/ Generally, images are placed in some image directory.
  • Use suffixes such as .jpg, .png, etc. to match the pattern

Visit http://localhost/test.jpg to map to $doc_root/img

Visit http://localhost/images/test.jpg When the same path meets more than one location, it will match the location with higher priority, because the priority of ^~ is higher than ~, so it will go to the location corresponding to /images/.

There are several common location path mapping paths.

  • = Exact match for common characters. That is, exact match.
  • ^~ prefix matching. If the match is successful, no other locations will be matched.
  • ~ indicates a regular match, case sensitive
  • ~* means perform a regular match, not case-sensitive
  • /xxx/ regular string path matching
  • /xxx/ generic match, any request will be matched to

Location priority

When a path matches multiple locations, there is a priority order of which location can be matched, and the priority order is related to the expression type of the location value, not the order in the configuration file. For the same type of expression, the longer string will be matched first.

The following are the descriptions in order of priority.

  • The equal sign type (=) has the highest priority. Once the match is successful, no other matches are found and the search stops.
  • ^~ type expression, not a regular expression. Once the match is successful, no more matches are found and the search stops.
  • The regular expression type (~ ~*) has the next highest priority. If more than one location can match, the one with the longest regular expression is used.
  • Regular string match type. Match by prefix.
  • / generic match, if no match, match the generic

Priority search problem: different types of location mapping decide whether to continue down the search

  • equals type, ^~ type: once matched, the search stops, no other location will be matched
  • Regular expression type (~ ~*), regular string matching type /xxx/ : after matching, it will continue to search for other locations until it finds the highest priority, or stop searching when it finds the first case

Location priority from highest to lowest:

(location =) > (location full path) > (location ^~ path) > (location ~,~* regular order) > (location partial start path) > (/)

location = / {
    # Exact match /, hostname cannot be followed by any string /
    [ configuration A ]
}
location / {
    # Match all requests that start with /.
    # But if a longer expression of the same type is available, the longer expression is chosen.
    # If there are regular expressions to match, the regular expressions are matched first.
    [ configuration B ]
}
location /documents/ {
    # Match all requests that start with /documents/, and continue down the list after the match.
    # But if there is a longer expression of the same type, the longer expression is chosen.
    # If there are regular expressions to match, the regular expressions are given priority.
    [ configuration C ]
}
location ^~ /images/ {
    # Match all expressions starting with /images/, and if the match is successful, stop matching the lookup and stop searching.
    # So, even if there is a matching regular expression location, it will not be used
    [ configuration D ]
}

location ~* \. (gif|jpg|jpeg)$ {
    # Match all requests ending with gif jpg jpeg.
    # But requests starting with /images/ will use Configuration D, which has a higher priority
    [ configuration E ]
}

location /images/ {
    # Characters matching /images/ will continue to search down
    [ configuration F ]
}


location = /test.htm {
    root /usr/local/var/www/htm;
    index index.htm;
index.htm; }

Note: The priority of location is not related to the location of location configuration


III. Reverse Proxy Sever

Reverse proxy should be the most used feature of Nginx. Reverse proxy means that the proxy server accepts connection requests on the Internet, forwards the requests to a server on the internal network, and returns the results obtained from the server to the client requesting the connection on the Internet, at which point the proxy server behaves externally as a reverse proxy server.

Simply put, the real server cannot be directly accessed by the external network, so a proxy server is needed, while the proxy server can be accessed by the external network and the real server in the same network environment, of course, it may be the same server, the port is different.

Reverse proxy through the proxy_pass command to achieve .

Start a Java Web project with port 8081

server {
    listen 80;
    server_name localhost;

    location / {
        proxy_pass http://localhost:8081;
        proxy_set_header Host $host:$server_port;
        # Set user ip address
         proxy_set_header X-Forwarded-For $remote_addr;
         # When requesting server error to find another server
         proxy_next_upstream error timeout invalid_header http_500 http_502 http_503; 
    }

http_500 http_502; }   

When we access localhost, it’s the same as accessing localhost:8081


IV. Load Balancer

Load balancing is also a common feature of Nginx. Load balancing means spreading the execution across multiple operating units, such as web servers, FTP servers, enterprise critical application servers, and other mission-critical servers, so that they can work together to complete their tasks.

Simply put, when there are two or more servers, the requests are randomly distributed to the specified servers according to the rules, and the load balancing configuration generally requires a reverse proxy to be configured at the same time to jump to the load balancing through the reverse proxy. Nginx currently supports three load balancing policies, and two common third-party policies.

Load balancing is achieved through the upstream directive. Recommended: Java Interview Questions

1. RR (round robin :polling by default)

Each request is assigned to different back-end servers one by one in chronological order, that is, the first request is assigned to the first server, the second request is assigned to the second server, and if there are only two servers, the third request continues to be assigned to the first one, so that the cycle of polling continues, that is, the ratio of requests received by servers is 1:1, and if the back-end server is down, it can be automatically eliminated. Polling is the default configuration and does not require much configuration

Start the same project on ports 8081 and 8082 respectively

upstream web_servers {  
   server localhost:8081;  
   server localhost:8082;  
}

server {
    listen 80;
    server_name localhost;
    #access_log logs/host.access.log main;


    location / {
        proxy_pass http://web_servers;
        # Header Host must be specified
        proxy_set_header Host $host:$server_port;
    host $host:$server_port; }
 }

The access address can still get the response http://localhost/api/user/login?username=zhangsan&password=111111, this way is polled

2. weights

Specify the polling rate, weight is proportional to the access ratio, that is, the proportion of requests received by the server is the proportion of the respective configured weight, used in the case of uneven performance of the back-end server, such as the server performance is poor to receive fewer requests, the server performance is better to handle more requests.

upstream test {
    server localhost:8081 weight=1;
    server localhost:8082 weight=3;
    server localhost:8083 weight=4 backup;
backup; }

The example is that only one of the four requests is assigned to 8081, and the other three are assigned to 8082. backup means hot standby, which only goes to 8083 if both 8081 and 8082 are down

3. ip_hash

The above two ways have a problem, that is, the next request may be distributed to another server, when our program is not stateless (using the session to save data), then there is a big problem, such as the login information is saved to the session, then jump to another server when you need to log in again So many times we need a client to access only one server, then we need to use iphash, iphash each request by accessing the IP hash result distribution, so that each visitor fixed access to a back-end server, can solve the problem of session.

upstream test {
    ip_hash;
    server localhost:8080;
    server localhost:8081;
}

4. fair (third party)

Distribute requests according to the response time of the backend server, with the shorter response time being assigned first. This is configured to give the user a faster response

upstream backend {
    fair;
    server localhost:8080;
    server localhost:8081;
}

5. url_hash (third party)

Allocate requests by the hash result of the accessed url, so that each url is directed to the same backend server, which is more effective when the backend server is cached. Add the hash statement to the upstream, the server statement can not write other parameters such as weight, hash_method is the hash algorithm used

upstream backend {
    hash $request_uri;
    hash_method crc32;
    server localhost:8080;
    server localhost:8081;
}

Each of the above five load balancing is applicable to different situations, so you can choose which policy mode to use according to the actual situation, but fair and url_hash need to install third-party modules to use.


V. Separation of Static and Dynamic

Separation of dynamic and static is to let the dynamic web pages in the dynamic website according to certain rules to the unchanging resources and often change the resources to distinguish, dynamic and static resources to do a good job of splitting, we can do caching operations based on the characteristics of static resources, which is the core idea of the static website processing.

upstream web_servers {  
       server localhost:8081;  
       server localhost:8082;  
}

server {
    listen 80;
    server_name localhost;

    set $doc_root /usr/local/var/www;

    location ~* \. (gif|jpg|jpeg|png|bmp|ico|swf|css|js)$ {
       root $doc_root/img;
    }

    location / {
        proxy_pass http://web_servers;
        # Header Host must be specified
        proxy_set_header Host $host:$server_port;
    host $host:$server_port; }

    error_page 500 502 503 504 /50x.html;  
    location = /50x.html {  
        root $doc_root;
    }

 }

VI. Others

1. return directive

Return the http status code and optionally the second parameter can be the redirect URL

location /permanently/moved/url {
    return 301 http://www.example.com/moved/here;
}

2. rewrite directive

The rewrite URI request rewrite, which modifies the request URI multiple times during request processing by using the rewrite directive, has one optional parameter and two required parameters.

The first (required) parameter is a regular expression that the request URI must match.

The second parameter is the URI used to replace the matching URI.

The optional third parameter is a flag that can stop further processing of the rewrite directive or send a redirect (code 301 or 302)

location /users/ {
    rewrite ^/users/(. *)$ /show?user=$1 break;
}

3. error_page directive

Using the error_page directive, you can configure NGINX to return a custom page with an error code, replace other error codes in the response, or redirect the browser to another URI. in the following example, the error_page directive specifies the page (/404.html) that will return the 404 page error code.

error_page 404 /404.html;

4. logs

Access log: need to turn on compression gzip on; otherwise no log file is generated, open log_format, access_log comments

log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      ' "$http_user_agent" "$http_x_forwarded_for" '

access_log /usr/local/etc/nginx/logs/host.access.log main;

gzip on. 5;

5. deny command

# Deny access to a directory
location ~* \. (txt|doc)${
    root $doc_root;
    deny all;
}   

6. Built-in variables

The built-in variables that can be used in nginx configuration files start with the dollar sign $, and are also called global variables by some people. The values of some of these predefined variables can be changed. Also, pay attention to the Java Voice public page, reply to “back-end interview”, and you will be sent a treasure trove of interview questions!

  • $args: # This variable is equal to the parameter in the request line, same as $query_string
  • $content_length : The Content-length field in the request header.
  • $content_type : The Content-Type field in the request header.
  • $document_root : The value specified in the root directive for the current request.
  • $host : The request host header field, otherwise it is the server name.
  • $http_user_agent : client-side agent information
  • $http_cookie : client-side cookie information
  • $limit_rate : This variable can limit the connection rate.
  • $request_method : The action requested by the client, usually GET or POST.
  • $remote_addr : IP address of the client.
  • $remote_port : The port of the client.
  • $remote_user : The user name that has been authenticated by Auth Basic Module.
  • $request_filename : The path of the current request file, generated by the root or alias directive with the URI request.
  • $scheme : HTTP method (e.g. http, https).
  • $server_protocol : The protocol used for the request, usually HTTP/1.0 or HTTP/1.1.
  • $server_addr : The server address, this value can be determined after a system call is completed.
  • $server_name : The name of the server.
  • $server_port : The port number on which the request reaches the server.
  • $request_uri : The original URI containing the request parameters, without the host name, e.g. /foo/bar.php?arg=baz.
  • $uri : The current URI without request parameters, $uri does not contain the host name, e.g. /foo/bar.html.
  • $document_uri : Same as $uri

Source


Tags:
comments powered by Disqus