Understanding nginx (proxying, reverse proxying, load balancing)

May 25, 2018 0 Comments

Understanding nginx (proxying, reverse proxying, load balancing)

 

 


This piece was originally posted on my blog at legobox

Many think of Nginx as just a web server, yep it's a web server, and so much more, Nginx is first and foremost a proxy engine, and load balancing server mechanism, which is capable of working as a web server using whichever preprocessor server you’d like, be it uWSGI, php-fpm etc.

Many of us have heard of nginx, and some of us are looking forward to checking out what it is, in this piece we take a good look at how to use nginx as a load balancer and a general proxy passer.

Pre-requiste

Before getting up and running with nginx, it's usually important (but not compulsory) that you meet the following conditions,


  • You are looking to learn and understand some more devOps

  • You have an understanding of what a web server is in general

  • You’ve used a computer before. 😏

Got all these down, then you are good to go.

Proxying

The first scenario and struct we are going to check out is the concept and term of proxying. In general, terms, proxying is simply having one server position in front of another and have all request going to the second server go through the first server.

Illustration here

in nginx, this is achieved using the proxy pass directive.

 // basic proxy pass location / { proxypass http://ourserver2here.com } 

With the proxy pass, there are many tricks and these are important to understanding with respect to setting a proxy pass system, let’s have a look at some of these in the books.

Matching prefixes

 location /match/here { proxypass http://ourserver2here.com; } 

In the above setup, no URI is given at the end of the server in the proxy pass definition. In this case a request such as /match/here/please, This request would be sent to ourserver2here as https://ourserver2here/match/here/please. Pretty convenient right?

Reverse Proxying

This is one type of proxying that’s particularly interesting, let have a look at what Wikipedia describes it as since those guys are incredibly good at describing things.

In computer networks, a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as if they originated from the Web server itself.

There’s a virtual description

source - wikipedia - such good people

With nginx, we could achieve this, and even take it further, ever heard of ngrok, maybe you may have, maybe not either way,

Ngrok is a service that allows you server your localhost application via a domain proxy, exposing it via a public URL.

With the reverse proxy, and setting up a subdomain on your site for such businesses, you can have your own ngrok without the constant change in the domain name and stuff.

To pull this off, the first part is to setup nginx on your server to proxy connections from your specific subdomain or domain (whichever you find convenient) to a specific port on the server.

 server { listen 80; servername coolsubdomain.mainname.com; location / { proxypass http://127.0.0.1:5000; #in this case we picked the port 5000 } } 

on your computer, assuming our application is served on our localhost port 8000 (it could be any port really) proceeding to use ssh we run a reverse tunnel to the port on the server.

 // run a reverse proxy to your server via ssh ssh -R 5000:127.0.0.1:8000 youruser@mainname.com 

This sets up the reverse proxy, therefore any request going to coolsubdomain.mainname.com is actually coming to your localhost machine.

Load balancing

Load balancing is another problem nginx has been uniquely positioned to solve, it’s built to handle a lot of requests and if we have several servers running our project, we can balance the load between them using upstreams, load balancing is broad, therefore we are going to dive deep a little but not too much.

Nginx uses upstreams to balance the load as mentioned earlier, and it has a few algorithms by default which are implored to handle the load balancing mechanism.

These algorithms are specified by directives and can be noted as follows.



  • (round robin): The default load balancing algorithm that is used if no other balancing directives are present. Each server defined in the upstream context is passed requests sequentially in turn.


  • leastconn : Specifies that new connections should always be given to the backend that has the least number of active connections. This can be especially useful in situations where connections to the backend may persist for some time.


  • ip
    hash : This balancing algorithm distributes requests to different servers based on the client's IP address. The first three octets are used as a key to deciding on the server to handle the request. The result is that clients tend to be served by the same server each time, which can assist in session consistency.


  • hash : This balancing algorithm is mainly used with memcached proxying. The servers are divided based on the value of an arbitrarily provided hash key. This can be text, variables, or a combination. This is the only balancing method that requires the user to provide data, which is the key that should be used for the hash

When setting up, for load balancing it may look like this.

 # http context upstream backendhosts { leastconn; server host1.example.com; server host2.example.com; server host3.example.com; } # next we set up our server server { listen 80; servername example.com; location /proxy-me { proxypass http://backendhosts; } } 

We could even add weights to a specific host so they handle more connections in general than the rest according to a certain ratio.

 ... upstream backendhosts { least_conn; server host1.example.com weight=3; server host2.example.com; server host3.example.com; } ... 

Conclusion

Where are many other things we can achieve using nginx, this is pretty much just the tip of the iceberg. Setting up web servers and proxy servers can usually be a bit of a hassle, but it doesn’t have to.

In the next post under this category, I’d explore how to get this up and running using, caddy, it's a web server technology which aims to make the process of setting up web servers and proxy engines a whole lot easier.


Tag cloud