pod/nginx/load_balancing.pod


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345

=encoding utf-8


=head1 Name


load_balancing - Using nginx as HTTP load balancer


=head1 Introduction


Load balancing across multiple application instances is a commonly used
technique for optimizing resource utilization, maximizing throughput,
reducing latency, and ensuring fault-tolerant configurations.


It is possible to use nginx as a very efficient HTTP load balancer to
distribute traffic to several application servers and to improve
performance, scalability and reliability of web applications with nginx.


=head1 Load balancing methods


The following load balancing mechanisms (or methods) are supported in
nginx:

=over


=item *

round-robin — requests to the application servers are distributed
in a round-robin fashion,


=item *

least-connected — next request is assigned to the server with the
least number of active connections,


=item *

ip-hash — a hash-function is used to determine what server should
be selected for the next request (based on the client’s IP address).


=back


=head1 Default load balancing configuration


The simplest configuration for load balancing with nginx may look
like the following:

    
    http {
        upstream myapp1 {
            server srv1.example.com;
            server srv2.example.com;
            server srv3.example.com;
        }
    
        server {
            listen 80;
    
            location / {
                proxy_pass http://myapp1;
            }
        }
    }


In the example above, there are 3 instances of the same application
running on srv1-srv3.
When the load balancing method is not specifically configured,
it defaults to round-robin.
All requests are
L<proxied|ngx_http_proxy_module> to the server group myapp1, and nginx applies HTTP load
balancing to distribute the requests.


Reverse proxy implementation in nginx includes load balancing for HTTP,
HTTPS, FastCGI, uwsgi, SCGI, memcached, and gRPC.


To configure load balancing for HTTPS instead of HTTP, just use “https”
as the protocol.


When setting up load balancing for FastCGI, uwsgi, SCGI, memcached, or gRPC, use
L<ngx_http_fastcgi_module>,
L<ngx_http_uwsgi_module>,
L<ngx_http_scgi_module>,
L<ngx_http_memcached_module>, and
L<ngx_http_grpc_module>
directives respectively.


=head1 Least connected load balancing


Another load balancing discipline is least-connected.
Least-connected allows controlling the load on application
instances more fairly in a situation when some of the requests
take longer to complete.


With the least-connected load balancing, nginx will try not to overload a
busy application server with excessive requests, distributing the new
requests to a less busy server instead.


Least-connected load balancing in nginx is activated when the
L<least_conn|ngx_http_upstream_module> directive is used as part of the server group configuration:

    
        upstream myapp1 {
            least_conn;
            server srv1.example.com;
            server srv2.example.com;
            server srv3.example.com;
        }


=head1 Session persistence


Please note that with round-robin or least-connected load
balancing, each subsequent client’s request can be potentially
distributed to a different server.
There is no guarantee that the same client will be always
directed to the same server.


If there is the need to tie a client to a particular application server —
in other words, make the client’s session “sticky” or “persistent” in
terms of always trying to select a particular server — the ip-hash load
balancing mechanism can be used.


With ip-hash, the client’s IP address is used as a hashing key to
determine what server in a server group should be selected for the
client’s requests.
This method ensures that the requests from the same client
will always be directed to the same server
except when this server is unavailable.


To configure ip-hash load balancing, just add the
L<ngx_http_upstream_module>
directive to the server (upstream) group configuration:

    
    upstream myapp1 {
        ip_hash;
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
    }


=head1 Weighted load balancing


It is also possible to influence nginx load balancing algorithms even
further by using server weights.


In the examples above, the server weights are not configured which means
that all specified servers are treated as equally qualified for a
particular load balancing method.


With the round-robin in particular it also means a more or less equal
distribution of requests across the servers — provided there are enough
requests, and when the requests are processed in a uniform manner and
completed fast enough.


When the
L<weight|ngx_http_upstream_module>
parameter is specified for a server, the weight is accounted as part
of the load balancing decision.

    
        upstream myapp1 {
            server srv1.example.com weight=3;
            server srv2.example.com;
            server srv3.example.com;
        }


With this configuration, every 5 new requests will be distributed across
the application instances as the following: 3 requests will be directed
to srv1, one request will go to srv2, and another one — to srv3.


It is similarly possible to use weights with the least-connected and
ip-hash load balancing in the recent versions of nginx.


=head1 Health checks


Reverse proxy implementation in nginx includes in-band (or passive)
server health checks.
If the response from a particular server fails with an error,
nginx will mark this server as failed, and will try to
avoid selecting this server for subsequent inbound requests for a while.


The
L<max_fails|ngx_http_upstream_module>
directive sets the number of consecutive unsuccessful attempts to
communicate with the server that should happen during
L<fail_timeout|ngx_http_upstream_module>.
By default,
L<max_fails|ngx_http_upstream_module>
is set to 1.
When it is set to 0, health checks are disabled for this server.
The
L<fail_timeout|ngx_http_upstream_module>
parameter also defines how long the server will be marked as failed.
After
L<fail_timeout|ngx_http_upstream_module>
interval following the server failure, nginx will start to gracefully
probe the server with the live client’s requests.
If the probes have been successful, the server is marked as a live one.


=head1 Further reading


In addition, there are more directives and parameters that control server
load balancing in nginx, e.g.
L<ngx_http_proxy_module>,
L<backup|ngx_http_upstream_module>,
L<down|ngx_http_upstream_module>, and
L<ngx_http_upstream_module>.
For more information please check our
L<reference documentation|..>.


Last but not least,
application load balancing,
application health checks,
activity monitoring and
on-the-fly reconfiguration of server groups are available
as part of our paid NGINX Plus subscriptions.