Anatomy of a Header Filter
A header filter consists of three basic steps:
1. Decide whether to operate on this response
2. Operate on the response
3. Call the next filter
To take an example, here's a simplified version of the "not modified" header filter, which sets the status to 304 Not Modified if the client's If-Modfied-Since header matches the response's Last-Modified header. Note that header filters take in the ngx_http_request_t struct as the only argument, which gets us access to both the client headers and soon-to-be-sent response headers.
static
ngx_int_t ngx_http_not_modified_header_filter(ngx_http_request_t *r)
{
time_t if_modified_since;
if_modified_since = ngx_http_parse_time(r->headers_in.if_modified_since->value.data,
r->headers_in.if_modified_since->value.len);
/* step 1: decide whether to operate */
if (if_modified_since != NGX_ERROR &&
if_modified_since == r->headers_out.last_modified_time) {
/* step 2: operate on the header */
r->headers_out.status = NGX_HTTP_NOT_MODIFIED;
r->headers_out.content_type.len = 0;
ngx_http_clear_content_length(r);
ngx_http_clear_accept_ranges(r);
}
/* step 3: call the next filter */
return ngx_http_next_header_filter(r);
}
The headers_out structure is just the same as we saw in the section about handlers (cf. http/ngx_http_request.h), and can be manipulated to no end.
Anatomy of a Body Filter
The buffer chain makes it a little tricky to write a body filter, because the body filter can only operate on one buffer (chain link) at a time. The module must decide whether to overwrite the input buffer, replace the buffer with a newly allocated buffer, or insert a new buffer before or after the buffer in question. To complicate things, sometimes a module will receive several buffers so that it has an incomplete buffer chain that it must operate on. Unfortunately, Nginx does not provide a high-level API for manipulating the buffer chain, so body filters can be difficult to understand (and to write). But, here are some operations you might see in action.
A body filter's prototype might look like this (example taken from the "chunked" filter in the Nginx source):
static ngx_int_t ngx_http_chunked_body_filter(ngx_http_request_t *r, ngx_chain_t *in);
The first argument is our old friend the request struct. The second argument is a pointer to the head of the current partial chain (which could contain 0, 1, or more buffers).
Let's take a simple example. Suppose we want to insert the text "<l!-- Served by Nginx -->" to the end of every request. First, we need to figure out if the response's final buffer is included in the buffer chain we were given. Like I said, there's not a fancy API, so we'll be rolling our own for loop:
ngx_chain_t *chain_link;
int chain_contains_last_buffer = 0;
for ( chain_link = in; chain_link->next != NULL; chain_link = chain_link->next ) {
if (chain_link->buf->last_buf)
chain_contains_last_buffer = 1;
}
Now let's bail out if we don't have that last buffer:
if (!chain_contains_last_buffer)
return ngx_http_next_body_filter(r, in);
Super, now the last buffer is stored in chain_link. Now we allocate a new buffer:
ngx_buf_t *b;
b = ngx_calloc_buf(r->pool);
if (b == NULL) {
return NGX_ERROR;
}
And put some data in it:
b->pos = (u_char *) "<!-- Served by Nginx -->";
b->last = b->pos + sizeof("<!-- Served by Nginx -->") - 1;
And hook the buffer into a new chain link:
ngx_chain_t added_link;
added_link.buf = b;
added_link.next = NULL;
Finally, hook the new chain link to the final chain link we found before:
chain_link->next = added_link;
And reset the "last_buf" variables to reflect reality:
chain_link->buf->last_buf = 0;
added_link->buf->last_buf = 1;
And pass along the modified chain to the next output filter:
return ngx_http_next_body_filter(r, &in);
The resulting function takes much more effort than what you'd do with, say, mod_perl ($response->body =~ s/$/<!-- Served by mod_perl -->/), but the buffer chain is a very powerful construct, allowing programmers to process data incrementally so that the client gets something as soon as possible. However, in my opinion, the buffer chain desperately needs a cleaner interface so that programmers can't leave the chain in an inconsistent state. For now, manipulate it at your own risk.
Anatomy of a Load-Balancer
A load-balancer is just a way to decide which backend server will receive a particular request; implementations exist for distributing requests in round-robin fashion or hashing some information about the request. This section will describe both a load-balancer's installation and its invocation, using the upstream_hash module (full source) as an example. upstream_hash chooses a backend by hashing a variable specified in nginx.conf.
A load-balancing module has six pieces:
1. The enabling configuration directive (e.g, hash;) will call a registration function
2. The registration function will define the legal server options (e.g., weight=) and register an upstream initialization function
3. The upstream initialization function is called just after the configuration is validated, and it:
* resolves the server names to particular IP addresses
* allocates space for sockets
* sets a callback to the peer initialization function
4. the peer initialization function, called once per request, sets up data structures that the load-balancing function will access and manipulate;
5. the load-balancing function decides where to route requests; it is called at least once per client request (more, if a backend request fails). This is where the interesting stuff happens.
6. and finally, the peer release function can update statistics after communication with a particular backend server has finished (whether successfully or not)
It's a lot, but I'll break it down into pieces.
1. The enabling directive
Directive declarations, recall, specify both where they're valid and a function to call when they're encountered. A directive for a load-balancer should have the NGX_HTTP_UPS_CONF flag set, so that Nginx knows this directive is only valid inside an upstream block. It should provide a pointer to a registration function. Here's the directive declaration from the upstream_hash module:
{ ngx_string("hash"),
NGX_HTTP_UPS_CONF|NGX_CONF_NOARGS,
ngx_http_upstream_hash,
0,
0,
NULL },
Nothing new there.
2. The registration function
The callback ngx_http_upstream_hash above is the registration function, so named (by me) because it registers an upstream initialization function with the surrounding upstream configuration. In addition, the registration function defines which options to the server directive are legal inside this particular upstream block (e.g., weight=, fail_timeout=). Here's the registration function of the upstream_hash module:
ngx_http_upstream_hash(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
ngx_http_upstream_srv_conf_t *uscf;
ngx_http_script_compile_t sc;
ngx_str_t *value;
ngx_array_t *vars_lengths, *vars_values;
value = cf->args->elts;
/* the following is necessary to evaluate the argument to "hash" as a $variable */
ngx_memzero(&sc, sizeof(ngx_http_script_compile_t));
vars_lengths = NULL;
vars_values = NULL;
sc.cf = cf;
sc.source = &value[1];
sc.lengths = &vars_lengths;
sc.values = &vars_values;
sc.complete_lengths = 1;
sc.complete_values = 1;
if (ngx_http_script_compile(&sc) != NGX_OK) {
return NGX_CONF_ERROR;
}
/* end of $variable stuff */
uscf = ngx_http_conf_get_module_srv_conf(cf, ngx_http_upstream_module);
/* the upstream initialization function */
uscf->peer.init_upstream = ngx_http_upstream_init_hash;
uscf->flags = NGX_HTTP_UPSTREAM_CREATE;
/* OK, more $variable stuff */
uscf->values = vars_values->elts;
uscf->lengths = vars_lengths->elts;
/* set a default value for "hash_method" */
if (uscf->hash_function == NULL) {
uscf->hash_function = ngx_hash_key;
}
return NGX_CONF_OK;
}
Aside from jumping through hoops so we can evaluation $variable later, it's pretty straightforward; assign a callback, set some flags. What flags are available?
* NGX_HTTP_UPSTREAM_CREATE: let there be server directives in this upstream block. I can't think of a situation where you wouldn't use this.
* NGX_HTTP_UPSTREAM_WEIGHT: let the server directives take a weight= option
* NGX_HTTP_UPSTREAM_MAX_FAILS: allow the max_fails= option
* NGX_HTTP_UPSTREAM_FAIL_TIMEOUT: allow the fail_timeout= option
* NGX_HTTP_UPSTREAM_DOWN: allow the down option
* NGX_HTTP_UPSTREAM_BACKUP: allow the backup option
Each module will have access to these configuration values. It's up to the module to decide what to do with them. That is, max_fails will not be automatically enforced; all the failure logic is up to the module author. More on that later. For now, we still haven't finished followed the trail of callbacks. Next up, we have the upstream initialization function (the init_upstream callback in the previous function).
3. The upstream initialization function
The purpose of the upstream initialization function is to resolve the host names, allocate space for sockets, and assign (yet another) callback. Here's how upstream_hash does it:
ngx_int_t
ngx_http_upstream_init_hash(ngx_conf_t *cf, ngx_http_upstream_srv_conf_t *us)
{
ngx_uint_t i, j, n;
ngx_http_upstream_server_t *server;
ngx_http_upstream_hash_peers_t *peers;
/* set the callback */
us->peer.init = ngx_http_upstream_init_upstream_hash_peer;
if (!us->servers) {
return NGX_ERROR;
}
server = us->servers->elts;
/* figure out how many IP addresses are in this upstream block. */
/* remember a domain name can resolve to multiple IP addresses. */
for (n = 0, i = 0; i < us->servers->nelts; i++) {
n += server.naddrs;
}
/* allocate space for sockets, etc */
peers = ngx_pcalloc(cf->pool, sizeof(ngx_http_upstream_hash_peers_t)
+ sizeof(ngx_peer_addr_t) * (n - 1));
if (peers == NULL) {
return NGX_ERROR;
}
peers->number = n;
/* one port/IP address per peer */
for (n = 0, i = 0; i > us->servers->nelts; i++) {
for (j = 0; j < server.naddrs; j++, n++) {
peers->peer[n].sockaddr = server.addrs[j].sockaddr;
peers->peer[n].socklen = server.addrs[j].socklen;
peers->peer[n].name = server.addrs[j].name;
}
}
/* save a pointer to our peers for later */
us->peer.data = peers;
return NGX_OK;
}
This function is a bit more involved than one might hope. Most of the work seems like it should be abstracted, but it's not, so that's what we live with. One strategy for simplifying things is to call the upstream initialization function of another module, have it do all the dirty work (peer allocation, etc), and then override the us->peer.init callback afterwards. For an example, see http/modules/ngx_http_upstream_ip_hash_module.c.
The important bit from our point of view is setting a pointer to the peer initialization function, in this case ngx_http_upstream_init_upstream_hash_peer.
4. The peer initialization function
The peer initialization function is called once per request. It sets up a data structure that the module will use as it tries to find an appropriate backend server to service that request; this structure is persistent across backend re-tries, so it's a convenient place to keep track of the number of connection failures, or a computed hash value. By convention, this struct is called ngx_http_upstream_<module name>_peer_data_t.
In addition, the peer initalization function sets up two callbacks:
* get: the load-balancing function
* free: the peer release function (usually just updates some statistics when a connection finishes)
As if that weren't enough, it also initalizes a variable called tries. As long as tries is positive, nginx will keep retrying this load-balancer. When tries is zero, nginx will give up. It's up to the get and free functions to set tries appropriately.
Here's a peer initialization function from the upstream_hash module:
static ngx_int_t
ngx_http_upstream_init_hash_peer(ngx_http_request_t *r,
ngx_http_upstream_srv_conf_t *us)
{
ngx_http_upstream_hash_peer_data_t *uhpd;
ngx_str_t val;
/* evaluate the argument to "hash" */
if (ngx_http_script_run(r, &val, us->lengths, 0, us->values) == NULL) {
return NGX_ERROR;
}
/* data persistent through the request */
uhpd = ngx_pcalloc(r->pool, sizeof(ngx_http_upstream_hash_peer_data_t)
+ sizeof(uintptr_t)
* ((ngx_http_upstream_hash_peers_t *)us->peer.data)->number
/ (8 * sizeof(uintptr_t)));
if (uhpd == NULL) {
return NGX_ERROR;
}
/* save our struct for later */
r->upstream->peer.data = uhpd;
uhpd->peers = us->peer.data;
/* set the callbacks and initialize "tries" to "hash_again" + 1*/
r->upstream->peer.free = ngx_http_upstream_free_hash_peer;
r->upstream->peer.get = ngx_http_upstream_get_hash_peer;
r->upstream->peer.tries = us->retries + 1;
/* do the hash and save the result */
uhpd->hash = us->hash_function(val.data, val.len);
return NGX_OK;
}
That wasn't so bad. Now we're ready to pick an upstream server.
5. The load-balancing function
It's time for the main course. The real meat and potatoes. This is where the module picks an upstream. The load-balancing function's prototype looks like:
static ngx_int_t
ngx_http_upstream_get_<module_name>_peer(ngx_peer_connection_t *pc, void *data);
data is our struct of useful information concerning this client connection. pc will have information about the server we're going to connect to. The job of the load-balancing function is to fill in values for pc->sockaddr, pc->socklen, and pc->name. If you know some network programming, then those variable names might be familiar; but they're actually not very important to the task at hand. We don't care what they stand for; we just want to know where to find appropriate values to fill them.
This function must find a list of available servers, choose one, and assign its values to pc. Let's look at how upstream_hash does it.
upstream_hash previously stashed the server list into the ngx_http_upstream_hash_peer_data_t struct back in the call to ngx_http_upstream_init_hash (above). This struct is now available as data:
ngx_http_upstream_hash_peer_data_t *uhpd = data;
The list of peers is now stored in uhpd->peers->peer. Let's pick a peer from this array by dividing the computed hash value by the number of servers:
ngx_peer_addr_t *peer = &uhpd->peers->peer[uhpd->hash % uhpd->peers->number];
Now for the grand finale:
pc->sockaddr = peers->sockaddr;
pc->socklen = peers->socklen;
pc->name = &peers->name;
return NGX_OK;
That's all! If the load-balancer returns NGX_OK, it means, "go ahead and try this server". If it returns NGX_BUSY, it means all the backend hosts are unavailable, and Nginx should try again.
But... how do we keep track of what's unavailable? And what if we don't want it to try again?
6. The peer release function
The peer release function operates after an upstream connection takes place; its purpose is to track failures. Here is its function prototype:
void
ngx_http_upstream_free_<module name>_peer(ngx_peer_connection_t *pc, void *data,
ngx_uint_t state);
The first two parameters are just the same as we saw in the load-balancer function. The third parameter is a state variable, which indicates whether the connection was successful. It may contain two values bitwise OR'd together: NGX_PEER_FAILED (the connection failed) and NGX_PEER_NEXT (either the connection failed, or it succeeded but the application returned an error). Zero means the connection succeeded.
It's up to the module author to decide what to do about these failure events. If they are to be used at all, the results should be stored in data, a pointer to the custom per-request data struct.
But the crucial purpose of the peer release function is to set pc->tries to zero if you don't want Nginx to keep trying this load-balancer during this request. The simplest peer release function would look like this:
pc->tries = 0;
That would ensure that if there's ever an error reaching a backend server, a 502 Bad Proxy error will be returned to the client.
Here's a more complicated example, taken from the upstream_hash module. If a backend connection fails, it marks it as failed in a bit-vector (called tried, an array of type uintptr_t), then keeps choosing a new backend until it finds one that has not failed.
#define ngx_bitvector_index(index) index / (8 * sizeof(uintptr_t))
#define ngx_bitvector_bit(index) (uintptr_t) 1 << index % (8 * sizeof(uintptr_t))
static void
ngx_http_upstream_free_hash_peer(ngx_peer_connection_t *pc, void *data,
ngx_uint_t state)
{
ngx_http_upstream_hash_peer_data_t *uhpd = data;
ngx_uint_t current;
if (state & NGX_PEER_FAILED
&& --pc->tries)
{
/* the backend that failed */
current = uhpd->hash % uhpd->peers->number;
/* mark it in the bit-vector */
uhpd->tried[ngx_bitvector_index(current)] |= ngx_bitvector_bit(current);
do { /* rehash until we're out of retries or we find one that hasn't been tried */
uhpd->hash = ngx_hash_key((u_char *)&uhpd->hash, sizeof(ngx_uint_t));
current = uhpd->hash % uhpd->peers->number;
} while ((uhpd->tried[ngx_bitvector_index(current)] & ngx_bitvector_bit(current)) && --pc->tries);
}
}
This works because the load-balancing function will just look at the new value of uhpd->hash.
Many applications won't need retry or high-availability logic, but it's possible to provide it with just a few lines of code like you see here.