curl-library
DNS Cache
Date: Tue, 8 Jan 2002 05:56:53 +0100
Recently I've added a DNS cache to libcURL which will cache DNS
lookups on a per-process basis, with a few exceptions/special cases
also supported.
libcurl tries its hardest to, by default, support the multi-threaded
programming paradigm. Therefore, by default having a global dns
cache is hard, simply because, cURL, as it is now, has no concept of
mutexes, and therefore cannot assure sequential access to a global
cache, across multiple concurrent threads. Therefore, by default,
the cURL dns cache works on a per-handle basis:
ie::
int main(void)
{
CURL *c1;
CURL *c2;
c1 = curl_easy_init();
curl_easy_setopt(c1, CURLOPT_URL, "http://catalogs.google.com/");
curl_easy_perform(c1);
curl_easy_cleanup(c1);
c2 = curl_easy_init();
curl_easy_setopt(c2, CURLOPT_URL,
"http://catalogs.google.com/catalog_list");
curl_easy_perform(c2);
curl_easy_cleanup(c2);
}
will cause two DNS lookups because the first dns lookup on c1 will
be stored in a local cache, present on the c1 handle. The same will
happen with the c2 handle. This cache is still useful simply
because using cURL you can re-use a handle many times, and the cache
will survive the lifetime of the individual requests.
When using the new cURL "multi" interface, multiple handles will be
able to share the same transfer and connection space. Furthermore
these handles are guaranteed to be linked together, therefore, we
can safely share a cache between the handles. So, re-writing the
above example to use the multi interface:
int main(void)
{
CURL *c1;
CURL *c2;
CURLM *m;
struct timeval to;
int running;
int rc;
int max;
fd_set read;
fd_set write;
fd_set except;
c1 = curl_easy_init();
c2 = curl_easy_init();
curl_easy_setopt(c1, CURLOPT_URL, "http://catalogs.google.com/");
curl_easy_setopt(c2, CURLOPT_URL,
"http://catalogs.google.com/catalog_list");
m = curl_multi_init();
curl_multi_add_handle(m, c1);
curl_multi_add_handle(m, c2);
while (CURLM_CALL_MULTI_PERFORM ==
curl_multi_perform(m, &running));
while (running) {
FD_ZERO(&read);
FD_ZERO(&write);
FD_ZERO(&except);
to.tv_sec = 1;
to.tv_usec = 0;
curl_multi_fdset(m, &read, &write, &except, &max);
rc = select(max+1, &read, &write, &except);
switch (rc) {
case -1: /* Error */
break;
case 0:
default:
curl_multi_perform(m, &running);
break;
}
}
curl_multi_cleanup(m);
curl_easy_cleanup(c1);
curl_easy_cleanup(c2);
return 0;
}
Would cause 1 DNS lookup, since the cache is shared between members
of the CURLM handle (m).
Finally, in many cases you don't care about Threadsafety, because,
well, your application doesn't use threads, and therefore, you might
want to exploit the advantages of using a global DNS cache. One
such use is in PHP with the Apache webserver. Apache 1.3 & co. use
a pool of pre-forked processes to serve requests. This allows for
two seperate Apache states: Actions to perform on process
startup/shutdown, and actions to perform on request startup/shutdown.
Curl handles cannot survive the request startup and shutdown,
however, a *global* dns cache, can be maintained on a per-process
basis, therefore, caching dns lookups over quite a few requests,
resulting in a very nice performance gain in many cases. In order
to enable, global DNS caching you can set the
CURLOPT_DNS_USE_GLOBAL_CACHE option to non-false (ie, 1).
Above are the three different ways that cURL cache's DNS lookups --
per handle by default with the easy interface, per handle pool with
the multi interface, and globally when DNS caching is enabled for
non-threaded applications.
Another problem with DNS cach'ing is that you may have applications
that run for a loooooooooooooongggggggggggggggggggg time, and
therefore you may have changes in DNS information (ie,
catalogs.google.com may resolve to 10.0.0.143 instead of
10.0.0.112). libcurl does not try and be too smart for you (ttl's
from nameservers are a tricky business), but rather leaves the
responsibility on the programmer to set the cache expiration (it
defaults to 60 seconds) with the CURLOPT_DNS_CACHE_TIMEOUT option,
which expects a second-based timeout, ie, to have the cache timeout
every 15 seconds, you would use:
curl_easy_setopt(handle, CURLOPT_DNS_CACHE_TIMEOUT, 15);
With two special cases, 0 and -1... O will completely disable dns
caching and -1 will make it so that the DNS cache *never* expires.
*phew, long breath* :)
-Sterling
Received on 2002-01-08