[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patches] Track resolv.conf, avoid need for explicit res_init() or nscd in long-running applications



On Wed, May 25, 2011 at 6:47 PM, Paul Pluzhnikov <ppluzhnikov@xxxxxxxxxx> wrote:
> On Wed, May 25, 2011 at 6:20 PM, Paul Pluzhnikov <ppluzhnikov@xxxxxxxxxx> wrote:
>
>> We've just tracked a bug to this patch:
>> http://www.eglibc.org/cgi-bin/viewcvs.cgi/trunk/libc/resolv/res_libc.c?rev=9102&r1=8608&r2=9102
>
> Oops. I see this has already been noted:
> http://www.eglibc.org/archives/patches/msg00901.html
>
> Sorry about the noise.

It has been pointed out to me that the fix above is still buggy :-(

Effectively it's doing double-checked locking. Consider:

1. two threads (A and B) use the resolver, loading some version of resolv.conf
2. the threads go idle
3. some time later, resolv.conf changes
4. a long time -- hours, maybe -- elapses, with no further changes or activity
5. both threads wake up, and both call __res_maybe_init() at the same time
6. both threads stat() the file and see the new timestamp
7. thread A updates last_mtime to the new timestamp
8. thread B checks last_mtime, sees that it's current
9. thread B checks __res_initstamp, which has not yet changed
10. thread B goes ahead using the hours-old resolv.conf data
11. thread A now updates __res_initstamp, but it's too late

The next time thread B calls __res_maybe_init(), it will _probably_ get
the right data.  (Theoretically, an arbitrary amount of time and number
of requests can elapse after thread A updates last_time and before thread
A updates __res_initstamp, all of which will use the stale resolv.conf data.)

This is somewhat unlikely to happen, but libc should work 100% of the time,
not 99.999%. Given that we are already doing a stat here, the cost of a
lock should be completely negligible.

Patch attached. Google ref: b/4494453.

Thanks,
-- 
Paul Pluzhnikov

2011-06-20  Paul Pluzhnikov  <ppluzhnikov@xxxxxxxxxx>

	* resolv/res_libc.c (__res_initstamp): Declare unconditionally.
	(__res_maybe_init): Avoid double-checked locking.
2011-06-20  Paul Pluzhnikov  <ppluzhnikov@xxxxxxxxxx>

	* resolv/res_libc.c (__res_initstamp): Declare unconditionally.
	(__res_maybe_init): Avoid double-checked locking.


--- resolv/res_libc.c.orig
+++ resolv/res_libc.c
@@ -28,6 +28,7 @@
    out) since res_init() should go into libc.so but the rest of that
    file should not.  */
 
+__libc_lock_define_initialized (static, lock);
 extern unsigned long long int __res_initstamp attribute_hidden;
 /* We have atomic increment operations on 64-bit platforms.  */
 #if __WORDSIZE == 64
@@ -35,7 +36,6 @@ extern unsigned long long int __res_init
 # define atomicincunlock(lock) (void) 0
 # define atomicinc(var) catomic_increment (&(var))
 #else
-__libc_lock_define_initialized (static, lock);
 # define atomicinclock(lock) __libc_lock_lock (lock)
 # define atomicincunlock(lock) __libc_lock_unlock (lock)
 # define atomicinc(var) ++var
@@ -100,12 +100,12 @@ __res_maybe_init (res_state resp, int pr
 
 	if (resp->options & RES_INIT) {
 		ret = stat (_PATH_RESCONF, &statbuf);
+		__libc_lock_lock (lock);
 		if ((ret == 0) && (last_mtime != statbuf.st_mtime)) {
 			last_mtime = statbuf.st_mtime;
-			atomicinclock (lock);
 			atomicinc (__res_initstamp);
-			atomicincunlock (lock);
 		}
+		__libc_lock_unlock (lock);
 		if (__res_initstamp != resp->_u._ext.initstamp) {
 			if (resp->nscount > 0)
 				__res_iclose (resp, true);