Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Mar 2008 19:41:14 -0700
From:      Alfred Perlstein <alfred@freebsd.org>
To:        stable@freebsd.org
Cc:        smp@freebsd.org
Subject:   timeout/untimeout race conditions/crash [patch]
Message-ID:  <20080315024114.GD67856@elvis.mu.org>

next in thread | raw e-mail | index | archive | help

--F8dlzb82+Fcn6AgP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

We think we tracked down a defect in timeout/untimeout in
FreeBSD.

We have reduced the problem to the following scenario:

2+ cpu system, one cpu is running softclock at the same time
another thread is running on another cpu which makes use of
timeout/untimeout.

CPU 0 is running "softclock"
CPU 1 is running "driver" with Giant held.

softclock: mtx_lock_spin(&callout_lock)
softclock: CACHES the callout structure's fields.
softclock: sees that it's a CALLOUT_LOCAL_ALLOC
softclock: executes this code:
  if (c->c_flags & CALLOUT_LOCAL_ALLOC) {
  	c->c_func = NULL;
  	c->c_flags = CALLOUT_LOCAL_ALLOC;
  	SLIST_INSERT_HEAD(&callfree, c,
  		c_links.sle);
  	curr_callout = NULL;
  } else {
  
  NOTE: that c->c_func has been set to NULL and curr_callout
        is also NULL.
softclock: mtx_unlock_spin(&callout_lock)
driver: calls untimeout(), the following sequence happens:
        mtx_lock_spin(&callout_lock);
        if (handle.callout->c_func == ftn && handle.callout->c_arg == arg)
                callout_stop(handle.callout);
        mtx_unlock_spin(&callout_lock);

  NOTE: untimeout() sees that handle.callout->c_func is not set
        to the function so it does NOT call callout_stop(9)!
driver: free's backing structure for c->c_arg.
softclock: executes callout.
softclock: likely crashes at this point due to access after free.

I have a patch I'm trying out here, but I need feedback on it.

The way the patch works is to treat CALLOUT_LOCAL_ALLOC (timeout/untimeout)
callouts the same as ~CALLOUT_LOCAL_ALLOC allocs, and moves the
freelist manipulation to the end of the callout dispatch.

Some light testing seems to have the system work.

We are doing some testing in-house to also make sure this works.

Please provide feedback.

See attached delta.

-- 
- Alfred Perlstein

--F8dlzb82+Fcn6AgP
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="kern_timeout.diff"

Index: kern_timeout.c
===================================================================
RCS file: /cvs/ncvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.97.2.2
diff -u -r1.97.2.2 kern_timeout.c
--- kern_timeout.c	26 Sep 2005 19:49:12 -0000	1.97.2.2
+++ kern_timeout.c	15 Mar 2008 02:28:48 -0000
@@ -241,17 +241,8 @@
 				c_arg = c->c_arg;
 				c_mtx = c->c_mtx;
 				c_flags = c->c_flags;
-				if (c->c_flags & CALLOUT_LOCAL_ALLOC) {
-					c->c_func = NULL;
-					c->c_flags = CALLOUT_LOCAL_ALLOC;
-					SLIST_INSERT_HEAD(&callfree, c,
-							  c_links.sle);
-					curr_callout = NULL;
-				} else {
-					c->c_flags =
-					    (c->c_flags & ~CALLOUT_PENDING);
-					curr_callout = c;
-				}
+				c->c_flags &= ~CALLOUT_PENDING;
+				curr_callout = c;
 				curr_cancelled = 0;
 				mtx_unlock_spin(&callout_lock);
 				if (c_mtx != NULL) {
@@ -310,6 +301,12 @@
 					mtx_unlock(c_mtx);
 				mtx_lock_spin(&callout_lock);
 done_locked:
+				if (c->c_flags & CALLOUT_LOCAL_ALLOC) {
+					c->c_func = NULL;
+					c->c_flags = CALLOUT_LOCAL_ALLOC;
+					SLIST_INSERT_HEAD(&callfree, c,
+							  c_links.sle);
+				}
 				curr_callout = NULL;
 				if (wakeup_needed) {
 					/*

--F8dlzb82+Fcn6AgP--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080315024114.GD67856>