Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jan 2000 23:15:25 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Jason Evans <jasone@canonware.com>
Cc:        Bruce Evans <bde@zeta.org.au>, Alfred Perlstein <bright@wintelcom.net>, cvs-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG
Subject:   Re: cvs commit: src/sys/kern vfs_aio.c
Message-ID:  <200001200715.XAA43425@apollo.backplane.com>
References:  <20000119210539.F20191@fw.wintelcom.net> <Pine.BSF.4.21.0001201741580.2347-100000@alphplex.bde.org> <20000119225846.W27689@sturm.canonware.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:> And usually necessary to avoid races (Locking 101, part 3: an spl is
:> not a lock...).  However, it may not be necessary in vfs_aio.c, since
:> the wakeups may only come from other processes and not from interrupt
:> handlers.  vfs_aio.c sleeps outside of an spl'ed section in about 6
:> other places, including 2 in very similar code.
:
:The person who sent me the patch was convinced that it was a necessary
:change due to problems he had run into, and he convinced me at the time,
:though my understanding of the code isn't great.
:
:Contrary to a previous statement, I'm going to leave the code as is for now
:and actually write some test code to try to produce a problem with/without
:the change.
:
:Jason

    I've reviewed the code carefully.    The KAIO_WAKEUP/tsleep combination
    can be woken up by aio_physwakeup, which is called from biodone() which
    itself may be called from an interrupt.  To avoid the race you *MUST*
    be at splbio() when setting the flag and calling tsleep().

    There appears to be two places where the flag is set and tsleep is
    called while running at splbio() and one place where it isn't.  The
    place where it isn't is incorrect.  All instances where this code
    sequence occurs:

	    ki->kaio_flags |= KAIO_WAKEUP;
	    error = tsleep(p, PRIBIO | PCATCH, "aiowc", timo);

    *MUST* be called while at splbio() to avoid a potential race.


    I will also note that in the past year I found a number of similar
    races in the VM system and that due to the small window of opportunity
    they were often very hard, but not impossible, to reproduce.  These
    are the kinds of races that might occur once every few months in a 
    medium-loaded system.  Reproduction can depend on many factors including
    system load and device driver latencies.  There were some horrendous 
    windows in the VM system that were never hit and some one or two
    instruction windows that were fairly easy to reproduce.  Go figure.
    
					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200001200715.XAA43425>