Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Feb 1999 12:56:15 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Doug Rabson <dfr@nlsystems.com>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: Panic in FFS/4.0 as of yesterday
Message-ID:  <199902202056.MAA11068@apollo.backplane.com>
References:   <Pine.BSF.4.05.9902201158300.82049-100000@herring.nlsystems.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:Jacob's bulk writing test and I can see what is happening (although I'm
:not sure what to do about it).
:
:The system is unresponsive because the root inode is locked virtually all
:of the time and this is because of a lock cascade leading to a single
:process which is trying to rewrite a block of the directory which the test
:is running in (synchronously since the fs is not using softupdates). That
:process is waiting for its i/o to complete before unlocking the directory.
:Unfortunately the buffer is the last on the drive's buffer queue and there
:are 647 (for one instance which I examined in the debugger) buffers ahead
:of it, most of which are writing about 8k. About 4Mb of buffers on the
:queue are from a *single* process which seems extreme.

    There isn't much we can do except to try to fix the lock cascade that
    occurs in namei and lookup.  The problem is that the lower level vnode
    is locked before the parent vnode is released.

    What if we simply bumped the vnode's v_holdcnt or v_usecount in lookup
    instead of lock it, and then have the parent namei unlock the parent vnode
    prior to gaining a lock on the new vnode in its loop?

    This would limit the locking cascade to one vnode worst case.  We would
    have to allow the bumping up and down of v_usecount by independant
    processes while an exclusive lock is held on it.

:It seems to me that there should be a mechanism to prevent the queued i/o
:lists from becoming so long (over 5Mb is queued on the machine which I
:have in the debugger), perhaps by throttling the writers if they start too
:much asynchronous i/o.  I wonder if this can be treated as a similar
:problem to the swapper latency issues which John Dyson was talking about.
:--

    Maybe.  The difference is that the I/O topology is not known at the
    higher levels where the I/O gets queued, so it would be more difficult
    to calculate what the async limit should be in a scaleable way.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
	
:Doug Rabson				Mail:  dfr@nlsystems.com
:Nonlinear Systems Ltd.			Phone: +44 181 442 9037



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902202056.MAA11068>