Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Feb 1999 08:43:25 -0500 (EST)
From:      Luoqi Chen <luoqi@watermarkgroup.com>
To:        dillon@apollo.backplane.com
Cc:        dfr@nlsystems.com, freebsd-hackers@FreeBSD.ORG, luoqi@watermarkgroup.com, mjacob@feral.com
Subject:   Re: Panic in FFS/4.0 as of yesterday - update
Message-ID:  <199902251343.IAA26707@lor.watermarkgroup.com>

next in thread | raw e-mail | index | archive | help
>     Oof!  Well, that takes the cake.  I just had a supervisor stack overflow
>     due to the writerecursion junk in getnewbuf().  5 levels of 
>     (FFS + VN + NFS-FILE) resulted in 60+ subroutine levels in DDB's traceback.
>     That is after DDB faulted on itself with just me hitting return.
> 
>     Doh.
> 
>     I think I have a solution.
> 
>     Normally, B_DELWRI buffers are pushed out by the syncer.  Normally 
>     getnewbuf() never needs to deal with B_DELWRI buffers.  It is only
>     during heavy buffered write I/O where the dirty buffers trash the
>     buffer queues.
> 
There seems to be a problem with buffer kva space accounting: buffer_map
is allocated with nbuf*BKVASIZE amount of kva space, but maxbufspace is
only initialized to (nbuf+8)*DFLTBSIZE, that means half of the kva space
will not be utilized (DFLTBSIZE=4096 BKVASIZE=8192)??? Maybe this is why
getblk() didn't block on numfreebuffers and instead thought kva space was
out and started to flush the B_DELWRI buffers.

>     At the moment, getnewbuf() attempts to convert B_DELWRI buffers into
>     async writes in order to push them out.  This is what leads to the
>     recursion problem.  I don't think we can afford to recurse even once,
>     therefore we cannot push out B_DELWRI buffers in getnewbuf().  At all.
> 
The recurse doesn't seem to be avoidable though, if we have any kind of
layered fs. We recurse either with getnewbuf() caller's stack or with
syncer's stack.

>     I think the proper solution is to speed up the syncer when getnewbuf()
>     gets starved.
> 
This won't help if we can't even recurse once.

>     I also see a fairly serious potential low-free-buffer lockup.  At the
>     moment, getblk() ( which calls getnewbuf() ) treats the lo buffer limit
>     as 'lofreebuffers' and blocks if numfreebuffers < lofreebuffers.  I
>     believe we must relax that rule if curproc == updateproc ( i.e. the
>     syncer ), thus guarenteeing sufficient buffers for the syncer to be
>     able to operate in extreme conditions.
> 
Yes, if syncer has to recurse.

>     So, to whit, I am going to start testing changes that remove the async
>     write code from getnewbuf() entirely and implement the emergency reserve
>     for updateproc.  It should be fairly straightforward.  I'll get the thing
>     reliable and then submit a patch for review.
> 
I don't think we have a thorough understanding of the problem yet (or is it
just me?:) For example, we need to answer this question first, what if we're
out of kva space instead of free buffers (I believe this was what actually
happened and could well be just a false alarm) ? We may need to implement
hi/lo watermark for bufspace as well. Another question, do we want to assume
syncer's stack can withstand all recursions resulted from a single async
write? I feel it's better to have all these design issues sorted out before
we start to write code.

> 					-Matt
> 					Matthew Dillon 
> 					<dillon@backplane.com>
> 
> 

-lq


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902251343.IAA26707>