Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 7 Nov 2000 08:08:19 -0800 (PST)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Kirk McKusick <mckusick@mckusick.com>, arch@FreeBSD.ORG
Subject:   Re: softdep panic due to blocked malloc (with traceback)
Message-ID:  <200011071608.eA7G8Jb73998@earth.backplane.com>
References:   <Pine.BSF.4.21.0011072254370.3075-100000@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
:... but I don't see how using malloc() in low-level i/o routines can be
:safe in general.  Deadlock seems to be possible if completion of output
:is necessary to free some pages.  Deadlock just usually doesn't occur,
:because the system attempts to reserve enough free pages to satisfy 
:low-level memory allocations.
:
:Bruce

    I have a complete solution to the low-memory deadlock problem under
    test with Paul Saab, and DG has approved of the idea.  As soon as both
    Paul and My machines survive a night of extreme memory strain I'll
    make the patches available generally.

    This is how it works:

	* The problem we face is that giving certain system processes 
	  special allocation privileages DOES NOT WORK, because non-system
	  processes can still block on a low-memory issue while holding
	  a vnode and this will prevent system processes such as pageout
	  from being able to flush any pages associated with that vnode
	  whether they can allocate memory or not.

	* We remove all contrived 'low memory' limitations from any code
	  which might be called with a locked vnode.  Specifically the
	  buffer cache code.

	  - getblk() no longer blocks if it is low on buffers, only if it
	    is out of buffers.

	  - When the buffer cache codes allocates a page, it is allowed
	    to dig into the free memory reserve rather then block.

	* All kernel MALLOCs called by filesystem support routines such
	  as ffs_inode.c and ffs_softdep.c use M_USE_RESERVE, allowing
	  the kernel malloc to dig into the memory reserve

	* If we are low on memory, the following occurs:

	  - All major delayed write calls, bdwrite(), are turned into 
	    async calls, bawrite().

	  - brelse() and bqrelse() free clean buffers and their underlying
	    VM pages (VM pages go into the CACHE instead of the INACTIVE
	    queue), recovering resources immediately.

	    This allows us to continue issuing I/O without limitation and
	    yet not run out of memory.

	  - The rest of the system uses the normal allocator flags and will
	    block in a low memory situation.  But due to the new method of
	    doing things our paging I/O still operates to free up new memory.

    The jist of the solution is that I/O is able to continue when you hit
    v_free_reserved.  While the rest of the system shudders, I/O still goes
    on which means the system can recover.

    That's it in a nutshell.  The patches are modest but not complex...
    actually fairly straightforward.  I haven't dealt with networking/NFS
    issues yet, but I believe I have the main filesystem and softupdates
    working under extreme *dirty* mmap and I/O loads.  I'll know when Paul
    gets back to me on the overnight tests he ran.  Note that Paul and I
    have been testing things for a week with things failing within hours
    usually, so it may not be today.  Or it may...

						-Matt




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011071608.eA7G8Jb73998>