Date: Sun, 24 Jan 1999 09:58:13 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Wilko Bulte <wilko@yedi.iaf.nl> Cc: Doug Rabson <dfr@nlsystems.com>, current@FreeBSD.ORG Subject: Re: panic: found dirty cache page 0xf046f1c0 Message-ID: <199901241758.JAA03816@apollo.backplane.com> References: <199901241457.PAA12682@yedi.iaf.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
:FYI: a buildworld of -current including the above on FreeBSD/axp completed
:without any incidents.
:
:Wilko
:...
:... ( other reports )
We are looking good, I've got half a dozen positive reports!
On general principles, I think it is possible to make the FreeBSD
VM system bulletproof. The problem is that there are lots of odd
exceptions and special rules that haven't been black-boxed or even
documented ( other then being in John's head, which isn't all that
useful to me ). The rules tend to be layed out in code on each
occurance, which inevitably leads to mistakes. The mistakes
are further compounded by a severe lack of enforcement ( KASSERT()s )
and thus propogate from release to release, building up as time passes.
With appropriate black boxing, documentation, and enforcement, it
should be fairly easy to shorten the development cycle on finding
the bugs. __inline procedures are a godsend because there are literally
a hundred places in the code where someone 'optimized' it by doing
a manual expansion of something from some other module in order to avoid
a subroutine call. This cross module pollinization tends to make
things even less readable. Bleh.
So, for example, a few commits ago I added enforcement of the no-dirty-
pages-on-cache-queue rule and systems started to panic. That enforcement
had to be extended to every dirtying of a page before we actually found
the bug ( which turned out to be a -3.x bug ). More recently I have
added enforcement for PG_BUSY state changes to disallow the busying of
an already-busy page, and unbusying of a non-busy page.
In discussions with John, there are a number of other rules that have
been broken and need to be fixed. Pages on PQ_CACHE are supposed to be
unqueued prior to being busied, held, or wired, for example, but the
rule is pretty much ignored and a lot of code was hacked in to check for
and requeue ( to another queue) the busy-page-on-cache case.
Entry conditions, exit conditions, and side effects for procedures are
mostly undocumented. biodone() sequencing is not well documented, and
struct buf's have a 'kitchen sink' mentality from being hacked up so much.
There are currently too many NFS-specific exceptions strewn all over
the code.
It all works, but it is also a mess.
-Matt
Matthew Dillon
<dillon@backplane.com>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901241758.JAA03816>
