Date: Tue, 9 Jan 2001 18:19:15 -0800 (PST) From: Matt Dillon <dillon@earth.backplane.com> To: Tor.Egge@fast.no, iedowse@maths.tcd.ie Cc: freebsd-stable@freebsd.org Subject: Re: Repeated panic in 4.2-stable Message-ID: <200101100219.f0A2JFf55797@earth.backplane.com> References: <200101100106.f0A16gH54559@earth.backplane.com> <200101100115.CAA03955@midten.fast.no>
next in thread | previous in thread | raw e-mail | index | archive | help
( I think -stable will be interested in this so I'm including -stable
in the thread )
----------> TO EVERYONE RUNNING STABLE!!! Do not use a filesystem
block size greater then 16384. 8192 is ok, 16384 should be ok. Anything
bigger will hit this bug.
:Yes. PR 20609, assigned to you for some time.
:
:> The buffer map is a system map, right? Which means it allocates
:> map_entry elements out of kmapentzone, which is a ZONE_INTERRUPT zone,
:> which can't block.
:
:No. It is not a system map. It should be a system map.
:
:> There is a vm_object_allocate() call, which uses a non-interrupt zone,
:> but I don't think it applies to buffer-cache buffers. Buffer cache
:> buffer KVM is not backed by arbitrary VM objects.
:
:_vm_map_clip_end() and _vm_map_clip_start() calls vm_object_allocate()
:if the entry being split has no object. So it certainly applies to
:the buffer cache buffers due to vm objects incorrectly being allocated
:for buffer map entries.
:
:- Tor Egge
Yes, well, there are two dozen PR's assigned to me... sometimes I
loose track :-) Grr. You know what Tor, your mail from middle of
last year is *still* in my inbox!
Ok, I'll look into it and not forget this time. The patch you include
in the PR seems reasonable after a quick look.
The VFS/BIO subsystem never calls vm_map_simplify_entry() or
vm_map_lookup(), which means that as long as the filesystem block
size does not exceed the buffer cache KVM granularity the clipping
routines will never be called.
The buffer cache KVM granularity is 16384 bytes. So as long as one
does not use a block size greater then 16384 bytes it should be
reasonably safe (we still have to fix this, of course, I'm trying
to diagnose the likelyhood of a problem for people using standard
block sizes). A block size of exactly 16384 bytes should work fine.
It looks like with the above stricture, the only exposed hole is the
vm_map_entry allocation routine. If we block there it is possible
for a second entry in the VFS/BIO subsystem to allocate the same
KVM address range and create a second buffer sharing the same backing
memory.
(Side note: If bitmap blocks exceed 16384K, this could cause serious
filesystem corruption. I'm not sure whether this is possible or not
but it's something to think about. It could be related to why
filesystems with extreme newfs parameters seems to have more then
their fair share of problems).
I will say that of course we have to fix this bug, but in the
current released systems there is virtually no chance of zalloc()
blocking due to the low memory deadlock fixes I put in after christmas.
So as long as people are not using a filesystem block size larger
then 16384 bytes I think their systems will be fine.
This is a great example of how fragile code can wind up in a relatively
stable system with no alarm bells ringing.
-Matt
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200101100219.f0A2JFf55797>
