Date: Wed, 26 Sep 2001 08:15:38 +1000 From: Peter Jeremy <peter.jeremy@alcatel.com.au> To: Matt Dillon <dillon@earth.backplane.com> Cc: hackers@FreeBSD.ORG Subject: Re: VM Corruption - stumped, anyone have any ideas? Message-ID: <20010926081538.L75481@gsmx07.alcatel.com.au> In-Reply-To: <bulk.14194.20010925044747@hub.freebsd.org>; from owner-freebsd-hackers-digest@FreeBSD.ORG on Tue, Sep 25, 2001 at 04:47:47AM -0700 References: <bulk.14194.20010925044747@hub.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2001-Sep-25 04:47:47 -0700, freebsd-hackers-digest <owner-freebsd-hackers-digest@FreeBSD.ORG> wrote: On Mon, 24 Sep 2001 14:13:37 -0700 (PDT), Matt Dillon <dillon@earth.backplane.com> wrote: > This is very similar to the corruption I found on one of Yahoo's > machines. Except on that machine two bits were changed. It's as though > some other subsystem is trying to manipulate a flag in a structure using > a bad structure pointer. I'm not sure how practical this is, and it assumes that the machine can survive with around half the current KVA... How about changing the kernel memory allocation routines to only use every second page, with the other pages unmapped. Obviously large arrays would need to be allocated in a single blob without unmapped pages in the middle of them, but I believe most of the kernel data structures are handled as linked lists, so even some of the large arrays may be amenable to having holes in them. This would increase the chance that a stray off-by-1 index would wind up hitting an unmapped page. Of course, it doesn't help if the problem is that the pointer is valid, it just isn't pointing to the expected object (careful examination of explicit casts, combined with lint/gcc should uncover any of these). What we need is run-time type identification (RTTI) (at least as a debugging option). The only problem is that I don't know of any tools to automatically add RTTI to C and doing it manually would be extremely expensive in developer time. It would be technically feasible to modify GCC to add an RTTI field to every structure and verify it when de-referencing pointers, but that's not a trivial undertaking - the bounds-checking patches to gcc2.7.2 comprised about 160K of diffs and new code, together with 380K of support library code and I suspect RTTI is a similar order of complexity. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010926081538.L75481>