Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Sep 2001 08:15:38 +1000
From:      Peter Jeremy <peter.jeremy@alcatel.com.au>
To:        Matt Dillon <dillon@earth.backplane.com>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: VM Corruption - stumped, anyone have any ideas?
Message-ID:  <20010926081538.L75481@gsmx07.alcatel.com.au>
In-Reply-To: <bulk.14194.20010925044747@hub.freebsd.org>; from owner-freebsd-hackers-digest@FreeBSD.ORG on Tue, Sep 25, 2001 at 04:47:47AM -0700
References:  <bulk.14194.20010925044747@hub.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2001-Sep-25 04:47:47 -0700, freebsd-hackers-digest <owner-freebsd-hackers-digest@FreeBSD.ORG> wrote:
On Mon, 24 Sep 2001 14:13:37 -0700 (PDT), Matt Dillon <dillon@earth.backplane.com> wrote:
>    This is very similar to the corruption I found on one of Yahoo's 
>    machines.  Except on that machine two bits were changed.  It's as though
>    some other subsystem is trying to manipulate a flag in a structure using
>    a bad structure pointer.

I'm not sure how practical this is, and it assumes that the machine
can survive with around half the current KVA...  How about changing
the kernel memory allocation routines to only use every second page,
with the other pages unmapped.  Obviously large arrays would need to
be allocated in a single blob without unmapped pages in the middle of
them, but I believe most of the kernel data structures are handled as
linked lists, so even some of the large arrays may be amenable to
having holes in them.

This would increase the chance that a stray off-by-1 index would wind
up hitting an unmapped page.  Of course, it doesn't help if the
problem is that the pointer is valid, it just isn't pointing to the
expected object (careful examination of explicit casts, combined with
lint/gcc should uncover any of these).

What we need is run-time type identification (RTTI) (at least as a
debugging option).  The only problem is that I don't know of any tools
to automatically add RTTI to C and doing it manually would be
extremely expensive in developer time.  It would be technically
feasible to modify GCC to add an RTTI field to every structure and
verify it when de-referencing pointers, but that's not a trivial
undertaking - the bounds-checking patches to gcc2.7.2 comprised
about 160K of diffs and new code, together with 380K of support
library code and I suspect RTTI is a similar order of complexity.

Peter

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010926081538.L75481>