Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Jan 2007 18:57:21 +0000
From:      Ceri Davies <ceri@submonkey.net>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        stable@FreeBSD.org
Subject:   Re: (audit?) Panic in 6.2-PRERELEASE
Message-ID:  <20070107185721.GA94367@submonkey.net>
In-Reply-To: <20070107180257.I41371@fledge.watson.org>
References:  <20070105111954.GA51511@submonkey.net> <20070105120539.H46119@fledge.watson.org> <20070105131528.GB7088@submonkey.net> <20070105133028.F98541@fledge.watson.org> <20070105150857.GC7088@submonkey.net> <20070106120040.N46119@fledge.watson.org> <20070106132540.GG7088@submonkey.net> <20070107114243.K41371@fledge.watson.org> <20070107170014.GL7088@submonkey.net> <20070107180257.I41371@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--0OAP2g/MAC+5xKAE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Jan 07, 2007 at 06:05:39PM +0000, Robert Watson wrote:
>=20
> On Sun, 7 Jan 2007, Ceri Davies wrote:
>=20
> >>Could you try printing *td->td_ar?  Maybe this will give us a clue as t=
o=20
> >>how far it got.  In particular, this may be able to more reliably give =
us=20
> >>the file descriptor number, which is audited early in the system call.=
=20
> >>You might find that 'td' is corrupted in many layers of the stack, keep=
=20
> >>going up until you find one where it's good.  It may well be that=20
> >>td->td_ar->k_ar.ar_arg_fd is correct, and might confirm that uap->fd is=
=20
> >>correct still.  We'd like also to know if ARG_SOCKINFO, ARG_VNODE1, or=
=20
> >>ARG_VNODE2 is set in the k_ar.ar_valid_arg field.  This may tell us som=
e=20
> >>more about the file descriptor even though it appears to have vanished.
> >
> >*td->td_ar is null (0x0) in both cases...
>=20
> I'm actually beginning to wonder if this is actually audit-related at all=
=2E=20
> Something is clearly not right, and the audit code should not actually ha=
ve=20
> been entered at all there.  Perhaps we're being mislead by the stack trac=
e=20
> corruption into thinking audit is involved.

I've wondered the same.

> >>I'm quite worried by the fact that the file descriptor seems not to be=
=20
> >>present any more -- this suggests a file descriptor related race of the=
=20
> >>sort that is both quite difficult to figure out and also quite a risk.=
=20
> >>It's strange that it would only trigger with audit, however--perhaps=20
> >>audit stretches out the race.  Is this an SMP box?
> >
> >It's certainly looking quite nasty.  This system is UP hardware without=
=20
> >options SMP.
> >
> >...
> >
> >If it's at all useful, I can provide access to this system and the dumps.
>=20
> Yeah, I think at this point that would probably be the most helpful thing.

OK, you should be able to log in as rwatson@submonkey.net with your
freefall key.  Details in ~rwatson/README once you're logged in.

> Could you confirm that the kernel.debug you're using definitely matches t=
he=20
> version of the kernel in the core dump?

Yes, definitely.

Thanks again,

Ceri
--=20
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere

--0OAP2g/MAC+5xKAE
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)

iD8DBQFFoUKRocfcwTS3JF8RAjsBAJ0XI7xHADgATXi9qKvULfEg1k5kOwCeJ8Pg
LNzvXn2HZkJntzDFnVG8Wc0=
=IbGq
-----END PGP SIGNATURE-----

--0OAP2g/MAC+5xKAE--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070107185721.GA94367>