Date: Wed, 15 Oct 2003 13:21:34 +0930 From: Greg 'groggy' Lehey <grog@FreeBSD.org> To: Bruce Evans <bde@zeta.org.au> Cc: FreeBSD current users <FreeBSD-current@freebsd.org> Subject: Re: Serial debug broken in recent -CURRENT? Message-ID: <20031015035134.GB13080@wantadilla.lemis.com> In-Reply-To: <20031008011915.J961@gamplex.bde.org> References: <20030929083007.GA33083@wantadilla.lemis.com> <20030930074500.GY45668@wantadilla.lemis.com> <20030930204919.A4354@gamplex.bde.org> <200309300932.54682.sam@errno.com> <20031008011915.J961@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Fba/0zbH8Xs+Fj9o Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wednesday, 8 October 2003 at 2:08:55 +1000, Bruce Evans wrote: > On Tue, 30 Sep 2003, Sam Leffler wrote: > >> It reliably locks up for me when you break into a running system; set a >> breakpoint; and then continue. Machine is UP+HTT. Haven't tried other >> machines. > > This seems to be because rev.1.75 of db_interface.c disturbed some much > larger bugs related to the ones that it fixed. It takes miracles for > entering ddb to even sort of work in the SMP case.=20 Ah, interesting. I hadn't thought that it might be related to SMP. > If one of multiple CPUs in kdb_trap() somehow stops the others, then the > others face different problems when they restart. They can't just return > because debugger traps are not restartable (by just returning). They can= 't > just proceed because the first CPU may changed the state in such a way as > to make proceeding in the normal way not work (e.g., it may have deleted > a breakpoint). > > These problems are not correctly or completely fixed in: > >>>> > Index: db_interface.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /home/ncvs/src/sys/i386/i386/db_interface.c,v > retrieving revision 1.75 > diff -u -2 -r1.75 db_interface.c > --- db_interface.c 7 Sep 2003 13:43:01 -0000 1.75 > +++ db_interface.c 7 Oct 2003 14:11:35 -0000 > ... > This is supposed to stop the other CPUs either in kdb_trap() or normally. > The timeouts are hopefully long enough for all the CPUs to stop in 1 > of these ways. But it doesn't always work. 1 possible problem is > that stop and start IPIs may be delivered out of order, so CPUs stopped > in kdb_trap() may end up stopped (since we don't wait for them to see > the stop IPI). Correct. This patch doesn't fix the problem on my system. I've built a single processor kernel (comment out SMP and APIC_IO), and that *does* work with remote gdb, so it's almost certainly an SMP issue. I have a dump of a partially hanging system if that's of any help. Greg -- See complete headers for address and phone numbers. --Fba/0zbH8Xs+Fj9o Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE/jMRGIubykFB6QiMRApWZAJ9YtM9L9xhhIMjbKke6uHuNF0NLOQCfa1J7 RZA3QHAsN4IyvLE1vaDvQsM= =+AOn -----END PGP SIGNATURE----- --Fba/0zbH8Xs+Fj9o--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031015035134.GB13080>