From owner-freebsd-stable Sat Apr 13 1:12:42 2002 Delivered-To: freebsd-stable@freebsd.org Received: from server.rucus.ru.ac.za (server.rucus.ru.ac.za [146.231.115.1]) by hub.freebsd.org (Postfix) with SMTP id D4F2C37B404 for ; Sat, 13 Apr 2002 01:12:32 -0700 (PDT) Received: (qmail 30507 invoked from network); 13 Apr 2002 08:12:28 -0000 Received: from shell-fxp1.rucus.ru.ac.za (HELO shell.rucus.ru.ac.za) (10.0.0.1) by server.rucus.ru.ac.za with SMTP; 13 Apr 2002 08:12:28 -0000 Received: (qmail 50741 invoked by uid 1040); 13 Apr 2002 08:12:28 -0000 Date: Sat, 13 Apr 2002 10:12:28 +0200 From: =?iso-8859-1?Q?David_Sieb=F6rger?= To: freebsd-stable@freebsd.org Subject: Re: Understanding a crash dump Message-ID: <20020413101228.A48671@rucus.ru.ac.za> References: <20020412224856.A20583@rucus.ru.ac.za> <20020413130311.J47408@wantadilla.lemis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2.5.1i In-Reply-To: <20020413130311.J47408@wantadilla.lemis.com>; from grog@FreeBSD.org on Sat, Apr 13, 2002 at 01:03:11PM +0930 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat 2002-04-13 (13:03), Greg 'groggy' Lehey wrote: > On Friday, 12 April 2002 at 22:48:56 +0200, David Siebörger wrote: > > Can anyone shed any light on this? Is there anything useful still to > > be gleaned from this dump, or is there anything that I should do to > > get better information in future? > > Possibly you could get something more out of it, but it would be heavy > work. I have some macros in /usr/src/sys/modules/vinum which are > really intended for debugging Vinum (thus the location), but which > might help here. See http://www.vinumvm.org/vinum/how-to-debug.html > for a general description of how to use them, though you won't need > the Vinum-specific parts here. In particular, though, you should find > a ps command which may show you what was running at the time. I say > "may", because you appear to have memory corruption here. It seems to me that there was no process running at the time of the first panic. The first panic report said: current process = Idle interrupt mask = I loaded up the Vinum gdb macros, and ran 'ps'. Every process listed had stat = 3 (SSLEEP). [snip] > > #6 0xc01cb1a4 in acquire_lock (lk=0xc02a6e3c) > > at /usr/build/src/sys/ufs/ffs/ffs_softdep.c:266 > > This is the second trap. It might give you some idea about what > caused the first trap. Digging a little deeper here, again, I'm drawn to conclude that there was no process running. (kgdb) up 6 #6 0xc01cb1a4 in acquire_lock (lk=0xc02a6e3c) at /usr/build/src/sys/ufs/ffs/ffs_softdep.c:266 266 lk->lkt_held = CURPROC->p_pid; (kgdb) p curproc $1 = 0x0 CURPROC is #defined as curproc in ffs_softdep.c. So what caused the second panic was that this function tried to dereference a null pointer. It assumes (probably correctly :) ) that there should be a running process. [snip] > > #15 0xc0245753 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = 0, > > tf_esi = -1040807936, tf_ebp = -1071042748, tf_isp = -1071042780, > > tf_ebx = -1036232128, tf_edx = -6, tf_ecx = 10, tf_eax = -6, tf_trapno = 12, > > tf_err = 0, tf_eip = 0, tf_cs = 8, tf_eflags = 66055, tf_esp = -1040807936, > > tf_ss = 40}) at /usr/build/src/sys/i386/i386/trap.c:458 > > #16 0x0 in ?? () > > Somehow you have ended up trying to execute code at address 0. This > smells of a smashed stack. I don't think that it would be an indirect > function call, since otherwise I'd expect the backtrace to continue. > You could find out what the current process is (ps will show it) and > use the btp macro to show a backtrace which may show more. Usage is > 'btp pid', where pid is the numeric PID of the process. > > Greg > -- > See complete headers for address and phone numbers Clearly memory has somehow been corrupted, but it's not apparent why. This machine doesn't have ECC memory, so it could be possible that it was a random hardware glitch. I guess I'll just have to see whether the crash repeats itself in future. Thank you to Greg and Morten Rodal for your replies. -- David Siebörger drs@rucus.ru.ac.za To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message