Date: Mon, 23 Jan 2006 15:51:07 -0500 From: John Baldwin <jhb@freebsd.org> To: thierry@herbelot.com Cc: freebsd-current@freebsd.org Subject: Re: panic: spin lock held too long (while rebooting) Message-ID: <200601231551.08474.jhb@freebsd.org> In-Reply-To: <200601210705.11539.thierry@herbelot.com> References: <200601040806.37953.thierry@herbelot.com> <200601040838.49663.jhb@freebsd.org> <200601210705.11539.thierry@herbelot.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday 21 January 2006 01:05, Thierry Herbelot wrote: > Le Wednesday 4 January 2006 14:38, John Baldwin a =E9crit : > > On Wednesday 04 January 2006 02:06 am, Thierry Herbelot wrote: > > [SNIP previous similar panic] > > > Next time you get this, can you use 'show threads' to figure out the tid > > for the thread whose pointer is in the printf (0xc16de480 in this case) > > and then do a trace of that thread? > > Hello, > > Here is a more detailed crash session : > > is this (zomb) problematic ? (in ps) : > 8 c182e228 0 1 0 0002204 zomb[INACTIVE] g_mirror gm0s1 > > I keep the machine in DDB, if there are more detailed commands to > investigate the panic (the machine is an SMP BP6, runs a GENERIC current > kernel, and stores its local files in two g_mirror partitions). > > The problematic spinlock is held by 0xc16de340 which is cpustop_handler. > > TfH > > PS : printout of the crash : > > # reboot > Waiting (max 60 seconds) for system process `vnlru' to stop...done > Waiting (max 60 seconds) for system process `bufdaemon' to stop...done > Waiting (max 60 seconds) for system process `syncer' to stop... > Syncing disks, vnodes remaining...3 2 2 2 0 0 done > All buffers synced. > Uptime: 39m52s > GEOM_MIRROR: Device files1: provider mirror/files1 destroyed. > GEOM_MIRROR: Device files1 destroyed. > GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 destroyed. > GEOM_MIRROR: Device gm0s1 destroyed. > Rebooting... > cpu_reset: Stopping other CPUs > spin lock sched lock held by 0xc16de340 for > 5 seconds > panic: spin lock held too long Ok, it's not a fatal panic in that your disks should already be clean at th= is=20 point, etc. You can try this hack to see if it fixes it: Index: vm_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/cvs/src/sys/i386/i386/vm_machdep.c,v retrieving revision 1.267 diff -u -r1.267 vm_machdep.c =2D-- vm_machdep.c 14 Nov 2005 00:43:44 -0000 1.267 +++ vm_machdep.c 23 Jan 2006 20:49:21 -0000 @@ -533,6 +533,7 @@ ; /* Wait for other cpu to see that we've started */ stop_cpus((1<<cpu_reset_proxyid)); printf("cpu_reset_proxy: Stopped CPU %d\n", cpu_reset_proxyid); + disable_intr(); DELAY(1000000); cpu_reset_real(); } @@ -581,6 +582,7 @@ /* NOTREACHED */ } + disable_intr(); DELAY(1000000); } #endif =20 The better fix is that we really should take CPUs offline more gracefully=20 during a shutdown (at least during an orderly shutdown). =2D-=20 John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" =3D http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200601231551.08474.jhb>