Date: Tue, 10 Oct 2006 15:44:43 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-stable@freebsd.org Cc: stable@freebsd.org, Bruno Ducrot <ducrot@poupinou.org>, Bill Moran <wmoran@collaborativefusion.com> Subject: Re: Dell 1950 does not properly respond to reboot and shutdown -p Message-ID: <200610101544.43903.jhb@freebsd.org> In-Reply-To: <200610101720.k9AHKdMI099668@ambrisko.com> References: <200610101720.k9AHKdMI099668@ambrisko.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 10 October 2006 13:20, Doug Ambrisko wrote: > John Baldwin writes: > | On Tuesday 10 October 2006 08:54, Bill Moran wrote: > | > In response to Doug Ambrisko <ambrisko@ambrisko.com>: > | > > Bruno Ducrot writes: > | > > | On Wed, Oct 04, 2006 at 02:07:12PM -0400, Bill Moran wrote: > | > > | > In response to Bruno Ducrot <ducrot@poupinou.org>: > | > > | > > Hi, > | > > | > > > | > > | > > On Wed, Oct 04, 2006 at 12:28:35PM -0400, Bill Moran wrote: > | > > | > > > > | > > | > > > A reboot causes the OS to halt, but the hardware just sits there on the > | > > | > > > shutdown screen. > | > > | > > > > | > > | > > > A shutdown -p does the same. > | > > | > > > | > > | > > What exactly are the last few lines? > | > > | > > | > > | > (manually copied) > | > > | > > | > > | > ... > | > > | > All buffers synced. > | > > | > Uptime: 1m16s > | > > | > > | > > | > | > > | Thanks. Then this happen after print_uptime(). > | > > | > | > > | I believe one of the drivers register a shutdown_final (or > | > > | shutdown_post_sync) event that hang your system. I think (though I > | > > | may be wrong) mfi may be that one. > | > > | > | > > | It would help if you can add some printf in dev/mfi/mfi.c into the > | > > | mfi_shutdown() function in order to check if that assumption > | > > | is correct. > | > > > | > > Some what related to this we have a local hack: > | > > > | > > --- sys/kern/subr_bus.c.orig Tue Jun 27 15:49:39 2006 > | > > +++ sys/kern/subr_bus.c Tue Jun 27 15:49:51 2006 > | > > @@ -2906,6 +2906,7 @@ bus_generic_shutdown(device_t dev) > | > > device_t child; > | > > > | > > TAILQ_FOREACH(child, &dev->children, link) { > | > > + DELAY(1000); > | > > device_shutdown(child); > | > > } > | > > | > This patch seems to "fix" the problem. I'm going to replace it with > | > some printfs and see if I can determine which driver is actually > | > causing the problem (hopefully it's only one). > | > > | > Am I wrong in saying that the correct solution would be to identify the > | > driver that needs more time and implementing some sort of polling > | > mechanism to ensure the hardware is ready when the driver wants to > | > shut down? > | > | Well, first let's see which driver it is. :) You might be able to just > | remove the DELAY and add a printf and see which device is printed last. > > I think it was in a different ones. One of our configs has the base > HW + bge NIC the other has base HW + 2 x 2 port em NICs. The more > NIC's the better chance for a problem. > > I've removed the hack from our kernel and I'm going to run the reboot > cycle. I don't think a printf will work since I recall trying that > it "fixed" the problem so I put the DELAY in :-( It could be generic > problem to the system with a sufficiently fast CPU to beat the > HW at shutting down. I'm not sure if his system is Dempsey or Woodcrest. > We use Woodcrest and they are really faster. Other machines might be > "slow" enough that it's not a a problem! We haven't seen it on our older > platforms with the same kernel and similar HW configs. Can you break into the debugger when it is broken? If so, then change the printf to a KTR trace and enable just that KTR trace and do 'show ktr' in ddb to see which devices were shutdown. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200610101544.43903.jhb>