Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Oct 2006 10:20:39 -0700 (PDT)
From:      Doug Ambrisko <ambrisko@ambrisko.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        stable@freebsd.org, Bruno Ducrot <ducrot@poupinou.org>, freebsd-stable@freebsd.org, Bill Moran <wmoran@collaborativefusion.com>
Subject:   Re: Dell 1950 does not properly respond to reboot and shutdown -p
Message-ID:  <200610101720.k9AHKdMI099668@ambrisko.com>
In-Reply-To: <200610101022.33761.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin writes:
| On Tuesday 10 October 2006 08:54, Bill Moran wrote:
| > In response to Doug Ambrisko <ambrisko@ambrisko.com>:
| > > Bruno Ducrot writes:
| > > | On Wed, Oct 04, 2006 at 02:07:12PM -0400, Bill Moran wrote:
| > > | > In response to Bruno Ducrot <ducrot@poupinou.org>:
| > > | > > Hi,
| > > | > > 
| > > | > > On Wed, Oct 04, 2006 at 12:28:35PM -0400, Bill Moran wrote:
| > > | > > > 
| > > | > > > A reboot causes the OS to halt, but the hardware just sits there on the
| > > | > > > shutdown screen.
| > > | > > > 
| > > | > > > A shutdown -p does the same.
| > > | > > 
| > > | > > What exactly are the last few lines?
| > > | > 
| > > | > (manually copied)
| > > | > 
| > > | > ...
| > > | > All buffers synced.
| > > | > Uptime: 1m16s
| > > | > 
| > > | 
| > > | Thanks.  Then this happen after print_uptime().
| > > | 
| > > | I believe one of the drivers register a shutdown_final (or
| > > | shutdown_post_sync) event that hang your system.  I think (though I
| > > | may be wrong) mfi may be that one.
| > > | 
| > > | It would help if you can add some printf in dev/mfi/mfi.c into the
| > > | mfi_shutdown() function in order to check if that assumption
| > > | is correct.
| > > 
| > > Some what related to this we have a local hack:
| > > 
| > > --- sys/kern/subr_bus.c.orig	Tue Jun 27 15:49:39 2006
| > > +++ sys/kern/subr_bus.c	Tue Jun 27 15:49:51 2006
| > > @@ -2906,6 +2906,7 @@ bus_generic_shutdown(device_t dev)
| > >  	device_t child;
| > >  
| > >  	TAILQ_FOREACH(child, &dev->children, link) {
| > > +		DELAY(1000);
| > >  		device_shutdown(child);
| > >  	}
| > 
| > This patch seems to "fix" the problem.  I'm going to replace it with
| > some printfs and see if I can determine which driver is actually
| > causing the problem (hopefully it's only one).
| > 
| > Am I wrong in saying that the correct solution would be to identify the
| > driver that needs more time and implementing some sort of polling
| > mechanism to ensure the hardware is ready when the driver wants to
| > shut down?
| 
| Well, first let's see which driver it is. :)  You might be able to just
| remove the DELAY and add a printf and see which device is printed last.

I think it was in a different ones.  One of our configs has the base
HW + bge NIC the other has base HW + 2 x 2 port em NICs.  The more
NIC's the better chance for a problem.

I've removed the hack from our kernel and I'm going to run the reboot
cycle.  I don't think a printf will work since I recall trying that
it "fixed" the problem so I put the DELAY in :-(  It could be generic
problem to the system with a sufficiently fast CPU to beat the
HW at shutting down.  I'm not sure if his system is Dempsey or Woodcrest.
We use Woodcrest and they are really faster.  Other machines might be 
"slow" enough that it's not a a problem!  We haven't seen it on our older 
platforms with the same kernel and similar HW configs.

Doug A.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200610101720.k9AHKdMI099668>