Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Oct 2006 16:40:22 +0200
From:      Bruno Ducrot <ducrot@poupinou.org>
To:        Bill Moran <wmoran@collaborativefusion.com>
Cc:        freebsd-stable@freebsd.org, stable@freebsd.org, John Baldwin <jhb@freebsd.org>
Subject:   Re: Dell 1950 does not properly respond to reboot and shutdown -p
Message-ID:  <20061012144022.GV4945@poupinou.org>
In-Reply-To: <20061010145315.cefa9e19.wmoran@collaborativefusion.com>
References:  <200610101022.33761.jhb@freebsd.org> <200610101720.k9AHKdMI099668@ambrisko.com> <20061010145315.cefa9e19.wmoran@collaborativefusion.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 10, 2006 at 02:53:15PM -0400, Bill Moran wrote:
> In response to Doug Ambrisko <ambrisko@ambrisko.com>:
> 
> > John Baldwin writes:
> > | On Tuesday 10 October 2006 08:54, Bill Moran wrote:
> > | > In response to Doug Ambrisko <ambrisko@ambrisko.com>:
> > | > > Bruno Ducrot writes:
> > | > > | On Wed, Oct 04, 2006 at 02:07:12PM -0400, Bill Moran wrote:
> > | > > | > In response to Bruno Ducrot <ducrot@poupinou.org>:
> > | > > | > > Hi,
> > | > > | > > 
> > | > > | > > On Wed, Oct 04, 2006 at 12:28:35PM -0400, Bill Moran wrote:
> > | > > | > > > 
> > | > > | > > > A reboot causes the OS to halt, but the hardware just sits there on the
> > | > > | > > > shutdown screen.
> > | > > | > > > 
> > | > > | > > > A shutdown -p does the same.
> > | > > | > > 
> > | > > | > > What exactly are the last few lines?
> > | > > | > 
> > | > > | > (manually copied)
> > | > > | > 
> > | > > | > ...
> > | > > | > All buffers synced.
> > | > > | > Uptime: 1m16s
> > | > > | > 
> > | > > | 
> > | > > | Thanks.  Then this happen after print_uptime().
> > | > > | 
> > | > > | I believe one of the drivers register a shutdown_final (or
> > | > > | shutdown_post_sync) event that hang your system.  I think (though I
> > | > > | may be wrong) mfi may be that one.
> > | > > | 
> > | > > | It would help if you can add some printf in dev/mfi/mfi.c into the
> > | > > | mfi_shutdown() function in order to check if that assumption
> > | > > | is correct.
> > | > > 
> > | > > Some what related to this we have a local hack:
> > | > > 
> > | > > --- sys/kern/subr_bus.c.orig	Tue Jun 27 15:49:39 2006
> > | > > +++ sys/kern/subr_bus.c	Tue Jun 27 15:49:51 2006
> > | > > @@ -2906,6 +2906,7 @@ bus_generic_shutdown(device_t dev)
> > | > >  	device_t child;
> > | > >  
> > | > >  	TAILQ_FOREACH(child, &dev->children, link) {
> > | > > +		DELAY(1000);
> > | > >  		device_shutdown(child);
> > | > >  	}
> > | > 
> > | > This patch seems to "fix" the problem.  I'm going to replace it with
> > | > some printfs and see if I can determine which driver is actually
> > | > causing the problem (hopefully it's only one).
> > | > 
> > | > Am I wrong in saying that the correct solution would be to identify the
> > | > driver that needs more time and implementing some sort of polling
> > | > mechanism to ensure the hardware is ready when the driver wants to
> > | > shut down?
> > | 
> > | Well, first let's see which driver it is. :)  You might be able to just
> > | remove the DELAY and add a printf and see which device is printed last.
> > 
> > I think it was in a different ones.  One of our configs has the base
> > HW + bge NIC the other has base HW + 2 x 2 port em NICs.  The more
> > NIC's the better chance for a problem.
> > 
> > I've removed the hack from our kernel and I'm going to run the reboot
> > cycle.  I don't think a printf will work since I recall trying that
> > it "fixed" the problem so I put the DELAY in :-(  It could be generic
> > problem to the system with a sufficiently fast CPU to beat the
> > HW at shutting down.  I'm not sure if his system is Dempsey or Woodcrest.
> > We use Woodcrest and they are really faster.  Other machines might be 
> > "slow" enough that it's not a a problem!  We haven't seen it on our older 
> > platforms with the same kernel and similar HW configs.
> 
> Well, I already did this.  The only printf is the
> device_printf(child, "shutdown\n") that Bruno suggested.  With this
> single change, I'm unable to reproduce the problem.
> 
> Have any commits been made to 6-STABLE that might have inadvertently
> fixed this in the last week or so?
> 

The device_printf() function take too much time I think, so you get the same
behaviour as the DELAY().

-- 
Bruno Ducrot

--  Which is worse:  ignorance or apathy?
--  Don't know.  Don't care.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061012144022.GV4945>