Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Jun 2011 10:17:19 -0600
From:      "Kenneth D. Merry" <ken@freebsd.org>
To:        Andrey Chernov <ache@freebsd.org>, Kostik Belousov <kostikbel@gmail.com>,  "Justin T. Gibbs" <gibbs@freebsd.org>, Eir Nym <eirnym@gmail.com>, current@freebsd.org, will@freebsd.org
Subject:   Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)
Message-ID:  <20110621161719.GA16166@nargothrond.kdm.org>
In-Reply-To: <20110620114656.GA83524@vniz.net>
References:  <20110619160148.GA35431@vniz.net> <BANLkTim5CyWOHThk45tgvMgcivF54QNuHQ@mail.gmail.com> <20110619165328.GA35872@vniz.net> <BANLkTimCegAu0ERJyLaCqen0=F%2BbWMDBNw@mail.gmail.com> <20110619232307.GA57530@vniz.net> <20110620001912.GA60252@vniz.net> <4DFEAD4F.1040603@FreeBSD.org> <20110620070222.GA74009@vniz.net> <20110620080146.GF48734@deviant.kiev.zoral.com.ua> <20110620114656.GA83524@vniz.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 20, 2011 at 15:46:56 +0400, Andrey Chernov wrote:
> On Mon, Jun 20, 2011 at 11:01:46AM +0300, Kostik Belousov wrote:
> > On Mon, Jun 20, 2011 at 11:02:22AM +0400, Andrey Chernov wrote:
> > > On Sun, Jun 19, 2011 at 08:15:43PM -0600, Justin T. Gibbs wrote:
> > > > On 6/19/11 6:19 PM, Andrey Chernov wrote:
> > > > > Exactly that commit is responsible for boot hang.
> > > > > Please fix.
> > > > > BTW, I have MBR on SATA disk (CAM emulated), ICH9.
> > > > 
> > > > Since it works for me, you'll need to provide more information.  Can you
> > > > at least drop into kdb to determine the likely source of the hang by
> > > > getting a stack trace of all processes to see where they are sleeping
> > > > and dumping lock information?
> > > 
> > > I drop into DDB and put 'bt' console photo in the very first message of 
> > > this thread - nothing unusual seen in the main stack. Could you please 
> > > specify exact DDB commands you want to be issued by me? No dump can be 
> > > provided since nothing is mounted yet including swap,
> > > 
> > > BTW, I remember I saw previously unseen warnings with post Jun 14 kernels:
> > > "xpt_action_default: CCB type 0xe not supported"
> > > 
> > > 'ps' inside DDB shows [xpt_thrd] at "ccb_scan" wmesg state and [g_event]
> > > at "caplck" wmesg state, [kernel] at "g_waitid" state.
> > > Even don't know, if it matters.
> > 
> > Just in case, please try r223277.
> 
> As the second message in the thread states, I try first even 223296 with 
> the same hang and the same 
> xpt_action_default: CCB type 0xe not supported
> As I think, DDB's 'ps' indicates that kernel waits something from geom and 
> geom waits something from ccb_scan forever, just raw guess. I will be glad to 
> issue more specific DDB commands and upload corresponding photos.
> BTW, pluging and unplugging USB devides works in that stage.

Can you do the following when the hang happens:

ps
alltrace
show locks
show msgbuf

Hopefully that will give us something to start looking at...

This would really work a lot better if there is any way to get a serial
console on the machine.  The above will produce a good bit of output, and
would likely need a lot of pictures.

Since we can't reproduce the problem here, some debugging help would be
greatly appreciated.

Thanks,

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110621161719.GA16166>