From owner-freebsd-current@FreeBSD.ORG  Mon Apr 16 16:15:45 2012
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 95C7E106566C
	for <freebsd-current@freebsd.org>; Mon, 16 Apr 2012 16:15:45 +0000 (UTC)
	(envelope-from rhurlin@gwdg.de)
Received: from fmailer.gwdg.de (fmailer.gwdg.de [134.76.11.16])
	by mx1.freebsd.org (Postfix) with ESMTP id 109648FC12
	for <freebsd-current@freebsd.org>; Mon, 16 Apr 2012 16:15:45 +0000 (UTC)
Received: from p508c79a9.dip.t-dialin.net ([80.140.121.169]
	helo=krabat.raven.hur)
	by mailer.gwdg.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72)
	(envelope-from <rhurlin@gwdg.de>)
	id 1SJoaV-000431-Nl; Mon, 16 Apr 2012 18:15:35 +0200
Message-ID: <4F8C45A4.2050407@gwdg.de>
Date: Mon, 16 Apr 2012 18:15:32 +0200
From: Rainer Hurling <rhurlin@gwdg.de>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:10.0.3) Gecko/20120317 Thunderbird/10.0.3
MIME-Version: 1.0
To: Konstantin Belousov <kostikbel@gmail.com>
References: <20120415053032.370280f9@cox.net>
	<4F8BDF13.4060903@mail.zedat.fu-berlin.de>
	<4F8C2E2B.20408@gmail.com>
	<20120416145543.GB2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <20120416145543.GB2358@deviant.kiev.zoral.com.ua>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Authenticated: Id:rhurlin
X-Spam-Level: -
X-Virus-Scanned: (clean) by exiscan+sophie
Cc: matt <sendtomatt@gmail.com>,
	"O. Hartmann" <ohartman@mail.zedat.fu-berlin.de>,
	freebsd-current@freebsd.org, "Conrad J. Sabatier" <conrads@cox.net>
Subject: Re: Kernel builds, but crashes at boot (amd64, Revision: 234306)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Apr 2012 16:15:45 -0000

On 16.04.2012 16:55 (UTC+1), Konstantin Belousov wrote:
> On Mon, Apr 16, 2012 at 07:35:23AM -0700, matt wrote:
>> On 04/16/12 01:57, O. Hartmann wrote:
>>> On 04/15/12 12:30, Conrad J. Sabatier wrote:
>>>> Today I'm suddenly unable to boot a newly built kernel without crashing
>>>> right near the end of the device probes, just before the system is
>>>> about to actually come up:
>>>>
>>>> Fatal trap 18: integer divide fault while in kernel mode
>>>>
>>>> Stopped at 0xffffffff803b2646 = g_label_ufs_taste_common+0x36
>>>> divl 0x50(%rcx),%eax
>>>>
>>>> Backtrace lists this chain of calls:
>>>> g_label_ufs_taste_common
>>>> g_label_taste
>>>> g_new_provider_event
>>>> g_run_events
>>>> g_event_procbody
>>>> fork_exit
>>>> fork_trampoline
>>>>
>>>> Whether built with clang or gcc, CUSTOM config or GENERIC, same results
>>>> on rebooting.  No idea why this suddenly started happening, haven't
>>>> changed anything at all in my setup.
>>> My recent kernel does the same on two "FreeBSD 10.0-CURRENT #1 r234309:
>>> Sun Apr 15 14:14:11 CEST 2012" boxes. Both boxes in common is they are
>>> attached to a Dell UltraSharp U2711 screen which does have a built-in
>>> USB/MMC hub. I realized that it was possible to log into my lab's box
>>> from remote when I'm not in the lab and that is usually coincidentally
>>> with a switched off screen.
>>> This morning I loged in from home, loged out and got to the office,
>>> switched on the screen - and reboot! I wasn't able to get the system
>>> running again, it always got stuck in a
>>>
>>> Fatal trap 18: integer divide fault while in kernel mode
>>>
>>> Unplugging the screen's USB hub makes the system booting again!
>>>
>>> Following is one of the last logged messages from the kernel, I don not
>>> know whether this is usefull looking for the problem.
>>>
>>> Regards,
>>> Oliver
>>>
>>> Apr 12 15:32:33 telesto kernel: hwpmc:
>>> SOFT/16/64/0x67<INT,USR,SYS,REA,WRI>  TSC/1/64/0x20<REA>
>>> IAP/4/48/0x3ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA,PRC>
>>> IAF/3/48/0x61<INT,REA,WRI>  UCP/8/48/0x3f8<EDG,THR,REA,WRI,INV,QUA,PRC>
>>> UCF/1/48/0x60<REA,WRI>
>>> Apr 12 15:32:33 telesto kernel: uhub1: 4 ports with 4 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: uhub2: 4 ports with 4 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: uhub3: 2 ports with 2 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: uhub0: 2 ports with 2 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: ugen3.2:<vendor 0x8087>  at usbus3
>>> Apr 12 15:32:33 telesto kernel: uhub4:<vendor 0x8087 product 0x0024,
>>> class 9/0, rev 2.00/0.00, addr 2>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: ugen0.2:<vendor 0x8087>  at usbus0
>>> Apr 12 15:32:33 telesto kernel: uhub5:<vendor 0x8087 product 0x0024,
>>> class 9/0, rev 2.00/0.00, addr 2>  on usbus0
>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 usbus0
>>> Apr 12 15:32:33 telesto kernel: uhub5: 6 ports with 6 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: uhub4: 8 ports with 8 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: ugen3.3:<Cherry GmbH>  at usbus3
>>> Apr 12 15:32:33 telesto kernel: ukbd0:<Cherry GmbH wired keyboard,
>>> class 0/0, rev 2.00/1.11, addr 3>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: kbd2 at ukbd0
>>> Apr 12 15:32:33 telesto kernel: uhid0:<Cherry GmbH wired keyboard,
>>> class 0/0, rev 2.00/1.11, addr 3>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3
>>> Apr 12 15:32:33 telesto kernel: ugen3.4:<vendor 0x0424>  at usbus3
>>> Apr 12 15:32:33 telesto kernel: uhub6:<vendor 0x0424 product 0x2514,
>>> class 9/0, rev 2.00/0.00, addr 4>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3
>>> Apr 12 15:32:33 telesto kernel: uhub6: 3 ports with 2 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: ugen3.5:<vendor 0x0424>  at usbus3
>>> Apr 12 15:32:33 telesto kernel: uhub7:<vendor 0x0424 product 0x2640,
>>> class 9/0, rev 2.00/0.00, addr 5>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3
>>> Apr 12 15:32:33 telesto kernel: uhub7: 3 ports with 2 removable, self
>>> powered
>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3
>>> Apr 12 15:32:33 telesto kernel: ugen3.6:<Generic>  at usbus3
>>> Apr 12 15:32:33 telesto kernel: umass0:<Generic Ultra Fast Media
>>> Reader, class 0/0, rev 2.00/1.91, addr 6>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3
>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): TEST UNIT
>>> READY. CDB: 0 0 0 0 0 0
>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): CAM status:
>>> SCSI Status Error
>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): SCSI status:
>>> Check Condition
>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): SCSI sense:
>>> NOT READY asc:3a,0 (Medium not present)
>>> Apr 12 15:32:33 telesto kernel: da0 at umass-sim0 bus 0 scbus14 target 0
>>> lun 0
>>> Apr 12 15:32:33 telesto kernel: da0:<Generic Ultra HS-SD/MMC 1.91>
>>> Removable Direct Access SCSI-0 device
>>> Apr 12 15:32:33 telesto kernel: da0: 40.000MB/s transfers
>>> Apr 12 15:32:33 telesto kernel: da0: Attempt to query device size
>>> failed: NOT READY, Medium not present
>>> Apr 12 15:32:33 telesto kernel: ugen3.7:<Logitech>  at usbus3
>>> Apr 12 15:32:33 telesto kernel: ums0:<Logitech USB Laser Mouse, class
>>> 0/0, rev 2.00/56.01, addr 7>  on usbus3
>>> Apr 12 15:32:33 telesto kernel: ums0: 8 buttons and [XYZT] coordinates ID=0
>>> Apr 12 15:32:33 telesto kernel: Trying to mount root from
>>> ufs:/dev/gpt/root [rw]...
>>> Apr 12 15:32:33 telesto kernel: nvidia0:<GeForce GTX 570>  on vgapci0
>>> Apr 12 15:32:33 telesto kernel: vgapci0: child nvidia0 requested
>>> pci_enable_io
>>> Apr 12 15:32:33 telesto kernel: vgapci0: child nvidia0 requested
>>> pci_enable_io
>>> Apr 12 15:32:33 telesto kernel: vboxdrv: fAsync=0 offMin=0x2d8 offMax=0x603c
>>> Apr 12 15:32:33 telesto kernel: module_register: module ng_ether already
>>> exists!
>>> Apr 12 15:32:33 telesto kernel: Module ng_ether failed to register: 17
>>>
>> Disconnect "Generic Ultra HS-SD/MMC" device which is presenting
>> da0...same problem here. System will boot if da0 is either not present
>> or has media (I think). In my case it was a different card reader that
>> had no cards in it, which seem to be similar to your case.
>>
>> My guess is that this problem is related to recent changes in da, but I
>> couldn't pinpoint in the diff what's going wrong in a quick look.
>
> So did you tried to revert r234177 and/or r233963 ?

I just updated my system to r234342, only downgraded 
/usr/src/sys/cam/scsi/scsi_da.c to r233746, and now the system is 
booting again. So obviously there is something wrong with the newest 
patch to  scsi_da.c.