From owner-freebsd-current@FreeBSD.ORG Mon Apr 16 17:59:02 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3BF81065670; Mon, 16 Apr 2012 17:59:02 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from fmailer.gwdg.de (fmailer.gwdg.de [134.76.11.16]) by mx1.freebsd.org (Postfix) with ESMTP id 1C37D8FC0A; Mon, 16 Apr 2012 17:59:02 +0000 (UTC) Received: from p508c79a9.dip.t-dialin.net ([80.140.121.169] helo=krabat.raven.hur) by mailer.gwdg.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1SJqCY-0003Ob-GB; Mon, 16 Apr 2012 19:58:58 +0200 Message-ID: <4F8C5DE1.60200@gwdg.de> Date: Mon, 16 Apr 2012 19:58:57 +0200 From: Rainer Hurling User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120317 Thunderbird/10.0.3 MIME-Version: 1.0 To: Konstantin Belousov References: <20120415053032.370280f9@cox.net> <4F8BDF13.4060903@mail.zedat.fu-berlin.de> <4F8C2E2B.20408@gmail.com> <20120416145543.GB2358@deviant.kiev.zoral.com.ua> <4F8C45A4.2050407@gwdg.de> <20120416173150.GH2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120416173150.GH2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated: Id:rhurlin X-Spam-Level: - X-Virus-Scanned: (clean) by exiscan+sophie Cc: matt , "O. Hartmann" , ken@freebsd.org, freebsd-current@freebsd.org, trasz@freebsd.org, "Conrad J. Sabatier" Subject: Re: Kernel builds, but crashes at boot (amd64, Revision: 234306) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Apr 2012 17:59:02 -0000 On 16.04.2012 19:31 (UTC+1), Konstantin Belousov wrote: > On Mon, Apr 16, 2012 at 06:15:32PM +0200, Rainer Hurling wrote: >> On 16.04.2012 16:55 (UTC+1), Konstantin Belousov wrote: >>> On Mon, Apr 16, 2012 at 07:35:23AM -0700, matt wrote: >>>> On 04/16/12 01:57, O. Hartmann wrote: >>>>> On 04/15/12 12:30, Conrad J. Sabatier wrote: >>>>>> Today I'm suddenly unable to boot a newly built kernel without crashing >>>>>> right near the end of the device probes, just before the system is >>>>>> about to actually come up: >>>>>> >>>>>> Fatal trap 18: integer divide fault while in kernel mode >>>>>> >>>>>> Stopped at 0xffffffff803b2646 = g_label_ufs_taste_common+0x36 >>>>>> divl 0x50(%rcx),%eax >>>>>> >>>>>> Backtrace lists this chain of calls: >>>>>> g_label_ufs_taste_common >>>>>> g_label_taste >>>>>> g_new_provider_event >>>>>> g_run_events >>>>>> g_event_procbody >>>>>> fork_exit >>>>>> fork_trampoline >>>>>> >>>>>> Whether built with clang or gcc, CUSTOM config or GENERIC, same results >>>>>> on rebooting. No idea why this suddenly started happening, haven't >>>>>> changed anything at all in my setup. >>>>> My recent kernel does the same on two "FreeBSD 10.0-CURRENT #1 r234309: >>>>> Sun Apr 15 14:14:11 CEST 2012" boxes. Both boxes in common is they are >>>>> attached to a Dell UltraSharp U2711 screen which does have a built-in >>>>> USB/MMC hub. I realized that it was possible to log into my lab's box >>>> >from remote when I'm not in the lab and that is usually coincidentally >>>>> with a switched off screen. >>>>> This morning I loged in from home, loged out and got to the office, >>>>> switched on the screen - and reboot! I wasn't able to get the system >>>>> running again, it always got stuck in a >>>>> >>>>> Fatal trap 18: integer divide fault while in kernel mode >>>>> >>>>> Unplugging the screen's USB hub makes the system booting again! >>>>> >>>>> Following is one of the last logged messages from the kernel, I don not >>>>> know whether this is usefull looking for the problem. >>>>> >>>>> Regards, >>>>> Oliver >>>>> >>>>> Apr 12 15:32:33 telesto kernel: hwpmc: >>>>> SOFT/16/64/0x67 TSC/1/64/0x20 >>>>> IAP/4/48/0x3ff >>>>> IAF/3/48/0x61 UCP/8/48/0x3f8 >>>>> UCF/1/48/0x60 >>>>> Apr 12 15:32:33 telesto kernel: uhub1: 4 ports with 4 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: uhub2: 4 ports with 4 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: uhub3: 2 ports with 2 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: uhub0: 2 ports with 2 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: ugen3.2: at usbus3 >>>>> Apr 12 15:32:33 telesto kernel: uhub4:>>>> class 9/0, rev 2.00/0.00, addr 2> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: ugen0.2: at usbus0 >>>>> Apr 12 15:32:33 telesto kernel: uhub5:>>>> class 9/0, rev 2.00/0.00, addr 2> on usbus0 >>>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 usbus0 >>>>> Apr 12 15:32:33 telesto kernel: uhub5: 6 ports with 6 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: uhub4: 8 ports with 8 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: ugen3.3: at usbus3 >>>>> Apr 12 15:32:33 telesto kernel: ukbd0:>>>> class 0/0, rev 2.00/1.11, addr 3> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: kbd2 at ukbd0 >>>>> Apr 12 15:32:33 telesto kernel: uhid0:>>>> class 0/0, rev 2.00/1.11, addr 3> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 >>>>> Apr 12 15:32:33 telesto kernel: ugen3.4: at usbus3 >>>>> Apr 12 15:32:33 telesto kernel: uhub6:>>>> class 9/0, rev 2.00/0.00, addr 4> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 >>>>> Apr 12 15:32:33 telesto kernel: uhub6: 3 ports with 2 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: ugen3.5: at usbus3 >>>>> Apr 12 15:32:33 telesto kernel: uhub7:>>>> class 9/0, rev 2.00/0.00, addr 5> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 >>>>> Apr 12 15:32:33 telesto kernel: uhub7: 3 ports with 2 removable, self >>>>> powered >>>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 >>>>> Apr 12 15:32:33 telesto kernel: ugen3.6: at usbus3 >>>>> Apr 12 15:32:33 telesto kernel: umass0:>>>> Reader, class 0/0, rev 2.00/1.91, addr 6> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 >>>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): TEST UNIT >>>>> READY. CDB: 0 0 0 0 0 0 >>>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): CAM status: >>>>> SCSI Status Error >>>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): SCSI status: >>>>> Check Condition >>>>> Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): SCSI sense: >>>>> NOT READY asc:3a,0 (Medium not present) >>>>> Apr 12 15:32:33 telesto kernel: da0 at umass-sim0 bus 0 scbus14 target 0 >>>>> lun 0 >>>>> Apr 12 15:32:33 telesto kernel: da0: >>>>> Removable Direct Access SCSI-0 device >>>>> Apr 12 15:32:33 telesto kernel: da0: 40.000MB/s transfers >>>>> Apr 12 15:32:33 telesto kernel: da0: Attempt to query device size >>>>> failed: NOT READY, Medium not present >>>>> Apr 12 15:32:33 telesto kernel: ugen3.7: at usbus3 >>>>> Apr 12 15:32:33 telesto kernel: ums0:>>>> 0/0, rev 2.00/56.01, addr 7> on usbus3 >>>>> Apr 12 15:32:33 telesto kernel: ums0: 8 buttons and [XYZT] coordinates >>>>> ID=0 >>>>> Apr 12 15:32:33 telesto kernel: Trying to mount root from >>>>> ufs:/dev/gpt/root [rw]... >>>>> Apr 12 15:32:33 telesto kernel: nvidia0: on vgapci0 >>>>> Apr 12 15:32:33 telesto kernel: vgapci0: child nvidia0 requested >>>>> pci_enable_io >>>>> Apr 12 15:32:33 telesto kernel: vgapci0: child nvidia0 requested >>>>> pci_enable_io >>>>> Apr 12 15:32:33 telesto kernel: vboxdrv: fAsync=0 offMin=0x2d8 >>>>> offMax=0x603c >>>>> Apr 12 15:32:33 telesto kernel: module_register: module ng_ether already >>>>> exists! >>>>> Apr 12 15:32:33 telesto kernel: Module ng_ether failed to register: 17 >>>>> >>>> Disconnect "Generic Ultra HS-SD/MMC" device which is presenting >>>> da0...same problem here. System will boot if da0 is either not present >>>> or has media (I think). In my case it was a different card reader that >>>> had no cards in it, which seem to be similar to your case. >>>> >>>> My guess is that this problem is related to recent changes in da, but I >>>> couldn't pinpoint in the diff what's going wrong in a quick look. >>> >>> So did you tried to revert r234177 and/or r233963 ? >> >> I just updated my system to r234342, only downgraded >> /usr/src/sys/cam/scsi/scsi_da.c to r233746, and now the system is >> booting again. So obviously there is something wrong with the newest >> patch to scsi_da.c. > It is too broad, try to revert exactly one patch and see whether it works. Sorry for my bad english. I wanted to say, that I only reverted exactly one patch (file scsi_da.c from 234177 back to 233746 manually). The rest is up to r234342.