From owner-freebsd-current@FreeBSD.ORG Mon Apr 16 17:32:00 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2B190106566C; Mon, 16 Apr 2012 17:32:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 879EA8FC08; Mon, 16 Apr 2012 17:31:58 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q3GHVodf027574; Mon, 16 Apr 2012 20:31:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q3GHVoKS083999; Mon, 16 Apr 2012 20:31:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q3GHVojv083998; Mon, 16 Apr 2012 20:31:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 16 Apr 2012 20:31:50 +0300 From: Konstantin Belousov To: Rainer Hurling Message-ID: <20120416173150.GH2358@deviant.kiev.zoral.com.ua> References: <20120415053032.370280f9@cox.net> <4F8BDF13.4060903@mail.zedat.fu-berlin.de> <4F8C2E2B.20408@gmail.com> <20120416145543.GB2358@deviant.kiev.zoral.com.ua> <4F8C45A4.2050407@gwdg.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="uCh/rgkw4ZLIO80s" Content-Disposition: inline In-Reply-To: <4F8C45A4.2050407@gwdg.de> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: matt , "O. Hartmann" , ken@freebsd.org, freebsd-current@freebsd.org, trasz@freebsd.org, "Conrad J. Sabatier" Subject: Re: Kernel builds, but crashes at boot (amd64, Revision: 234306) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Apr 2012 17:32:00 -0000 --uCh/rgkw4ZLIO80s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Apr 16, 2012 at 06:15:32PM +0200, Rainer Hurling wrote: > On 16.04.2012 16:55 (UTC+1), Konstantin Belousov wrote: > >On Mon, Apr 16, 2012 at 07:35:23AM -0700, matt wrote: > >>On 04/16/12 01:57, O. Hartmann wrote: > >>>On 04/15/12 12:30, Conrad J. Sabatier wrote: > >>>>Today I'm suddenly unable to boot a newly built kernel without crashi= ng > >>>>right near the end of the device probes, just before the system is > >>>>about to actually come up: > >>>> > >>>>Fatal trap 18: integer divide fault while in kernel mode > >>>> > >>>>Stopped at 0xffffffff803b2646 =3D g_label_ufs_taste_common+0x36 > >>>>divl 0x50(%rcx),%eax > >>>> > >>>>Backtrace lists this chain of calls: > >>>>g_label_ufs_taste_common > >>>>g_label_taste > >>>>g_new_provider_event > >>>>g_run_events > >>>>g_event_procbody > >>>>fork_exit > >>>>fork_trampoline > >>>> > >>>>Whether built with clang or gcc, CUSTOM config or GENERIC, same resul= ts > >>>>on rebooting. No idea why this suddenly started happening, haven't > >>>>changed anything at all in my setup. > >>>My recent kernel does the same on two "FreeBSD 10.0-CURRENT #1 r234309: > >>>Sun Apr 15 14:14:11 CEST 2012" boxes. Both boxes in common is they are > >>>attached to a Dell UltraSharp U2711 screen which does have a built-in > >>>USB/MMC hub. I realized that it was possible to log into my lab's box > >>>from remote when I'm not in the lab and that is usually coincidentally > >>>with a switched off screen. > >>>This morning I loged in from home, loged out and got to the office, > >>>switched on the screen - and reboot! I wasn't able to get the system > >>>running again, it always got stuck in a > >>> > >>>Fatal trap 18: integer divide fault while in kernel mode > >>> > >>>Unplugging the screen's USB hub makes the system booting again! > >>> > >>>Following is one of the last logged messages from the kernel, I don not > >>>know whether this is usefull looking for the problem. > >>> > >>>Regards, > >>>Oliver > >>> > >>>Apr 12 15:32:33 telesto kernel: hwpmc: > >>>SOFT/16/64/0x67 TSC/1/64/0x20 > >>>IAP/4/48/0x3ff > >>>IAF/3/48/0x61 UCP/8/48/0x3f8 > >>>UCF/1/48/0x60 > >>>Apr 12 15:32:33 telesto kernel: uhub1: 4 ports with 4 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: uhub2: 4 ports with 4 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: uhub3: 2 ports with 2 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: uhub0: 2 ports with 2 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: ugen3.2: at usbus3 > >>>Apr 12 15:32:33 telesto kernel: uhub4: >>>class 9/0, rev 2.00/0.00, addr 2> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: ugen0.2: at usbus0 > >>>Apr 12 15:32:33 telesto kernel: uhub5: >>>class 9/0, rev 2.00/0.00, addr 2> on usbus0 > >>>Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 usbus0 > >>>Apr 12 15:32:33 telesto kernel: uhub5: 6 ports with 6 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: uhub4: 8 ports with 8 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: ugen3.3: at usbus3 > >>>Apr 12 15:32:33 telesto kernel: ukbd0: >>>class 0/0, rev 2.00/1.11, addr 3> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: kbd2 at ukbd0 > >>>Apr 12 15:32:33 telesto kernel: uhid0: >>>class 0/0, rev 2.00/1.11, addr 3> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 > >>>Apr 12 15:32:33 telesto kernel: ugen3.4: at usbus3 > >>>Apr 12 15:32:33 telesto kernel: uhub6: >>>class 9/0, rev 2.00/0.00, addr 4> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 > >>>Apr 12 15:32:33 telesto kernel: uhub6: 3 ports with 2 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: ugen3.5: at usbus3 > >>>Apr 12 15:32:33 telesto kernel: uhub7: >>>class 9/0, rev 2.00/0.00, addr 5> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 > >>>Apr 12 15:32:33 telesto kernel: uhub7: 3 ports with 2 removable, self > >>>powered > >>>Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 > >>>Apr 12 15:32:33 telesto kernel: ugen3.6: at usbus3 > >>>Apr 12 15:32:33 telesto kernel: umass0: >>>Reader, class 0/0, rev 2.00/1.91, addr 6> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: Root mount waiting for: usbus3 > >>>Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): TEST UNIT > >>>READY. CDB: 0 0 0 0 0 0 > >>>Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): CAM status: > >>>SCSI Status Error > >>>Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): SCSI status: > >>>Check Condition > >>>Apr 12 15:32:33 telesto kernel: (probe0:umass-sim0:0:0:0): SCSI sense: > >>>NOT READY asc:3a,0 (Medium not present) > >>>Apr 12 15:32:33 telesto kernel: da0 at umass-sim0 bus 0 scbus14 target= 0 > >>>lun 0 > >>>Apr 12 15:32:33 telesto kernel: da0: > >>>Removable Direct Access SCSI-0 device > >>>Apr 12 15:32:33 telesto kernel: da0: 40.000MB/s transfers > >>>Apr 12 15:32:33 telesto kernel: da0: Attempt to query device size > >>>failed: NOT READY, Medium not present > >>>Apr 12 15:32:33 telesto kernel: ugen3.7: at usbus3 > >>>Apr 12 15:32:33 telesto kernel: ums0: >>>0/0, rev 2.00/56.01, addr 7> on usbus3 > >>>Apr 12 15:32:33 telesto kernel: ums0: 8 buttons and [XYZT] coordinates= =20 > >>>ID=3D0 > >>>Apr 12 15:32:33 telesto kernel: Trying to mount root from > >>>ufs:/dev/gpt/root [rw]... > >>>Apr 12 15:32:33 telesto kernel: nvidia0: on vgapci0 > >>>Apr 12 15:32:33 telesto kernel: vgapci0: child nvidia0 requested > >>>pci_enable_io > >>>Apr 12 15:32:33 telesto kernel: vgapci0: child nvidia0 requested > >>>pci_enable_io > >>>Apr 12 15:32:33 telesto kernel: vboxdrv: fAsync=3D0 offMin=3D0x2d8=20 > >>>offMax=3D0x603c > >>>Apr 12 15:32:33 telesto kernel: module_register: module ng_ether alrea= dy > >>>exists! > >>>Apr 12 15:32:33 telesto kernel: Module ng_ether failed to register: 17 > >>> > >>Disconnect "Generic Ultra HS-SD/MMC" device which is presenting > >>da0...same problem here. System will boot if da0 is either not present > >>or has media (I think). In my case it was a different card reader that > >>had no cards in it, which seem to be similar to your case. > >> > >>My guess is that this problem is related to recent changes in da, but I > >>couldn't pinpoint in the diff what's going wrong in a quick look. > > > >So did you tried to revert r234177 and/or r233963 ? >=20 > I just updated my system to r234342, only downgraded=20 > /usr/src/sys/cam/scsi/scsi_da.c to r233746, and now the system is=20 > booting again. So obviously there is something wrong with the newest=20 > patch to scsi_da.c. It is too broad, try to revert exactly one patch and see whether it works. --uCh/rgkw4ZLIO80s Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk+MV4UACgkQC3+MBN1Mb4gKeACgvJSRC5AgXXKTKlU4wOuodyJO 5pcAoOQ0CX68lk34iVJ+t+HeZq5XdEVT =dhHQ -----END PGP SIGNATURE----- --uCh/rgkw4ZLIO80s--