From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 2 15:56:37 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C1F7C784; Tue, 2 Apr 2013 15:56:37 +0000 (UTC) (envelope-from bryanv@daemoninthecloset.org) Received: from torment.daemoninthecloset.org (ip-static-94-242-209-234.as5577.net [94.242.209.234]) by mx1.freebsd.org (Postfix) with ESMTP id 6DA3FC14; Tue, 2 Apr 2013 15:56:37 +0000 (UTC) Received: from sage.daemoninthecloset.org (unknown [70.114.209.60]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "sage.daemoninthecloset.org", Issuer "daemoninthecloset.org" (verified OK)) by torment.daemoninthecloset.org (Postfix) with ESMTPS id F16EB42C25FF; Tue, 2 Apr 2013 17:48:06 +0200 (CEST) X-Virus-Scanned: amavisd-new at daemoninthecloset.org X-Virus-Scanned: amavisd-new at daemoninthecloset.org Date: Tue, 2 Apr 2013 10:47:20 -0500 (CDT) From: Bryan Venteicher To: Gleb Smirnoff Message-ID: <1227759852.2041.1364917640905.JavaMail.root@daemoninthecloset.org> In-Reply-To: <20130402085708.GI76816@FreeBSD.org> References: <201304010945.r319jJw7027369@elf.torek.net> <20130402085708.GI76816@FreeBSD.org> Subject: Re: boot time crash in if_detach_internal() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.51.1.14] X-Mailer: Zimbra 8.0.2_GA_5569 (ZimbraWebClient - GC25 (Mac)/8.0.2_GA_5569) Thread-Topic: boot time crash in if_detach_internal() Thread-Index: S9ZTAhWq2YKEhktJh+wcG+W+W7cLyQ== Cc: freebsd-hackers@freebsd.org, Chris Torek X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Apr 2013 15:56:37 -0000 Hi, ----- Original Message ----- > From: "Gleb Smirnoff" > To: "Chris Torek" > Cc: freebsd-hackers@freebsd.org > Sent: Tuesday, April 2, 2013 3:57:08 AM > Subject: Re: boot time crash in if_detach_internal() > > On Mon, Apr 01, 2013 at 03:45:19AM -0600, Chris Torek wrote: > C> I have been poking about with the bhyve virtualization code in > C> FreeBSD 10-current, and managed to crash FreeBSD during its > C> bootstrap process due to the fact that if_detach is called > C> from boot time configuration code, before the internal domain > C> system initialization has happened. > C> > C> I added the following patch to work around the problem. As > C> the large comment notes, it might not be quite correct but it > C> does allow the boot to proceed (of course the "dead" network > C> device is soon a problem anyway...). > C> > C> The fix mirrors (more or less) the code in if_attach_internal(). > C> Feel free to accept, ignore, or modify the patch. :-) > C> > C> Chris > C> > C> diff --git a/sys/net/if.c b/sys/net/if.c > C> --- a/sys/net/if.c > C> +++ b/sys/net/if.c > C> @@ -845,6 +845,15 @@ > C> > C> if_purgeaddrs(ifp); > C> > C> + /* > C> + * torek: it's not entirely clear to me where and how this > C> + * should go, but if domain_init_status < 2 then there should > C> + * be no inet, inet6, etc items, and this is where the crash > C> + * happens during boot, so let's try this: > C> + */ > C> + if (domain_init_status < 2) > C> + return; > C> + > C> #ifdef INET > C> in_ifdetach(ifp); > C> #endif > > Can you provide a backtrace that leads to this? > It is probably along the lines of ... vtnet0: cannot setup virtqueue interrupts Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x370 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8088039b stack pointer = 0x28:0xffffffff8182c4b0 frame pointer = 0x28:0xffffffff8182c550 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) [ thread pid 0 tid 100000 ] Stopped at __rw_rlock+0x23b: movl 0x370(%r12),%eax db> bt Tracing pid 0 tid 100000 td 0xffffffff814fb200 __rw_rlock() at __rw_rlock+0x23b/frame 0xffffffff8182c550 in_pcbpurgeif0() at in_pcbpurgeif0+0x30/frame 0xffffffff8182c5a0 in_ifdetach() at in_ifdetach+0x1c/frame 0xffffffff8182c5d0 if_detach() at if_detach+0x19b/frame 0xffffffff8182c630 vtnet_attach() at vtnet_attach+0xb63/frame 0xffffffff8182c760 device_attach() at device_attach+0x396/frame 0xffffffff8182c7b0 vtpci_probe_and_attach_child() at vtpci_probe_and_attach_child+0x91/frame 0xffffffff8182c7f0 vtpci_attach() at vtpci_attach+0x23b/frame 0xffffffff8182c830 device_attach() at device_attach+0x396/frame 0xffffffff8182c880 bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182c8a0 acpi_pci_attach() at acpi_pci_attach+0x15f/frame 0xffffffff8182c8f0 device_attach() at device_attach+0x396/frame 0xffffffff8182c940 bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182c960 acpi_pcib_attach() at acpi_pcib_attach+0x24d/frame 0xffffffff8182c9b0 acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x299/frame 0xffffffff8182ca00 device_attach() at device_attach+0x396/frame 0xffffffff8182ca50 bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182ca70 acpi_attach() at acpi_attach+0xdd6/frame 0xffffffff8182cb30 device_attach() at device_attach+0x396/frame 0xffffffff8182cb80 bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182cba0 nexus_acpi_attach() at nexus_acpi_attach+0x76/frame 0xffffffff8182cbd0 device_attach() at device_attach+0x396/frame 0xffffffff8182cc20 bus_generic_new_pass() at bus_generic_new_pass+0x116/frame 0xffffffff8182cc50 bus_set_pass() at bus_set_pass+0x8f/frame 0xffffffff8182cc80 configure() at configure+0xa/frame 0xffffffff8182cc90 mi_startup() at mi_startup+0x118/frame 0xffffffff8182ccb0 btext() at btext+0x2c This is from neel@ for vtnet, but I recently saw the same crash at work on an igb (on 9.1 or 9-STABLE). I hadn't had time to look at it much. Not sure if the right answer is for drivers not to call ether_ifattach() until the point-of-no-failure (lots of drivers are wrong then) or initialize other parts earlier. > -- > Totus tuus, Glebius. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >