Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 2 Apr 2013 10:47:20 -0500 (CDT)
From:      Bryan Venteicher <bryanv@daemoninthecloset.org>
To:        Gleb Smirnoff <glebius@FreeBSD.org>
Cc:        freebsd-hackers@freebsd.org, Chris Torek <torek@torek.net>
Subject:   Re: boot time crash in if_detach_internal()
Message-ID:  <1227759852.2041.1364917640905.JavaMail.root@daemoninthecloset.org>
In-Reply-To: <20130402085708.GI76816@FreeBSD.org>
References:  <201304010945.r319jJw7027369@elf.torek.net> <20130402085708.GI76816@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,


----- Original Message -----
> From: "Gleb Smirnoff" <glebius@FreeBSD.org>
> To: "Chris Torek" <torek@torek.net>
> Cc: freebsd-hackers@freebsd.org
> Sent: Tuesday, April 2, 2013 3:57:08 AM
> Subject: Re: boot time crash in if_detach_internal()
> 
> On Mon, Apr 01, 2013 at 03:45:19AM -0600, Chris Torek wrote:
> C> I have been poking about with the bhyve virtualization code in
> C> FreeBSD 10-current, and managed to crash FreeBSD during its
> C> bootstrap process due to the fact that if_detach is called
> C> from boot time configuration code, before the internal domain
> C> system initialization has happened.
> C>
> C> I added the following patch to work around the problem.  As
> C> the large comment notes, it might not be quite correct but it
> C> does allow the boot to proceed (of course the "dead" network
> C> device is soon a problem anyway...).
> C>
> C> The fix mirrors (more or less) the code in if_attach_internal().
> C> Feel free to accept, ignore, or modify the patch. :-)
> C>
> C> Chris
> C>
> C> diff --git a/sys/net/if.c b/sys/net/if.c
> C> --- a/sys/net/if.c
> C> +++ b/sys/net/if.c
> C> @@ -845,6 +845,15 @@
> C>
> C>  	if_purgeaddrs(ifp);
> C>
> C> +	/*
> C> +	 * torek: it's not entirely clear to me where and how this
> C> +	 * should go, but if domain_init_status < 2 then there should
> C> +	 * be no inet, inet6, etc items, and this is where the crash
> C> +	 * happens during boot, so let's try this:
> C> +	 */
> C> +	if (domain_init_status < 2)
> C> +		return;
> C> +
> C>  #ifdef INET
> C>  	in_ifdetach(ifp);
> C>  #endif
> 
> Can you provide a backtrace that leads to this?
> 

It is probably along the lines of 

...
vtnet0: cannot setup virtqueue interrupts

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x370
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8088039b
stack pointer           = 0x28:0xffffffff8182c4b0
frame pointer           = 0x28:0xffffffff8182c550
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (swapper)
[ thread pid 0 tid 100000 ]
Stopped at      __rw_rlock+0x23b:       movl    0x370(%r12),%eax
db> bt
Tracing pid 0 tid 100000 td 0xffffffff814fb200
__rw_rlock() at __rw_rlock+0x23b/frame 0xffffffff8182c550
in_pcbpurgeif0() at in_pcbpurgeif0+0x30/frame 0xffffffff8182c5a0
in_ifdetach() at in_ifdetach+0x1c/frame 0xffffffff8182c5d0
if_detach() at if_detach+0x19b/frame 0xffffffff8182c630
vtnet_attach() at vtnet_attach+0xb63/frame 0xffffffff8182c760
device_attach() at device_attach+0x396/frame 0xffffffff8182c7b0
vtpci_probe_and_attach_child() at
vtpci_probe_and_attach_child+0x91/frame 0xffffffff8182c7f0
vtpci_attach() at vtpci_attach+0x23b/frame 0xffffffff8182c830
device_attach() at device_attach+0x396/frame 0xffffffff8182c880
bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182c8a0
acpi_pci_attach() at acpi_pci_attach+0x15f/frame 0xffffffff8182c8f0
device_attach() at device_attach+0x396/frame 0xffffffff8182c940
bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182c960
acpi_pcib_attach() at acpi_pcib_attach+0x24d/frame 0xffffffff8182c9b0
acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x299/frame 0xffffffff8182ca00
device_attach() at device_attach+0x396/frame 0xffffffff8182ca50
bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182ca70
acpi_attach() at acpi_attach+0xdd6/frame 0xffffffff8182cb30
device_attach() at device_attach+0x396/frame 0xffffffff8182cb80
bus_generic_attach() at bus_generic_attach+0x4a/frame 0xffffffff8182cba0
nexus_acpi_attach() at nexus_acpi_attach+0x76/frame 0xffffffff8182cbd0
device_attach() at device_attach+0x396/frame 0xffffffff8182cc20
bus_generic_new_pass() at bus_generic_new_pass+0x116/frame 0xffffffff8182cc50
bus_set_pass() at bus_set_pass+0x8f/frame 0xffffffff8182cc80
configure() at configure+0xa/frame 0xffffffff8182cc90
mi_startup() at mi_startup+0x118/frame 0xffffffff8182ccb0
btext() at btext+0x2c

This is from neel@ for vtnet, but I recently saw the same crash at work
on an igb (on 9.1 or 9-STABLE). I hadn't had time to look at it much.

Not sure if the right answer is for drivers not to call ether_ifattach()
until the point-of-no-failure (lots of drivers are wrong then) or
initialize other parts earlier.

> --
> Totus tuus, Glebius.
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1227759852.2041.1364917640905.JavaMail.root>