Date: Thu, 11 Aug 2016 16:37:55 -0700 From: Adrian Chadd <adrian.chadd@gmail.com> To: Bryan Drewery <bdrewery@freebsd.org> Cc: "stable@freebsd.org" <stable@freebsd.org>, Andriy Voskoboinyk <avos@freebsd.org> Subject: Re: Panic in stable/11 (amd64) @r303903: page fault while in kernel mode Message-ID: <CAJ-Vmo=tx9NgqQHjnh91rbR5nfx%2BWytBTRC7%2B5hECGL9KY4Wsg@mail.gmail.com> In-Reply-To: <CAJ-Vmon6Rak07ux%2BZX8ySnxkgn5Sv9Jtcus4SSv=J_sXTWQ%2BgQ@mail.gmail.com> References: <20160810165458.GB1112@albert.catwhisker.org> <570bda1e-d4d7-42dc-6037-7c321ba9e97d@FreeBSD.org> <CAJ-Vmon6Rak07ux%2BZX8ySnxkgn5Sv9Jtcus4SSv=J_sXTWQ%2BgQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
.. and maybe we should revert or comment out the code until we figure
out what to do about LLADDR checks.
(I see this in the detach path too; same kind of race. Sigh.)
-adrian
On 11 August 2016 at 16:37, Adrian Chadd <adrian.chadd@gmail.com> wrote:
> Eep. Is this anotehr case where there's a race and ifp is NULL or the
> ll pointer for ifp is NULL or use-after-free'd?
>
> I remember bumping into these here and there because we don't seem to
> have a well defined lifecycle for lladdr access. ;(
>
>
> -adrian
>
>
> On 10 August 2016 at 12:10, Bryan Drewery <bdrewery@freebsd.org> wrote:
>> On 8/10/16 9:54 AM, David Wolfskill wrote:
>>> Happened after a few iterations of {"pkill dhclient" followed by
>>> "dhclient wlan0"}.
>>>
>>> Gory details (both "normal" and gzipped, and including the crash
>>> dump and crashinfo) are in
>>> <http://www.catwhisker.org/~david/FreeBSD/stable_11/2016.08.10/>.
>>>
>>> Summary:
>>> Wed Aug 10 15:56:26 UTC 2016
>>>
>>> FreeBSD 11.0-BETA4 FreeBSD 11.0-BETA4 #69 r303902M/303903:1100120: Wed Aug 10 04:00:09 PDT 2016 root@g1-252.catwhisker.org:/common/S3/obj/usr/src/sys/CANARY amd64
>>>
>>> panic: page fault
>>>
>>> GNU gdb 6.1.1 [FreeBSD]
>>> Copyright 2004 Free Software Foundation, Inc.
>>> GDB is free software, covered by the GNU General Public License, and you are
>>> welcome to change it and/or distribute copies of it under certain conditions.
>>> Type "show copying" to see the conditions.
>>> There is absolutely no warranty for GDB. Type "show warranty" for details.
>>> This GDB was configured as "amd64-marcel-freebsd"...
>>>
>>> Unread portion of the kernel message buffer:
>>>
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>> cpuid = 7; apic id = 07
>>> fault virtual address = 0x0
>>> fault code = supervisor read data, page not present
>>> instruction pointer = 0x20:0xffffffff80bdaaa1
>>> stack pointer = 0x28:0xfffffe060bc956e0
>>> frame pointer = 0x28:0xfffffe060bc957b0
>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>> processor eflags = interrupt enabled, resume, IOPL = 0
>>> current process = 20685 (wpa_supplicant)
>>> trap number = 12
>>> panic: page fault
>>> cpuid = 7
>>> KDB: stack backtrace:
>>> #0 0xffffffff80add787 at kdb_backtrace+0x67
>>> #1 0xffffffff80a950e2 at vpanic+0x182
>>> #2 0xffffffff80a94f53 at panic+0x43
>>> #3 0xffffffff80eead51 at trap_fatal+0x351
>>> #4 0xffffffff80eeaf43 at trap_pfault+0x1e3
>>> #5 0xffffffff80eea4ec at trap+0x26c
>>> #6 0xffffffff80ece0d1 at calltrap+0x8
>>> #7 0xffffffff80b9811c at ifioctl+0x133c
>>> #8 0xffffffff80afc914 at kern_ioctl+0x2d4
>>> #9 0xffffffff80afc5d1 at sys_ioctl+0x171
>>> #10 0xffffffff80eeb6c9 at amd64_syscall+0x4e9
>>> #11 0xffffffff80ece3bb at Xfast_syscall+0xfb
>>> Uptime: 3h0m4s
>>> ...
>>> Reading symbols from /boot/kernel/linux64.ko...Reading symbols from /usr/lib/debug//boot/kernel/linux64.ko.debug...done.
>>> done.
>>> Loaded symbols for /boot/kernel/linux64.ko
>>> #0 doadump (textdump=<value optimized out>) at pcpu.h:221
>>> 221 pcpu.h: No such file or directory.
>>> in pcpu.h
>>> (kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:221
>>> #1 0xffffffff80a94b69 in kern_reboot (howto=260)
>>> at /usr/src/sys/kern/kern_shutdown.c:366
>>> #2 0xffffffff80a9511b in vpanic (fmt=<value optimized out>,
>>> ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
>>> #3 0xffffffff80a94f53 in panic (fmt=0x0)
>>> at /usr/src/sys/kern/kern_shutdown.c:690
>>> #4 0xffffffff80eead51 in trap_fatal (frame=0xfffffe060bc95630, eva=0)
>>> at /usr/src/sys/amd64/amd64/trap.c:841
>>> #5 0xffffffff80eeaf43 in trap_pfault (frame=0xfffffe060bc95630, usermode=0)
>>> at /usr/src/sys/amd64/amd64/trap.c:691
>>> #6 0xffffffff80eea4ec in trap (frame=0xfffffe060bc95630)
>>> at /usr/src/sys/amd64/amd64/trap.c:442
>>> #7 0xffffffff80ece0d1 in calltrap ()
>>> at /usr/src/sys/amd64/amd64/exception.S:236
>>> #8 0xffffffff80bdaaa1 in ieee80211_ioctl (ifp=0xfffff80007991800,
>>> cmd=<value optimized out>, data=<value optimized out>)
>>> at /usr/src/sys/net80211/ieee80211_ioctl.c:3398
>>
>> The code crashing is quite recent:
>>
>>> commit c6321695321bae43c0cd024db564c5207a7e8e31
>>> Author: avos <avos@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
>>> Date: Mon May 2 20:46:05 2016 +0000
>>>
>>> net80211: fix MAC address change via SIOCSIFLLADDR ioctl.
>>>
>>> Recheck MAC address on SIOCSIFFLAGS; as a result,
>>> 'ifconfig wlan0 ether <addr>' can be used after interface startup.
>>>
>>> PR: 208933
>>>
>>>
>>> git-svn-id: svn+ssh://svn.freebsd.org/base/head@298941 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
>>>
>>> diff --git sys/net80211/ieee80211_ioctl.c sys/net80211/ieee80211_ioctl.c
>>> index c3b02e8..823906b 100644
>>> --- sys/net80211/ieee80211_ioctl.c
>>> +++ sys/net80211/ieee80211_ioctl.c
>>> @@ -3382,8 +3382,18 @@ ieee80211_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
>>> }
>>> IEEE80211_UNLOCK(ic);
>>> /* Wait for parent ioctl handler if it was queued */
>>> - if (wait)
>>> + if (wait) {
>>> ieee80211_waitfor_parent(ic);
>>> +
>>> + /*
>>> + * Check if the MAC address was changed
>>> + * via SIOCSIFLLADDR ioctl.
>>> + */
>>> + if ((ifp->if_flags & IFF_UP) == 0 &&
>>> + !IEEE80211_ADDR_EQ(vap->iv_myaddr, IF_LLADDR(ifp)))
>>> + IEEE80211_ADDR_COPY(vap->iv_myaddr,
>>> + IF_LLADDR(ifp));
>>> + }
>>> break;
>>> case SIOCADDMULTI:
>>> case SIOCDELMULTI:
>>
>>
>>> #9 0xffffffff80b9811c in ifioctl (so=<value optimized out>,
>>> cmd=<value optimized out>, data=<value optimized out>,
>>> td=<value optimized out>) at /usr/src/sys/net/if.c:2447
>>> #10 0xffffffff80afc914 in kern_ioctl (td=<value optimized out>,
>>> fd=<value optimized out>, com=2149607696, data=0xfffffe060bc958e0 "wlan0")
>>> at file.h:327
>>> #11 0xffffffff80afc5d1 in sys_ioctl (td=<value optimized out>,
>>> uap=0xfffffe060bc95a40) at /usr/src/sys/kern/sys_generic.c:743
>>> #12 0xffffffff80eeb6c9 in amd64_syscall (td=<value optimized out>,
>>> traced=<value optimized out>) at subr_syscall.c:135
>>> #13 0xffffffff80ece3bb in Xfast_syscall ()
>>> at /usr/src/sys/amd64/amd64/exception.S:396
>>> #14 0x00000008015c448a in ?? ()
>>> Previous frame inner to this frame (corrupt stack?)
>>> Current language: auto; currently minimal
>>> (kgdb)
>>>
>>> This was on my laptop, which I'm actively using at work as I type
>>> -- though it's now connected via wired NIC (em0). I had experienced
>>> no trouble with wlan0 at home (before coming in to work) or on the
>>> bus (en route to work). (I didn't attempt it while cycling to the
>>> bus stop. :-})
>>>
>>> Also, I had no issues running stable/11 (amd64) @303870 -- either
>>> at home or at work -- yesterday. On the other hand, this is (so
>>> far) a one-off, so alleging a "pattern" at this point is not something
>>> I'm willing to do.
>>>
>>> Peace,
>>> david
>>>
>>
>>
>> --
>> Regards,
>> Bryan Drewery
>>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=tx9NgqQHjnh91rbR5nfx%2BWytBTRC7%2B5hECGL9KY4Wsg>
