Date: Sun, 6 Nov 2016 13:20:36 +0100 From: "Hartmann, O." <ohartman@zedat.fu-berlin.de> To: YongHyeon PYUN <pyunyh@gmail.com> Cc: FreeBSD CURRENT <freebsd-current@freebsd.org> Subject: Re: CURRENT: re(4) crashing system Message-ID: <20161106132036.06add6ca@hermann> In-Reply-To: <20161031021222.GA1252@michelle.fasterthan.co.kr> References: <20161023132538.6bf55fb2@hermann> <20161024051359.GA1185@michelle.fasterthan.co.kr> <20161024140337.47af924e@freyja.zeit4.iv.bundesimmobilien.de> <20161025020538.GA1238@michelle.fasterthan.co.kr> <20161025070338.76ad6711@hermann> <20161027010004.GA1215@michelle.fasterthan.co.kr> <20161028212113.5c4a2ca2@hermann> <20161031021222.GA1252@michelle.fasterthan.co.kr>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
On Mon, 31 Oct 2016 11:12:22 +0900
YongHyeon PYUN <pyunyh@gmail.com> wrote:
> On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote:
> > On Thu, 27 Oct 2016 10:00:04 +0900
> > YongHyeon PYUN <pyunyh@gmail.com> wrote:
> >
> > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote:
> > > > On Tue, 25 Oct 2016 11:05:38 +0900
> > > > YongHyeon PYUN <pyunyh@gmail.com> wrote:
> > > >
> > >
> > > [...]
> > >
> > > > > I'm not sure but it's likely the issue is related with
> > > > > EEE/Green Ethernet handling. EEE is negotiated feature with
> > > > > link partner. If you directly connect your laptop to non-EEE
> > > > > capable link partner like other re(4) box without switches
> > > > > you may be able to tell whether the issue is EEE/Green
> > > > > Ethernet related one or not.
> > > >
> > > > Me either since when I discovered a problem the first time with
> > > > CURRENT, that was the Friday before last week's Friday, there
> > > > was a unlucky coicidence: I got the new switch, FreeBSD
> > > > introduced a serious bug and I changed the NICs.
> > > >
> > > > The laptop, the last in the row of re(4) equipted systems on
> > > > which I use the Realtek NIC, does well now with Green IT
> > > > technology, but crashes on plugging/unplugging - not on each
> > > > event, but at least in one of ten.
> > >
> > > Hmm, it seems you know how to trigger the issue. When you unplug
> > > UTP cable was there active network traffic on re(4) device?
> > > It would be helpful to know which event triggers the crash(e.g.
> > > unplugging or plugging). And would you show me backtrace of
> > > panic?
> > > > I guess the Green IT issue is more a unlucky guess of mine and
> > > > went hand in hand with the problem I face with CURRENT right
> > > > now on some older, Non UEFI machines.
> > > >
> > >
> > > Ok.
> > >
> > > [...]
> > > >
> > > > As requested the informations about re0 and rgephy0 on the
> > > > laptop (Lenovo E540)
> > > >
> > > > [...]
> > > >
> > > > rgephy0: <RTL8251 1000BASE-T media interface> PHY 1 on miibus0
> > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow,
> > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX,
> > > > 1000baseT-FDX-master, 1000baseT-FDX-flow,
> > > > 1000baseT-FDX-flow-master, auto, auto-flow
> > > >
> > > > re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet>
> > > > port 0x3000-0x30ff mem
> > > > 0xf0d04000-0xf0d04fff,0xf0d00000-0xf0d03fff at device 0.0 on
> > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip
> > > > rev. 0x50800000 re0: MAC rev. 0x00100000
> > >
> > > This looks like 8168GU controller.
> > >
> > > [...]
> > >
> > > > I use options netmap in kernel config, but the problem is also
> > > > present without this option - just for the record.
> > > >
> > >
> > > Yup, netmap(4) has nothing to do with the crash.
> > >
> > > Thanks.
> >
> > Attached, you'll find the backtrace of the crash. This time it was
> > really easy - just one pull of the LAN cabling - and we are
> > happy :-/
> >
> > Please let me know if you need something else. I will return to
> > normal operations (disabling debugging) due to CURRENT is very
> > unstable at the moment on other hosts beyond r307157.
> >
>
> It seems the attachment was stripped.
This time I hope I got it right!
Attached you'll find the latest CURRENT's backtrace on the provoked
crash (plug and unplug).
I also saved the kernel and coredump, so if you need me to do further
investigations,please let me know.
Thanks in advance and kind regards,
oliver
[-- Attachment #2 --]
Sun Nov 6 13:14:13 CET 2016
FreeBSD hermann 12.0-CURRENT FreeBSD 12.0-CURRENT #25 r308359: Sun Nov 6 10:02:27 CET 2016 root@hermann:/usr/obj/usr/src/sys/HERMANN amd64
panic:
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Unread portion of the kernel message buffer:
Kernel page fault with the following non-sleepable locks held:
shared rw udpinp (udpinp) r = 0 (0xfffff8000bcb08a8) locked @ /usr/src/sys/netinet/udp_usrreq.c:1176
stack backtrace:
#0 0xffffffff806d7530 at witness_debugger+0x70
#1 0xffffffff806d88c7 at witness_warn+0x477
#2 0xffffffff80a04607 at trap_pfault+0x57
#3 0xffffffff80a03d1c at trap+0x28c
#4 0xffffffff809e4c11 at calltrap+0x8
#5 0xffffffff8066b62e at _rw_wlock_cookie+0x6e
#6 0xffffffff80824f7c at ip_output+0x48c
#7 0xffffffff808af8fb at udp_send+0xb4b
#8 0xffffffff8070ea98 at sosend_dgram+0x368
#9 0xffffffff80715946 at kern_sendit+0x296
#10 0xffffffff80715c8f at sendit+0x19f
#11 0xffffffff80715add at sys_sendto+0x4d
#12 0xffffffff80a04daf at amd64_syscall+0x32f
#13 0xffffffff809e4efb at Xfast_syscall+0xfb
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address = 0x8
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff806d655f
stack pointer = 0x28:0xfffffe046fa63470
frame pointer = 0x28:0xfffffe046fa634e0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 592 (unbound)
Reading symbols from /boot/kernel/acpi_video.ko...Reading symbols from /usr/lib/debug//boot/kernel/acpi_video.ko.debug...done.
done.
Loaded symbols for /boot/kernel/acpi_video.ko
Reading symbols from /boot/kernel/acpi_wmi.ko...Reading symbols from /usr/lib/debug//boot/kernel/acpi_wmi.ko.debug...done.
done.
Loaded symbols for /boot/kernel/acpi_wmi.ko
Reading symbols from /boot/kernel/drm2.ko...Reading symbols from /usr/lib/debug//boot/kernel/drm2.ko.debug...done.
done.
Loaded symbols for /boot/kernel/drm2.ko
Reading symbols from /boot/kernel/i915kms.ko...Reading symbols from /usr/lib/debug//boot/kernel/i915kms.ko.debug...done.
done.
Loaded symbols for /boot/kernel/i915kms.ko
#0 doadump (textdump=0) at pcpu.h:222
222 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=0) at pcpu.h:222
#1 0xffffffff8037952b in db_dump (dummy=<value optimized out>, dummy2=false,
dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:546
#2 0xffffffff80379329 in db_command (cmd_table=<value optimized out>)
at /usr/src/sys/ddb/db_command.c:453
#3 0xffffffff80379084 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:506
#4 0xffffffff8037c53f in db_trap (type=<value optimized out>,
code=<value optimized out>) at /usr/src/sys/ddb/db_main.c:248
#5 0xffffffff806b5413 in kdb_trap (type=<value optimized out>,
code=<value optimized out>, tf=<value optimized out>)
at /usr/src/sys/kern/subr_kdb.c:654
#6 0xffffffff80a04560 in trap_fatal (frame=0xfffffe046fa633b0, eva=8)
at /usr/src/sys/amd64/amd64/trap.c:796
#7 0xffffffff80a047ad in trap_pfault (frame=0xfffffe046fa633b0, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:658
#8 0xffffffff80a03d1c in trap (frame=0xfffffe046fa633b0)
at /usr/src/sys/amd64/amd64/trap.c:421
#9 0xffffffff809e4c11 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:236
#10 0xffffffff806d655f in witness_checkorder (lock=0xfffff8009a858af8,
flags=9, file=0xffffffff80b5ae9a "/usr/src/sys/netinet/ip_output.c",
line=315, interlock=0x0) at pcpu.h:222
#11 0xffffffff8066b62e in _rw_wlock_cookie (c=0xfffff8009a858b10,
file=0xffffffff80b5ae9a "/usr/src/sys/netinet/ip_output.c", line=315)
at /usr/src/sys/kern/kern_rwlock.c:294
#12 0xffffffff80824f7c in ip_output (m=<value optimized out>,
opt=<value optimized out>, ro=<value optimized out>, flags=0, imo=0x0,
inp=<value optimized out>) at /usr/src/sys/netinet/ip_output.c:315
#13 0xffffffff808af8fb in udp_send (so=<value optimized out>,
flags=<value optimized out>, m=<value optimized out>,
addr=<value optimized out>, control=<value optimized out>,
td=<value optimized out>) at /usr/src/sys/netinet/udp_usrreq.c:1559
#14 0xffffffff8070ea98 in sosend_dgram (so=0xfffff8009a3e4a20,
addr=0xfffff8000b442b20, uio=<value optimized out>,
top=<value optimized out>, control=<value optimized out>,
flags=<value optimized out>, td=<value optimized out>)
at /usr/src/sys/kern/uipc_socket.c:1174
#15 0xffffffff80715946 in kern_sendit (td=<value optimized out>,
s=<value optimized out>, mp=<value optimized out>, flags=0, control=0x0,
segflg=UIO_USERSPACE) at /usr/src/sys/kern/uipc_syscalls.c:811
#16 0xffffffff80715c8f in sendit (td=0xfffff8000b421500,
s=<value optimized out>, mp=0xfffffe046fa63900,
flags=<value optimized out>) at /usr/src/sys/kern/uipc_syscalls.c:736
#17 0xffffffff80715add in sys_sendto (td=0xfffff8009a858af8,
uap=<value optimized out>) at /usr/src/sys/kern/uipc_syscalls.c:853
#18 0xffffffff80a04daf in amd64_syscall (td=0xfffff8000b421500,
traced=<value optimized out>) at subr_syscall.c:135
#19 0xffffffff809e4efb in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:396
#20 0x00000008017f5dca in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161106132036.06add6ca>
