Date: Tue, 21 Dec 2021 12:01:50 -0500 From: Michael Butler via freebsd-current <freebsd-current@freebsd.org> To: Larry Rosenman <ler@lerctr.org>, Mark Johnston <markj@freebsd.org> Cc: Alexander Motin <mav@freebsd.org>, Freebsd current <freebsd-current@freebsd.org> Subject: Re: Panic: Page Fault in Kernel: Yesterday's CURRENT Message-ID: <311ce6f4-9fa8-0514-193d-6be841af26b2@protected-networks.net> In-Reply-To: <f6b06d768f31c89931e295da7fff5f6f@lerctr.org> References: <3d1b5249a2c51670de496fad9e8b054c@lerctr.org> <9852ae04-6dd0-1cd4-13fe-e97c68e71b37@FreeBSD.org> <b73464cb07d071b35590e7ee22bae88a@lerctr.org> <YbzmwWidvjx9%2BjaX@nuc> <f6b06d768f31c89931e295da7fff5f6f@lerctr.org>
next in thread | previous in thread | raw e-mail | index | archive | help
I have an old pentium-3 that also won't boot kernels built after Dec 6th. I suspect the commits listed below but, with the device being remote and having no DRAC, I'm struggling to test this theory. The relevant commits .. commit 553af8f1ec71d397b5b4fd5876622b9269936e63 Author: Mark Johnston <markj@FreeBSD.org> Date: Mon Dec 6 10:42:19 2021 -0500 x86: Perform late TSC calibration before LAPIC timer calibration commit 62d09b46ad7508ae74d462e49234f0a80f91ff69 Author: Mark Johnston <markj@FreeBSD.org> Date: Mon Dec 6 10:42:10 2021 -0500 x86: Defer LAPIC calibration until after timecounters are available It's currently running git rev e43d081f352 and I have a kernel at git rev f06f1d1fdb969fa7a0a6eefa030d8536f365eb6e to test later this evening, Michael On 12/17/21 15:07, Larry Rosenman wrote: > On 12/17/2021 1:36 pm, Mark Johnston wrote: >> On Fri, Dec 10, 2021 at 10:43:19AM -0600, Larry Rosenman wrote: >>> 14-2021_12_07-1217 - - 1.87G 2021-12-07 12:17 >>> 14-2021_12_09-1957 NR / 121G 2021-12-09 19:57 >>> >>> If that's any help >> >> I can't tell what this is saying. A kernel built on the 7th does not >> crash, or...? Which revision did you update from before you started >> seeing crashes? >> >> From a kgdb session it'd be useful to see output from >> >> (kgdb) frame 8 >> (kgdb) p/x *tmp >> >> to start. >> > > Correct, the 7th didn't panic, but the 9th did, and yesterday's too. > > Grrr > ler in borg in /mnt🔒 on ☁️ (us-east-1) > ❯ kgdb -c /var/crash/vmcore.0 /mnt/boot/kernel/kernel > GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD] > Copyright (C) 2021 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > Type "show copying" and "show warranty" for details. > This GDB was configured as "x86_64-portbld-freebsd14.0". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <https://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from /mnt/boot/kernel/kernel... > (No debugging symbols found in /mnt/boot/kernel/kernel) > Failed to open vmcore: /var/crash/vmcore.0: Permission denied > (kgdb) bt > No stack. > quitb) > > ler in borg in /mnt🔒 on ☁️ (us-east-1) took 6s > ❯ sudo chmod +r /var/crash/* > > ler in borg in /mnt🔒 on ☁️ (us-east-1) > ❯ kgdb -c /var/crash/vmcore.0 /mnt/boot/kernel/kernel > GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD] > Copyright (C) 2021 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > Type "show copying" and "show warranty" for details. > This GDB was configured as "x86_64-portbld-freebsd14.0". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <https://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from /mnt/boot/kernel/kernel... > (No debugging symbols found in /mnt/boot/kernel/kernel) > /wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: > internal-error: void switch_to_thread(thread_info *): Assertion `thr != > NULL' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Quit this debugging session? (y or n) n > > This is a bug, please report it. For instructions, see: > <https://www.gnu.org/software/gdb/bugs/>. > > /wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: > internal-error: void switch_to_thread(thread_info *): Assertion `thr != > NULL' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Create a core file of GDB? (y or n) n > Command aborted. > (kgdb) bt > No thread selected. > (kgdb) fr 8 > No thread selected. > (kgdb) > >>> On 12/10/2021 10:36 am, Alexander Motin wrote: >>> > Hi Larry, >>> > >>> > This looks like some use-after-free or otherwise corrupted callout >>> > structure. Unfortunately the backtrace does not tell what was the >>> > callout. When was the previous update to look what could change? >>> > >>> > On 10.12.2021 11:24, Larry Rosenman wrote: >>> >> FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15 >>> >> main-n251537-ab639f2398b: Thu Dec 9 19:45:37 CST 2021 >>> >> root@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL >>> >> amd64 >>> >> >>> >> VMCORE *IS* available. >>> >> >>> >> >>> >> >>> >> >>> >> Unread portion of the kernel message buffer: >>> >> kernel trap 12 with interrupts disabled >>> >> >>> >> >>> >> Fatal trap 12: page fault while in kernel mode >>> >> cpuid = 0; apic id = 20 >>> >> fault virtual address = 0x0 >>> >> fault code = supervisor write data, page not present >>> >> instruction pointer = 0x20:0xffffffff804e0db4 >>> >> stack pointer = 0x0:0xfffffe0434de4e10 >>> >> frame pointer = 0x0:0xfffffe0434de4e70 >>> >> code segment = base 0x0, limit 0xfffff, type 0x1b >>> >> = DPL 0, pres 1, long 1, def32 0, gran 1 >>> >> processor eflags = resume, IOPL = 0 >>> >> current process = 82990 (c++) >>> >> trap number = 12 >>> >> panic: page fault >>> >> cpuid = 0 >>> >> time = 1639111198 >>> >> KDB: stack backtrace: >>> >> #0 0xffffffff8050fc95 at kdb_backtrace+0x65 >>> >> #1 0xffffffff804c468f at vpanic+0x17f >>> >> #2 0xffffffff804c4503 at panic+0x43 >>> >> #3 0xffffffff807a2195 at trap_fatal+0x385 >>> >> #4 0xffffffff807a21ef at trap_pfault+0x4f >>> >> #5 0xffffffff80779c78 at calltrap+0x8 >>> >> #6 0xffffffff8045ddb8 at handleevents+0x188 >>> >> #7 0xffffffff8045ea3e at timercb+0x24e >>> >> #8 0xffffffff807ca9eb at lapic_handle_timer+0x9b >>> >> #9 0xffffffff8077b9b1 at Xtimerint+0xb1 >>> >> Uptime: 2h28m57s >>> >> Dumping 12829 out of 131023 >>> >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% >>> >> >>> >> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 >>> >> 55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" >>> >> (offsetof(struct pcpu, >>> >> (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 >>> >> #1 doadump (textdump=<optimized out>) >>> >> at /usr/src/sys/kern/kern_shutdown.c:399 >>> >> #2 0xffffffff804c428c in kern_reboot (howto=260) >>> >> at /usr/src/sys/kern/kern_shutdown.c:487 >>> >> #3 0xffffffff804c46fe in vpanic (fmt=0xffffffff807e1276 "%s", >>> >> ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:920 >>> >> #4 0xffffffff804c4503 in panic (fmt=<unavailable>) >>> >> at /usr/src/sys/kern/kern_shutdown.c:844 >>> >> #5 0xffffffff807a2195 in trap_fatal (frame=0xfffffe0434de4d50, >>> eva=0) >>> >> at /usr/src/sys/amd64/amd64/trap.c:946 >>> >> #6 0xffffffff807a21ef in trap_pfault (frame=0xfffffe0434de4d50, >>> >> usermode=false, signo=<optimized out>, ucode=<optimized out>) >>> >> at /usr/src/sys/amd64/amd64/trap.c:765 >>> >> #7 <signal handler called> >>> >> #8 0xffffffff804e0db4 in callout_process >>> >> (now=now@entry=38385536922300) >>> >> at /usr/src/sys/kern/kern_timeout.c:488 >>> >> #9 0xffffffff8045ddb8 in handleevents (now=now@entry=38385536922300, >>> >> fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213 >>> >> #10 0xffffffff8045ea3e in timercb (et=0xffffffff80d475e0 <lapic_et>, >>> >> arg=<optimized out>) at /usr/src/sys/kern/kern_clocksource.c:357 >>> >> #11 0xffffffff807ca9eb in lapic_handle_timer >>> >> (frame=0xfffffe0434de4f40) >>> >> at /usr/src/sys/x86/x86/local_apic.c:1364 >>> >> #12 <signal handler called> >>> >> #13 0x000000080df42bb6 in ?? () >>> >> Backtrace stopped: Cannot access memory at address 0x7ffffdef2c90 >>> >> (kgdb) >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?311ce6f4-9fa8-0514-193d-6be841af26b2>