Date: Wed, 3 Jul 2024 08:24:14 +0900 From: Tomoaki AOKI <junchoon@dec.sakura.ne.jp> To: stable@freebsd.org Subject: Re: x11/nvidia-driver fails on 14-STABLE/amd64 Message-ID: <20240703082414.572553dabee65d0f6dd129a1@dec.sakura.ne.jp> In-Reply-To: <2458ffc88ffac503076c06cccafa0dc0@chen.org.nz> References: <2458ffc88ffac503076c06cccafa0dc0@chen.org.nz>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 02 Jul 2024 22:11:45 +0000 jonc@chen.org.nz wrote: > Hi, > > I updated my STABLE-14/amd64 to 1a0314d6e30554fc2b07caa5121b00956f416cc4 (ctladm: Fix a race....), and it appears that the latest kernel update breaks x11/nvidia-driver. The system panics when X starts up. Just to be sure I have rebuild and resinstalled x11/nvidia-driver with the updated /usr/src present. /var/log/messages has the following errors: > > Jul 3 09:50:29 stormbringer kernel: ACPI Warning: \_SB.PC00.PEG1.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20221020/nsarguments-212) > Jul 3 09:50:29 stormbringer kernel: Firmware Error (ACPI): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20221020/dsfield-352) > Jul 3 09:50:29 stormbringer kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20221020/dswload2-639) > Jul 3 09:50:29 stormbringer kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20221020/psparse-689) > Jul 3 09:51:52 stormbringer syslogd: kernel boot file is /boot/kernel/kernel > Jul 3 09:51:52 stormbringer kernel: NVRM: GPU at PCI:0000:01:00: GPU-db6a2e9b-ba08-3668-c104-d55596af9efb > Jul 3 09:51:52 stormbringer kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus. > Jul 3 09:51:52 stormbringer kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus. > Jul 3 09:51:52 stormbringer kernel: > Jul 3 09:51:52 stormbringer syslogd: last message repeated 1 times > Jul 3 09:51:52 stormbringer kernel: Fatal trap 12: page fault while in kernel mode > Jul 3 09:51:52 stormbringer kernel: cpuid = 14; apic id = 38 > Jul 3 09:51:52 stormbringer kernel: fault virtual address = 0x0 > Jul 3 09:51:52 stormbringer kernel: fault code = supervisor read data, page not present > Jul 3 09:51:52 stormbringer kernel: instruction pointer = 0x20:0xffffffff85bae56c > Jul 3 09:51:52 stormbringer kernel: stack pointer = 0x28:0xfffffe01a894e5e0 > Jul 3 09:51:52 stormbringer kernel: frame pointer = 0x28:0xfffffe01adc85ce0 > Jul 3 09:51:52 stormbringer kernel: code segment = base 0x0, limit 0xfffff, type 0x1b > Jul 3 09:51:52 stormbringer kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 > Jul 3 09:51:52 stormbringer kernel: processor eflags = interrupt enabled, resume, IOPL = 0 > Jul 3 09:51:52 stormbringer kernel: current process = 1954 (Xorg) > Jul 3 09:51:52 stormbringer kernel: rdi: fffffe01a951f000 rsi: fffffe01ae26f000 rdx: 0000000000000001 > Jul 3 09:51:52 stormbringer kernel: rcx: 0000000000000000 r8: 00000000000000c0 r9: fffffe01adc858f0 > Jul 3 09:51:52 stormbringer kernel: rax: 0000000000000000 rbx: fffffe01ae26f000 rbp: fffffe01adc85ce0 > Jul 3 09:51:52 stormbringer kernel: r10: 000000005237a738 r11: 0000000066847626 r12: 0000000000000000 > Jul 3 09:51:52 stormbringer kernel: r13: fffffe01a951f000 r14: 0000000000000001 r15: fffffe01ade09008 > Jul 3 09:51:52 stormbringer kernel: trap number = 12 > Jul 3 09:51:52 stormbringer kernel: panic: page fault > Jul 3 09:51:52 stormbringer kernel: cpuid = 14 > Jul 3 09:51:52 stormbringer kernel: time = 1719957030 > Jul 3 09:51:52 stormbringer kernel: KDB: stack backtrace: > Jul 3 09:51:52 stormbringer kernel: #0 0xffffffff80b8002d at kdb_backtrace+0x5d > Jul 3 09:51:52 stormbringer kernel: #1 0xffffffff80b32c51 at vpanic+0x131 > Jul 3 09:51:52 stormbringer kernel: #2 0xffffffff80b32b13 at panic+0x43 > Jul 3 09:51:52 stormbringer kernel: #3 0xffffffff8100194b at trap_fatal+0x40b > Jul 3 09:51:52 stormbringer kernel: #4 0xffffffff81001996 at trap_pfault+0x46 > Jul 3 09:51:52 stormbringer kernel: #5 0xffffffff80fd8458 at calltrap+0x8 > > When I reverted to my previous kernel, X started up without any issues. > > Cheers > -- > Jonathan Chen <jonc@chen.org.nz> Did you tried rebuilding x11/nvidia-driver from ports AFTER INSTALLING NEW KERNEL AND WORLD? If yes, any of commits AFTER commit 620a6a54bb7bb6e1c5607092b6ec49e353e0925f [1] should broke something. (As I'm on the commit and x11/nvidia-driver 555.58 (overrided DISTVERSION and setting NO_CHECKSUM=YES on build) to try this new feature branch of driver) isworking fine. This case, if your old build is older than this and if you want fix for FreeBSD-SA-24:04.openssh, the above-mentioned commit is worth trying. Additional note: If you are using graphics/nvidia-drm-[515|61]-kmod port, you need to apply the patch attached at Bug 279539 [2] to build. And if you want to test 555 series of nvidia-drm*-kmod driver, you need to apply the diff at Differential revision D45400 of Phablicator [3], too. [1] https://cgit.freebsd.org/src/commit/?h=stable/14&id=620a6a54bb7bb6e1c5607092b6ec49e353e0925f [2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279539 [3] https://reviews.freebsd.org/D45400 -- Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20240703082414.572553dabee65d0f6dd129a1>