From nobody Sun Oct 8 02:27:19 2023 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4S35fH6CWSz4w4yM for ; Sun, 8 Oct 2023 02:27:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4S35fH4GWMz3dT7 for ; Sun, 8 Oct 2023 02:27:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1696732039; a=rsa-sha256; cv=none; b=CvOhHtwff0JVMTMgFFToOcNOnAurKQ7gJ7spRVXOuR5byK3duKuGkH3ZTWmbgnT4yKyT+V de/86AfFHpgnfnXCGIe+wDAmmycNmsG2bZ303OfZt39y6KrYDYWoA3FGpLp0eFcaFctBqj h2NgZcorCDTo5H1CSKQGBeBImcVqp0lJC3oC/gw6p6Dk53bvAekeb5Wu+qV5+GuKeEGu19 /byfigz2EDs6GIpp8DveWBBqRDz/9a+2hVVR2VGQ+mIAbIZHdIw1r1GfDDgVXXAEz5Na9U wdXgfPpWKjIiQkALYUeRqiIus2ZMdClPilH/9XejcvzD/zDXrW0W8kzIpeDJnA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1696732039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Nc893RIk1RAueIlkrSH9tfJJgW8nIzevencOul/6PW0=; b=QCMHdbDOsFnelyeDYc92j1ISN0uyn6b4EGAzGJZbKROT0TRLXxPXHF8Nreg9TEbrqHfTmj v+9qo2tUz3txiUD+jmIUmRD1CVYQDfEI7FSEK31kkYKMfywrWi9SFPewt/V2viadogWW6z 3hfUm2yRJwJ39Z4lzkFpWDLCnMcYHtYG5b/LjsiY0WYJbdqsf5od/svbSzX2Hyz971j7eA z63mH1VBr/xCxeJTNadxWpAQEhG6UncCxClo+0qnMFK7Ca0xTAxGIH6L8gB0AOls3KYjQN TbBDsrpGzhQk0YwYSw3Xamr7RnUcVwXZW7ydezVIiMS+nsLuQ1f81vUVvNiikg== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4S35fH3CrZzvsb for ; Sun, 8 Oct 2023 02:27:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 3982RJPo002769 for ; Sun, 8 Oct 2023 02:27:19 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 3982RJxw002768 for bugs@FreeBSD.org; Sun, 8 Oct 2023 02:27:19 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 274346] kernel panic/page fault in nfs_commonkrpc.c::newnfs_request(), due to duplicate hostid's Date: Sun, 08 Oct 2023 02:27:19 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: freebsd@kumba.dev X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D274346 Bug ID: 274346 Summary: kernel panic/page fault in nfs_commonkrpc.c::newnfs_request(), due to duplicate hostid's Product: Base System Version: 14.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: freebsd@kumba.dev So I have managed to trigger a kernel panic in 14.0-BETA5 in the NFS subsys= tem, but this is partly due to a mistake I myself made by not changing the hosti= d of a system that's a clone of another active system. The first system is runn= ing 13.2-RELEASE-p4 and the cloned system is running 14.0-BETA5, newly upgraded. There are several elements that lead to this panic: - Both systems have the same hostid - Both systems mount the same remote NFS share from a third system - Have the 13.2-RELEASE-p4 system start doing a job on the remote share, = like compiling code (e.g., /usr/ports is on this share) - Have the cloned system running 14.0-BETA5 attempt to unmount the remote share - The 14.0-BETA5 system will crash I know it's due to duplicate hostid's, because the below message is printed= on the console immediately before the kernel crashes: >=20 > Initiate recovery. If server has not rebooted, check NFS clients for uniq= ue /etc/hostid's >=20 And the printf() for that exact string is in the crashing function right wh= ere GDB says the crash happens, in nfs_commonkrpc.c, function newnfs_request(), line 1212. I'm just not sure if it's the if statement immediately preceedi= ng the printf() call or the if statement that happens after. The next call is memcmp() in machine code, so I am assuming a NULL deref of some kind. My kernel is a custom build, but this can be triggered on a GENERIC kernel = as well, as my first crash happened on GENERIC right before I was set to reboot into my rebuilt custom kernel after doing the second `freebsd-update instal= l` phase to upgrade to 14.0-BETA5. At that time, I had crashdumps disabled. = So the below crash info is from that custom kernel, after I enabled crashdumps= and re-triggered the crash (it's at least reproducible...): > Unread portion of the kernel message buffer: > [179] > [179] > [179] Fatal trap 12: page fault while in kernel mode > [179] cpuid =3D 0; apic id =3D 00 > [179] fault virtual address =3D 0x4 > [179] fault code =3D supervisor read data, page not present > [179] instruction pointer =3D 0x20:0xffffffff809e9893 > [179] stack pointer =3D 0x28:0xfffffe00a233e800 > [179] frame pointer =3D 0x28:0xfffffe00a233e800 > [179] code segment =3D base 0x0, limit 0xfffff, type 0x1b > [179] =3D DPL 0, pres 1, long 1, def32 0, gran 1 > [179] processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > [179] current process =3D 87256 (umount) > [179] rdi: fffff800077761e4 rsi: 0000000000000004 rdx: 0000000000000010 > [179] rcx: 0000000000000000 r8: 0000000000000024 r9: fffffe00a233f000 > [179] rax: 0000000000000000 rbx: fffffe00a251b020 rbp: fffffe00a233e800 > [179] r10: 0000000000000585 r11: 000000007ff9687f r12: fffff80007776010 > [180] r13: fffff80003abb800 r14: fffffe00a233ea18 r15: fffff80007776000 > [180] trap number =3D 12 > [180] panic: page fault > [180] cpuid =3D 0 > [180] time =3D 1696723338 > [180] KDB: stack backtrace: > [180] #0 0xffffffff806b5edd at kdb_backtrace+0x5d > [180] #1 0xffffffff8066aa20 at vpanic+0x130 > [180] #2 0xffffffff8066a8e3 at panic+0x43 > [180] #3 0xffffffff809ee34c at trap_fatal+0x40c > [180] #4 0xffffffff809ee39e at trap_pfault+0x4e > [180] #5 0xffffffff809c6288 at calltrap+0x8 > [180] #6 0xffffffff8053f804 at newnfs_request+0x10a4 > [180] #7 0xffffffff8054dbad at nfsrpc_destroysession+0x11d > [180] #8 0xffffffff80557252 at nfscl_umount+0x312 > [180] #9 0xffffffff80589470 at nfs_unmount+0x70 > [180] #10 0xffffffff8073c4ad at vfs_unmount_sigdefer+0x2d > [180] #11 0xffffffff80741e37 at dounmount+0x787 > [180] #12 0xffffffff80741645 at kern_unmount+0x2f5 > [180] #13 0xffffffff809eeaf9 at amd64_syscall+0x109 > [180] #14 0xffffffff809c6b9b at fast_syscall_common+0xf8 > [180] Timeout initializing vt_vga > [180] Uptime: 3m0s > [180] Dumping 447 out of 8077 MB:..4%..11%..22%..33%..43%..51%..61%..72%.= .83%..93% >=20 > __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 > 57 /usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory. > (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 > #1 doadump (textdump=3D) at ../../../kern/kern_shutdown.c= :405 > #2 0xffffffff8066a5b7 in kern_reboot (howto=3D260) > at ../../../kern/kern_shutdown.c:526 > #3 0xffffffff8066aa8d in vpanic (fmt=3D0xffffffff80a3bcd1 "%s", > ap=3Dap@entry=3D0xfffffe00a233e680) at ../../../kern/kern_shutdown.c:= 970 > #4 0xffffffff8066a8e3 in panic (fmt=3D) > at ../../../kern/kern_shutdown.c:894 > #5 0xffffffff809ee34c in trap_fatal (frame=3D0xfffffe00a233e740, eva=3D4) > at ../../../amd64/amd64/trap.c:952 > #6 0xffffffff809ee39e in trap_pfault (frame=3D0xfffffe00a233e740, > usermode=3Dfalse, signo=3D, ucode=3D) > at ../../../amd64/amd64/trap.c:760 > #7 > #8 memcmp () at ../../../amd64/amd64/support.S:115 > #9 0xffffffff8053f804 in newnfs_request (nd=3Dnd@entry=3D0xfffffe00a233e= a18, > nmp=3Dnmp@entry=3D0xfffff80003abb800, clp=3Dclp@entry=3D0x0, > nrp=3Dnrp@entry=3D0xfffff80003abbcd8, vp=3Dvp@entry=3D0x0, > td=3Dtd@entry=3D0xfffffe00a251b020, cred=3D0xfffff8000765aa00, prog= =3D100003, > vers=3D4, retsum=3D0x0, toplevel=3D1, xidp=3D0x0, dssep=3D0x0) > at ../../../fs/nfs/nfs_commonkrpc.c:1212 > #10 0xffffffff8054dbad in nfsrpc_destroysession ( > nmp=3Dnmp@entry=3D0xfffff80003abb800, tsep=3D0xfffff80007776010, > tsep@entry=3D0x0, cred=3Dcred@entry=3D0xfffff8000765aa00, > p=3Dp@entry=3D0xfffffe00a251b020) at ../../../fs/nfs/nfs_commonsubs.c= :5151 > #11 0xffffffff80557252 in nfscl_umount (nmp=3Dnmp@entry=3D0xfffff80003abb= 800, > p=3Dp@entry=3D0xfffffe00a251b020, dhp=3Ddhp@entry=3D0x0) > at ../../../fs/nfsclient/nfs_clstate.c:2094 > #12 0xffffffff80589470 in nfs_unmount (mp=3D0xfffffe00a4058000, > mntflags=3D) at ../../../fs/nfsclient/nfs_clvfsops.c:1= 903 > #13 0xffffffff8073c4ad in vfs_unmount_sigdefer (mp=3D0xfffffe00a4058000, > mntflags=3D134217728) at ../../../kern/vfs_init.c:185 > #14 0xffffffff80741e37 in dounmount (mp=3D0xfffff800077761e4, > mp@entry=3D0xfffffe00a4058000, flags=3Dflags@entry=3D134217728, > td=3Dtd@entry=3D0xfffffe00a251b020) at ../../../kern/vfs_mount.c:2327 > #15 0xffffffff80741645 in kern_unmount (td=3D0xfffffe00a251b020, > path=3D, flags=3D134217728) at ../../../kern/vfs_mount= .c:1785 > #16 0xffffffff809eeaf9 in syscallenter (td=3D0xfffffe00a251b020) > at ../../../amd64/amd64/../../kern/subr_syscall.c:187 > #17 amd64_syscall (td=3D0xfffffe00a251b020, traced=3D0) > at ../../../amd64/amd64/trap.c:1197 > #18 > #19 0x0000244bc41489ba in ?? () > Backtrace stopped: Cannot access memory at address 0x244bc20f4c18 > (kgdb) --=20 You are receiving this mail because: You are the assignee for the bug.=