From nobody Wed Jan 17 10:07:13 2024 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TFM4b6jT4z57V2J for ; Wed, 17 Jan 2024 10:07:27 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4TFM4Z3p72z4Y2n for ; Wed, 17 Jan 2024 10:07:26 +0000 (UTC) (envelope-from kib@freebsd.org) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=freebsd.org (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kib@freebsd.org) smtp.mailfrom=kib@freebsd.org Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.17.1/8.17.1) with ESMTP id 40HA7DQq045919; Wed, 17 Jan 2024 12:07:16 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 40HA7DQq045919 Received: (from kostik@localhost) by tom.home (8.17.1/8.17.1/Submit) id 40HA7DVI045918; Wed, 17 Jan 2024 12:07:13 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Wed, 17 Jan 2024 12:07:13 +0200 From: Konstantin Belousov To: Ulrich =?utf-8?B?U3DDtnJsZWlu?= Cc: stable@freebsd.org, Rick Macklem Subject: Re: Repeatable nfs_readdir kernel panic after upgrade to stable/14 Message-ID: References: List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=4.0.0 X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-14) on tom.home X-Spamd-Bar: -- X-Spamd-Result: default: False [-3.00 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : No valid SPF, No valid DKIM,none]; HAS_XAW(0.00)[]; TAGGED_RCPT(0.00)[]; ARC_NA(0.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; MIME_TRACE(0.00)[0:+]; FREEFALL_USER(0.00)[kib]; TO_DN_SOME(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; R_DKIM_NA(0.00)[]; MISSING_XM_UA(0.00)[]; FROM_HAS_DN(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_CC(0.00)[freebsd.org,gmail.com]; R_SPF_SOFTFAIL(0.00)[~all]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[stable@freebsd.org]; RCPT_COUNT_THREE(0.00)[3] X-Rspamd-Queue-Id: 4TFM4Z3p72z4Y2n On Wed, Jan 17, 2024 at 10:28:01AM +0100, Ulrich Spörlein wrote: > Hey there, > upgraded my NFS server and laptop (NFS client) to stable/14 over the > weekend and now anything "intensive" that reads from NFS seems to kernel > panic. > > I think this started when I upgraded the server first, shrugged it off as > some overload on the laptop, finished the laptop upgrade to 14 and now > everytime I open easytag on the NFS automounted directory, or browsing > photos with geeqie it locks up hard. > > Mounts on the client currently look like so: > > map /etc/auto_tank on /tank (autofs) > map -media on /media (autofs) > 192.168.0.151:/tank/music on /tank/music (nfs, automounted) > > I'm not even sure if I'm using NFS3 or 4 or whether I'm using the ZFS based > one, I've set this up ages ago. > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 02 > fault virtual address = 0x89 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80eee094 > stack pointer = 0x28:0xfffffe01268c0830 > frame pointer = 0x28:0xfffffe01268c0830 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 74673 (easytag) > rdi: 0000000000000000 rsi: ffffffff819bff08 rdx: 0000000000000000 > rcx: 0000000000000000 r8: fffffe003781e0f0 r9: fffff8001ab51740 > rax: 0000000000000000 rbx: fffff8001ab51740 rbp: fffffe01268c0830 > r10: ffffffff00000000 r11: fffffe01268c07b0 r12: fffffe003781e0f0 > r13: fffff8047ac47700 r14: fffffe012ac1ba38 r15: fffff80437cac000 > trap number = 12 > panic: page fault > cpuid = 1 > time = 1705480771 > KDB: stack backtrace: > #0 0xffffffff80b9d68d at kdb_backtrace+0x5d > #1 0xffffffff80b4f95f at vpanic+0x12f > #2 0xffffffff80b4f823 at panic+0x43 > #3 0xffffffff8102902f at trap_fatal+0x40f > #4 0xffffffff8102907f at trap_pfault+0x4f > #5 0xffffffff80ffef48 at calltrap+0x8 > #6 0xffffffff80a3a3fe at ncl_bioread+0xb7e > #7 0xffffffff80a2c0a0 at nfs_readdir+0x1f0 > #8 0xffffffff80c217aa at vop_sigdefer+0x2a > #9 0xffffffff81100280 at VOP_READDIR_APV+0x20 > #10 0xffffffff846af5ae at autofs_readdir+0x2ce > #11 0xffffffff81100280 at VOP_READDIR_APV+0x20 > #12 0xffffffff80c48501 at kern_getdirentries+0x221 > #13 0xffffffff80c488a9 at sys_getdirentries+0x29 > #14 0xffffffff810298d9 at amd64_syscall+0x109 > #15 0xffffffff80fff85b at fast_syscall_common+0xf8 > Uptime: 3m18s > Dumping 1242 out of 32368 > MB:..2%..11%..21%..31%..42%..51%..61%..71%..82%..91% > > I can still access those NFS mounts just fine, can play music off them with > audacious or just mpv, but easytag will try to recursively read everything > and presumably puts a lot of stress on the system. > > I see there was chatter about this recently, and kib committed something to > nfsclient, which got merged to stable/14 on the 11th, but my build is from > the 14th, so presumably I already have this "fix", and it's not working? > > I'm on n266311-299e9fe9709a right now, which _is_ after kib's fixes, maybe > they are not sufficient for stable/14? You need 7b49e60227f8 which I just pushed.