From nobody Mon Jan 15 10:53:31 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TD8Bm61llz56dF5 for ; Mon, 15 Jan 2024 10:53:36 +0000 (UTC) (envelope-from pblok@bsd4all.org) Received: from mail.bsd4all.org (mail.bsd4all.org [88.99.169.216]) by mx1.freebsd.org (Postfix) with ESMTP id 4TD8Bl3vhMz4CWZ for ; Mon, 15 Jan 2024 10:53:35 +0000 (UTC) (envelope-from pblok@bsd4all.org) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of pblok@bsd4all.org designates 88.99.169.216 as permitted sender) smtp.mailfrom=pblok@bsd4all.org Received: from mail.bsd4all.org (localhost [127.0.0.1]) by mail.bsd4all.org (Postfix) with ESMTP id 93DAA72FD; Mon, 15 Jan 2024 11:53:37 +0100 (CET) X-Virus-Scanned: amavisd-new at bsd4all.org Received: from mail.bsd4all.org ([127.0.0.1]) by mail.bsd4all.org (mail.bsd4all.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VoPHWp6OvagA; Mon, 15 Jan 2024 11:53:36 +0100 (CET) Received: from smtpclient.apple (pony_ip [204.168.249.121]) by mail.bsd4all.org (Postfix) with ESMTPSA id 8A4E272F7; Mon, 15 Jan 2024 11:53:35 +0100 (CET) From: Peter Blok Message-Id: <683EF50F-6665-4664-A7CE-1EFE50076FB0@bsd4all.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_727E0BFF-35CE-446A-A17B-7C5202616E94" List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\)) Subject: Re: NFSv4 crash of CURRENT Date: Mon, 15 Jan 2024 11:53:31 +0100 In-Reply-To: Cc: Cy Schubert , Rick Macklem , Ronald Klop , FreeBSD CURRENT To: FreeBSD User References: <20240113193324.3fd54295@thor.intern.walstatt.dynvpn.de> <1369645989.13766.1705178331205@localhost> <20240115043412.B6998C8@slippy.cwsent.com> <20240115064704.611fe0c4@thor.intern.walstatt.dynvpn.de> X-Mailer: Apple Mail (2.3696.120.41.1.4) X-Spamd-Bar: - X-Spamd-Result: default: False [-1.20 / 15.00]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_NO_TLS_LAST(0.10)[]; FREEMAIL_CC(0.00)[cschubert.com,gmail.com,klop.ws,freebsd.org]; ARC_NA(0.00)[]; ASN(0.00)[asn:24940, ipnet:88.99.0.0/16, country:DE]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; TAGGED_RCPT(0.00)[]; R_DKIM_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DMARC_NA(0.00)[bsd4all.org]; TO_DN_ALL(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCPT_COUNT_FIVE(0.00)[5] X-Rspamd-Queue-Id: 4TD8Bl3vhMz4CWZ --Apple-Mail=_727E0BFF-35CE-446A-A17B-7C5202616E94 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, Forgot to mention I=E2=80=99m on 13-stable. The fix that is causing the = crash with automounted NFS is: commit cc5cda1dbaa907ce52074f47264cc45b5a7d6c8b Author: Konstantin Belousov Date: Tue Jan 2 00:22:44 2024 +0200 nfsclient: limit situations when we do unlocked read-ahead by nfsiod =20 (cherry picked from commit 70dc6b2ce314a0f32755005ad02802fca7ed186e) When I remove the fix, the problem is gone. Add it back and the crash = happens. Peter > On 15 Jan 2024, at 09:31, Peter Blok wrote: >=20 > Hi, >=20 > I do have a crash on a NFS client with stable of today = (4c4633fdffbe8e4b6d328c2bc9bb3edacc9ab50a). It is also autofs related. = Maybe it is the same problem. >=20 > I have ports automounted on /am/ports. When I do cd /am/ports/sys and = type tab to autocomplete it crashes with the below stack trace. If I = plainly mount ports on /usr/ports and do the same everything works. I am = using NFSv3 >=20 > Peter >=20 >=20 >=20 >=20 > Fatal trap 12: page fault while in kernel mode > cpuid =3D 2; apic id =3D 04 > fault virtual address =3D 0x89 > fault code =3D supervisor read data, page not present > instruction pointer =3D 0x20:0xffffffff809645d4 > stack pointer =3D 0x28:0xfffffe00acadb830 > frame pointer =3D 0x28:0xfffffe00acadb830 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 6869 (csh) > trap number =3D 12 > panic: page fault > cpuid =3D 2 > time =3D 1705306940 > KDB: stack backtrace: > #0 0xffffffff806232f5 at kdb_backtrace+0x65 > #1 0xffffffff805d7a02 at vpanic+0x152 > #2 0xffffffff805d78a3 at panic+0x43 > #3 0xffffffff809d58ad at trap_fatal+0x38d > #4 0xffffffff809d58ff at trap_pfault+0x4f > #5 0xffffffff809af048 at calltrap+0x8 > #6 0xffffffff804c7a7e at ncl_bioread+0xb7e > #7 0xffffffff804b9d90 at nfs_readdir+0x1f0 > #8 0xffffffff8069c61a at vop_sigdefer+0x2a > #9 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20 > #10 0xffffffff81ce75de at autofs_readdir+0x2ce > #11 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20 > #12 0xffffffff806c3002 at kern_getdirentries+0x222 > #13 0xffffffff806c33a9 at sys_getdirentries+0x29 > #14 0xffffffff809d6180 at amd64_syscall+0x110 > #15 0xffffffff809af95b at fast_syscall_common+0xf8 >=20 >=20 >=20 >> On 15 Jan 2024, at 06:46, FreeBSD User > wrote: >>=20 >> Am Sun, 14 Jan 2024 20:34:12 -0800 >> Cy Schubert > schrieb: >>=20 >>> In message = >>> om> =20 >>> , Rick Macklem writes: >>>> On Sat, Jan 13, 2024 at 12:39=3DE2=3D80=3DAFPM Ronald Klop = >=3D >>>> wrote: =20 >>>>>=20 >>>>>=20 >>>>> Van: FreeBSD User > >>>>> Datum: 13 januari 2024 19:34 >>>>> Aan: FreeBSD CURRENT > >>>>> Onderwerp: NFSv4 crash of CURRENT >>>>>=20 >>>>> Hello, >>>>>=20 >>>>> running CURRENT client (FreeBSD 15.0-CURRENT #4 = main-n267556-69748e62e82a=3D =20 >>>> : Sat Jan 13 18:08:32 =20 >>>>> CET 2024 amd64). One NFSv4 server is same OS revision as the = mentioned cl=3D =20 >>>> ient, other is FreeBSD =20 >>>>> 13.2-RELEASE-p8. Both offer NFSv4 filesystems, non-kerberized. >>>>>=20 >>>>> I can crash the client reproducable by accessing the one or other = NFSv4 F=3D =20 >>>> S (a simple ls -la). =20 >>>>> The NFSv4 FS is backed by ZFS (if this matters). I do not have = physicla a=3D =20 >>>> ccess to the client =20 >>>>> host, luckily the box recovers. =20 >>>> Did you rebuild both the nfscommon and nfscl modules from the same = sources? >>>> I did a commit to main that changes the interface between these two >>>> modules and did bump the >>>> __FreeBSD_version to 1500010, which should cause both to be = rebuilt. >>>> (If you have "options NFSCL" in your kernel config, both should = have >>>> been rebuilt as a part of >>>> the kernel build.) >>>>=20 >>>=20 >>> Is anyone by chance seeing autofs in the backtrace too? >>>=20 >>>=20 >>=20 >> Hello Cy Shubert, >>=20 >> I forgot to mention that those crashes occur with autofs mounted = filesystems. Good question, >> by the way, I will check whether crashes also happen when mounting = the tradidional way. >>=20 >> Kind regards, >>=20 >> oh >>=20 >> --=20 >> O. Hartmann >=20 --Apple-Mail=_727E0BFF-35CE-446A-A17B-7C5202616E94 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Hi,

Forgot = to mention I=E2=80=99m on 13-stable. The fix that is causing the crash = with automounted NFS is:

commit = cc5cda1dbaa907ce52074f47264cc45b5a7d6c8b
Author: = Konstantin Belousov <kib@FreeBSD.org>
Date:   Tue = Jan 2 00:22:44 2024 +0200

    nfsclient: limit situations when we do unlocked = read-ahead by nfsiod
    
    (cherry picked from commit = 70dc6b2ce314a0f32755005ad02802fca7ed186e)

When I remove the fix, the problem is = gone. Add it back and the crash happens.

Peter

On 15 = Jan 2024, at 09:31, Peter Blok <pblok@bsd4all.org> wrote:

Hi,

I do have a crash on a NFS client with = stable of today (4c4633fdffbe8e4b6d328c2bc9bb3edacc9ab50a). It is also = autofs related. Maybe it is the same problem.

I have ports automounted on /am/ports. = When I do cd /am/ports/sys and type tab to autocomplete it crashes with = the below stack trace. If I plainly mount ports on /usr/ports and do the = same everything works. I am using NFSv3

Peter




Fatal trap 12: page fault while in kernel mode
cpuid =3D 2; apic id =3D 04
fault = virtual address = =3D 0x89
fault code =3D= supervisor read data, page not present
instruction = pointer = =3D 0x20:0xffffffff809645d4
stack = pointer =        =3D 0x28:0xfffffe00acadb830
frame pointer        =3D = 0x28:0xfffffe00acadb830
code segment =3D= base 0x0, limit 0xfffff, type 0x1b
= =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags =3D interrupt enabled, resume, = IOPL =3D 0
current process =3D= 6869 (csh)
trap number =3D 12
panic: page fault
cpuid =3D 2
time =3D 1705306940
KDB: stack = backtrace:
#0 0xffffffff806232f5 at = kdb_backtrace+0x65
#1 0xffffffff805d7a02 at = vpanic+0x152
#2 0xffffffff805d78a3 at = panic+0x43
#3 0xffffffff809d58ad at = trap_fatal+0x38d
#4 0xffffffff809d58ff at = trap_pfault+0x4f
#5 0xffffffff809af048 at = calltrap+0x8
#6 0xffffffff804c7a7e at = ncl_bioread+0xb7e
#7 0xffffffff804b9d90 at = nfs_readdir+0x1f0
#8 0xffffffff8069c61a at = vop_sigdefer+0x2a
#9 0xffffffff809f8ae0 at = VOP_READDIR_APV+0x20
#10 0xffffffff81ce75de at = autofs_readdir+0x2ce
#11 0xffffffff809f8ae0 at = VOP_READDIR_APV+0x20
#12 0xffffffff806c3002 at = kern_getdirentries+0x222
#13 0xffffffff806c33a9 at = sys_getdirentries+0x29
#14 0xffffffff809d6180 at = amd64_syscall+0x110
#15 0xffffffff809af95b at = fast_syscall_common+0xf8



On 15 = Jan 2024, at 06:46, FreeBSD User <freebsd@walstatt-de.de> wrote:

Am Sun, 14 Jan 2024 20:34:12 -0800
Cy Schubert <Cy.Schubert@cschubert.com> schrieb:

In message <CAM5tNy5aat8vUn2fsX9jV=3DD9yGZdnO20Q0Ea7qtszx+zSES2bw@mail.gmai= l.c
om>  
, Rick Macklem = writes:
On Sat, Jan = 13, 2024 at 12:39=3DE2=3D80=3DAFPM Ronald Klop <ronald-lists@klop.ws>=3D
wrote: =  


Van: FreeBSD User <freebsd@walstatt-de.de>
Datum: 13 = januari 2024 19:34
Aan: FreeBSD CURRENT <freebsd-current@freebsd.org>
Onderwerp: = NFSv4 crash of CURRENT

Hello,

running CURRENT client (FreeBSD 15.0-CURRENT = #4 main-n267556-69748e62e82a=3D  
: Sat = Jan 13 18:08:32  
CET 2024 amd64). One NFSv4 server is same OS revision as the = mentioned cl=3D  
ient, other is FreeBSD =  
13.2-RELEASE-p8.= Both offer NFSv4 filesystems, non-kerberized.

I can crash the client reproducable by accessing the one or = other NFSv4 F=3D  
S (a simple ls -la). =  
The NFSv4 FS = is backed by ZFS (if this matters). I do not have physicla a=3D =  
ccess to the client  
host, luckily the box = recovers.  
Did you rebuild both the = nfscommon and nfscl modules from the same sources?
I did a = commit to main that changes the interface between these two
modules and did bump the
__FreeBSD_version to = 1500010, which should cause both to be rebuilt.
(If you = have "options NFSCL" in your kernel config, both should have
been rebuilt as a part of
the kernel build.)


Is anyone by = chance seeing autofs in the backtrace too?



Hello Cy Shubert,

I forgot to mention that those crashes occur with autofs = mounted filesystems. Good question,
by the way, I will check whether crashes also happen when = mounting the tradidional way.

Kind regards,

oh

-- 
O. = Hartmann


= --Apple-Mail=_727E0BFF-35CE-446A-A17B-7C5202616E94--