Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Sep 2024 16:22:18 -0700
From:      Rick Macklem <rick.macklem@gmail.com>
To:        J David <j.david.lists@gmail.com>
Cc:        FreeBSD FS <freebsd-fs@freebsd.org>
Subject:   Re: panic: nfsv4root ref cnt cpuid = 1
Message-ID:  <CAM5tNy5Hh=6b9ZNseeQsRddLSFehiTsYNZOH==CeAGthie5SQw@mail.gmail.com>
In-Reply-To: <CABXB=RRKvfiwipfaaNA%2BAuA3Ug1VLyNvxa_o-5hWEq1-qjjTbg@mail.gmail.com>
References:  <CABXB=RShoxwT3PuPQK9OdJNBbWrShUuYchK7oVnT7gBbLH5D0w@mail.gmail.com> <CABXB=RRKvfiwipfaaNA%2BAuA3Ug1VLyNvxa_o-5hWEq1-qjjTbg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000d7308d0622bd8b4e
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Sep 22, 2024 at 7:28=E2=80=AFAM J David <j.david.lists@gmail.com> w=
rote:
>
> On Sun, Sep 22, 2024 at 10:17=E2=80=AFAM J David <j.david.lists@gmail.com=
> wrote:
> > #8 0xffffffff8302c3a7 at null_lookup+0xc7
>
> After noticing null_lookup in the crash trace, I realized that this
> must be an nfs filesystem that is then remounted elsewhere via nullfs.
> We've eliminated most of those.
>
> There's only one filesystem that is still used with nullfs
> (specifically to avoid a large number of otherwise identical mounts).
> Here are the "nfsstat -m" mount flags for that filesystem:
>
> nfsv4,minorversion=3D2,oneopenown,tcp,resvport,nconnect=3D1,hard,cto,nolo=
ckd,sec=3Dsys,acdirmin=3D3,acdirmax=3D60,acregmin=3D5,acregmax=3D60,nametim=
eo=3D60,negnametimeo=3D60,rsize=3D65536,wsize=3D65536,readdirsize=3D65536,r=
eadahead=3D1,wcommitsize=3D16777216,timeout=3D120,retrans=3D2147483647

I think I know what causes the crashes. The attached trivial patch should
work around them, but if you cannot apply a source kernel patch, the
only workaround would be to get rid of "oneopenown".
(Using nullfs may be a factor, since I think the crash would occur when
the code sleeps for a lock used to serialize opens for oneopenown.
This could result in the "struc nfsclopen *" being bogus, since the mutex
would be released/re-acquired.)

If you cannot get rid of the "oneopenown" or apply the kernel source patch,
getting rid of the nullfs mount or enabling delegations might also work aro=
und
this.

I will need to work on a correct fix, but it wouldn't make it into an updat=
e
for quite a while.

Sorry about the breakage, rick

>
> The server of this filesystem is 14.1-RELEASE-p5 and the exported
> filesystem is a readonly ZFS dataset.
>
> Thanks!
>

--000000000000d7308d0622bd8b4e
Content-Type: application/octet-stream; name="oneopen.patch"
Content-Disposition: attachment; filename="oneopen.patch"
Content-Transfer-Encoding: base64
Content-ID: <f_m1e7e3k90>
X-Attachment-Id: f_m1e7e3k90

LS0tIHN5cy9mcy9uZnNjbGllbnQvbmZzX2Nsdm5vcHMuYy5vbmVvcGVuCTIwMjQtMDktMjIgMTY6
MDU6NTYuNzY1OTU5MDAwIC0wNzAwCisrKyBzeXMvZnMvbmZzY2xpZW50L25mc19jbHZub3BzLmMJ
MjAyNC0wOS0yMiAxNjowNjo1My43OTg4OTMwMDAgLTA3MDAKQEAgLTEzMDksNiArMTMwOSw3IEBA
IG5mc19sb29rdXAoc3RydWN0IHZvcF9sb29rdXBfYXJncyAqYXApCiAJfQogCiAJb3Blbm1vZGUg
PSAwOworI2lmZGVmIG5vdG5vdwogCS8qCiAJICogSWYgdGhpcyBhbiBORlN2NC4xLzQuMiBtb3Vu
dCB1c2luZyB0aGUgIm9uZW9wZW5vd24iIG1vdW50CiAJICogb3B0aW9uLCBpdCBpcyBwb3NzaWJs
ZSB0byBkbyB0aGUgT3BlbiBvcGVyYXRpb24gaW4gdGhlIHNhbWUKQEAgLTEzMjgsNiArMTMyOSw3
IEBAIG5mc19sb29rdXAoc3RydWN0IHZvcF9sb29rdXBfYXJncyAqYXApCiAJCQlvcGVubW9kZSB8
PSBORlNWNE9QRU5fQUNDRVNTV1JJVEU7CiAJfQogCU5GU1VOTE9DS01OVChubXApOworI2VuZGlm
CiAKIAluZXd2cCA9IE5VTExWUDsKIAlORlNJTkNSR0xPQkFMKG5mc3N0YXRzdjEubG9va3VwY2Fj
aGVfbWlzc2VzKTsK
--000000000000d7308d0622bd8b4e--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy5Hh=6b9ZNseeQsRddLSFehiTsYNZOH==CeAGthie5SQw>