Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Feb 2024 14:20:28 -0800
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Zaphod Beeblebrox <zbeeble@gmail.com>
Cc:        Mark Johnston <markj@freebsd.org>, "Matthew L. Dailey" <Matthew.L.Dailey@dartmouth.edu>,  "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
Subject:   Re: FreeBSD panics possibly caused by nfs clients
Message-ID:  <CAM5tNy7-RbgFdn8bWcGL1d1MX-q6hW-YYv=EpTB7DhbkBQuxRg@mail.gmail.com>
In-Reply-To: <CACpH0MfvdizKo%2BRA0E6jnMVZSayotA2Vn2znZG8qD1K18dsF6g@mail.gmail.com>
References:  <c5d44484-8660-4b8b-a379-79423cb208f6@dartmouth.edu> <ZcZNDtN1nNJmo8cS@nuc> <c9eca81a-9eff-4b17-9928-bee2c79cef8f@dartmouth.edu> <b3243928-4d66-4c5e-9745-254d57f1cc5e@dartmouth.edu> <ZcaWkUwMlBCZCUhg@nuc> <CACpH0MfvdizKo%2BRA0E6jnMVZSayotA2Vn2znZG8qD1K18dsF6g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Feb 9, 2024 at 2:04=E2=80=AFPM Zaphod Beeblebrox <zbeeble@gmail.com=
> wrote:
>
> Just in case it's relevant, I'm carrying around this patch on my fairly b=
usy little RISC-V machine.
>
> diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnop=
s.c
> index 0b8c587a542c..85c0ebd7a10f 100644
> --- a/sys/fs/nfsclient/nfs_clvnops.c
> +++ b/sys/fs/nfsclient/nfs_clvnops.c
> @@ -2459,6 +2459,16 @@ nfs_readdir(struct vop_readdir_args *ap)
>                 return (EINVAL);
>         uio->uio_resid -=3D left;
>
> +       /*
> +        * For readdirplus, if starting to read the directory,
> +        * purge the name cache, since it will be reloaded by
> +        * this directory read.
> +        * This removes potentially stale name cache entries.
> +        */
> +       if (uio->uio_offset =3D=3D 0 &&
> +           (VFSTONFS(vp->v_mount)->nm_flag & NFSMNT_RDIRPLUS) !=3D 0)
> +               cache_purge(vp);
> +
>         /*
>          * Call ncl_bioread() to do the real work.
>          */
> ... without it, I can panic.
This is not of interest to Matthew, since he is using Linux clients against=
 a
FreeBSD server.  However, it is of interest to me.  This is the first time =
I've
seen this (unless I just forgot;-) and since readdirplus is not a default, =
I
suspect few test/use it.

I will take a look at this, since it sounds reasonable.

Thanks for posting it, rick

>
>
> On Fri, Feb 9, 2024 at 4:18=E2=80=AFPM Mark Johnston <markj@freebsd.org> =
wrote:
>>
>> On Fri, Feb 09, 2024 at 06:23:08PM +0000, Matthew L. Dailey wrote:
>> > I had my first kernel panic with a KASAN kernel after only 01:27. This
>> > first panic was a "double fault," which isn't anything we've seen
>> > previously - usually we've seen trap 9 or trap 12, but sometimes other=
s.
>> > Based on the backtrace, it definitely looks like KASAN caught somethin=
g,
>> > but I don't have the expertise to know if this points to anything
>> > specific. From the backtrace, it looks like this might have originated
>> > in ipfw code.
>>
>> A double fault is rather unexpected.  I presume you're running
>> releng/14.0?  Is it at all possible to test with FreeBSD-CURRENT?
>>
>> Did you add INVARIANTS etc. to the kernel configuration used here, or
>> just KASAN?
>>
>> > Please let me know what other info I can provide or what I can do to d=
ig
>> > deeper.
>>
>> If you could repeat the test several times, I'd be interested in seeing
>> if you always get the same result.  If you're willing to share the
>> vmcore (or several), I'd be willing to take a look at it.
>>
>> > Thanks!!
>> >
>> > Panic message:
>> > [5674] Fatal double fault
>> > [5674] rip 0xffffffff812f6e32 rsp 0xfffffe014677afe0 rbp 0xfffffe01467=
7b430
>> > [5674] rax 0x1fffffc028cef620 rdx 0xf2f2f2f8f2f2f2f2 rbx 0x1
>> > [5674] rcx 0xdffff7c000000000 rsi 0xfffffe004086a4a0 rdi 0xf8f8f8f8f2f=
2f2f8
>> > [5674] r8 0xf8f8f8f8f8f8f8f8 r9 0x162a r10 0x835003002d3a64e1
>> > [5674] r11 0 r12 0xfffff78028cef620 r13 0xfffffe004086a440
>> > [5674] r14 0xfffffe01488c0560 r15 0x26f40 rflags 0x10006
>> > [5674] cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
>> > [5674] fsbase 0x95d1d81a130 gsbase 0xffffffff84a14000 kgsbase 0
>> > [5674] cpuid =3D 4; apic id =3D 08
>> > [5674] panic: double fault
>> > [5674] cpuid =3D 4
>> > [5674] time =3D 1707498420
>> > [5674] KDB: stack backtrace:
>> > [5674] Uptime: 1h34m34s
>> >
>> > Backtrace:
>> > #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
>> > #1  doadump (textdump=3D<optimized out>) at
>> > /usr/src/sys/kern/kern_shutdown.c:405
>> > #2  0xffffffff8128b7dc in kern_reboot (howto=3Dhowto@entry=3D260)
>> >      at /usr/src/sys/kern/kern_shutdown.c:526
>> > #3  0xffffffff8128c000 in vpanic (
>> >      fmt=3Dfmt@entry=3D0xffffffff82589a00 <str> "double fault",
>> >      ap=3Dap@entry=3D0xfffffe0040866de0) at
>> > /usr/src/sys/kern/kern_shutdown.c:970
>> > #4  0xffffffff8128bd75 in panic (fmt=3D0xffffffff82589a00 <str> "doubl=
e
>> > fault")
>> >      at /usr/src/sys/kern/kern_shutdown.c:894
>> > #5  0xffffffff81c4b335 in dblfault_handler (frame=3D<optimized out>)
>> >      at /usr/src/sys/amd64/amd64/trap.c:1012
>> > #6  <signal handler called>
>> > #7  0xffffffff812f6e32 in sched_clock (td=3Dtd@entry=3D0xfffffe01488c0=
560,
>> >      cnt=3Dcnt@entry=3D1) at /usr/src/sys/kern/sched_ule.c:2601
>> > #8  0xffffffff8119e2a7 in statclock (cnt=3Dcnt@entry=3D1,
>> >      usermode=3Dusermode@entry=3D0) at /usr/src/sys/kern/kern_clock.c:=
760
>> > #9  0xffffffff8119fb67 in handleevents (now=3Dnow@entry=3D243718556998=
32,
>> >      fake=3Dfake@entry=3D0) at /usr/src/sys/kern/kern_clocksource.c:19=
5
>> > #10 0xffffffff811a10cc in timercb (et=3D<optimized out>, arg=3D<optimi=
zed out>)
>> >      at /usr/src/sys/kern/kern_clocksource.c:353
>> > #11 0xffffffff81dcd280 in lapic_handle_timer (frame=3D0xfffffe014677b7=
50)
>> >      at /usr/src/sys/x86/x86/local_apic.c:1343
>> > #12 <signal handler called>
>> > #13 __asan_load8_noabort (addr=3D18446741880219689232)
>> >      at /usr/src/sys/kern/subr_asan.c:1113
>> > #14 0xffffffff851488b8 in ?? () from /boot/thayer/ipfw.ko
>> > #15 0xfffffe0100000000 in ?? ()
>> > #16 0xffffffff8134dcd5 in pcpu_find (cpuid=3D1238425856)
>> >      at /usr/src/sys/kern/subr_pcpu.c:286
>> > #17 0xffffffff85151f6f in ?? () from /boot/thayer/ipfw.ko
>> > #18 0x0000000000000000 in ?? ()
>>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy7-RbgFdn8bWcGL1d1MX-q6hW-YYv=EpTB7DhbkBQuxRg>