Date: Fri, 19 Jan 2024 16:42:53 +0100 From: =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= <uqs@freebsd.org> To: Konstantin Belousov <kib@freebsd.org> Cc: stable@freebsd.org, Rick Macklem <rick.macklem@gmail.com> Subject: Re: Repeatable nfs_readdir kernel panic after upgrade to stable/14 Message-ID: <CAJ9axoSq7NohixYCfZ%2BhiyKKH5XwF%2B6%2BVoMT2yq2R_ZSJUkQog@mail.gmail.com> In-Reply-To: <Zaem0abAxKFYG4HY@kib.kiev.ua> References: <CAJ9axoS5SuZid6dihAWzPgg7xyRj8LX86Pq4ckM5FFaFRBVYOw@mail.gmail.com> <Zaem0abAxKFYG4HY@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000ef9026060f4e55ed Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Indeed, seems to work now. Thanks for the speedy fix. On Wed, 17 Jan 2024, 11:07 Konstantin Belousov, <kib@freebsd.org> wrote: > On Wed, Jan 17, 2024 at 10:28:01AM +0100, Ulrich Sp=C3=B6rlein wrote: > > Hey there, > > upgraded my NFS server and laptop (NFS client) to stable/14 over the > > weekend and now anything "intensive" that reads from NFS seems to kerne= l > > panic. > > > > I think this started when I upgraded the server first, shrugged it off = as > > some overload on the laptop, finished the laptop upgrade to 14 and now > > everytime I open easytag on the NFS automounted directory, or browsing > > photos with geeqie it locks up hard. > > > > Mounts on the client currently look like so: > > > > map /etc/auto_tank on /tank (autofs) > > map -media on /media (autofs) > > 192.168.0.151:/tank/music on /tank/music (nfs, automounted) > > > > I'm not even sure if I'm using NFS3 or 4 or whether I'm using the ZFS > based > > one, I've set this up ages ago. > > > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 1; apic id =3D 02 > > fault virtual address =3D 0x89 > > fault code =3D supervisor read data, page not present > > instruction pointer =3D 0x20:0xffffffff80eee094 > > stack pointer =3D 0x28:0xfffffe01268c0830 > > frame pointer =3D 0x28:0xfffffe01268c0830 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 74673 (easytag) > > rdi: 0000000000000000 rsi: ffffffff819bff08 rdx: 0000000000000000 > > rcx: 0000000000000000 r8: fffffe003781e0f0 r9: fffff8001ab51740 > > rax: 0000000000000000 rbx: fffff8001ab51740 rbp: fffffe01268c0830 > > r10: ffffffff00000000 r11: fffffe01268c07b0 r12: fffffe003781e0f0 > > r13: fffff8047ac47700 r14: fffffe012ac1ba38 r15: fffff80437cac000 > > trap number =3D 12 > > panic: page fault > > cpuid =3D 1 > > time =3D 1705480771 > > KDB: stack backtrace: > > #0 0xffffffff80b9d68d at kdb_backtrace+0x5d > > #1 0xffffffff80b4f95f at vpanic+0x12f > > #2 0xffffffff80b4f823 at panic+0x43 > > #3 0xffffffff8102902f at trap_fatal+0x40f > > #4 0xffffffff8102907f at trap_pfault+0x4f > > #5 0xffffffff80ffef48 at calltrap+0x8 > > #6 0xffffffff80a3a3fe at ncl_bioread+0xb7e > > #7 0xffffffff80a2c0a0 at nfs_readdir+0x1f0 > > #8 0xffffffff80c217aa at vop_sigdefer+0x2a > > #9 0xffffffff81100280 at VOP_READDIR_APV+0x20 > > #10 0xffffffff846af5ae at autofs_readdir+0x2ce > > #11 0xffffffff81100280 at VOP_READDIR_APV+0x20 > > #12 0xffffffff80c48501 at kern_getdirentries+0x221 > > #13 0xffffffff80c488a9 at sys_getdirentries+0x29 > > #14 0xffffffff810298d9 at amd64_syscall+0x109 > > #15 0xffffffff80fff85b at fast_syscall_common+0xf8 > > Uptime: 3m18s > > Dumping 1242 out of 32368 > > MB:..2%..11%..21%..31%..42%..51%..61%..71%..82%..91% > > > > I can still access those NFS mounts just fine, can play music off them > with > > audacious or just mpv, but easytag will try to recursively read > everything > > and presumably puts a lot of stress on the system. > > > > I see there was chatter about this recently, and kib committed somethin= g > to > > nfsclient, which got merged to stable/14 on the 11th, but my build is > from > > the 14th, so presumably I already have this "fix", and it's not working= ? > > > > I'm on n266311-299e9fe9709a right now, which _is_ after kib's fixes, > maybe > > they are not sufficient for stable/14? > > You need 7b49e60227f8 which I just pushed. > > --000000000000ef9026060f4e55ed Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto">Indeed, seems to work now. Thanks for the speedy fix.</di= v><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On W= ed, 17 Jan 2024, 11:07 Konstantin Belousov, <<a href=3D"mailto:kib@freeb= sd.org">kib@freebsd.org</a>> wrote:<br></div><blockquote class=3D"gmail_= quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1= ex">On Wed, Jan 17, 2024 at 10:28:01AM +0100, Ulrich Sp=C3=B6rlein wrote:<b= r> > Hey there,<br> > upgraded my NFS server and laptop (NFS client) to stable/14 over the<b= r> > weekend and now anything "intensive" that reads from NFS see= ms to kernel<br> > panic.<br> > <br> > I think this started when I upgraded the server first, shrugged it off= as<br> > some overload on the laptop, finished the laptop upgrade to 14 and now= <br> > everytime I open easytag on the NFS automounted directory, or browsing= <br> > photos with geeqie it locks up hard.<br> > <br> > Mounts on the client currently look like so:<br> > <br> > map /etc/auto_tank on /tank (autofs)<br> > map -media on /media (autofs)<br> > 192.168.0.151:/tank/music on /tank/music (nfs, automounted)<br> > <br> > I'm not even sure if I'm using NFS3 or 4 or whether I'm us= ing the ZFS based<br> > one, I've set this up ages ago.<br> > <br> > Fatal trap 12: page fault while in kernel mode<br> > cpuid =3D 1; apic id =3D 02<br> > fault virtual address=C2=A0 =C2=A0=3D 0x89<br> > fault code=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D supervi= sor read data, page not present<br> > instruction pointer=C2=A0 =C2=A0 =C2=A0=3D 0x20:0xffffffff80eee094<br> > stack pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 0x28:0xfffff= e01268c0830<br> > frame pointer=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 0x28:0xfffff= e01268c0830<br> > code segment=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D base 0x0, li= mit 0xfffff, type 0x1b<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0=3D DPL 0, pres 1, long 1, def32 0, gran 1<br> > processor eflags=C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D interrupt enabled, res= ume, IOPL =3D 0<br> > current process=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 74673 (easytag)<b= r> > rdi: 0000000000000000 rsi: ffffffff819bff08 rdx: 0000000000000000<br> > rcx: 0000000000000000=C2=A0 r8: fffffe003781e0f0=C2=A0 r9: fffff8001ab= 51740<br> > rax: 0000000000000000 rbx: fffff8001ab51740 rbp: fffffe01268c0830<br> > r10: ffffffff00000000 r11: fffffe01268c07b0 r12: fffffe003781e0f0<br> > r13: fffff8047ac47700 r14: fffffe012ac1ba38 r15: fffff80437cac000<br> > trap number=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 12<br> > panic: page fault<br> > cpuid =3D 1<br> > time =3D 1705480771<br> > KDB: stack backtrace:<br> > #0 0xffffffff80b9d68d at kdb_backtrace+0x5d<br> > #1 0xffffffff80b4f95f at vpanic+0x12f<br> > #2 0xffffffff80b4f823 at panic+0x43<br> > #3 0xffffffff8102902f at trap_fatal+0x40f<br> > #4 0xffffffff8102907f at trap_pfault+0x4f<br> > #5 0xffffffff80ffef48 at calltrap+0x8<br> > #6 0xffffffff80a3a3fe at ncl_bioread+0xb7e<br> > #7 0xffffffff80a2c0a0 at nfs_readdir+0x1f0<br> > #8 0xffffffff80c217aa at vop_sigdefer+0x2a<br> > #9 0xffffffff81100280 at VOP_READDIR_APV+0x20<br> > #10 0xffffffff846af5ae at autofs_readdir+0x2ce<br> > #11 0xffffffff81100280 at VOP_READDIR_APV+0x20<br> > #12 0xffffffff80c48501 at kern_getdirentries+0x221<br> > #13 0xffffffff80c488a9 at sys_getdirentries+0x29<br> > #14 0xffffffff810298d9 at amd64_syscall+0x109<br> > #15 0xffffffff80fff85b at fast_syscall_common+0xf8<br> > Uptime: 3m18s<br> > Dumping 1242 out of 32368<br> > MB:..2%..11%..21%..31%..42%..51%..61%..71%..82%..91%<br> > <br> > I can still access those NFS mounts just fine, can play music off them= with<br> > audacious or just mpv, but easytag will try to recursively read everyt= hing<br> > and presumably puts a lot of stress on the system.<br> > <br> > I see there was chatter about this recently, and kib committed somethi= ng to<br> > nfsclient, which got merged to stable/14 on the 11th, but my build is = from<br> > the 14th, so presumably I already have this "fix", and it= 9;s not working?<br> > <br> > I'm on n266311-299e9fe9709a right now, which _is_ after kib's = fixes, maybe<br> > they are not sufficient for stable/14?<br> <br> You need 7b49e60227f8 which I just pushed.<br> <br> </blockquote></div> --000000000000ef9026060f4e55ed--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ9axoSq7NohixYCfZ%2BhiyKKH5XwF%2B6%2BVoMT2yq2R_ZSJUkQog>