Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Apr 2024 08:33:06 -0700
From:      Rick Macklem <rick.macklem@gmail.com>
To:        current@freebsd.org, tuexen@freebsd.org,  Gleb Smirnoff <glebius@freebsd.org>
Subject:   Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d
Message-ID:  <CAM5tNy7Rf_cFF11wb0ZnqQ1RQ-F-2owv-qKFTt1bSRPCy6NodA@mail.gmail.com>
In-Reply-To: <CAM5tNy7Oou4WxVGUP_xsaYxBtQ6exVD6bSipjzThpEOO7C1L9A@mail.gmail.com>
References:  <ZhUqu8nXeQK9T2nH@albert.catwhisker.org> <CAM5tNy6EdhCeqLmU_tPJVtUN345s2GsYFZQ3L9aFL3=x%2BgJzzg@mail.gmail.com> <CAM5tNy7Oou4WxVGUP_xsaYxBtQ6exVD6bSipjzThpEOO7C1L9A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 9, 2024 at 8:04=E2=80=AFAM Rick Macklem <rick.macklem@gmail.com=
> wrote:
>
> On Tue, Apr 9, 2024 at 7:46=E2=80=AFAM Rick Macklem <rick.macklem@gmail.c=
om> wrote:
> >
> > On Tue, Apr 9, 2024 at 4:47=E2=80=AFAM David Wolfskill <david@catwhiske=
r.org> wrote:
> > >
> > > Machine had been running:
> > >
> > > FreeBSD 15.0-CURRENT #43 main-n269202-4e7aa03b7076: Mon Apr  8 11:19:=
58 UTC 2024     root@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.=
amd64/sys/GENERIC amd64 1500018 1500018
> > >
> > > This was an in-place source update, after updating sources to
> > > main-n269230-f6f67f58c19d.  On reboot (after "make installworld"
> > > completed, I see this on the serial console (copy/pasted):
> > >
> > > ...
> > > Starting lockd.
> > I'd guess this is caused by some recent change to AF_UNIX socket
> > creation. The crash appears to be either the SOCK_LOCK() or
> > SOCKBUF_LOCK(&so->so_rcv) not being initialized.
> > If you can find out what source line# corresponds to
> > clnt_vc_create+0x4f4 you can probably tell which one it is.
> >
> > All local_rpcb() does is a
> >   error =3D socreate(AF_LOCAL, &so, SOCK_STREAM, 0, curthread->td_ucred=
,
> > curthread);
> >   and then calls clnt_vc_create(..so..) with the socket.
> >
> > I think that socreate() is not initializing one of those two mutexes
> > for some reason.
> Looks to me like this was caused by commit 681711b. I've added tuexen@
> to the post, since he committed it.
Oops, my bad, got this wrong.

The commit is d80a97d, when it added PR_SOCKBUG to the pr_flags
for AF_UNIX/SOCKSTREAM.
I've added glebius@ to the email.

rick

>
> rick
>
> >
> > rick
> >
> > >
> > >
> > > Fatal trap 12: page fault while in kernel mode
> > > cpuid =3D 9; apic id =3D 09
> > > fault virtual address   =3D 0x18
> > > fault code              =3D supervisor read data, page not present
> > > instruction pointer     =3D 0x20:0xffffffff80b208c5
> > > stack pointer           =3D 0x28:0xfffffe048c204920
> > > frame pointer           =3D 0x28:0xfffffe048c204960
> > > code segment            =3D base 0x0, limit 0xfffff, type 0x1b
> > >                         =3D DPL 0, pres 1, long 1, def32 0, gran 1
> > > processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
> > > current process         =3D 1208 (rpc.Starting automountd.
> > > lockd)
> > > rdi: 0000000000000000 rsi: fffff801078b0740 rdx: 0000000000000000
> > > rcx: 000000000000010a  r8: ffffffff818d30f0  r9: 0000000000000000
> > > rax: 0000000000000000 rbx: 00000000Starting powerd.00000018 rbp: ffff=
fe048c204960
> > > r10: 0000000000010000 r11: 0000000000000001 r12: fffff80274e32c18
> > > r13: 000000000000010a r14: fffff80274e32c00 r15: ffffffff812ae38a
> > > trap number             =3D 12
> > > panic: page fault
> > > cpuid =3D 9
> > > time =3D 1712662362
> > > KDB: stack backtrace:
> > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0=
48c2045f0
> > > vpanic() at vpanic+0x135/frame 0xfffffe048c204720
> > > panic() at panic+0x43/frame 0xfffffe048c204780
> > > trap_fatal() at trap_fatal+0x40b/frame 0xfffffe048c2047e0
> > > trap_pfault() at trap_pfault+0xa0/frame 0xfffffe048c204850
> > > calltrap() at calltrap+0x8/frame 0xfffffe048c204850
> > > --- trap 0xc, rip =3D 0xffffffff80b208c5, rsp =3D 0xfffffe048c204920,=
 rbp =3D 0xfffffe
> > > 048c204960 ---
> > > __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960
> > > clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0
> > > local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50
> > > rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0
> > > svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90
> > > sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00
> > > amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30
> > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c2=
04f30
> > > --- syscall (154, FreeBSD ELF64, nlm_syscall), rip =3D 0x3f00a2dfd2a,=
 rsp =3D 0x3f00
> > > 96f7168, rbp =3D 0x3f0096f7230 ---
> > > KDB: enter: panic
> > > [ thread pid 1208 tid 101107 ]
> > > Stopped at      kdb_enter+0x33: movq    $0,0x104eb92(%rip)
> > > db>
> > >
> > >
> > > Given suitable clues, I can poke at it a bit -- this is my "build
> > > machine," so it doesn't have critical work to do at the moment.  (I
> > > would normally have powered it down for the day: here's no need for
> > > it to be wasting energy.)
> > >
> > > Laptops are still building ports under stable/14 -- something seems
> > > to want the llvm17 port, and they have firefox to build, so they
> > > won't be testing CURRENT/head for a while, yet.
> > >
> > > Peace,
> > > david
> > > --
> > > David H. Wolfskill                              david@catwhisker.org
> > > Alexey Navalny was a courageous man; Putin has made him a martyr.
> > >
> > > See https://www.catwhisker.org/~david/publickey.gpg for my public key=
.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy7Rf_cFF11wb0ZnqQ1RQ-F-2owv-qKFTt1bSRPCy6NodA>