From owner-freebsd-net@freebsd.org Tue May 4 18:38:42 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D930C5FD763 for ; Tue, 4 May 2021 18:38:42 +0000 (UTC) (envelope-from schmiedgen@gmx.net) Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass Class 2 CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FZTBT6gDXz3sXp; Tue, 4 May 2021 18:38:41 +0000 (UTC) (envelope-from schmiedgen@gmx.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1620153520; bh=OK4Z577p7w5gX2COZL493xGpDlDpvo5FqPdeUIxrSwA=; h=X-UI-Sender-Class:Subject:To:Cc:References:From:Date:In-Reply-To; b=MzyvxwwH7iDJQL9eM1sp4x2s1RplbqdnKjlwkTv3jEcD7kTkOtZQzCYoS1WvQfLK6 dJ+JAxdm8YjBlnhnqYH29u4yWHfyTS5A1qmgFbimX69qrF7piXn8JYZxQctWyK0i+p R/QvJ1vofDO40MVAXyp3a69YNQJHMB/KqaACXtl4= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from [192.168.10.5] ([62.246.110.10]) by mail.gmx.net (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MwQXN-1lN9Te4Ayr-00sNIk; Tue, 04 May 2021 20:38:40 +0200 Subject: Re: page fault while in kernel mode - after upgrade from 12.2 to 13.0 To: Mark Johnston Cc: freebsd-net@freebsd.org References: From: Michael Schmiedgen Message-ID: Date: Tue, 4 May 2021 20:38:39 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:dK/1jb/8d7NTLjcwGPKYmOH6FePYBGxWy7/fr7E6fOUuJx8wHUb 1Dg/Xffy4vhMQ8WSmb1udOHAzl9O3U0dmq6KieWbqhGy4Ev4S8bHXtoL2mQTNC10jNx1suM DOEGJXPjioNaCZ3fuirg5pcx/TDtAXjXJDVSlFr23jKoo2OdbntVHYb/W6dLW7B95Gq8Lhu cy8cxQ2HwI7sFoAYTXEJg== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:ec77GZXnC1s=:zTO4z9n/jYjZ5PB//nUNAF dEvuAkhwJtkmvwUjALwwuxSJUV1Rh3JuRFq7jhp9KpTYEOF7J7qFesxg/Y581pj+/ZPJ6iHgG Frs/ZJ2grWQbsh7sFvFgSbNUXdUifOt9UYxcrDsDPsPNsZg8GDd51F6oNIpNYBb7jANSkbpBx iUUCl+We4xhcssfMHmmT5uVU2JzxN60KT7UNwquPAQMcWhJ5xhW0N0ykwt5YCK4cvCeHd5GnX zaBvc59dG4hhl42Y8fghMqPtfQHqayahmdjZnYDyD2pUXY0rTD6Ko+AQTIE/9G0fao6RWkXO1 ip4gLsUtMrh934b+5Zb8uwFNM+OUMNP+EApQUSDv0ggX+NiI1fNSTJDHl464zeKVuIFQH47Ra Sw1ortYJK6+fTt1wIbDFjALY/lk64qA8qlNMO8qFuuXpST5q2lP7WpmjYxo7iDRyULfVx4Swt SQL/l5GV1yKQDnDVkfCwgvfndt/+FC3sAlViPBWUhUUhNqzWUjYBSnfNeVgUjDv1tAqUZanE7 34c/RxDMZN6mJgPsbgc32ymBr3QqX2E4lzmBrIYs6BPDnVgTbUlpi4azam73Zoi0ZMW/Lhinc 72SVB4DxP70tJzNvCJgsjOUoJU+L1imaxV678tG+7Cy9NxOh3DlY3c8Wawudihkb0aVUR122h 7/9VWZmxQP+wzZ9fQ/MaYY8oUHN76fKTUqT767AoCzBUkCm1S8l+SuRh4AyQ+Af7ULJ/Vx2gm 9nx2T/EmujWhQ73z0Sx+kDiMwKyC1yyLhGUrJCfDXnxCJoHYqUIjknH8gUvB5qj70hWVABsnS XEt106fPiLylyAhuTlmHvefNxdJBd5PHbl+7RfZ/dcwvaR111jKJn2qBSGPR1NnOEnt62GOEE A07VUni3tP/jxBV8G8MTQ6uV42BHFR10r3NJZaVUXIrFhkiBXWxWfV3gPNlq1xfjb9C00rXJj ZDCMARdcck1izcgIbWyNmRynO8J3qoOyziaUB7iYltm6/elUD5Py3b48PkpvE3uSoYoq0/8xy H38jaIqm9/VAm5UVK1VsrdDW7w/qwAthMjVMct+Xb8JxyL7F55Kp+VyzmZV2fJnjU9x0Z/ZrO NSbj5w2S1mkhH2AgdgvuIwJGcU8DECdcAZcuGlzggjNcfHWJ/nX14aT6JnPMa2wuVH6uXmcbd b+4mYWUYGtVRQLTt7EgtroVh1nqxG+hzmFv8zkDX1SyKV4HvWqhpAMIhVhndZhL/DQAE0= X-Rspamd-Queue-Id: 4FZTBT6gDXz3sXp X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmx.net header.s=badeba3b8450 header.b=MzyvxwwH; dmarc=pass (policy=none) header.from=gmx.net; spf=pass (mx1.freebsd.org: domain of schmiedgen@gmx.net designates 212.227.17.22 as permitted sender) smtp.mailfrom=schmiedgen@gmx.net X-Spamd-Result: default: False [-4.10 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmx.net]; R_SPF_ALLOW(-0.20)[+ip4:212.227.17.0/27]; DKIM_TRACE(0.00)[gmx.net:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmx.net,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCVD_IN_DNSWL_LOW(-0.10)[212.227.17.22:from]; FROM_EQ_ENVFROM(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[212.227.17.22:from]; FREEMAIL_ENVFROM(0.00)[gmx.net]; MID_RHS_MATCH_FROM(0.00)[]; ASN(0.00)[asn:8560, ipnet:212.227.0.0/16, country:DE]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmx.net:s=badeba3b8450]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[gmx.net:dkim]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; SPAMHAUS_ZRD(0.00)[212.227.17.22:from:127.0.2.255]; RWL_MAILSPIKE_POSSIBLE(0.00)[212.227.17.22:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 May 2021 18:38:42 -0000 Hi Mark, sorry for the delay, I only can test after work. I triggered another 2 pan= ics, this time with a different result (see below). Can I provide some more information? Thank you! Michael =2D-- #1 Fatal trap 12: page fault while in kernel mode cpuid =3D 1; apic id =3D 01 fault virtual address =3D 0x388 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80d3fa67 stack pointer =3D 0x28:0xfffffe0115bea9c0 frame pointer =3D 0x28:0xfffffe0115beaa20 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 12 (swi1: netisr 0) trap number =3D 12 panic: page fault cpuid =3D 1 time =3D 1620144777 KDB: stack backtrace: #0 0xffffffff80c57345 at kdb_backtrace+0x65 #1 0xffffffff80c09d21 at vpanic+0x181 #2 0xffffffff80c09b93 at panic+0x43 #3 0xffffffff8108b187 at trap_fatal+0x387 #4 0xffffffff8108b1df at trap_pfault+0x4f #5 0xffffffff8108a83d at trap+0x27d #6 0xffffffff810617a8 at calltrap+0x8 #7 0xffffffff80bcae5d at ithread_loop+0x24d #8 0xffffffff80bc7c5e at fork_exit+0x7e #9 0xffffffff8106282e at fork_trampoline+0xe Uptime: 3m51s Dumping 2617 out of 65454 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%.= .91% __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(str= uct pcpu, (kgdb) list *0xffffffff80d3fa67 0xffffffff80d3fa67 is in swi_net (/usr/src/sys/net/netisr.c:918). 913 if (local_npw.nw_head =3D=3D NULL) 914 local_npw.nw_tail =3D NULL; 915 local_npw.nw_len--; 916 VNET_ASSERT(m->m_pkthdr.rcvif !=3D NULL, 917 ("%s:%d rcvif =3D=3D NULL: m=3D%p", __func__, = __LINE__, m)); 918 CURVNET_SET(m->m_pkthdr.rcvif->if_vnet); 919 netisr_proto[proto].np_handler(m); 920 CURVNET_RESTORE(); 921 } 922 KASSERT(local_npw.nw_len =3D=3D 0, (kgdb) backtrace #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=3D) at /usr/src/sys/kern/kern_shutdow= n.c:399 #2 0xffffffff80c09916 in kern_reboot (howto=3D260) at /usr/src/sys/kern/k= ern_shutdown.c:486 #3 0xffffffff80c09d90 in vpanic (fmt=3D, ap=3D) at /usr/src/sys/kern/kern_shutdown.c:919 #4 0xffffffff80c09b93 in panic (fmt=3D) at /usr/src/sys/kern= /kern_shutdown.c:843 #5 0xffffffff8108b187 in trap_fatal (frame=3D0xfffffe0115bea900, eva=3D90= 4) at /usr/src/sys/amd64/amd64/trap.c:915 #6 0xffffffff8108b1df in trap_pfault (frame=3Dframe@entry=3D0xfffffe0115b= ea900, usermode=3Dfalse, signo=3D, signo@entry=3D0x0, ucode= =3D, ucode@entry=3D0x0) at /usr/src/sys/amd64/amd64/trap.c:732 #7 0xffffffff8108a83d in trap (frame=3D0xfffffe0115bea900) at /usr/src/sy= s/amd64/amd64/trap.c:398 #8 #9 0xffffffff80d3fa67 in netisr_process_workstream_proto (nwsp=3D, proto=3D1) at /usr/src/sys/net/netisr.c:918 #10 swi_net (arg=3D) at /usr/src/sys/net/netisr.c:966 #11 0xffffffff80bcae5d in intr_event_execute_handlers (p=3D= , ie=3D0xfffff80003dbb600) at /usr/src/sys/kern/kern_intr.c:1168 #12 ithread_execute_handlers (p=3D, ie=3D0xfffff80003dbb600= ) at /usr/src/sys/kern/kern_intr.c:1181 #13 ithread_loop (arg=3Darg@entry=3D0xfffff80003dced40) at /usr/src/sys/ke= rn/kern_intr.c:1269 #14 0xffffffff80bc7c5e in fork_exit (callout=3D0xffffffff80bcac10 , arg=3D0xfffff80003dced40, frame=3D0xfffffe0115beab00) at /usr/src/sys/kern/kern_fork.c:1069 #15 =2D-- #2 Fatal trap 12: page fault while in kernel mode cpuid =3D 1; apic id =3D 01 fault virtual address =3D 0x8 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80ca599c stack pointer =3D 0x28:0xfffffe0115bea6c0 frame pointer =3D 0x28:0xfffffe0115bea700 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 12 (swi1: netisr 0) trap number =3D 12 panic: page fault cpuid =3D 1 time =3D 1620152374 KDB: stack backtrace: #0 0xffffffff80c57345 at kdb_backtrace+0x65 #1 0xffffffff80c09d21 at vpanic+0x181 #2 0xffffffff80c09b93 at panic+0x43 #3 0xffffffff8108b187 at trap_fatal+0x387 #4 0xffffffff8108b1df at trap_pfault+0x4f #5 0xffffffff8108a83d at trap+0x27d #6 0xffffffff810617a8 at calltrap+0x8 #7 0xffffffff80dbf0ae at tcp_do_segment+0x10ce #8 0xffffffff80dbd21e at tcp_input+0xabe #9 0xffffffff80dafc15 at ip_input+0x125 #10 0xffffffff80d3fa7b at swi_net+0x12b #11 0xffffffff80bcae5d at ithread_loop+0x24d #12 0xffffffff80bc7c5e at fork_exit+0x7e #13 0xffffffff8106282e at fork_trampoline+0xe Uptime: 2h3m59s Dumping 2666 out of 65454 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%.= .91% __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(str= uct pcpu, (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=3D) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xffffffff80c09916 in kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:486 #3 0xffffffff80c09d90 in vpanic (fmt=3D, ap=3D) at /usr/src/sys/kern/kern_shutdown.c:919 #4 0xffffffff80c09b93 in panic (fmt=3D) at /usr/src/sys/kern/kern_shutdown.c:843 #5 0xffffffff8108b187 in trap_fatal (frame=3D0xfffffe0115bea600, eva=3D8) at /usr/src/sys/amd64/amd64/trap.c:915 #6 0xffffffff8108b1df in trap_pfault (frame=3Dframe@entry=3D0xfffffe0115b= ea600, usermode=3Dfalse, signo=3D, signo@entry=3D0x0, ucode=3D, ucode@entry=3D0x0) at /usr/src/sys/amd64/amd64/trap.c:732 #7 0xffffffff8108a83d in trap (frame=3D0xfffffe0115bea600) at /usr/src/sys/amd64/amd64/trap.c:398 #8 #9 sbcut_internal (sb=3D0xfffff80522aa09c0, len=3D203, len@entry=3D476) at /usr/src/sys/kern/uipc_sockbuf.c:1491 #10 0xffffffff80ca5b8a in sbcut_locked (sb=3D0xfffff80522aa09c0, len=3D-743943424, len@entry=3D476) at /usr/src/sys/kern/uipc_sockbuf.= c:1591 #11 0xffffffff80dbf0ae in tcp_do_segment (m=3D0xfffff8004c2aae00, th=3D, so=3D, tp=3D, drop_hdrlen=3D52, tlen=3D, iptos=3D0 '\000') at /usr/src/sys/netinet/tcp_input.c:2918 #12 0xffffffff80dbd21e in tcp_input (mp=3D, offp=3D, proto=3D) at /usr/src/sys/netinet/tcp_input.c:1382 #13 0xffffffff80dafc15 in ip_input (m=3D0x0) at /usr/src/sys/netinet/ip_input.c:829 #14 0xffffffff80d3fa7b in netisr_process_workstream_proto ( nwsp=3D, proto=3D1) at /usr/src/sys/net/netisr.c:919 #15 swi_net (arg=3D) at /usr/src/sys/net/netisr.c:966 #16 0xffffffff80bcae5d in intr_event_execute_handlers (p=3D= , ie=3D0xfffff80003bbe500) at /usr/src/sys/kern/kern_intr.c:1168 #17 ithread_execute_handlers (p=3D, ie=3D0xfffff80003bbe500= ) at /usr/src/sys/kern/kern_intr.c:1181 #18 ithread_loop (arg=3Darg@entry=3D0xfffff80003cb6d40) at /usr/src/sys/kern/kern_intr.c:1269 #19 0xffffffff80bc7c5e in fork_exit ( callout=3D0xffffffff80bcac10 , arg=3D0xfffff80003cb6d40= , frame=3D0xfffffe0115beab00) at /usr/src/sys/kern/kern_fork.c:1069 #20 =2D-- On 03.05.2021 21:45, Mark Johnston wrote: > On Mon, May 03, 2021 at 08:04:30PM +0200, Michael Schmiedgen wrote: >> Hi List, >> >> if I start a Samba jail, after a few seconds the system crashes. Very r= eproducible. >> >> System has ~10 jails and 3 bhyve VMs. Dell server, Xeon E3-1240, 64GB R= AM, 3 way mirror ZFS. >> >> It also occurs a few seconds after I start a phone call using the SIP V= M of that machine, >> very strange. >> >> I got some log messages suggesting raising somaxconn, so I did >> >> kern.ipc.somaxconn=3D4096 >> >> in sysctl.conf >> >> >> Below some debug information, please let me know if I should provide fu= rther information. >> >> Should I open a bug or something? >> >> Thank you very much! >> Michael >> >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid =3D 0; apic id =3D 00 >> fault virtual address =3D 0x0 >> fault code =3D supervisor read data, page not present >> instruction pointer =3D 0x20:0xffffffff80ca52c0 >> stack pointer =3D 0x28:0xfffffe019d039650 >> frame pointer =3D 0x28:0xfffffe019d039690 >> code segment =3D base 0x0, limit 0xfffff, type 0x1b >> =3D DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags =3D interrupt enabled, resume, IOPL =3D 0 >> current process =3D 649 (devd) >> trap number =3D 12 >> panic: page fault >> cpuid =3D 0 >> time =3D 1620061253 >> KDB: stack backtrace: >> #0 0xffffffff80c57345 at kdb_backtrace+0x65 >> #1 0xffffffff80c09d21 at vpanic+0x181 >> #2 0xffffffff80c09b93 at panic+0x43 >> #3 0xffffffff8108b187 at trap_fatal+0x387 >> #4 0xffffffff8108b1df at trap_pfault+0x4f >> #5 0xffffffff8108a83d at trap+0x27d >> #6 0xffffffff810617a8 at calltrap+0x8 >> #7 0xffffffff80ca51c3 at sbappendaddr_locked+0x93 >> #8 0xffffffff80cb437a at uipc_send+0x73a >> #9 0xffffffff80ca9053 at sosend_generic+0x633 >> #10 0xffffffff80ca94e0 at sosend+0x50 >> #11 0xffffffff80caff2e at kern_sendit+0x20e >> #12 0xffffffff80cb032b at sendit+0x1db >> #13 0xffffffff80cb013d at sys_sendto+0x4d >> #14 0xffffffff8108ba8c at amd64_syscall+0x10c >> #15 0xffffffff810620ce at fast_syscall_common+0xf8 >> Uptime: 2m2s >> Dumping 2373 out of 65454 MB:..1%..11%..21%..31%..41%..51%..61%..71%..8= 1%..91% >> >> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 >> 55 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(= struct pcpu, >> (kgdb) list *0xffffffff80ca52c0 >> 0xffffffff80ca52c0 is in sbappendaddr_locked_internal (/usr/src/sys/ker= n/uipc_sockbuf.c:1169). >> 1164 if (ctrl_last) >> 1165 ctrl_last->m_next =3D m0; /* concatenate data t= o control */ >> 1166 else >> 1167 control =3D m0; >> 1168 m->m_next =3D control; >> 1169 for (n =3D m; n->m_next !=3D NULL; n =3D n->m_next) >> 1170 sballoc(sb, n); >> 1171 sballoc(sb, n); >> 1172 nlast =3D n; >> 1173 SBLINKRECORD(sb, m); > > So we are crashing because "n" is somehow NULL? That seems difficult to > explain. Can you show the local variables in this frame? > > Does the panic always have the same stack trace? >