From nobody Tue Apr 9 07:18:37 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDHPX3jZQz5GrGg; Tue, 9 Apr 2024 07:18:40 +0000 (UTC) (envelope-from bapt@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDHPW6Spqz4LKv; Tue, 9 Apr 2024 07:18:39 +0000 (UTC) (envelope-from bapt@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712647119; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0DAH3S3efMsMkiJwXViEyHQkUVLVplSc2Qwu4Urt4RI=; b=Q4C/pUqRDlcuUegX5FfEqklWggMnsw4zX1tbGGNkD66X5zyRxfwKd8PKm3otW5aNizGP5G OGtTMKc1oHgPx4W0kgFPPQ0g2akyBi3YkQ271eA8PyRZ0bvi/RcwCDt+/TAF+vsAuxgrLG qgYTAR4pDvenAjKYQ9lmI++osSrczwXkfh1cdOTHM0JTW8ALwariHW3rPTqEhc1yW9Fkuu y9RoMA2IQ2CT2/FQUrA3+PSAs/SDy/aG8FHumKprkqtb+xdspedRs0EFksn5QjQoeIFoKn KKEOKjG+6UY9jdBbFfLSFzCSqyb8mQkUPQqeB+n0skD06IO33UJ+Hm1FWTiQ5A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712647119; a=rsa-sha256; cv=none; b=stK1cBurRqANMEYd+1AETuYD4Zs7D7pXCz6SpGca+XrhLERdc8r3KL7/GLnl4x1h2dq2im AVJ9YajjqRqg6w9Q1XIXBQ9ddoVKbr8RhhItQ1AHL08EUnyFPswP97LeortFKJQHckkrfd u5PItMY9PnAUuPrbee6scjmgdyrKSZGcGNsBpZ8T8Cwufyd1wO4YpCLnDASu+SrIPNJnsx Tp+CoE/ykAc+fePHOO2JQq/+X5IBvfWCE6/qMBqADE1EkPSjdA6KWSmhp2sxP5S2/zPo6E IiO8lpmkp+pDGjZlxmo0kbMp7cHXGM6sn0a3gJtnY2ALBl9os3clFtrWk8Xbtg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712647119; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0DAH3S3efMsMkiJwXViEyHQkUVLVplSc2Qwu4Urt4RI=; b=pJAc7ffxdPXexins1wYOmHdZbi/9CqAUVb/qUaNYcIevYURTqqc5wbNjhnfPwmVBkNonJb 5gxPTBo2P2wwjRI1NMF0/k8HHzvZc46IjFIV8NcbfH3NT1aPfP/8dcrQ5Uw/R82WVJ7ZeA 4UWWXEjiT+QUtgrW9O8EDKm0gqCPf9HcR+xVm0f56CYilykDm1LJx+/uwFIB1WFKAtzTjO sgdZxJB3QG5OdPEnK+FPZc7dUdF9hJb6DnY3OKta2A8IWZTqZIcKyKJdsKKcVHmyugcbRO U06Xmc0IbkbmCg1BIBV+xNmXzEizl/te0WfyYqFLA3zWlGuC6+dYbsaggsyIHw== Received: from aniel.nours.eu (nours.eu [176.31.115.77]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: bapt) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VDHPW5JW9z11pL; Tue, 9 Apr 2024 07:18:39 +0000 (UTC) (envelope-from bapt@freebsd.org) Received: by aniel.nours.eu (Postfix, from userid 1001) id 871B61AF528; Tue, 9 Apr 2024 09:18:37 +0200 (CEST) Date: Tue, 9 Apr 2024 09:18:37 +0200 From: Baptiste Daroussin To: FreeBSD User Cc: FreeBSD Ports , FreeBSD CURRENT Subject: Re: pkg-1.21.0: after upgrade 1.20.9_1 -> 1.21.0: pkg core dumps on specific ports Message-ID: References: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> On Sat 06 Apr 09:05, FreeBSD User wrote: > Hello, > > after updating (portmaster and make) ports-mgmt/ports from 1.20.9_1 -> 1.21.0 on CURRENT and > 14-STABLE, I can't update several ports: > > www/apache24 > databases/redis > > pkg core dumps while performing installation. apache24 and redis are ports I realized this > misbehaviour on ALL 14-STABLE and CURRENT boxes (both OS variants latest builds, i.e. FreeBSD > 15.0-CURRENT #32 main-n269135-da2b732288c7: Fri Apr 5 20:30:39 CEST 2024 amd64). > > After some updates on a poudriere builder (CURRENT base host, 14.0-RELENG jail with poudriere) > building packages for 14.0-RELENG, I observed the same behaviour when updating packages on > target hosts where pkg is first updated, on those hosts, nextcloud-server and icinga2 host > utilizing also databases/redis and www/apache24, pkg fails the same way. > > I do not dare to update our poudriere hosts since the problem seems to pop up when pkg 1.21.0 > is installed, no matter whether I use poudriere built ports (from our own builder hosts) or > recent source tree with portmaster/make build process. > > Looks like a serious bug to me and not a site/user specific problem. Hopefully others do > realize the same ... > > Thanks in advance, > > oh > https://github.com/freebsd/pkg/issues/2270 set HANDLE_RC_SCRIPTS=false in your pkg.conf a Fix was made last friday, given this is a non default option I waited for the Week end to pass to see if there were other regressions, but no more reports so I will issue a pkg 1.21.1 now. Best regards, Bapt From nobody Tue Apr 9 07:20:30 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDHRh2q2lz5GrTb; Tue, 9 Apr 2024 07:20:32 +0000 (UTC) (envelope-from bapt@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDHRh2MrTz4Mh0; Tue, 9 Apr 2024 07:20:32 +0000 (UTC) (envelope-from bapt@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712647232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z9Vr/nTnriCitYEP95YL76ScP8t8rqoRcjpt/5tbqfE=; b=M3XmuLfVhhSi//MwR+KnYVfqEeLPdxhorzhCUOR+bXbBgn41Mv0QSM3ixszqovlos79+Y+ 3yU8IyHiyy17e4KL8EvoL3WIuFOdQVLZU1Ypr6T7zeBU9eqiJ+jT2Yb4c6k0fyvn0CRWvO a55Ad6MvNPMY0IfEYlLyXrdFCf0tBz1syrGv07Y1A0gONi0p1KvwXYe3tgL/HVikLeE1oK UsbpO5HnYPJTovMgEHSLoP2XJkw4HDXRPXcrJYRCpqwbFiVfXnnkLU7+5Dbj229vWpMcoa o96vN8hN9oc07FToZxhbSL5Jqb/b7e0USt32jmriDPhHFhLuI9tH3oxTnSUEwg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712647232; a=rsa-sha256; cv=none; b=gaeuCQmDRcH3LoY6wFMDjFDtKkzsh+9VTAjyf9+SruMIAMZBnPLnCfqmeF6FwwjQMaJ0iZ ACMpzIFbVoUgBik1+8An4jLWWhY7E8fDZEmY0X5Gk/pg++xJRfSuapzHuRbcLBUg4n8BO6 kMANGcJdnOgG7HLisKln0mSgnzedXhxXpaGxJoa5X9pJQeDE+WqJQAlUDwfrOPK2sOCDpG J/yo+jCyylFbhDzEbaz+qo/1cciU2vgYWnXx59/XNNPUzfscdM9/rkIKT9bZo6QllzsACa DYioyoHhOjTz/kY54RCkWj3JKKQU5oU0GzYXD5XANXqszyvEjck/dJGMrEt67w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712647232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z9Vr/nTnriCitYEP95YL76ScP8t8rqoRcjpt/5tbqfE=; b=XGfp7ha6FRJmyi/pOgZe0z6MKai/+jz6UDab3gCNQFRf/QYUsDZZm6cg87ynUzEW7ifZVx JeG4FmW8ed7xDUjIGItkLkFrBM8KNOx1h5IPbNlJBhiEOnB6DoMT0idnObUUpun40qAONs UF5OLpQhQ22eQRWMrRwonarXT0RwevnxO6iaa+41UyF1oUxKwVYTV83CV3g7mYAkwuAo3W TS//DYBpu/YiDUpsEXoGsAshQDCPqgw4lOOALeJMKWYp6QvDmXSq/3a8bVch9Qnmub9Ac4 IbcMXiaogcSGTPK77qLp/HX/0VFdlSk72F6uf+za0KOzxoCrHA7pBEW/jVP6Dw== Received: from aniel.nours.eu (nours.eu [IPv6:2001:41d0:8:3a4d::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: bapt) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VDHRh1Bsbz11XL; Tue, 9 Apr 2024 07:20:32 +0000 (UTC) (envelope-from bapt@freebsd.org) Received: by aniel.nours.eu (Postfix, from userid 1001) id 2258D1AF427; Tue, 9 Apr 2024 09:20:30 +0200 (CEST) Date: Tue, 9 Apr 2024 09:20:30 +0200 From: Baptiste Daroussin To: Rainer Hurling Cc: FreeBSD User , FreeBSD Ports , FreeBSD CURRENT Subject: Re: pkg-1.21.0: after upgrade 1.20.9_1 -> 1.21.0: pkg core dumps on specific ports Message-ID: <4dsxkdn2uacxuxnvradf47yvhxx6z6pzbqezpfzn4qvbludhld@ome2p54xt2et> References: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Sat 06 Apr 09:23, Rainer Hurling wrote: > Am 06.04.24 um 09:05 schrieb FreeBSD User: > > Hello, > > > > after updating (portmaster and make) ports-mgmt/ports from 1.20.9_1 -> 1.21.0 on CURRENT and > > 14-STABLE, I can't update several ports: > > > > www/apache24 > > databases/redis > > > > pkg core dumps while performing installation. apache24 and redis are ports I realized this > > misbehaviour on ALL 14-STABLE and CURRENT boxes (both OS variants latest builds, i.e. FreeBSD > > 15.0-CURRENT #32 main-n269135-da2b732288c7: Fri Apr 5 20:30:39 CEST 2024 amd64). > > > > After some updates on a poudriere builder (CURRENT base host, 14.0-RELENG jail with poudriere) > > building packages for 14.0-RELENG, I observed the same behaviour when updating packages on > > target hosts where pkg is first updated, on those hosts, nextcloud-server and icinga2 host > > utilizing also databases/redis and www/apache24, pkg fails the same way. > > > > I do not dare to update our poudriere hosts since the problem seems to pop up when pkg 1.21.0 > > is installed, no matter whether I use poudriere built ports (from our own builder hosts) or > > recent source tree with portmaster/make build process. > > > > Looks like a serious bug to me and not a site/user specific problem. Hopefully others do > > realize the same ... > > > > Thanks in advance, > > > > oh > > > Hmm, I just tried to reproduce that. Both ports mentioned, databases/redis > and www/apache24, can be built and installed with Portmaster. The box is a > 15.0-CURRENT with pkg-1.21.0. > > Maybe 'pkg check -Bn' or 'portmaster --check-depends --check-port-dbdir' > show some inconsistencies? > > Best wishes, > Rainer > > using portmaster or not are strictly unlikely to be helpful here. The right way to test if to report running with pkg -dddd and also to recommand testing with default options in pkg.conf. Best regards, Bapt From nobody Tue Apr 9 11:47:07 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDPMM15Wvz5Gr3v for ; Tue, 9 Apr 2024 11:47:11 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mx.catwhisker.org (mx.catwhisker.org [107.204.234.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDPML09b6z4qTC for ; Tue, 9 Apr 2024 11:47:09 +0000 (UTC) (envelope-from david@catwhisker.org) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of david@catwhisker.org designates 107.204.234.170 as permitted sender) smtp.mailfrom=david@catwhisker.org Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.18.1/8.18.1) with ESMTP id 439Bl7PL038159 for ; Tue, 9 Apr 2024 11:47:07 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.18.1/8.18.1/Submit) id 439Bl7Uk038158 for current@freebsd.org; Tue, 9 Apr 2024 04:47:07 -0700 (PDT) (envelope-from david) Date: Tue, 9 Apr 2024 04:47:07 -0700 From: David Wolfskill To: current@freebsd.org Subject: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Message-ID: Reply-To: current@freebsd.org Mail-Followup-To: current@freebsd.org List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="s0QywwVAaZ20Cu4E" Content-Disposition: inline X-Spamd-Bar: / X-Spamd-Result: default: False [-0.38 / 15.00]; REPLYTO_EQ_TO_ADDR(5.00)[]; SIGNED_PGP(-2.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.98)[-0.980]; R_SPF_ALLOW(-0.20)[+ip4:107.204.234.170]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; DMARC_NA(0.00)[catwhisker.org]; ARC_NA(0.00)[]; FREEFALL_USER(0.00)[david]; RCVD_TLS_LAST(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MISSING_XM_UA(0.00)[]; ASN(0.00)[asn:7018, ipnet:107.192.0.0/12, country:US]; FROM_HAS_DN(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; TO_DN_NONE(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[current@freebsd.org]; MLMMJ_DEST(0.00)[current@freebsd.org]; HAS_REPLYTO(0.00)[current@freebsd.org] X-Rspamd-Queue-Id: 4VDPML09b6z4qTC --s0QywwVAaZ20Cu4E Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Machine had been running: FreeBSD 15.0-CURRENT #43 main-n269202-4e7aa03b7076: Mon Apr 8 11:19:58 UTC= 2024 root@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/= sys/GENERIC amd64 1500018 1500018 This was an in-place source update, after updating sources to main-n269230-f6f67f58c19d. On reboot (after "make installworld" completed, I see this on the serial console (copy/pasted): =2E.. Starting lockd. Fatal trap 12: page fault while in kernel mode cpuid =3D 9; apic id =3D 09 fault virtual address =3D 0x18 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80b208c5 stack pointer =3D 0x28:0xfffffe048c204920 frame pointer =3D 0x28:0xfffffe048c204960 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 1208 (rpc.Starting automountd. lockd) rdi: 0000000000000000 rsi: fffff801078b0740 rdx: 0000000000000000 rcx: 000000000000010a r8: ffffffff818d30f0 r9: 0000000000000000 rax: 0000000000000000 rbx: 00000000Starting powerd.00000018 rbp: fffffe048c= 204960 r10: 0000000000010000 r11: 0000000000000001 r12: fffff80274e32c18 r13: 000000000000010a r14: fffff80274e32c00 r15: ffffffff812ae38a trap number =3D 12 panic: page fault cpuid =3D 9 time =3D 1712662362 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe048c204= 5f0 vpanic() at vpanic+0x135/frame 0xfffffe048c204720 panic() at panic+0x43/frame 0xfffffe048c204780 trap_fatal() at trap_fatal+0x40b/frame 0xfffffe048c2047e0 trap_pfault() at trap_pfault+0xa0/frame 0xfffffe048c204850 calltrap() at calltrap+0x8/frame 0xfffffe048c204850 --- trap 0xc, rip =3D 0xffffffff80b208c5, rsp =3D 0xfffffe048c204920, rbp = =3D 0xfffffe 048c204960 --- __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960 clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0 local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50 rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0 svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90 sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c204f30 --- syscall (154, FreeBSD ELF64, nlm_syscall), rip =3D 0x3f00a2dfd2a, rsp = =3D 0x3f00 96f7168, rbp =3D 0x3f0096f7230 --- KDB: enter: panic [ thread pid 1208 tid 101107 ] Stopped at kdb_enter+0x33: movq $0,0x104eb92(%rip) db>=20 Given suitable clues, I can poke at it a bit -- this is my "build machine," so it doesn't have critical work to do at the moment. (I would normally have powered it down for the day: here's no need for it to be wasting energy.) Laptops are still building ports under stable/14 -- something seems to want the llvm17 port, and they have firefox to build, so they won't be testing CURRENT/head for a while, yet. Peace, david --=20 David H. Wolfskill david@catwhisker.org Alexey Navalny was a courageous man; Putin has made him a martyr. See https://www.catwhisker.org/~david/publickey.gpg for my public key. --s0QywwVAaZ20Cu4E Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iNUEARYKAH0WIQSTLzOSbomIK53fjFliipiWhXYx5QUCZhUqu18UgAAAAAAuAChp c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0OTMy RjMzOTI2RTg5ODgyQjlEREY4QzU5NjI4QTk4OTY4NTc2MzFFNQAKCRBiipiWhXYx 5ankAQDNR7EmlatgYdIziv4rCc3BTiNpxNjm0ksiktTahKhiBgD/YHhQ0Wbs9isp AGUaQcwyD1ZPd3S3GjgSNN3ElizL4QE= =oG9k -----END PGP SIGNATURE----- --s0QywwVAaZ20Cu4E-- From nobody Tue Apr 9 14:46:28 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDTLT0TMJz5H752 for ; Tue, 9 Apr 2024 14:46:41 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDTLS5Qh1z4Bqb for ; Tue, 9 Apr 2024 14:46:40 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-2a58209b159so389927a91.3 for ; Tue, 09 Apr 2024 07:46:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712673999; x=1713278799; darn=freebsd.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Y4YR9HAQ68iRIicEZS5ODfht8syq0JCIbPrTYoYtYe0=; b=KUKbLJs8TXRcYvVAWxd/yWneTvwoAZkwOmwwAui+ZOaU4Kv77UC+VPRaNzKh7XRU5H 3pltEsEMtxr3QIUyO1Nmfps0HglQIC2viouTPRAJ0lCVGdacOpWbmSd4ZuHzmLWqfgGk il0W7wj0S2znF0FKjNprOlAi1Dl/jOycNLzIKnCREq0m11Oop+ys20/QaJoMAcwVrECt VSaRfdOQV6KtCmraTVANchSF/5zuo9NvE2ZMh5Ui7pTqq8NlFLtbcCaNlqNtJUioryIL hBTXAY2kdWeN52f6dTUbbIceTjTjwRyOvQnu0czf/AjHSuX1RO4+a0XYqdVJzq2D+h2V xZ6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712673999; x=1713278799; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y4YR9HAQ68iRIicEZS5ODfht8syq0JCIbPrTYoYtYe0=; b=KfsZIBVhq8s9PPqDx/SsrNsZkyL5a+A8aQOkju+IyZvRVCeh2QyOKhv0PM+/31Yxro x4iD+XG4x3u8ytCyy17K8i3KDDzup+a7ZpumZTO4q6Xlr+jC8ghOpInt0EhNN2+k9Pfu hYkZx7zRtj5U0E4DKQKA6szxsNYM9YA/Re2o0za59llY6rvlXYKP04dAJ5oQSGhM9Ctq s+Npx8VTmf0d3RB/HACJ4hKa+k6O4ols+P1UMxan1gOt6couDo0hMQDfV9YKvpsZk798 BbLdd6bb6ytbOwXWl4oRJ6+uJhDTknGFo3YMCqk2KhPbSnQd8qcK9LaBvKoEOdyazHQN 5oig== X-Gm-Message-State: AOJu0YxxQRUPOPxA0B7UEmoS3IJXqe0xaIsU5oOb/ccCUAGebpZ3C7fR 6+nIzXjnv3BuufbD0Q0smBIs+09bazwIiJRb0tEBWbk7hJwLRM6w5BO+JLrcOwGO+9mR3pusiYQ G1lS3oWWuythBHMC8HqJ7L9fw3Za4Ieg= X-Google-Smtp-Source: AGHT+IEx1WoUPocwbuzgh+5cnY0hezCvgjaLOOx5AAtfgTuZdCg20mlkmLBnltTKrSMSvatbCVM114VG0ODuQFSor8g= X-Received: by 2002:a17:90b:1d01:b0:2a5:457a:8263 with SMTP id on1-20020a17090b1d0100b002a5457a8263mr3767038pjb.19.1712673999228; Tue, 09 Apr 2024 07:46:39 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Tue, 9 Apr 2024 07:46:28 -0700 Message-ID: Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d To: current@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4VDTLS5Qh1z4Bqb On Tue, Apr 9, 2024 at 4:47=E2=80=AFAM David Wolfskill wrote: > > Machine had been running: > > FreeBSD 15.0-CURRENT #43 main-n269202-4e7aa03b7076: Mon Apr 8 11:19:58 U= TC 2024 root@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.amd6= 4/sys/GENERIC amd64 1500018 1500018 > > This was an in-place source update, after updating sources to > main-n269230-f6f67f58c19d. On reboot (after "make installworld" > completed, I see this on the serial console (copy/pasted): > > ... > Starting lockd. I'd guess this is caused by some recent change to AF_UNIX socket creation. The crash appears to be either the SOCK_LOCK() or SOCKBUF_LOCK(&so->so_rcv) not being initialized. If you can find out what source line# corresponds to clnt_vc_create+0x4f4 you can probably tell which one it is. All local_rpcb() does is a error =3D socreate(AF_LOCAL, &so, SOCK_STREAM, 0, curthread->td_ucred, curthread); and then calls clnt_vc_create(..so..) with the socket. I think that socreate() is not initializing one of those two mutexes for some reason. rick > > > Fatal trap 12: page fault while in kernel mode > cpuid =3D 9; apic id =3D 09 > fault virtual address =3D 0x18 > fault code =3D supervisor read data, page not present > instruction pointer =3D 0x20:0xffffffff80b208c5 > stack pointer =3D 0x28:0xfffffe048c204920 > frame pointer =3D 0x28:0xfffffe048c204960 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 1208 (rpc.Starting automountd. > lockd) > rdi: 0000000000000000 rsi: fffff801078b0740 rdx: 0000000000000000 > rcx: 000000000000010a r8: ffffffff818d30f0 r9: 0000000000000000 > rax: 0000000000000000 rbx: 00000000Starting powerd.00000018 rbp: fffffe04= 8c204960 > r10: 0000000000010000 r11: 0000000000000001 r12: fffff80274e32c18 > r13: 000000000000010a r14: fffff80274e32c00 r15: ffffffff812ae38a > trap number =3D 12 > panic: page fault > cpuid =3D 9 > time =3D 1712662362 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe048c2= 045f0 > vpanic() at vpanic+0x135/frame 0xfffffe048c204720 > panic() at panic+0x43/frame 0xfffffe048c204780 > trap_fatal() at trap_fatal+0x40b/frame 0xfffffe048c2047e0 > trap_pfault() at trap_pfault+0xa0/frame 0xfffffe048c204850 > calltrap() at calltrap+0x8/frame 0xfffffe048c204850 > --- trap 0xc, rip =3D 0xffffffff80b208c5, rsp =3D 0xfffffe048c204920, rbp= =3D 0xfffffe > 048c204960 --- > __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960 > clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0 > local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50 > rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0 > svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90 > sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00 > amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c204f3= 0 > --- syscall (154, FreeBSD ELF64, nlm_syscall), rip =3D 0x3f00a2dfd2a, rsp= =3D 0x3f00 > 96f7168, rbp =3D 0x3f0096f7230 --- > KDB: enter: panic > [ thread pid 1208 tid 101107 ] > Stopped at kdb_enter+0x33: movq $0,0x104eb92(%rip) > db> > > > Given suitable clues, I can poke at it a bit -- this is my "build > machine," so it doesn't have critical work to do at the moment. (I > would normally have powered it down for the day: here's no need for > it to be wasting energy.) > > Laptops are still building ports under stable/14 -- something seems > to want the llvm17 port, and they have firefox to build, so they > won't be testing CURRENT/head for a while, yet. > > Peace, > david > -- > David H. Wolfskill david@catwhisker.org > Alexey Navalny was a courageous man; Putin has made him a martyr. > > See https://www.catwhisker.org/~david/publickey.gpg for my public key. From nobody Tue Apr 9 15:04:01 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDTkl1l4rz5H8jW for ; Tue, 9 Apr 2024 15:04:15 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDTkj6VgZz4Fm1; Tue, 9 Apr 2024 15:04:13 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=KETsCHCA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2607:f8b0:4864:20::52a as permitted sender) smtp.mailfrom=rick.macklem@gmail.com Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-5e152c757a5so3543486a12.2; Tue, 09 Apr 2024 08:04:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712675052; x=1713279852; darn=freebsd.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=smamsmKuMcJfZTSR82F47z4ebp9YRX2AsX+tgO2ceqw=; b=KETsCHCAXc76R99aj432aezIk9v2fI+tbgsQKQZgkM6hbuwSxx8dcakfUzc0Gj0ZNw EFXLjefVGO4F1ycIyN5uRlNA/aEW+16wlYmf9LbzuTJUucBlU+jqAzDkzOh/kwUFSFqA HCirRgRodjvqs80MpXP3P/yOrCSylyGtlMkq7tR0+S2qKkyjgm1IYktZRPHEzyX5lFWe 0ySg6X6Ost6gnrxr0bHUbnhD+fXr0W0pihVs4FQpP9dHzXDKjgnaODqprbDDpKfNBfMI p0sWrYQUm1BKBclcuQ6Pi44STMHzYqq4ixhtfCSiO6SoWCfoNehmcgvN55hhdZQTDxik E6TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712675052; x=1713279852; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=smamsmKuMcJfZTSR82F47z4ebp9YRX2AsX+tgO2ceqw=; b=Q+zL0Rw99ysN01SlpkkbLcNCocq/9++/YTUu4B4UtjynHAM7lk2pXyd6SCociFJN8U Rh5cooCzUMov18x/iMUiPOomigKX/CjgIx09sw48ok7kzALR8Ql83BjVGWtlKuey6ssZ J/dxUFGf/f086qhDNfgp/PPRQ1YaJX0OcIMUesC5Pdl2wSPoG0Z0h9T/Yfp+xYhz3PVg xwqL2v3s2UvR9oQBC0pbShmNVyXQSlyvTHI4MNL8ayz/NiJIW6AbfTJT7Lt03Q27rXlQ FTTMhStoI+ZMKHg51mhj7IBbWu6LxxTucQ1x7SRivO5RQb5loDtIoaVGVFIXrpKNdRLF eOtQ== X-Forwarded-Encrypted: i=1; AJvYcCVW+7QAQ4qqlkCHVTBWSgMI6X0bwKz20LSAf92h1Hw0cUoQlgxj4ml4AUJzvZJMN+naJl6vKsFuK86evOQEIs7qyCY= X-Gm-Message-State: AOJu0YwVkPfNL84fjsVkhfn7mCBUEhUaBBQGJt8NF97KA5dsQMu4c+76 j409tbk6hSUP4DTImDccBUKjHOvqdDaYtrqAz5NdaurJ8HNQo9t81UcIXLtpHMKjdiAoYAHBMvJ 72kQKrtgVgNtHbWnogjDotCsWEkWYDPk= X-Google-Smtp-Source: AGHT+IFAO+yRIeITtfVYVxb2H+lZHf/mgtLL7vUB6pF4yQyQOVVo/JSFuUp+Z3D0X7ylZTkGKoRXfrZq1aVoeKswmgs= X-Received: by 2002:a17:90a:8996:b0:2a3:6e4:31e7 with SMTP id v22-20020a17090a899600b002a306e431e7mr9767437pjn.8.1712675052230; Tue, 09 Apr 2024 08:04:12 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Tue, 9 Apr 2024 08:04:01 -0700 Message-ID: Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d To: current@freebsd.org, tuexen@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.99 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.990]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FREEMAIL_FROM(0.00)[gmail.com]; TAGGED_FROM(0.00)[]; MISSING_XM_UA(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TO_DN_NONE(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MLMMJ_DEST(0.00)[current@freebsd.org]; RCVD_COUNT_ONE(0.00)[1]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::52a:from] X-Rspamd-Queue-Id: 4VDTkj6VgZz4Fm1 On Tue, Apr 9, 2024 at 7:46=E2=80=AFAM Rick Macklem wrote: > > On Tue, Apr 9, 2024 at 4:47=E2=80=AFAM David Wolfskill wrote: > > > > Machine had been running: > > > > FreeBSD 15.0-CURRENT #43 main-n269202-4e7aa03b7076: Mon Apr 8 11:19:58= UTC 2024 root@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.am= d64/sys/GENERIC amd64 1500018 1500018 > > > > This was an in-place source update, after updating sources to > > main-n269230-f6f67f58c19d. On reboot (after "make installworld" > > completed, I see this on the serial console (copy/pasted): > > > > ... > > Starting lockd. > I'd guess this is caused by some recent change to AF_UNIX socket > creation. The crash appears to be either the SOCK_LOCK() or > SOCKBUF_LOCK(&so->so_rcv) not being initialized. > If you can find out what source line# corresponds to > clnt_vc_create+0x4f4 you can probably tell which one it is. > > All local_rpcb() does is a > error =3D socreate(AF_LOCAL, &so, SOCK_STREAM, 0, curthread->td_ucred, > curthread); > and then calls clnt_vc_create(..so..) with the socket. > > I think that socreate() is not initializing one of those two mutexes > for some reason. Looks to me like this was caused by commit 681711b. I've added tuexen@ to the post, since he committed it. rick > > rick > > > > > > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 9; apic id =3D 09 > > fault virtual address =3D 0x18 > > fault code =3D supervisor read data, page not present > > instruction pointer =3D 0x20:0xffffffff80b208c5 > > stack pointer =3D 0x28:0xfffffe048c204920 > > frame pointer =3D 0x28:0xfffffe048c204960 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 1208 (rpc.Starting automountd. > > lockd) > > rdi: 0000000000000000 rsi: fffff801078b0740 rdx: 0000000000000000 > > rcx: 000000000000010a r8: ffffffff818d30f0 r9: 0000000000000000 > > rax: 0000000000000000 rbx: 00000000Starting powerd.00000018 rbp: fffffe= 048c204960 > > r10: 0000000000010000 r11: 0000000000000001 r12: fffff80274e32c18 > > r13: 000000000000010a r14: fffff80274e32c00 r15: ffffffff812ae38a > > trap number =3D 12 > > panic: page fault > > cpuid =3D 9 > > time =3D 1712662362 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe048= c2045f0 > > vpanic() at vpanic+0x135/frame 0xfffffe048c204720 > > panic() at panic+0x43/frame 0xfffffe048c204780 > > trap_fatal() at trap_fatal+0x40b/frame 0xfffffe048c2047e0 > > trap_pfault() at trap_pfault+0xa0/frame 0xfffffe048c204850 > > calltrap() at calltrap+0x8/frame 0xfffffe048c204850 > > --- trap 0xc, rip =3D 0xffffffff80b208c5, rsp =3D 0xfffffe048c204920, r= bp =3D 0xfffffe > > 048c204960 --- > > __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960 > > clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0 > > local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50 > > rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0 > > svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90 > > sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00 > > amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30 > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c204= f30 > > --- syscall (154, FreeBSD ELF64, nlm_syscall), rip =3D 0x3f00a2dfd2a, r= sp =3D 0x3f00 > > 96f7168, rbp =3D 0x3f0096f7230 --- > > KDB: enter: panic > > [ thread pid 1208 tid 101107 ] > > Stopped at kdb_enter+0x33: movq $0,0x104eb92(%rip) > > db> > > > > > > Given suitable clues, I can poke at it a bit -- this is my "build > > machine," so it doesn't have critical work to do at the moment. (I > > would normally have powered it down for the day: here's no need for > > it to be wasting energy.) > > > > Laptops are still building ports under stable/14 -- something seems > > to want the llvm17 port, and they have firefox to build, so they > > won't be testing CURRENT/head for a while, yet. > > > > Peace, > > david > > -- > > David H. Wolfskill david@catwhisker.org > > Alexey Navalny was a courageous man; Putin has made him a martyr. > > > > See https://www.catwhisker.org/~david/publickey.gpg for my public key. From nobody Tue Apr 9 15:06:02 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDTmt3w9Fz5H8h2; Tue, 9 Apr 2024 15:06:06 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from mx-2023-1.gwdg.de (mx-2023-1.gwdg.de [134.76.10.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDTms6CtDz4Gld; Tue, 9 Apr 2024 15:06:05 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gwdg.de header.s=2023-rsa header.b=YUzbzk7m; dmarc=pass (policy=none) header.from=gwdg.de; spf=pass (mx1.freebsd.org: domain of rhurlin@gwdg.de designates 134.76.10.21 as permitted sender) smtp.mailfrom=rhurlin@gwdg.de DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gwdg.de; s=2023-rsa; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:CC:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=5tfH8OPN/WIcHKJgFJrlMk/6ZUNiqNKH/d/QRJUaP50=; b=YUzbzk7mAXAT9xg0U2y/BC6A2U pwNKWtGgEnb2nPwBwkVum6ottTJsn6AJw5Wo3wFnuDO0bCTUIctG+eApHrU+5jvwB5A3zdtwqjcot SLpXJ47yskr1rIGVM3T327d2W95mdcPWSxPNtQCx8bNK0MrSfk1Pd6GXYBwhv3jTUbnm2lyuUJvTf dkLy+9OoTQ27+bm6hwPaKlDheXgAkc2nZRJoqnERXtb+5yjBcMRKlBw0YB67nTqN1SoN4MxJ8Ds3B v5nD1ZVH5GPWHc/PpKL9q8GNFQDQTIYXSlaRLLbKEbdA4Dh47x1yBVMBoevVxw7Z0QoBVfIDyhlLD f8+82Yyg==; Received: from xmailer.gwdg.de ([134.76.10.29]:34532) by mailer.gwdg.de with esmtp (GWDG Mailer) (envelope-from ) id 1ruD2x-007mxM-0t; Tue, 09 Apr 2024 17:06:03 +0200 Received: from mbx19-gwd-03.um.gwdg.de ([10.108.142.56] helo=email.gwdg.de) by mailer.gwdg.de with esmtps (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (GWDG Mailer) (envelope-from ) id 1ruD2x-0004gR-0A; Tue, 09 Apr 2024 17:06:03 +0200 Received: from [192.168.178.23] (10.250.9.199) by MBX19-GWD-03.um.gwdg.de (10.108.142.56) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.2.1544.9; Tue, 9 Apr 2024 17:06:03 +0200 Message-ID: <72581cbc-8165-4750-9ec9-2d300152d14b@gwdg.de> Date: Tue, 9 Apr 2024 17:06:02 +0200 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: pkg-1.21.0: after upgrade 1.20.9_1 -> 1.21.0: pkg core dumps on specific ports To: FreeBSD User CC: FreeBSD Ports , FreeBSD CURRENT References: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> <20240406095652.1a5f7acb@thor.intern.walstatt.dynvpn.de> Content-Language: en-US From: Rainer Hurling In-Reply-To: <20240406095652.1a5f7acb@thor.intern.walstatt.dynvpn.de> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.250.9.199] X-ClientProxiedBy: excmbx-22.um.gwdg.de (134.76.9.232) To MBX19-GWD-03.um.gwdg.de (10.108.142.56) X-Virus-Scanned: (clean) by clamav X-Spam-Level: - X-Spamd-Bar: ----- X-Spamd-Result: default: False [-5.69 / 15.00]; DWL_DNSWL_LOW(-1.00)[gwdg.de:dkim]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; RCVD_DKIM_ARC_DNSWL_MED(-0.50)[]; DMARC_POLICY_ALLOW(-0.50)[gwdg.de,none]; R_SPF_ALLOW(-0.20)[+ip4:134.76.10.0/23]; R_DKIM_ALLOW(-0.20)[gwdg.de:s=2023-rsa]; RCVD_IN_DNSWL_MED(-0.20)[134.76.10.29:received]; MIME_GOOD(-0.10)[text/plain]; XM_UA_NO_VERSION(0.01)[]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gwdg.de:+]; FREEFALL_USER(0.00)[rhurlin]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; HAS_XOIP(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; MLMMJ_DEST(0.00)[freebsd-ports@freebsd.org,freebsd-current@freebsd.org]; ASN(0.00)[asn:207592, ipnet:134.76.0.0/16, country:DE]; FROM_HAS_DN(0.00)[] X-Rspamd-Queue-Id: 4VDTms6CtDz4Gld Am 06.04.24 um 09:56 schrieb FreeBSD User: > Am Sat, 6 Apr 2024 09:23:30 +0200 > Rainer Hurling schrieb: > >> Am 06.04.24 um 09:05 schrieb FreeBSD User: >>> Hello, >>> >>> after updating (portmaster and make) ports-mgmt/ports from 1.20.9_1 -> 1.21.0 on CURRENT >>> and 14-STABLE, I can't update several ports: >>> >>> www/apache24 >>> databases/redis >>> >>> pkg core dumps while performing installation. apache24 and redis are ports I realized this >>> misbehaviour on ALL 14-STABLE and CURRENT boxes (both OS variants latest builds, i.e. >>> FreeBSD 15.0-CURRENT #32 main-n269135-da2b732288c7: Fri Apr 5 20:30:39 CEST 2024 amd64). >>> >>> After some updates on a poudriere builder (CURRENT base host, 14.0-RELENG jail with >>> poudriere) building packages for 14.0-RELENG, I observed the same behaviour when updating >>> packages on target hosts where pkg is first updated, on those hosts, nextcloud-server and >>> icinga2 host utilizing also databases/redis and www/apache24, pkg fails the same way. >>> >>> I do not dare to update our poudriere hosts since the problem seems to pop up when pkg >>> 1.21.0 is installed, no matter whether I use poudriere built ports (from our own builder >>> hosts) or recent source tree with portmaster/make build process. >>> >>> Looks like a serious bug to me and not a site/user specific problem. Hopefully others do >>> realize the same ... >>> >>> Thanks in advance, >>> >>> oh >> >> >> Hmm, I just tried to reproduce that. Both ports mentioned, >> databases/redis and www/apache24, can be built and installed with >> Portmaster. The box is a 15.0-CURRENT with pkg-1.21.0. >> >> Maybe 'pkg check -Bn' or 'portmaster --check-depends --check-port-dbdir' >> show some inconsistencies? >> >> Best wishes, >> Rainer >> > > Hello, > > thanks for the quick response. > > I checked on the CURRENT systems here at hand and must confess - it is a mess! pkg check -Bn > dropped a lot of missing shared objects missing from autotools and missing guile2 :-( > > Thank you very much, > oh > You're really welcome. I myself have failed several times precisely because some dependencies were not in order. And that's not always obvious :) Best wishes, Rainer From nobody Tue Apr 9 15:10:52 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDTtQ3H1Hz5H9PS; Tue, 9 Apr 2024 15:10:54 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from mx-2023-1.gwdg.de (mx-2023-1.gwdg.de [134.76.10.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDTtQ1SHFz4JcX; Tue, 9 Apr 2024 15:10:54 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Authentication-Results: mx1.freebsd.org; none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gwdg.de; s=2023-rsa; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: Reply-To:References:CC:To:Subject:MIME-Version:Date:Message-ID:Sender: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=6ubEYTJ51cW2osAuI5sBif/ywX+/CTD8jnyzqlY0bxg=; b=dWU5cdraeIIyCz3c0k/mXFVXvq X8J2/k7rRrGQJsF5WZIF/VN63vQ1SBXru3BJzawIng2hXzjRfSODZQmrEAzVyDWAP83jUmm57gPf9 HwF6XDSn0OIdv/hJ9RsAUHXOTasegaN/eg5I7wPEaSM6IYtf8A2Ssw8DaqMKTN0zVgDeF1MbRQ9Bd Cu0Sx9umR4jnTdJlO4BhrZ3AQVF+uwZ0UVhMLlUmymBCz2nolaE9/Ppvl2hvue3bB5+cdb8iye014 jzN8Fic8G+B/lqqTVnxiQ+UU1S002zIMqRFECbC4qNRCP1vgIsATme519V2KkGHCC5btou3n3bd0T kOvYvc3Q==; Received: from xmailer.gwdg.de ([134.76.10.29]:51815) by mailer.gwdg.de with esmtp (GWDG Mailer) (envelope-from ) id 1ruD7d-007n38-12; Tue, 09 Apr 2024 17:10:53 +0200 Received: from mbx19-gwd-03.um.gwdg.de ([10.108.142.56] helo=email.gwdg.de) by mailer.gwdg.de with esmtps (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (GWDG Mailer) (envelope-from ) id 1ruD7d-0007Pj-0I; Tue, 09 Apr 2024 17:10:53 +0200 Received: from [192.168.178.23] (10.250.9.199) by MBX19-GWD-03.um.gwdg.de (10.108.142.56) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.2.1544.9; Tue, 9 Apr 2024 17:10:53 +0200 Message-ID: Date: Tue, 9 Apr 2024 17:10:52 +0200 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: pkg-1.21.0: after upgrade 1.20.9_1 -> 1.21.0: pkg core dumps on specific ports Content-Language: en-US To: Baptiste Daroussin CC: FreeBSD User , FreeBSD Ports , FreeBSD CURRENT References: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> <4dsxkdn2uacxuxnvradf47yvhxx6z6pzbqezpfzn4qvbludhld@ome2p54xt2et> Reply-To: Rainer Hurling From: Rainer Hurling In-Reply-To: <4dsxkdn2uacxuxnvradf47yvhxx6z6pzbqezpfzn4qvbludhld@ome2p54xt2et> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.250.9.199] X-ClientProxiedBy: excmbx-22.um.gwdg.de (134.76.9.232) To MBX19-GWD-03.um.gwdg.de (10.108.142.56) X-Virus-Scanned: (clean) by clamav X-Spam-Level: - X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:207592, ipnet:134.76.0.0/16, country:DE] X-Rspamd-Queue-Id: 4VDTtQ1SHFz4JcX Am 09.04.24 um 09:20 schrieb Baptiste Daroussin: > On Sat 06 Apr 09:23, Rainer Hurling wrote: >> Am 06.04.24 um 09:05 schrieb FreeBSD User: >>> Hello, >>> >>> after updating (portmaster and make) ports-mgmt/ports from 1.20.9_1 -> 1.21.0 on CURRENT and >>> 14-STABLE, I can't update several ports: >>> >>> www/apache24 >>> databases/redis >>> >>> pkg core dumps while performing installation. apache24 and redis are ports I realized this >>> misbehaviour on ALL 14-STABLE and CURRENT boxes (both OS variants latest builds, i.e. FreeBSD >>> 15.0-CURRENT #32 main-n269135-da2b732288c7: Fri Apr 5 20:30:39 CEST 2024 amd64). >>> >>> After some updates on a poudriere builder (CURRENT base host, 14.0-RELENG jail with poudriere) >>> building packages for 14.0-RELENG, I observed the same behaviour when updating packages on >>> target hosts where pkg is first updated, on those hosts, nextcloud-server and icinga2 host >>> utilizing also databases/redis and www/apache24, pkg fails the same way. >>> >>> I do not dare to update our poudriere hosts since the problem seems to pop up when pkg 1.21.0 >>> is installed, no matter whether I use poudriere built ports (from our own builder hosts) or >>> recent source tree with portmaster/make build process. >>> >>> Looks like a serious bug to me and not a site/user specific problem. Hopefully others do >>> realize the same ... >>> >>> Thanks in advance, >>> >>> oh >> >> >> Hmm, I just tried to reproduce that. Both ports mentioned, databases/redis >> and www/apache24, can be built and installed with Portmaster. The box is a >> 15.0-CURRENT with pkg-1.21.0. >> >> Maybe 'pkg check -Bn' or 'portmaster --check-depends --check-port-dbdir' >> show some inconsistencies? >> >> Best wishes, >> Rainer >> >> > using portmaster or not are strictly unlikely to be helpful here. > > The right way to test if to report running with pkg -dddd and also to recommand > testing with default options in pkg.conf. > > Best regards, > Bapt This is correct and certainly better. I was not aware of this. Fortunately, my less optimal suggestions helped O. Hartmann in this case to find the missing and outdated dependencies. In any case, many thanks for this helpfull advice. Regards, Rainer From nobody Tue Apr 9 15:33:06 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDVNL2zpnz5HBl3 for ; Tue, 9 Apr 2024 15:33:22 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDVNK3vCYz4Mqg; Tue, 9 Apr 2024 15:33:21 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=OYFRX4WN; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2607:f8b0:4864:20::1031 as permitted sender) smtp.mailfrom=rick.macklem@gmail.com Received: by mail-pj1-x1031.google.com with SMTP id 98e67ed59e1d1-2a484f772e2so2656942a91.3; Tue, 09 Apr 2024 08:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712676798; x=1713281598; darn=freebsd.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GrjiRaul9XgcFBy+gOYKeCS0BUrQn8hq6yEJ7HWnkHQ=; b=OYFRX4WNwTF+jx4yhLglHedoRK/N5ppMNvI6MjJKhacBpV3/QFkb0dPgsxgzEoDF7V unic2M2nCYs+dFk0pJVagPQnv7bZrcu2PhS/sqF5OB1IFYQ98lyRIv8EVnfq+2ZZIg1u uW+fVZLdMaJZlQYR3kPeM442BqjtxJZJC2qhT2RRjj+VwCO6p++mJZ+kLPgJH/dt7Ygi dsdHV1G2rUujE40aDbyTOZtd4A30SVLs1oKk5/YSnC5LFMMCAir5yKbKxSUVd4/CD3HA wPHH2Tv3b/1W+yhhxQ2ECGys92b4w8tc9Z2iTOP4iwXRgSuMhFfNBKNP55LVnOIvaHVZ /DSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712676798; x=1713281598; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GrjiRaul9XgcFBy+gOYKeCS0BUrQn8hq6yEJ7HWnkHQ=; b=GfNRTs2AvEKadq+Uib4so0bE0urQT+AFxgc6+DFxnVhndEnBwunuc8IG5Yj08xlm5i Zqjf185wb6ZprOuroapnWy9MC7I2HS2cGpPsMUwDnZE3kVGFf9dxm0km4Hjki9k5VBqw s1MBECf6Gy+HE6ptgWsOuToJAvvHcXAOxgyRWZvEoKtinwavQ2/ir+MNadzcS8Ig5D6d e0hkqvXVJ7XnoY3wCRj+YI7TCO0Qz8Z6WuIho4r/ltq13Xu+zN/maehH57SS+U9fyf+5 vdlQo4k1NY6p8ynP25xFnrQRyyyKhJpTQP7Pbny2NHhXrlNFlg+OEhOw6+MmxrmmKv+N TRtA== X-Forwarded-Encrypted: i=1; AJvYcCUngohFjGhL6Y4qkvQHs2+5ktvPSbJlHdP6wmnEqqWbIbqtT6B7N5egUTkFYOR2zrDh3rfWzHeChlE3iip8U7gJV0usRVHC1ISCyGOKKv1QTqVYbgF4SBM= X-Gm-Message-State: AOJu0YzR8g39gi3SldQy+beUF8j8ZgC8fNPD4aD/3Ky04HPUMmhSe1ta n2wi00OqTCNGEcwOr4ijxu4qLGaIuL/Ymm44iZF0sU5RkBF36ifHpmnLhGLuTeefBsvtNANw7IU RewQghORDFAxk2jon6mTLYAEx6AYc3WU= X-Google-Smtp-Source: AGHT+IFWYBc9pbMGrU3xci4dzFb8ueTCsHQkCGsNU8L7Thc3NICgkyHlxEAX+mo8pvAlTMPBH0pHcCiA4I/SH4JOP0E= X-Received: by 2002:a17:90a:a005:b0:2a2:ea20:2074 with SMTP id q5-20020a17090aa00500b002a2ea202074mr11439385pjp.23.1712676797949; Tue, 09 Apr 2024 08:33:17 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Tue, 9 Apr 2024 08:33:06 -0700 Message-ID: Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d To: current@freebsd.org, tuexen@freebsd.org, Gleb Smirnoff Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.42 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.42)[-0.424]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; FREEMAIL_FROM(0.00)[gmail.com]; TAGGED_FROM(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::1031:from]; MID_RHS_MATCH_FROMTLD(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MLMMJ_DEST(0.00)[current@freebsd.org]; RCVD_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MISSING_XM_UA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com] X-Rspamd-Queue-Id: 4VDVNK3vCYz4Mqg On Tue, Apr 9, 2024 at 8:04=E2=80=AFAM Rick Macklem wrote: > > On Tue, Apr 9, 2024 at 7:46=E2=80=AFAM Rick Macklem wrote: > > > > On Tue, Apr 9, 2024 at 4:47=E2=80=AFAM David Wolfskill wrote: > > > > > > Machine had been running: > > > > > > FreeBSD 15.0-CURRENT #43 main-n269202-4e7aa03b7076: Mon Apr 8 11:19:= 58 UTC 2024 root@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.= amd64/sys/GENERIC amd64 1500018 1500018 > > > > > > This was an in-place source update, after updating sources to > > > main-n269230-f6f67f58c19d. On reboot (after "make installworld" > > > completed, I see this on the serial console (copy/pasted): > > > > > > ... > > > Starting lockd. > > I'd guess this is caused by some recent change to AF_UNIX socket > > creation. The crash appears to be either the SOCK_LOCK() or > > SOCKBUF_LOCK(&so->so_rcv) not being initialized. > > If you can find out what source line# corresponds to > > clnt_vc_create+0x4f4 you can probably tell which one it is. > > > > All local_rpcb() does is a > > error =3D socreate(AF_LOCAL, &so, SOCK_STREAM, 0, curthread->td_ucred= , > > curthread); > > and then calls clnt_vc_create(..so..) with the socket. > > > > I think that socreate() is not initializing one of those two mutexes > > for some reason. > Looks to me like this was caused by commit 681711b. I've added tuexen@ > to the post, since he committed it. Oops, my bad, got this wrong. The commit is d80a97d, when it added PR_SOCKBUG to the pr_flags for AF_UNIX/SOCKSTREAM. I've added glebius@ to the email. rick > > rick > > > > > rick > > > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > cpuid =3D 9; apic id =3D 09 > > > fault virtual address =3D 0x18 > > > fault code =3D supervisor read data, page not present > > > instruction pointer =3D 0x20:0xffffffff80b208c5 > > > stack pointer =3D 0x28:0xfffffe048c204920 > > > frame pointer =3D 0x28:0xfffffe048c204960 > > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > > current process =3D 1208 (rpc.Starting automountd. > > > lockd) > > > rdi: 0000000000000000 rsi: fffff801078b0740 rdx: 0000000000000000 > > > rcx: 000000000000010a r8: ffffffff818d30f0 r9: 0000000000000000 > > > rax: 0000000000000000 rbx: 00000000Starting powerd.00000018 rbp: ffff= fe048c204960 > > > r10: 0000000000010000 r11: 0000000000000001 r12: fffff80274e32c18 > > > r13: 000000000000010a r14: fffff80274e32c00 r15: ffffffff812ae38a > > > trap number =3D 12 > > > panic: page fault > > > cpuid =3D 9 > > > time =3D 1712662362 > > > KDB: stack backtrace: > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0= 48c2045f0 > > > vpanic() at vpanic+0x135/frame 0xfffffe048c204720 > > > panic() at panic+0x43/frame 0xfffffe048c204780 > > > trap_fatal() at trap_fatal+0x40b/frame 0xfffffe048c2047e0 > > > trap_pfault() at trap_pfault+0xa0/frame 0xfffffe048c204850 > > > calltrap() at calltrap+0x8/frame 0xfffffe048c204850 > > > --- trap 0xc, rip =3D 0xffffffff80b208c5, rsp =3D 0xfffffe048c204920,= rbp =3D 0xfffffe > > > 048c204960 --- > > > __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960 > > > clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0 > > > local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50 > > > rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0 > > > svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90 > > > sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00 > > > amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30 > > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c2= 04f30 > > > --- syscall (154, FreeBSD ELF64, nlm_syscall), rip =3D 0x3f00a2dfd2a,= rsp =3D 0x3f00 > > > 96f7168, rbp =3D 0x3f0096f7230 --- > > > KDB: enter: panic > > > [ thread pid 1208 tid 101107 ] > > > Stopped at kdb_enter+0x33: movq $0,0x104eb92(%rip) > > > db> > > > > > > > > > Given suitable clues, I can poke at it a bit -- this is my "build > > > machine," so it doesn't have critical work to do at the moment. (I > > > would normally have powered it down for the day: here's no need for > > > it to be wasting energy.) > > > > > > Laptops are still building ports under stable/14 -- something seems > > > to want the llvm17 port, and they have firefox to build, so they > > > won't be testing CURRENT/head for a while, yet. > > > > > > Peace, > > > david > > > -- > > > David H. Wolfskill david@catwhisker.org > > > Alexey Navalny was a courageous man; Putin has made him a martyr. > > > > > > See https://www.catwhisker.org/~david/publickey.gpg for my public key= . From nobody Tue Apr 9 16:06:12 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDW6y2Bzfz5HFD6; Tue, 9 Apr 2024 16:06:50 +0000 (UTC) (envelope-from freebsd@walstatt-de.de) Received: from smtp052.goneo.de (smtp052.goneo.de [85.220.129.60]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDW6x6wGVz4TKB; Tue, 9 Apr 2024 16:06:49 +0000 (UTC) (envelope-from freebsd@walstatt-de.de) Authentication-Results: mx1.freebsd.org; none Received: from hub2.goneo.de (hub2.goneo.de [85.220.129.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp5.goneo.de (Postfix) with ESMTPS id A7BB5240AD8; Tue, 9 Apr 2024 18:06:42 +0200 (CEST) Received: from hub2.goneo.de (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by hub2.goneo.de (Postfix) with ESMTPS id 1568B24033A; Tue, 9 Apr 2024 18:06:41 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=walstatt-de.de; s=DKIM001; t=1712678801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vr2tM7ufCZIujkuhHYwmFfbAZz0aZ9tYbpeiRfJCqIs=; b=jRNA/Tk8f1ce+pkqIH+kDLqQ+zEAEViG5EKOvUFSc+Q2s8QN4dM8MAgFLhXSLI3At6jt73 DbA/rp5pM0Q7eB2flADyTvZVY95RoZ0jTNjir/rDrtga0//qCrYZBkIEt8GX52QiQhdRpj OJN1KisY025EAFuhONeFtMNeVi/pfrgnCpsguRV+WtmounTb8bsMN8IPWTWn+R8xuUFfGK YJfrhJOIJc1sxafyK6zy6yPguObaBMONOSBDD+fYAvgqh5TWPeKEqRfCetd24UaydBSKEJ xei8UjPXCSR1y+0oBqUQR9kCkVh8/GARzIZFhBfEFwR7aa2aQzH38A4omw8V3g== Received: from thor.intern.walstatt.dynvpn.de (dynamic-089-012-184-064.89.12.pool.telefonica.de [89.12.184.64]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by hub2.goneo.de (Postfix) with ESMTPSA id B78D0240308; Tue, 9 Apr 2024 18:06:40 +0200 (CEST) Date: Tue, 9 Apr 2024 18:06:12 +0200 From: FreeBSD User To: Rainer Hurling Cc: Rainer Hurling , Baptiste Daroussin , FreeBSD Ports , FreeBSD CURRENT Subject: Re: pkg-1.21.0: after upgrade 1.20.9_1 -> 1.21.0: pkg core dumps on specific ports Message-ID: <20240409180639.55211744@thor.intern.walstatt.dynvpn.de> In-Reply-To: References: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> <4dsxkdn2uacxuxnvradf47yvhxx6z6pzbqezpfzn4qvbludhld@ome2p54xt2et> Organization: walstatt-de.de List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-UID: 0a7c43 X-Rspamd-UID: ee5b3f X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:25394, ipnet:85.220.128.0/17, country:DE] X-Rspamd-Queue-Id: 4VDW6x6wGVz4TKB Am Tue, 9 Apr 2024 17:10:52 +0200 Rainer Hurling schrieb: > Am 09.04.24 um 09:20 schrieb Baptiste Daroussin: > > On Sat 06 Apr 09:23, Rainer Hurling wrote: > >> Am 06.04.24 um 09:05 schrieb FreeBSD User: > >>> Hello, > >>> > >>> after updating (portmaster and make) ports-mgmt/ports from 1.20.9_1 -> 1.21.0 on CURRENT > >>> and 14-STABLE, I can't update several ports: > >>> > >>> www/apache24 > >>> databases/redis > >>> > >>> pkg core dumps while performing installation. apache24 and redis are ports I realized > >>> this misbehaviour on ALL 14-STABLE and CURRENT boxes (both OS variants latest builds, > >>> i.e. FreeBSD 15.0-CURRENT #32 main-n269135-da2b732288c7: Fri Apr 5 20:30:39 CEST 2024 > >>> amd64). > >>> > >>> After some updates on a poudriere builder (CURRENT base host, 14.0-RELENG jail with > >>> poudriere) building packages for 14.0-RELENG, I observed the same behaviour when > >>> updating packages on target hosts where pkg is first updated, on those hosts, > >>> nextcloud-server and icinga2 host utilizing also databases/redis and www/apache24, pkg > >>> fails the same way. > >>> > >>> I do not dare to update our poudriere hosts since the problem seems to pop up when pkg > >>> 1.21.0 is installed, no matter whether I use poudriere built ports (from our own builder > >>> hosts) or recent source tree with portmaster/make build process. > >>> > >>> Looks like a serious bug to me and not a site/user specific problem. Hopefully others do > >>> realize the same ... > >>> > >>> Thanks in advance, > >>> > >>> oh > >> > >> > >> Hmm, I just tried to reproduce that. Both ports mentioned, databases/redis > >> and www/apache24, can be built and installed with Portmaster. The box is a > >> 15.0-CURRENT with pkg-1.21.0. > >> > >> Maybe 'pkg check -Bn' or 'portmaster --check-depends --check-port-dbdir' > >> show some inconsistencies? > >> > >> Best wishes, > >> Rainer > >> > >> > > using portmaster or not are strictly unlikely to be helpful here. > > > > The right way to test if to report running with pkg -dddd and also to recommand > > testing with default options in pkg.conf. > > > > Best regards, > > Bapt > > This is correct and certainly better. I was not aware of this. > > Fortunately, my less optimal suggestions helped O. Hartmann in this case > to find the missing and outdated dependencies. > > In any case, many thanks for this helpfull advice. > > Regards, > Rainer > > Hello, @Babptist : it should be pkg -d, shouldn't it? Or do I miss again something here? With today's update to pkg 1.21.1 the problem has vanished. @R. Hurling: Thanks for the tip using the checks. I missed that and somehow it revealed some problems here I hopefully have fixed so far. Kind regads and thanks, oh -- O. Hartmann From nobody Tue Apr 9 16:18:49 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDWNr5zvnz5HGBM for ; Tue, 9 Apr 2024 16:18:52 +0000 (UTC) (envelope-from glebius@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDWNr59kWz4XZr; Tue, 9 Apr 2024 16:18:52 +0000 (UTC) (envelope-from glebius@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712679532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JEdHmo8ZHYyNTuvI5y7/3js0Pnp88apcwKTDSxRtSeU=; b=nKhPRT58ICdJsYNiXtJV6jSk+AvX7vEs8cCW4GUWUP6L5+nq0bIQwibjUaTuTUB4CMi3uR NUJY8E82X9Kl+yCV/tPFEUoIX8yRnsV2DDW7C8vCS4GFScdQteiiqoC1Hem6u/+DDiYK47 zYn6VDrFZpl1yxnzGb/hSsWZITcIlQTaP2cwEREbmycVOco/p4dQASc8vlZC9GmhgSgZUO 9B2GiFQhXqrM70SX4dJnICJAmGvSkE8TkiI/HOhrKONX3g6NebulL480ARAEuOaFrnB9CK Dtripi7sPj7E2f9dPkjPs+yY+4IGze96nkuzfLFg6UY7rkIYnXbrV+r84UTQMA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712679532; a=rsa-sha256; cv=none; b=s1Th4R7pE7BydUP0jtpyPf4KDNuGtviODZMwpA+CEC3vZivqjO2rFiOkIEQNvJYkiHHkxD R8qFNfgzvyMzyWMNNLGXtWJiP0P7VK0muc+EHA0HdFtfp0hdQ1TJsD/ovUbI4ThxNMEoTo Sz7Ku9A2w96J2GJDxZkCMzTIf53QP8O7IsB/f5J0lc8+ZsFa/oJ5cMwOFbRsWpBhXlcxy9 6XHaY8xOudL/kj/dDz0J2nUJHVaFC622RaGQ7i5O1edd6R3bwmZTa864gVOtj3qvPSjpUu 4AcqHH15NmDQoX0q/UYOEy4RES6jXE4v0lgIKX335sU8eDOfXDKuG+SGbhpLSg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712679532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JEdHmo8ZHYyNTuvI5y7/3js0Pnp88apcwKTDSxRtSeU=; b=GMjIFWNHtyKJwiHrTRXxrrHz3VNiGiNYuycx4rCNZiL2PmCIxCozloGI3Dm0dP/eW3+VOD NwtULgzEGNlOB+IUQBI3FOhfCTv37G6ZIK6kjSU7EmEm566ZLTocf4FBxebYtKUAgGkzBB sd+2iKJpaylGkUa9u1ivAaGbXop6HuHCZ5GpWNGbfEQHI4lzDbK3Ej5hsaOmnhCLS6Jnb4 PQNEeK71lsyFPpHewrRTrJsZjTfV+G0X/4BgkaLHCMOfVF52J5wfkyn3ic0RCqk0Q8pMxX V/ma3EkQcB9XWsEiTV+Pl74n0qa0wQbKobtBk7GlFAqYbkwNs9uJOfMu6HYOrA== Received: from cell.glebi.us (glebi.us [162.251.186.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: glebius) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VDWNr1jqsz1B9Z; Tue, 9 Apr 2024 16:18:52 +0000 (UTC) (envelope-from glebius@freebsd.org) Date: Tue, 9 Apr 2024 09:18:49 -0700 From: Gleb Smirnoff To: current@freebsd.org, David Wolfskill Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Message-ID: References: List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Apr 09, 2024 at 04:47:07AM -0700, David Wolfskill wrote: D> --- trap 0xc, rip = 0xffffffff80b208c5, rsp = 0xfffffe048c204920, rbp = 0xfffffe D> 048c204960 --- D> __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960 D> clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0 D> local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50 D> rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0 D> svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90 D> sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00 D> amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30 D> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c204f30 D> --- syscall (154, FreeBSD ELF64, nlm_syscall), rip = 0x3f00a2dfd2a, rsp = 0x3f00 D> 96f7168, rbp = 0x3f0096f7230 --- D> KDB: enter: panic D> [ thread pid 1208 tid 101107 ] D> Stopped at kdb_enter+0x33: movq $0,0x104eb92(%rip) D> db> This should be fixed by just pushed e205fd318a296ffdb7392486cdcec7f660fcffcf. Sorry for that! -- Gleb Smirnoff From nobody Tue Apr 9 16:41:15 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDWtk0N0Nz5HHw9 for ; Tue, 9 Apr 2024 16:41:18 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mx.catwhisker.org (mx.catwhisker.org [107.204.234.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDWtj0KFJz4b5g; Tue, 9 Apr 2024 16:41:16 +0000 (UTC) (envelope-from david@catwhisker.org) Authentication-Results: mx1.freebsd.org; none Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.18.1/8.18.1) with ESMTP id 439GfFT9056608; Tue, 9 Apr 2024 16:41:15 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.18.1/8.18.1/Submit) id 439GfFa5056607; Tue, 9 Apr 2024 09:41:15 -0700 (PDT) (envelope-from david) Date: Tue, 9 Apr 2024 09:41:15 -0700 From: David Wolfskill To: Gleb Smirnoff Cc: current@freebsd.org Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Message-ID: Mail-Followup-To: David Wolfskill , Gleb Smirnoff , current@freebsd.org References: List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="YoqVKM1RoDeO2Cqc" Content-Disposition: inline In-Reply-To: X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:7018, ipnet:107.192.0.0/12, country:US] X-Rspamd-Queue-Id: 4VDWtj0KFJz4b5g --YoqVKM1RoDeO2Cqc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 09, 2024 at 09:18:49AM -0700, Gleb Smirnoff wrote: > ... > D> db>=20 >=20 > This should be fixed by just pushed e205fd318a296ffdb7392486cdcec7f660fcf= fcf. Thanks! :-) > Sorry for that! > .... Glad it's idenitfied & addressed. [Sorry for delay; commute this morning was a bit more turbulent than usual.] Peace, david --=20 David H. Wolfskill david@catwhisker.org Alexey Navalny was a courageous man; Putin has made him a martyr. See https://www.catwhisker.org/~david/publickey.gpg for my public key. --YoqVKM1RoDeO2Cqc Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iNUEARYKAH0WIQSTLzOSbomIK53fjFliipiWhXYx5QUCZhVvq18UgAAAAAAuAChp c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0OTMy RjMzOTI2RTg5ODgyQjlEREY4QzU5NjI4QTk4OTY4NTc2MzFFNQAKCRBiipiWhXYx 5fmHAP9NwXQ3W0apApe8UXEOMmNFEJ2MA/7/MrwG2PYoNMlBfQD/YGLCBLC9GUij FssJCPtLxwWQLtPD6+CJ/ZRbL2sdxAA= =j7CZ -----END PGP SIGNATURE----- --YoqVKM1RoDeO2Cqc-- From nobody Tue Apr 9 17:02:11 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDXMc69sCz5HKry for ; Tue, 9 Apr 2024 17:02:52 +0000 (UTC) (envelope-from freebsd@walstatt-de.de) Received: from smtp6.goneo.de (smtp6.goneo.de [85.220.129.31]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDXMc3ygHz4dQc; Tue, 9 Apr 2024 17:02:52 +0000 (UTC) (envelope-from freebsd@walstatt-de.de) Authentication-Results: mx1.freebsd.org; none Received: from hub1.goneo.de (hub1.goneo.de [IPv6:2001:1640:5::8:52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp6.goneo.de (Postfix) with ESMTPS id 94F982407DE; Tue, 9 Apr 2024 19:02:45 +0200 (CEST) Received: from hub1.goneo.de (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by hub1.goneo.de (Postfix) with ESMTPS id 708F7240036; Tue, 9 Apr 2024 19:02:39 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=walstatt-de.de; s=DKIM001; t=1712682159; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7KscEF2ndqArzSsCHnYbY09ed6WIdz8MkOGY978+FGY=; b=TVwMKuJwfiCcrgKZb8hEpFwTFXPFlD9I0NFu8kWp4mlMBE8yxBnSltRP7oExxzMG+/zIGY NcPfRSHaTvz5snULwM1oRSPvIGhA4mEpqj7XcWogT2Gjm8AILQNJfxqVttZU7h7FOZ8h3y pwMeFBf2HVGH+H2+cBNhcWU0XRozoNLEQc6r6LF+tu4CV0ipC0ZkU97DyZRr10r+fRqQFF 11CKiG+C5F4hzOny0Bj+VPt3L1JAjEFp/eHWwEcurPsnLDz0ntnzED49t1nIgn4+OOV81Q Rd/AGC+atZ+ieLJN4wPZ2PeYeccXaGlKv1bFq0qhn9ivqXfUVENeIl5uWKBMoA== Received: from thor.intern.walstatt.dynvpn.de (dynamic-089-012-184-064.89.12.pool.telefonica.de [89.12.184.64]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by hub1.goneo.de (Postfix) with ESMTPSA id 0084B240013; Tue, 9 Apr 2024 19:02:38 +0200 (CEST) Date: Tue, 9 Apr 2024 19:02:11 +0200 From: FreeBSD User To: Gleb Smirnoff Cc: current@freebsd.org, David Wolfskill Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Message-ID: <20240409190238.32f2be63@thor.intern.walstatt.dynvpn.de> In-Reply-To: References: Organization: walstatt-de.de List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-UID: 45db78 X-Rspamd-UID: 6e838f X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:25394, ipnet:85.220.128.0/17, country:DE] X-Rspamd-Queue-Id: 4VDXMc3ygHz4dQc Am Tue, 9 Apr 2024 09:18:49 -0700 Gleb Smirnoff schrieb: > On Tue, Apr 09, 2024 at 04:47:07AM -0700, David Wolfskill wrote: > D> --- trap 0xc, rip = 0xffffffff80b208c5, rsp = 0xfffffe048c204920, rbp = 0xfffffe > D> 048c204960 --- > D> __mtx_lock_flags() at __mtx_lock_flags+0x45/frame 0xfffffe048c204960 > D> clnt_vc_create() at clnt_vc_create+0x4f4/frame 0xfffffe048c204ab0 > D> local_rpcb() at local_rpcb+0x11b/frame 0xfffffe048c204b50 > D> rpcb_unset() at rpcb_unset+0x24/frame 0xfffffe048c204bb0 > D> svc_tp_create() at svc_tp_create+0xee/frame 0xfffffe048c204c90 > D> sys_nlm_syscall() at sys_nlm_syscall+0x3d0/frame 0xfffffe048c204e00 > D> amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe048c204f30 > D> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe048c204f30 > D> --- syscall (154, FreeBSD ELF64, nlm_syscall), rip = 0x3f00a2dfd2a, rsp = 0x3f00 > D> 96f7168, rbp = 0x3f0096f7230 --- > D> KDB: enter: panic > D> [ thread pid 1208 tid 101107 ] > D> Stopped at kdb_enter+0x33: movq $0,0x104eb92(%rip) > D> db> > > This should be fixed by just pushed e205fd318a296ffdb7392486cdcec7f660fcffcf. > > Sorry for that! > Hello all. The crash is still present on the most recent checked out sources as of minutes ago. I just checked out on HEAD the latest commits (see below, just for the record and to prevent being wrong here). [...] commit 841cf52595b6a6b98e266b63e54a7cf6fb6ca73e (HEAD -> main, origin/main, origin/HEAD) Author: Alan Cox Date: Mon Apr 8 00:05:27 2024 -0500 arm64 pmap: Add ATTR_CONTIGUOUS support [Part 2] Create ATTR_CONTIGUOUS mappings in pmap_enter_object(). As a result, when the base page size is 4 KB, the read-only data and text sections of large (2 MB+) executables, e.g., clang, can be mapped using 64 KB pages. Similarly, when the base page size is 16 KB, the read-only data section of large executables can be mapped using 2 MB pages. Rename pmap_enter_2mpage(). Given that we have grown support for 16 KB base pages, we should no longer include page sizes that may vary, e.g., 2mpage, in pmap function names. Requested by: andrew Co-authored-by: Eliot Solomon Differential Revision: https://reviews.freebsd.org/D44575 commit e205fd318a296ffdb7392486cdcec7f660fcffcf Author: Gleb Smirnoff Date: Tue Apr 9 09:16:52 2024 -0700 rpc: use new macros to lock socket buffers Fixes: d80a97def9a1db6f07f5d2e68f7ad62b27918947 commit cb20a74ca06381e96c41cb4495d633710cc6cb79 Author: Stephen J. Kiernan Date: Wed Apr 3 17:04:57 2024 -0400 -- O. Hartmann From nobody Tue Apr 9 17:08:50 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDXVY0GV5z5HLCP for ; Tue, 9 Apr 2024 17:08:53 +0000 (UTC) (envelope-from glebius@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDXVX6qysz4g01; Tue, 9 Apr 2024 17:08:52 +0000 (UTC) (envelope-from glebius@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712682533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Cv8XCv8ka6ZhVki8p8OFPZUdrGJXJjGz0Ut9oIptGF8=; b=Rl+n7D+BE9dBBWv8idJrBKyPnYTaqanW0tEsw95ZDiZ8/ksFRbzM9gI3i83HWqAhopAFXr dp81w2/pVEq3q3+k6lJTlsezjkXdBZrVNAUL26+6iogAvs7IcEkz1GTC2lSTYctGjhGkDx Z1FoqrAeC7CMKP+DlIfmD2zDxUqaLSOdOHnNqQVyaxTiQr/Jla0GhVuSRBH6QmNOgp21sr NgLGM2IzHFpU2n/tAJyX/JHIn3I6j6iZCitVlhtnL7UmyIFwjwd+8sHp0T2rx66mpQLqK8 ovEi02YUSXxHDWqgwKQtaEfTw/vbHuNRRMVwMeCr1zpvGloQ3FPWGYWfvsxjIg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712682533; a=rsa-sha256; cv=none; b=u3jj1KdNw8C4UPdVNzD/bY4rQkSSPiGXIoW2CqUKOduwd1Z06O5jAgi0wqDX1YDB2MldKu +morW/iOtsuTIydc7PeB4Hth5Mr7idNA9A11v4SHeqI3nqecsOiNnDLFywovaKXgFBaefP X+c7O2Aar4I5S9XYNQeIjtIFX0+pvzl4n1NPtU5iqnpS4j7BMq7k2Zr4rjKwStE/peEMb/ Rm/IBAz2OvKKXTGVgBTc2Oz9Vun5KNLdp/v3eqx37LFfYeXP4Q3pb9J/GS0gORwdeW08aP MR9iOz2K0t9hziDh6NNztYPIZIV3yrdIUVaQ+q1O0aFg4E6NyZzYDan33cABNQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712682533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Cv8XCv8ka6ZhVki8p8OFPZUdrGJXJjGz0Ut9oIptGF8=; b=pOW4tsdBsDXZ0Jl+NqOxNZR17AK8L7ih04eJ8YR48jTbDooHff6E7jlmdsH7dbx1FRjThH 9nvldm5KMWnqzw+aBR10RMi/dh0Md74JJSROw3aa+O2M8wUoJS4MKMfOP3L/abiNtxfwJw 9NjmuBIvle07A6AXrca8Vhhrubj6haA5ZJwb+1+CLpdx5JhH43hThZmGzjvyXBcNRR8ROt y43qswggSvy4YRDrLWO2dVXLjk4fDaJ6cto5+P+9ecfLDBHwyJKddbjPDQpwJJyvzGz3nX h6CUY/GwxrebWyNLCmRwAp891lr+S00kRlD/bzNweUAxIW/IXP+kVIEYJwjxrg== Received: from cell.glebi.us (glebi.us [162.251.186.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: glebius) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VDXVX3QmFz1DCT; Tue, 9 Apr 2024 17:08:52 +0000 (UTC) (envelope-from glebius@freebsd.org) Date: Tue, 9 Apr 2024 10:08:50 -0700 From: Gleb Smirnoff To: FreeBSD User Cc: current@freebsd.org, David Wolfskill Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Message-ID: References: <20240409190238.32f2be63@thor.intern.walstatt.dynvpn.de> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240409190238.32f2be63@thor.intern.walstatt.dynvpn.de> On Tue, Apr 09, 2024 at 07:02:11PM +0200, FreeBSD User wrote: F> The crash is still present on the most recent checked out sources as of minutes ago. F> I just checked out on HEAD the latest commits (see below, just for the record and to prevent F> being wrong here). F> F> [...] F> commit 841cf52595b6a6b98e266b63e54a7cf6fb6ca73e (HEAD -> main, origin/main, origin/HEAD) Is the crash same or different? Can you please share backtrace? -- Gleb Smirnoff From nobody Tue Apr 9 17:59:08 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDZdF6s4Nz5Fkws for ; Tue, 9 Apr 2024 18:44:49 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from omta001.cacentral1.a.cloudfilter.net (omta001.cacentral1.a.cloudfilter.net [3.97.99.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDZdF2W5Pz4qTX; Tue, 9 Apr 2024 18:44:49 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of cy.schubert@cschubert.com has no SPF policy when checking 3.97.99.32) smtp.mailfrom=cy.schubert@cschubert.com Received: from shw-obgw-4001a.ext.cloudfilter.net ([10.228.9.142]) by cmsmtp with ESMTPS id u7KtrdHdf2Ui5uGLlrznn9; Tue, 09 Apr 2024 18:37:41 +0000 Received: from spqr.komquats.com ([70.66.152.170]) by cmsmtp with ESMTPSA id uGLjrvqiHpsbguGLkrTNRd; Tue, 09 Apr 2024 18:37:41 +0000 Message-ID: uGLjrvqiHpsbguGLkrTNRd@shw-obgw-4001a.ext.cloudfilter.net X-Authority-Analysis: v=2.4 cv=Ff+Ux4+6 c=1 sm=1 tr=0 ts=66158af5 a=y8EK/9tc/U6QY+pUhnbtgQ==:117 a=y8EK/9tc/U6QY+pUhnbtgQ==:17 a=HpEJnUlJZJkA:10 a=kj9zAlcOel0A:10 a=raytVjVEu-sA:10 a=YxBL1-UpAAAA:8 a=6I5d2MoRAAAA:8 a=EkcXrb_YAAAA:8 a=deZXG2v8ohieStxLYE8A:9 a=tGNgDIr1aq9wFFCi:21 a=CjuIK1q_8ugA:10 a=Njy1M2FgLQYA:10 a=Ia-lj3WSrqcvXOmTRaiG:22 a=IjZwj45LgO3ly-622nXo:22 a=LK5xJRSDVpKd5WXXoEvA:22 Received: from slippy.cwsent.com (slippy [10.1.1.91]) by spqr.komquats.com (Postfix) with ESMTP id F26A6CD; Tue, 09 Apr 2024 11:37:38 -0700 (PDT) Received: by slippy.cwsent.com (Postfix, from userid 1000) id C1793B7; Tue, 09 Apr 2024 11:37:38 -0700 (PDT) X-Mailer: exmh version 2.9.0 11/07/2018 with nmh-1.8+dev Reply-to: Cy Schubert From: Cy Schubert X-os: FreeBSD X-Sender: cy@cwsent.com X-URL: http://www.cschubert.com/ To: Gleb Smirnoff , FreeBSD User , current@freebsd.org, David Wolfskill Subject: Re: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Comments: In-reply-to Cy Schubert message dated "Tue, 09 Apr 2024 10:17:10 -0700." List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 09 Apr 2024 10:59:08 -0700 Resent-To: Gleb Smirnoff , FreeBSD User , current@freebsd.org, David Wolfskill Resent-From: "Cy Schubert (cy)" Resent-Date: Tue, 09 Apr 2024 11:37:38 -0700 Resent-Message-Id: <20240409183738.C1793B7@slippy.cwsent.com> X-CMAE-Envelope: MS4xfPW3f0SG1kcd3gSgOIWBopP+gEmZApJgiQTdtheqYbDBhlDarA82UCBZrDwz2dnq27WpuR6wq0aR2hvC23/azJhK3MqxbujEvMJM6mczigFW0ewNJuvj czFq/hPPx2uyzL0fBCbdHpBTUyiDddthmqHq2XZnQjArqtUvkh4CtAeIXkn3f1TC3FsQ+9LWwNbGuxuyfMRT0R6n5RMbZS0vHKv3yg+hgKeC7sD323AQEP+G NSnGR1ZF8ApxKujDyIaopd6eOLeM4DmGESJ0mU9NBP4N00m2YyWicmXw739/MUuJ X-Spamd-Bar: + X-Spamd-Result: default: False [1.83 / 15.00]; THREAD_HIJACKING_FROM_INJECTOR(2.00)[]; FAKE_REPLY(1.00)[]; AUTH_NA(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.97)[-0.974]; MID_MISSING_BRACKETS(0.50)[]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_IN_DNSWL_LOW(-0.10)[3.97.99.32:from]; RCVD_TLS_LAST(0.00)[]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; DMARC_NA(0.00)[cschubert.com]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; GREYLIST(0.00)[pass,body]; RCPT_COUNT_THREE(0.00)[4]; ASN(0.00)[asn:16509, ipnet:3.96.0.0/15, country:US]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; HAS_REPLYTO(0.00)[Cy.Schubert@cschubert.com]; R_SPF_NA(0.00)[no SPF record]; MLMMJ_DEST(0.00)[current@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_DKIM_NA(0.00)[]; REPLYTO_EQ_FROM(0.00)[] X-Rspamd-Queue-Id: 4VDZdF2W5Pz4qTX Cy Schubert writes: > In message , Gleb Smirnoff writes: > > On Tue, Apr 09, 2024 at 07:02:11PM +0200, FreeBSD User wrote: > > F> The crash is still present on the most recent checked out sources as of > mi > > nutes ago. > > F> I just checked out on HEAD the latest commits (see below, just for the r > ec > > ord and to prevent > > F> being wrong here). > > F> > > F> [...] > > F> commit 841cf52595b6a6b98e266b63e54a7cf6fb6ca73e (HEAD -> main, origin/ma > in > > , origin/HEAD) > > > > Is the crash same or different? Can you please share backtrace? > > The new panic is: > > Fatal trap 12: page fault while in kernel mode > cpuid = 3; apic id = 03 > fault virtual address = 0x28 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80729d8d > stack pointer = 0x28:0xfffffe00b59c0a70 > frame pointer = 0x28:0xfffffe00b59c0aa0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 2697 (rpcbind) > rdi: fffff80004fcd720 rsi: 0000000000000000 rdx: fffffe00b59c0b68 > rcx: 0000000000000000 r8: 0000000000000001 r9: 000000003b9ac9e0 > rax: 000000003b9aca00 rbx: fffffe00b59c0b68 rbp: fffffe00b59c0aa0 > r10: 0000000000000020 r11: 00000000ffffffff r12: 0000000000000000 > r13: 0000000000000020 r14: 0000000000000020 r15: fffff80004fcd720 > trap number = 12 > panic: page fault > cpuid = 3 > time = 1712682162 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe00b59c0760 > vpanic() at vpanic+0x135/frame 0xfffffe00b59c0890 > panic() at panic+0x43/frame 0xfffffe00b59c08f0 > trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00b59c0950 > trap_pfault() at trap_pfault+0x46/frame 0xfffffe00b59c09a0 > calltrap() at calltrap+0x8/frame 0xfffffe00b59c09a0 > --- trap 0xc, rip = 0xffffffff80729d8d, rsp = 0xfffffe00b59c0a70, rbp = > 0xfffffe00b59c0aa0 --- > uiomove_faultflag() at uiomove_faultflag+0x9d/frame 0xfffffe00b59c0aa0 > uipc_soreceive_stream_or_seqpacket() at uipc_soreceive_stream_or_seqpacket+0 > x38c/frame 0xfffffe00b59c0b30 > soreceive() at soreceive+0x2f/frame 0xfffffe00b59c0b50 > clnt_vc_soupcall() at clnt_vc_soupcall+0x139/frame 0xfffffe00b59c0c00 > sorwakeup_locked() at sorwakeup_locked+0x98/frame 0xfffffe00b59c0c20 > uipc_sosend_stream_or_seqpacket() at uipc_sosend_stream_or_seqpacket+0x58e/f > rame 0xfffffe00b59c0ce0 > sousrsend() at sousrsend+0x5f/frame 0xfffffe00b59c0d40 > dofilewrite() at dofilewrite+0x7f/frame 0xfffffe00b59c0d90 > sys_write() at sys_write+0xb3/frame 0xfffffe00b59c0e00 > amd64_syscall() at amd64_syscall+0x115/frame 0xfffffe00b59c0f30 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00b59c0f30 > --- syscall (4, FreeBSD ELF64, write), rip = 0x1d82f79281a, rsp = > 0x1d82c63be78, rbp = 0x1d82c63bee0 --- > Uptime: 39s > Dumping 515 out of 7969 MB:..4%..13%..22%..32%..41%..53%..63%..72%..81%..91% > > (kgdb) bt > #0 __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:57 > #1 doadump (textdump=textdump@entry=1) at /opt/src/git-src/sys/kern/kern_sh > utdown.c:404 > #2 0xffffffff806bd7d9 in kern_reboot (howto=260) at > /opt/src/git-src/sys/kern/kern_shutdown.c:524 > #3 0xffffffff806bdcf2 in vpanic (fmt=0xffffffff80ae0f0d "%s", > ap=ap@entry=0xfffffe00b59c08d0) at /opt/src/git-src/sys/kern/kern_shutdown.c > :976 > #4 0xffffffff806bdb43 in panic (fmt=) at > /opt/src/git-src/sys/kern/kern_shutdown.c:892 > #5 0xffffffff80a597fb in trap_fatal (frame=0xfffffe00b59c09b0, eva=40) at > /opt/src/git-src/sys/amd64/amd64/trap.c:950 > #6 0xffffffff80a59846 in trap_pfault (frame=, usermode=false, > signo=, ucode=) at /opt/src/git-src/sys/amd64/ > amd64/trap.c:758 > #7 > #8 uiomove_faultflag (cp=0xfffff80004fcd720, n=32, > uio=uio@entry=0xfffffe00b59c0b68, nofault=nofault@entry=0) at > /opt/src/git-src/sys/kern/subr_uio.c:240 > #9 0xffffffff80729ce9 in uiomove (cp=0xfffff80004fcd720, n=0, > uio=uio@entry=0xfffffe00b59c0b68) at /opt/src/git-src/sys/kern/subr_uio.c:19 > 3 > #10 0xffffffff80774f1c in uipc_soreceive_stream_or_seqpacket > (so=0xfffff800361f4000, psa=, uio=0xfffffe00b59c0b68, > mp0=, controlp=0xfffffe00b59c0bc0, flagsp=0xfffffe00b59c0ba8) > at /opt/src/git-src/sys/kern/uipc_usrreq.c:1420 > #11 0xffffffff8076d4ff in soreceive (so=0xfffff80004fcd720, > so@entry=0xfffff800361f4000, psa=psa@entry=0x0, uio=uio@entry=0xfffffe00b59c > 0b68, mp0=0x0, mp0@entry=0xfffffe00b59c0bb8, controlp=0x1, > controlp@entry=0xfffffe00b59c0bc0, flagsp=0x3b9ac9e0, > flagsp@entry=0xfffffe00b59c0ba8) at /opt/src/git-src/sys/kern/uipc_socke > t.c:2965 > #12 0xffffffff80917719 in clnt_vc_soupcall (so=0xfffff800361f4000, > arg=0xfffff80036191c00, waitflag=) at > /opt/src/git-src/sys/rpc/clnt_vc.c:991 > #13 0xffffffff80765338 in sowakeup (so=0xfffff800361f4000, which=SO_RCV) at > /opt/src/git-src/sys/kern/uipc_sockbuf.c:493 > #14 sorwakeup_locked (so=so@entry=0xfffff800361f4000) at > /opt/src/git-src/sys/kern/uipc_sockbuf.c:526 > #15 0xffffffff807758ae in uipc_sosend_stream_or_seqpacket > (so=0xfffff800361e4b40, addr=, uio=0xfffffe00b59c0da8, > m=, c=, flags=, > td=0xfffff8001e73e000) at /opt/src/git-src/sys/kern/uipc_usrreq.c:1154 > #16 0xffffffff8076b2cf in sousrsend (so=0xfffff80004fcd720, addr=0x0, > uio=0xfffffe00b59c0b68, control=0x1, flags=0, userproc=0x0) at > /opt/src/git-src/sys/kern/uipc_socket.c:1941 > #17 0xffffffff8073106f in fo_write (fp=0xfffff800092800a0, > uio=0xfffffe00b59c0da8, active_cred=0xfffffe00b59c0b68, > td=0xfffff8001e73e000, flags=) at /opt/src/git-src/sys/sys/fi > le.h:352 > #18 dofilewrite (td=td@entry=0xfffff8001e73e000, fd=fd@entry=14, > fp=0xfffff800092800a0, auio=auio@entry=0xfffffe00b59c0da8, > offset=offset@entry=-1, flags=flags@entry=0) at /opt/src/git-src/sys/kern/sy > s_generic.c:562 > #19 0xffffffff80730c23 in kern_writev (td=0xfffff8001e73e000, fd=14, > auio=0xfffffe00b59c0da8) at /opt/src/git-src/sys/kern/sys_generic.c:489 > #20 sys_write (td=0xfffff8001e73e000, uap=) at > /opt/src/git-src/sys/kern/sys_generic.c:404 > #21 0xffffffff80a5a0b5 in syscallenter (td=0xfffff8001e73e000) at > /opt/src/git-src/sys/amd64/amd64/../../kern/subr_syscall.c:189 > #22 amd64_syscall (td=0xfffff8001e73e000, traced=0) at > /opt/src/git-src/sys/amd64/amd64/trap.c:1192 > #23 > #24 0x000001d82f79281a in ?? () > Backtrace stopped: Cannot access memory at address 0x1d82c63be78 > (kgdb) frame 8 > #8 uiomove_faultflag (cp=0xfffff80004fcd720, n=32, > uio=uio@entry=0xfffffe00b59c0b68, nofault=nofault@entry=0) at > /opt/src/git-src/sys/kern/subr_uio.c:240 > 240 cnt = iov->iov_len; > (kgdb) p *iov > Cannot access memory at address 0x20 > (kgdb) l > 235 while (n > 0 && uio->uio_resid) { > 236 KASSERT(uio->uio_iovcnt > 0, > 237 ("%s: uio %p iovcnt underflow", __func__, uio)); > 238 > 239 iov = uio->uio_iov; > 240 cnt = iov->iov_len; > 241 if (cnt == 0) { > 242 uio->uio_iov++; > 243 uio->uio_iovcnt--; > 244 continue; > (kgdb) p *uio > $1 = {uio_iov = 0x20, uio_iovcnt = 0, uio_offset = 0, uio_resid = > 1000000000, uio_segflg = (unknown: 0x80696078), uio_rw = (UIO_WRITE | > unknown: 0xfffffffe), uio_td = 0xfffff8001e73e000} > (kgdb) uio_iov contains 0x20 at frame 12. Is it because send buffer is now bypassed, not initializing uio_iov? -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 From nobody Tue Apr 9 20:02:45 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VDcMC4zG7z5Fswf; Tue, 9 Apr 2024 20:02:47 +0000 (UTC) (envelope-from bapt@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VDcMC4DV9z4xM2; Tue, 9 Apr 2024 20:02:47 +0000 (UTC) (envelope-from bapt@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712692967; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tBmOfO+iBeV5Elw727weh9o1tq1pa3QpZ0QY0hgb8yA=; b=gcSDpkSwu+X+gC8nit2wcZmh8O9ZhYIUDq9r9k1BT0JzzdGXQYrws4OKg9oIVVnsFayeRi hWDp9EX+KfTbwXqqvEmK/VJb7VAm4WP1Z0BLoVglJ9TEUaz/Xn5JjmnHMvU+x7gg92aAxC wbvCtcx3IlPWVv8Yv9R5YqceCg9MOJtUlzrv1+AiqRoHrCg6m0N9u9FFl/DgZAe+1LGV8I epRq+FxK0XQPCm+h0zTp4ox75m6C2jw0+gM5MpOSzJxEGA3pSOb3T2nFD+mOO1iv/A3gfW 4tmDIYryOumMjWdutdJCqM0yP/Iusb0jwYQ9NCYRUzNQQWOypXpCyMyLJ/wldg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712692967; a=rsa-sha256; cv=none; b=rIFsRThgI20GZ1rva3paCDfbzzjM/ZI4nCfVX84NrGFPy0XAPMVS3kbw7+4m0fdtV/zmXP 1nc8wEoffqW79Z3YWlpIiHEzN2qnszpXGpwMmKE05SOxSf1kiE2yBQbvoG5UFZmXqzCKYP fpIrxnnCKQrbDH4WKEeD+iDul6/RBeS5oSbDGoIIT0YcGfDbLkmZW5//qMiwnBViPRu+tn JvE0MT/RVd29MoqZU2rbIkry3vtihPuxccZEUtnZYgurFM/OPs8MZCEZjIyMI6o6Rf416v sCIlPfX50aF/gzW4PvKj9/WsRN5LCZvHAhfI8zqLDAC8WV3tG4iu8kcplFfTZA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712692967; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tBmOfO+iBeV5Elw727weh9o1tq1pa3QpZ0QY0hgb8yA=; b=mWL39v06Ypx0+yNvscfnUFeFvH1aiFUI9fpetx1U1SqFkhyhdmD8CQDvEN0Ed5S+6AwlQt FktqVGvFO1IoqNW6uhCo2KZ1jWJrkCe/NVlg6CbO3FfWHV751RjcJMTcFd4YTiq1j8lGvb HPAGJyLCtOP0w7+Cbq2ezd4m5Ln4tMccRys5wZdQL3IzriB8NPXUOx3UR5lhMfEctwiBql 3NBANeMWwj0S0jlQdT/nNWhW8Xs4vzU5SiR7pWO3JGmaM4khqYe/V33Ec9FxTevP3omcjJ fcj0ymYHnxNvUJB8I+4qQ/xYVHkPVrM2iR+EQATB5VtxPnV4BgrpFwZ2+nAwfg== Received: from aniel.nours.eu (nours.eu [IPv6:2001:41d0:8:3a4d::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: bapt) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VDcMC3GpRz1Gmm; Tue, 9 Apr 2024 20:02:47 +0000 (UTC) (envelope-from bapt@FreeBSD.org) Received: from [IPv6:::1] (unknown [IPv6:2a01:e0a:39e:77a0:681:381f:ee72:de4f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aniel.nours.eu (Postfix) with ESMTPSA id 5775513DD1B; Tue, 9 Apr 2024 22:02:44 +0200 (CEST) Date: Tue, 09 Apr 2024 22:02:45 +0200 From: Baptiste Daroussin To: freebsd-current@freebsd.org, FreeBSD User , Rainer Hurling CC: Rainer Hurling , FreeBSD Ports , FreeBSD CURRENT Subject: =?US-ASCII?Q?Re=3A_pkg-1=2E21=2E0=3A_after_upgrade_1=2E20=2E9=5F1_-=3E?= =?US-ASCII?Q?_1=2E21=2E0=3A_pkg_core_dumps_on_specific_ports?= User-Agent: K-9 Mail for Android In-Reply-To: <20240409180639.55211744@thor.intern.walstatt.dynvpn.de> References: <20240406090527.34d84eb9@thor.intern.walstatt.dynvpn.de> <4dsxkdn2uacxuxnvradf47yvhxx6z6pzbqezpfzn4qvbludhld@ome2p54xt2et> <20240409180639.55211744@thor.intern.walstatt.dynvpn.de> Message-ID: <0E590DE9-6020-41C3-9DAE-7B3D74D99F07@FreeBSD.org> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Le 9 avril 2024 18:06:12 GMT+02:00, FreeBSD User = a =C3=A9crit=C2=A0: >Am Tue, 9 Apr 2024 17:10:52 +0200 >Rainer Hurling schrieb: > >> Am 09=2E04=2E24 um 09:20 schrieb Baptiste Daroussin: >> > On Sat 06 Apr 09:23, Rainer Hurling wrote: =20 >> >> Am 06=2E04=2E24 um 09:05 schrieb FreeBSD User: =20 >> >>> Hello, >> >>> >> >>> after updating (portmaster and make) ports-mgmt/ports from 1=2E20= =2E9_1 -> 1=2E21=2E0 on CURRENT >> >>> and 14-STABLE, I can't update several ports: >> >>> >> >>> www/apache24 >> >>> databases/redis >> >>> >> >>> pkg core dumps while performing installation=2E apache24 and redis = are ports I realized >> >>> this misbehaviour on ALL 14-STABLE and CURRENT boxes (both OS varia= nts latest builds, >> >>> i=2Ee=2E FreeBSD 15=2E0-CURRENT #32 main-n269135-da2b732288c7: Fri = Apr 5 20:30:39 CEST 2024 >> >>> amd64)=2E >> >>> >> >>> After some updates on a poudriere builder (CURRENT base host, 14=2E= 0-RELENG jail with >> >>> poudriere) building packages for 14=2E0-RELENG, I observed the same= behaviour when >> >>> updating packages on target hosts where pkg is first updated, on th= ose hosts, >> >>> nextcloud-server and icinga2 host utilizing also databases/redis an= d www/apache24, pkg >> >>> fails the same way=2E >> >>> >> >>> I do not dare to update our poudriere hosts since the problem seems= to pop up when pkg >> >>> 1=2E21=2E0 is installed, no matter whether I use poudriere built po= rts (from our own builder >> >>> hosts) or recent source tree with portmaster/make build process=2E >> >>> >> >>> Looks like a serious bug to me and not a site/user specific problem= =2E Hopefully others do >> >>> realize the same =2E=2E=2E >> >>> >> >>> Thanks in advance, >> >>> >> >>> oh =20 >> >> >> >> >> >> Hmm, I just tried to reproduce that=2E Both ports mentioned, databas= es/redis >> >> and www/apache24, can be built and installed with Portmaster=2E The = box is a >> >> 15=2E0-CURRENT with pkg-1=2E21=2E0=2E >> >> >> >> Maybe 'pkg check -Bn' or 'portmaster --check-depends --check-port-db= dir' >> >> show some inconsistencies? >> >> >> >> Best wishes, >> >> Rainer >> >> >> >> =20 >> > using portmaster or not are strictly unlikely to be helpful here=2E >> >=20 >> > The right way to test if to report running with pkg -dddd and also to= recommand >> > testing with default options in pkg=2Econf=2E >> >=20 >> > Best regards, >> > Bapt =20 >>=20 >> This is correct and certainly better=2E I was not aware of this=2E >>=20 >> Fortunately, my less optimal suggestions helped O=2E Hartmann in this c= ase=20 >> to find the missing and outdated dependencies=2E >>=20 >> In any case, many thanks for this helpfull advice=2E >>=20 >> Regards, >> Rainer >>=20 >>=20 > >Hello, > >@Babptist : it should be pkg -d, shouldn't it? Or do I miss again somethi= ng here? Each d will provide a more verbose level of debug Bapt From nobody Wed Apr 10 11:22:11 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VF0mC2cfGz5GwYD for ; Wed, 10 Apr 2024 11:22:19 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mx.catwhisker.org (mx.catwhisker.org [107.204.234.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VF0mB6zpGz4GbM; Wed, 10 Apr 2024 11:22:18 +0000 (UTC) (envelope-from david@catwhisker.org) Authentication-Results: mx1.freebsd.org; none Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.18.1/8.18.1) with ESMTP id 43ABMBrv015243; Wed, 10 Apr 2024 11:22:11 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.18.1/8.18.1/Submit) id 43ABMBvu015242; Wed, 10 Apr 2024 04:22:11 -0700 (PDT) (envelope-from david) Date: Wed, 10 Apr 2024 04:22:11 -0700 From: David Wolfskill To: Gleb Smirnoff Cc: FreeBSD User , current@freebsd.org Subject: Resolved: Panic after update main-n269202-4e7aa03b7076 -> n269230-f6f67f58c19d Message-ID: Mail-Followup-To: David Wolfskill , Gleb Smirnoff , FreeBSD User , current@freebsd.org References: <20240409190238.32f2be63@thor.intern.walstatt.dynvpn.de> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="1Ng5XyilkHwvhfQ2" Content-Disposition: inline In-Reply-To: X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:7018, ipnet:107.192.0.0/12, country:US] X-Rspamd-Queue-Id: 4VF0mB6zpGz4GbM --1Ng5XyilkHwvhfQ2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable After the update to main-n269261-1e6db7be6921, head built & booted OK. FreeBSD 15.0-CURRENT #45 main-n269261-1e6db7be6921: Wed Apr 10 11:11:50 UTC= 2024 root@freebeast.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/= sys/GENERIC amd64 1500018 1500018 Peace, david --=20 David H. Wolfskill david@catwhisker.org Alexey Navalny was a courageous man; Putin has made him a martyr. See https://www.catwhisker.org/~david/publickey.gpg for my public key. --1Ng5XyilkHwvhfQ2 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iNUEARYKAH0WIQSTLzOSbomIK53fjFliipiWhXYx5QUCZhZ2Y18UgAAAAAAuAChp c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0OTMy RjMzOTI2RTg5ODgyQjlEREY4QzU5NjI4QTk4OTY4NTc2MzFFNQAKCRBiipiWhXYx 5eetAP9zZWNaRVn/Mn+0/faRpK/apsrus3hTxi9aRMOJAO+mCQD8CSehRW0whDBO efauLDfnHaUJwjYiWkox5Bx62r/0Pg4= =/3G2 -----END PGP SIGNATURE----- --1Ng5XyilkHwvhfQ2-- From nobody Wed Apr 10 11:40:16 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VF1986KPBz5GxMl; Wed, 10 Apr 2024 11:40:28 +0000 (UTC) (envelope-from eduardo@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VF1985JrBz4JTh; Wed, 10 Apr 2024 11:40:28 +0000 (UTC) (envelope-from eduardo@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712749228; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=l8vh42v6fmv7XhH8SXV5MwwLBBKllEko3xorlPVnaYw=; b=MDOBjzYfktSQk+MMYo2jokuffnXVLtbqgRQcES8s45ZMOcwvzRpZavXfBunXZXOOQrtbZL 4Hc95bCv9vSP5T4T12Ui4wG1GQfPYaWsjkxweFqYyDOj1PcnEgcS6582b8v7jAgzUHqtZN fixgZLSBkBRpjIE+zZ+fuWvpH8JwyS1v1SrP8xRAc+jB4W4g474TViRROzC6A8GFfUuvmZ a7gaCUl0kKqU8c3wRkXVpvXTmYsEWYMl/OKMiiZMDVlkjeP9pbxFrCpLVFBxX7KfbtMvFy h+wBobI1K7ImLdU7pIwAxDrkUMc60awUtQHhoKo8oBI5I8Vec/V+ShnV+hN8sA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712749228; a=rsa-sha256; cv=none; b=h4t2wQB586gYwqpt9duoGHK8t7x1neb6IewtBZUwOhMaI/yOOnAHmIlsGjWinFt6nM74V0 dMbJgvaHPhJa2UVUneUG9Zq3eh5wD0LL7ZVSvhcLiDb1U2kswtOXCh3magOkMf19feL6vr wQ8kI/QUpbmOTCIWCj1kwVa6Um4+UNuWnHQluSJf6ZvAxf994I6JMam/kQwEnXfVJiGITI qH0+FhtonCweyfXO13EbtJa5C/xSdFU6hMTxt2v6PN1NU4iYzZoj3hBmnICY70whHhgRmD UUrA2uiBjQJ8L6yIBhZCKNXub23gBw9tkLRmEp5E+HIW7PpT6tVB7HqWnhOhMg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712749228; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=l8vh42v6fmv7XhH8SXV5MwwLBBKllEko3xorlPVnaYw=; b=VVz9aRM8f3ZHdVbFR/l8ETNeZdhJ4SeDNaAbINRYIMm+A4nbWqOhxgjDj7S+9NYM2Ai/m2 FmivSJfLOkg31ypgw4dd2ct2VkkHajPl+wuzvZWa93t+qgnAxkvblDfeGt5oyUNcMWk9c6 bi60GJyX+DVXKoPPlAoB0zMlbX78BU7ut6XH3jdUhm1OhZNvGsK7TyEZgmW9BVh1Zo9eqx CddWeGNvLPKSfMub+K44qabYCir9j5J/3aphg8lmebct2M4nuNu/5886wGfQuLkEZHUl2K 5KrifwBC4QVtdN/Eq8uWGarlcyhkJAcjNwNEN8IN72e98RbfNDDNb6sMiAodnQ== Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) (Authenticated sender: eduardo) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VF1984kkTzLQn; Wed, 10 Apr 2024 11:40:28 +0000 (UTC) (envelope-from eduardo@freebsd.org) Received: by mail-yb1-f181.google.com with SMTP id 3f1490d57ef6-dc236729a2bso6308233276.0; Wed, 10 Apr 2024 04:40:28 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCUpEzH1bSXatb9OyFDiw2rmdOiCyQKZ3dap3T9svyfGL/ivBCLS183bJQZJfrnqdM+3CGyZd6TpMhWOavbM1mcWCoL4ftWmMTJYmTwhEqq/td+9dviUhQdDP7ii/noZMbGMBJ2z8SmcV8ELkkt1NBe/qrf0tzkn X-Gm-Message-State: AOJu0Yw8nglrE/rdqV/6XvLzStCInUG+FqpwODverstCQRlIESB7tRF7 5xZo1upotbQ8ueW8PdTbLa11djf3U7A0zAHI0+EvV4Hi0rjkaXMsyaMjsT+W42MlCBGvhObyARV 8QZSv09uzuvcCNxYEgFlD+t2gyZk= X-Google-Smtp-Source: AGHT+IE9OgPBrk3nu4n44r3WDLgcsQdARQYYW6dMCmekc+eLpEAQo51FYbRxDz80TYeEtiOgK5hh+52bpEh64hqiEpc= X-Received: by 2002:a25:d842:0:b0:dcf:bf94:bc30 with SMTP id p63-20020a25d842000000b00dcfbf94bc30mr2503007ybg.34.1712749227563; Wed, 10 Apr 2024 04:40:27 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <6e795e9c-8de4-4e02-9a96-8fabfaa4e66f@app.fastmail.com> <6047C8EF-B1B0-4286-93FA-AA38F8A18656@karels.net> <8031cd99-ded8-4b06-93b3-11cc729a8b2c@app.fastmail.com> <38c54399-6c96-44d8-a3a2-3cc1bfbe50c2@app.fastmail.com> <27d8144f-0658-46f6-b8f3-35eb60061644@lakerest.net> <5C9863F7-0F1C-4D02-9F6D-9DDC5FBEB368@freebsd.org> In-Reply-To: <5C9863F7-0F1C-4D02-9F6D-9DDC5FBEB368@freebsd.org> From: Nuno Teixeira Date: Wed, 10 Apr 2024 12:40:16 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Request for Testing: TCP RACK To: tuexen@freebsd.org Cc: Drew Gallatin , Konstantin Belousov , rrs , Mike Karels , garyj@gmx.de, current@freebsd.org, net@freebsd.org, Randall Stewart Content-Type: multipart/alternative; boundary="0000000000004c6d020615bc81cf" --0000000000004c6d020615bc81cf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello all, @ current 1500018 and fetching torrents with net-p2p/qbittorrent finished ~2GB download and connection UP until the end: --- Apr 10 11:26:46 leg kernel: re0: watchdog timeout Apr 10 11:26:46 leg kernel: re0: link state changed to DOWN Apr 10 11:26:49 leg dhclient[58810]: New IP Address (re0): 192.168.1.67 Apr 10 11:26:49 leg dhclient[58814]: New Subnet Mask (re0): 255.255.255.0 Apr 10 11:26:49 leg dhclient[58818]: New Broadcast Address (re0): 192.168.1.255 Apr 10 11:26:49 leg kernel: re0: link state changed to UP Apr 10 11:26:49 leg dhclient[58822]: New Routers (re0): 192.168.1.1 --- In the past tests, I've got more watchdog timeouts, connection goes down and a reboot needed to put it back (`service netif restart` didn't work). Other way to reproduce this is using sysutils/restic (backup program) to read/check all files from a remote server via sftp: `restic -r sftp:user@remote:restic-repo check --read-data` from a 60GB compressed backup. --- watchdog timeout x3 as above --- restic check fail log @ 15% progress: --- Load(, 17310001, 0) returned error, retrying after 1.7670599s: connection lost Load(, 17456892, 0) returned error, retrying after 4.619104908s: connection lost Load(, 17310001, 0) returned error, retrying after 5.477648517s: connection lost List(lock) returned error, retrying after 293.057766ms: connection lost List(lock) returned error, retrying after 385.206693ms: connection lost List(lock) returned error, retrying after 1.577594281s: connection lost Connection continues UP. Cheers, escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53): > > On 28. Mar 2024, at 15:00, Nuno Teixeira wrote: > > > > Hello all! > > > > Running rack @b7b78c1c169 "Optimize HPTS..." very happy on my laptop > (amd64)! > > > > Thanks all! > Thanks for the feedback! > > Best regards > Michael > > > > Drew Gallatin escreveu (quinta, 21/03/2024 =C3= =A0(s) > 12:58): > > The entire point is to *NOT* go through the overhead of scheduling > something asynchronously, but to take advantage of the fact that a > user/kernel transition is going to trash the cache anyway. > > > > In the common case of a system which has less than the threshold numbe= r > of connections , we access the tcp_hpts_softclock function pointer, make > one function call, and access hpts_that_need_softclock, and then return. > So that's 2 variables and a function call. > > > > I think it would be preferable to avoid that call, and to move the > declaration of tcp_hpts_softclock and hpts_that_need_softclock so that th= ey > are in the same cacheline. Then we'd be hitting just a single line in th= e > common case. (I've made comments on the review to that effect). > > > > Also, I wonder if the threshold could get higher by default, so that > hpts is never called in this context unless we're to the point where we'r= e > scheduling thousands of runs of the hpts thread (and taking all those clo= ck > interrupts). > > > > Drew > > > > On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote: > >> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote: > >>> Ok I have created > >>> > >>> https://reviews.freebsd.org/D44420 > >>> > >>> > >>> To address the issue. I also attach a short version of the patch that > Nuno > >>> can try and validate > >>> > >>> it works. Drew you may want to try this and validate the optimization > does > >>> kick in since I can > >>> > >>> only now test that it does not on my local box :) > >> The patch still causes access to all cpu's cachelines on each userret. > >> It would be much better to inc/check the threshold and only schedule t= he > >> call when exceeded. Then the call can occur in some dedicated context= , > >> like per-CPU thread, instead of userret. > >> > >>> > >>> > >>> R > >>> > >>> > >>> > >>> On 3/18/24 3:42 PM, Drew Gallatin wrote: > >>>> No. The goal is to run on every return to userspace for every threa= d. > >>>> > >>>> Drew > >>>> > >>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrote: > >>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallatin wrote: > >>>>>> I got the idea from > >>>>>> > https://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf > >>>>>> The gist is that the TCP pacing stuff needs to run frequently, and > >>>>>> rather than run it out of a clock interrupt, its more efficient to > run > >>>>>> it out of a system call context at just the point where we return = to > >>>>>> userspace and the cache is trashed anyway. The current > implementation > >>>>>> is fine for our workload, but probably not idea for a generic > system. > >>>>>> Especially one where something is banging on system calls. > >>>>>> > >>>>>> Ast's could be the right tool for this, but I'm super unfamiliar > with > >>>>>> them, and I can't find any docs on them. > >>>>>> > >>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be roughly equivalent = to > >>>>>> what's happening here? > >>>>> This call would need some AST number added, and then it registers t= he > >>>>> ast to run on next return to userspace, for the current thread. > >>>>> > >>>>> Is it enough? > >>>>>> > >>>>>> Drew > >>>>> > >>>>>> > >>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote: > >>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote: > >>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: > >>>>>>>> > >>>>>>>>>> On 18. Mar 2024, at 12:42, Nuno Teixeira > >>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Hello all! > >>>>>>>>>> > >>>>>>>>>> It works just fine! > >>>>>>>>>> System performance is OK. > >>>>>>>>>> Using patch on main-n268841-b0aaf8beb126(-dirty). > >>>>>>>>>> > >>>>>>>>>> --- > >>>>>>>>>> net.inet.tcp.functions_available: > >>>>>>>>>> Stack D > >>>>> Alias PCB count > >>>>>>>>>> freebsd freebsd 0 > >>>>>>>>>> rack * > >>>>> rack 38 > >>>>>>>>>> --- > >>>>>>>>>> > >>>>>>>>>> It would be so nice that we can have a sysctl tunnable for > >>>>> this patch > >>>>>>>>>> so we could do more tests without recompiling kernel. > >>>>>>>>> Thanks for testing! > >>>>>>>>> > >>>>>>>>> @gallatin: can you come up with a patch that is acceptable > >>>>> for Netflix > >>>>>>>>> and allows to mitigate the performance regression. > >>>>>>>> > >>>>>>>> Ideally, tcphpts could enable this automatically when it > >>>>> starts to be > >>>>>>>> used (enough?), but a sysctl could select auto/on/off. > >>>>>>> There is already a well-known mechanism to request execution of t= he > >>>>>>> specific function on return to userspace, namely AST. The > difference > >>>>>>> with the current hack is that the execution is requested for one > >>>>> callback > >>>>>>> in the context of the specific thread. > >>>>>>> > >>>>>>> Still, it might be worth a try to use it; what is the reason to > >>>>> hit a thread > >>>>>>> that does not do networking, with TCP processing? > >>>>>>> > >>>>>>>> > >>>>>>>> Mike > >>>>>>>> > >>>>>>>>> Best regards > >>>>>>>>> Michael > >>>>>>>>>> > >>>>>>>>>> Thanks all! > >>>>>>>>>> Really happy here :) > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> > >>>>>>>>>> Nuno Teixeira escreveu (domingo, > >>>>> 17/03/2024 =C3=A0(s) 20:26): > >>>>>>>>>>> > >>>>>>>>>>> Hello, > >>>>>>>>>>> > >>>>>>>>>>>> I don't have the full context, but it seems like the > >>>>> complaint is a performance regression in bonnie++ and perhaps other > >>>>> things when tcp_hpts is loaded, even when it is not used. Is that > >>>>> correct? > >>>>>>>>>>>> > >>>>>>>>>>>> If so, I suspect its because we drive the > >>>>> tcp_hpts_softclock() routine from userret(), in order to avoid tons > >>>>> of timer interrupts and context switches. To test this theory, yo= u > >>>>> could apply a patch like: > >>>>>>>>>>> > >>>>>>>>>>> It's affecting overall system performance, bonnie was just > >>>>> a way to > >>>>>>>>>>> get some numbers to compare. > >>>>>>>>>>> > >>>>>>>>>>> Tomorrow I will test patch. > >>>>>>>>>>> > >>>>>>>>>>> Thanks! > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Nuno Teixeira > >>>>>>>>>>> FreeBSD Committer (ports) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Nuno Teixeira > >>>>>>>>>> FreeBSD Committer (ports) > >>>>>>>> > >>>>>>> > >>>>> > >>>> > >> > >>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts.c > >>> index 8c4d2d41a3eb..eadbee19f69c 100644 > >>> --- a/sys/netinet/tcp_hpts.c > >>> +++ b/sys/netinet/tcp_hpts.c > >>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry { > >>> void *ie_cookie; > >>> uint16_t p_num; /* The hpts number one per cpu */ > >>> uint16_t p_cpu; /* The hpts CPU */ > >>> + uint8_t hit_callout_thresh; > >>> /* There is extra space in here */ > >>> /* Cache line 0x100 */ > >>> struct callout co __aligned(CACHE_LINE_SIZE); > >>> @@ -269,6 +270,11 @@ static struct hpts_domain_info { > >>> int cpu[MAXCPU]; > >>> } hpts_domains[MAXMEMDOM]; > >>> > >>> +counter_u64_t hpts_that_need_softclock; > >>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, needsoftclock= , > CTLFLAG_RD, > >>> + &hpts_that_need_softclock, > >>> + "Number of hpts threads that need softclock"); > >>> + > >>> counter_u64_t hpts_hopelessly_behind; > >>> > >>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hopeless, > CTLFLAG_RD, > >>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, > precision, CTLFLAG_RW, > >>> &tcp_hpts_precision, 120, > >>> "Value for PRE() precision of callout"); > >>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFLAG_RW, > >>> - &conn_cnt_thresh, 0, > >>> + &conn_cnt_thresh, DEFAULT_CONNECTION_THESHOLD, > >>> "How many connections (below) make us use the callout based > mechanism"); > >>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_RW, > >>> &hpts_does_tp_logging, 0, > >>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void) > >>> struct tcp_hpts_entry *hpts; > >>> int ticks_ran; > >>> > >>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0) > >>> + return; > >>> + > >>> hpts =3D tcp_choose_hpts_to_run(); > >>> > >>> if (hpts->p_hpts_active) { > >>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx) > >>> ticks_ran =3D tcp_hptsi(hpts, 1); > >>> tv.tv_sec =3D 0; > >>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER_SLOT; > >>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) && > (hpts->hit_callout_thresh =3D=3D 0)) { > >>> + hpts->hit_callout_thresh =3D 1; > >>> + counter_u64_add(hpts_that_need_softclock, 1); > >>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thresh) && > (hpts->hit_callout_thresh =3D=3D 1)) { > >>> + hpts->hit_callout_thresh =3D 0; > >>> + counter_u64_add(hpts_that_need_softclock, -1); > >>> + } > >>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) { > >>> if(hpts->p_direct_wake =3D=3D 0) { > >>> /* > >>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void) > >>> cpu_top =3D NULL; > >>> #endif > >>> tcp_pace.rp_num_hptss =3D ncpus; > >>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK); > >>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK); > >>> hpts_loops =3D counter_u64_alloc(M_WAITOK); > >>> back_tosleep =3D counter_u64_alloc(M_WAITOK); > >>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void) > >>> free(tcp_pace.grps, M_TCPHPTS); > >>> #endif > >>> > >>> + counter_u64_free(hpts_that_need_softclock); > >>> counter_u64_free(hpts_hopelessly_behind); > >>> counter_u64_free(hpts_loops); > >>> counter_u64_free(back_tosleep); > >> > >> > > > > > > > > -- > > Nuno Teixeira > > FreeBSD Committer (ports) > > --=20 Nuno Teixeira FreeBSD Committer (ports) --0000000000004c6d020615bc81cf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello all,

@ current 1500018= and fetching torrents with net-p2p/qbittorrent finished ~2GB download and = connection UP until the end:

---
Apr 10 11:26:46 leg kernel: re0: watchdog timeout
Apr 10 11:26:46 leg = kernel: re0: link state changed to DOWN
Apr 10 11:26:49 leg dhclient[588= 10]: New IP Address (re0): 192.168.1.67
Apr 10 11:26:49 leg dhclient[588= 14]: New Subnet Mask (re0): 255.255.255.0
Apr 10 11:26:49 leg dhclient[5= 8818]: New Broadcast Address (re0): 192.168.1.255
Apr 10 11:26:49 leg ke= rnel: re0: link state changed to UP
Apr 10 11:26:49 leg dhclient[58822]:= New Routers (re0): 192.168.1.1
---

In t= he past tests, I've got more watchdog timeouts, connection goes down an= d a reboot needed to put it back (`service netif restart` didn't work).=

Other way to reproduce this is using sysutils/res= tic (backup program) to read/check all files from a remote server via sftp:=

`restic -r sftp:user@remote:restic-repo check --r= ead-data` from a 60GB compressed backup.

---
=
watchdog timeout x3 as above
---

re= stic check fail log @ 15% progress:
---
<snip>= ;
Load(<data/52e2923dd6>, 17310001, 0) returned error, = retrying after 1.7670599s: connection lost
Load(<data/d27a0abe0f>,= 17456892, 0) returned error, retrying after 4.619104908s: connection lost<= br>Load(<data/52e2923dd6>, 17310001, 0) returned error, retrying afte= r 5.477648517s: connection lost
List(lock) returned error, retrying afte= r 293.057766ms: connection lost
List(lock) returned error, retrying afte= r 385.206693ms: connection lost
List(lock) returned error, retrying afte= r 1.577594281s: connection lost
<snip>

=
Connection continues UP.

Cheers,

<= ;tuexen@freebsd.org= > escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53):
> On 28. Mar 2024, at 15:00, Nun= o Teixeira <edu= ardo@freebsd.org> wrote:
>
> Hello all!
>
> Running rack @b7b78c1c169 "Optimize HPTS..." very happy on m= y laptop (amd64)!
>
> Thanks all!
Thanks for the feedback!

Best regards
Michael
>
> Drew Gallatin <gallatin@freebsd.org> escreveu (quinta, 21/03/2024 =C3=A0(s) 1= 2:58):
> The entire point is to *NOT* go through the overhead of scheduling som= ething asynchronously, but to take advantage of the fact that a user/kernel= transition is going to trash the cache anyway.
>
> In the common case of a system which has less than the threshold=C2=A0= number of connections , we access the tcp_hpts_softclock function pointer,= make one function call, and access hpts_that_need_softclock, and then retu= rn.=C2=A0 So that's 2 variables and a function call.
>
> I think it would be preferable to avoid that call, and to move the dec= laration of tcp_hpts_softclock and hpts_that_need_softclock so that they ar= e in the same cacheline.=C2=A0 Then we'd be hitting just a single line = in the common case.=C2=A0 (I've made comments on the review to that eff= ect).
>
> Also, I wonder if the threshold could get higher by default, so that h= pts is never called in this context unless we're to the point where we&= #39;re scheduling thousands of runs of the hpts thread (and taking all thos= e clock interrupts).
>
> Drew
>
> On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote:
>> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote:
>>> Ok I have created
>>>
>>> https://reviews.freebsd.org/D44420
>>>
>>>
>>> To address the issue. I also attach a short version of the pat= ch that Nuno
>>> can try and validate
>>>
>>> it works. Drew you may want to try this and validate the optim= ization does
>>> kick in since I can
>>>
>>> only now test that it does not on my local box :)
>> The patch still causes access to all cpu's cachelines on each = userret.
>> It would be much better to inc/check the threshold and only schedu= le the
>> call when exceeded.=C2=A0 Then the call can occur in some dedicate= d context,
>> like per-CPU thread, instead of userret.
>>
>>>
>>>
>>> R
>>>
>>>
>>>
>>> On 3/18/24 3:42 PM, Drew Gallatin wrote:
>>>> No.=C2=A0 The goal is to run on every return to userspace = for every thread.
>>>>
>>>> Drew
>>>>
>>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrot= e:
>>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallati= n wrote:
>>>>>> I got the idea from
>>>>>> https= ://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf
>>>>>> The gist is that the TCP pacing stuff needs to run= frequently, and
>>>>>> rather than run it out of a clock interrupt, its m= ore efficient to run
>>>>>> it out of a system call context at just the point = where we return to
>>>>>> userspace and the cache is trashed anyway. The cur= rent implementation
>>>>>> is fine for our workload, but probably not idea fo= r a generic system.
>>>>>> Especially one where something is banging on syste= m calls.
>>>>>>
>>>>>> Ast's could be the right tool for this, but I&= #39;m super unfamiliar with
>>>>>> them, and I can't find any docs on them.
>>>>>>
>>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be rou= ghly equivalent to
>>>>>> what's happening here?
>>>>> This call would need some AST number added, and then i= t registers the
>>>>> ast to run on next return to userspace, for the curren= t thread.
>>>>>
>>>>> Is it enough?
>>>>>>
>>>>>> Drew
>>>>>
>>>>>>
>>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belou= sov wrote:
>>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike= Karels wrote:
>>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote:
>>>>>>>>
>>>>>>>>>> On 18. Mar 2024, at 12:42, Nuno Te= ixeira
>>>>> <eduardo@freebsd.org> wrote:
>>>>>>>>>>
>>>>>>>>>> Hello all!
>>>>>>>>>>
>>>>>>>>>> It works just fine!
>>>>>>>>>> System performance is OK.
>>>>>>>>>> Using patch on main-n268841-b0aaf8= beb126(-dirty).
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> net.inet.tcp.functions_available:<= br> >>>>>>>>>> Stack=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0D
>>>>> Alias=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 PCB count
>>>>>>>>>> freebsd freebsd=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=
>>>>>>>>>> rack=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *
>>>>> rack=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A038
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> It would be so nice that we can ha= ve a sysctl tunnable for
>>>>> this patch
>>>>>>>>>> so we could do more tests without = recompiling kernel.
>>>>>>>>> Thanks for testing!
>>>>>>>>>
>>>>>>>>> @gallatin: can you come up with a patc= h that is acceptable
>>>>> for Netflix
>>>>>>>>> and allows to mitigate the performance= regression.
>>>>>>>>
>>>>>>>> Ideally, tcphpts could enable this automat= ically when it
>>>>> starts to be
>>>>>>>> used (enough?), but a sysctl could select = auto/on/off.
>>>>>>> There is already a well-known mechanism to req= uest execution of the
>>>>>>> specific function on return to userspace, name= ly AST.=C2=A0 The difference
>>>>>>> with the current hack is that the execution is= requested for one
>>>>> callback
>>>>>>> in the context of the specific thread.
>>>>>>>
>>>>>>> Still, it might be worth a try to use it; what= is the reason to
>>>>> hit a thread
>>>>>>> that does not do networking, with TCP processi= ng?
>>>>>>>
>>>>>>>>
>>>>>>>> Mike
>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>> Michael
>>>>>>>>>>
>>>>>>>>>> Thanks all!
>>>>>>>>>> Really happy here :)
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Nuno Teixeira <eduardo@freebsd.org> escrev= eu (domingo,
>>>>> 17/03/2024 =C3=A0(s) 20:26):
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>>> I don't have the full = context, but it seems like the
>>>>> complaint is a performance regression in bonnie++ and = perhaps other
>>>>> things when tcp_hpts is loaded, even when it is not us= ed.=C2=A0 Is that
>>>>> correct?
>>>>>>>>>>>>
>>>>>>>>>>>> If so, I suspect its becau= se we drive the
>>>>> tcp_hpts_softclock() routine from userret(), in order = to avoid tons
>>>>> of timer interrupts and context switches.=C2=A0 To tes= t this theory,=C2=A0 you
>>>>> could apply a patch like:
>>>>>>>>>>>
>>>>>>>>>>> It's affecting overall sys= tem performance, bonnie was just
>>>>> a way to
>>>>>>>>>>> get some numbers to compare. >>>>>>>>>>>
>>>>>>>>>>> Tomorrow I will test patch. >>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Nuno Teixeira
>>>>>>>>>>> FreeBSD Committer (ports)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Nuno Teixeira
>>>>>>>>>> FreeBSD Committer (ports)
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
>>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts.c >>> index 8c4d2d41a3eb..eadbee19f69c 100644
>>> --- a/sys/netinet/tcp_hpts.c
>>> +++ b/sys/netinet/tcp_hpts.c
>>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry {
>>> void *ie_cookie;
>>> uint16_t p_num; /* The hpts number one per cpu */
>>> uint16_t p_cpu; /* The hpts CPU */
>>> + uint8_t hit_callout_thresh;
>>> /* There is extra space in here */
>>> /* Cache line 0x100 */
>>> struct callout co __aligned(CACHE_LINE_SIZE);
>>> @@ -269,6 +270,11 @@ static struct hpts_domain_info {
>>> int cpu[MAXCPU];
>>> } hpts_domains[MAXMEMDOM];
>>>
>>> +counter_u64_t hpts_that_need_softclock;
>>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, needso= ftclock, CTLFLAG_RD,
>>> +=C2=A0 =C2=A0 &hpts_that_need_softclock,
>>> +=C2=A0 =C2=A0 "Number of hpts threads that need softcloc= k");
>>> +
>>> counter_u64_t hpts_hopelessly_behind;
>>>
>>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hopeles= s, CTLFLAG_RD,
>>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, p= recision, CTLFLAG_RW,
>>>=C2=A0 =C2=A0 =C2=A0&tcp_hpts_precision, 120,
>>>=C2=A0 =C2=A0 =C2=A0"Value for PRE() precision of callout&= quot;);
>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFLAG_R= W,
>>> -=C2=A0 =C2=A0 &conn_cnt_thresh, 0,
>>> +=C2=A0 =C2=A0 &conn_cnt_thresh, DEFAULT_CONNECTION_THESHO= LD,
>>>=C2=A0 =C2=A0 =C2=A0"How many connections (below) make us = use the callout based mechanism");
>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_RW,<= br> >>>=C2=A0 =C2=A0 =C2=A0&hpts_does_tp_logging, 0,
>>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void)
>>> struct tcp_hpts_entry *hpts;
>>> int ticks_ran;
>>>
>>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0) >>> + return;
>>> +
>>> hpts =3D tcp_choose_hpts_to_run();
>>>
>>> if (hpts->p_hpts_active) {
>>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx)
>>> ticks_ran =3D tcp_hptsi(hpts, 1);
>>> tv.tv_sec =3D 0;
>>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER_SLO= T;
>>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) &&= ; (hpts->hit_callout_thresh =3D=3D 0)) {
>>> + hpts->hit_callout_thresh =3D 1;
>>> + counter_u64_add(hpts_that_need_softclock, 1);
>>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thresh)= && (hpts->hit_callout_thresh =3D=3D 1)) {
>>> + hpts->hit_callout_thresh =3D 0;
>>> + counter_u64_add(hpts_that_need_softclock, -1);
>>> + }
>>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) {
>>> if(hpts->p_direct_wake =3D=3D 0) {
>>> /*
>>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void)
>>> cpu_top =3D NULL;
>>> #endif
>>> tcp_pace.rp_num_hptss =3D ncpus;
>>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK); >>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK);
>>> hpts_loops =3D counter_u64_alloc(M_WAITOK);
>>> back_tosleep =3D counter_u64_alloc(M_WAITOK);
>>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void)
>>> free(tcp_pace.grps, M_TCPHPTS);
>>> #endif
>>>
>>> + counter_u64_free(hpts_that_need_softclock);
>>> counter_u64_free(hpts_hopelessly_behind);
>>> counter_u64_free(hpts_loops);
>>> counter_u64_free(back_tosleep);
>>
>>
>
>
>
> --
> Nuno Teixeira
> FreeBSD Committer (ports)



--
Nuno Teixeira
FreeBSD Committ= er (ports)
--0000000000004c6d020615bc81cf-- From nobody Wed Apr 10 12:11:54 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VF1sd6DPgz5H1cg; Wed, 10 Apr 2024 12:12:05 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VF1sd378Cz4MhP; Wed, 10 Apr 2024 12:12:05 +0000 (UTC) (envelope-from tuexen@freebsd.org) Authentication-Results: mx1.freebsd.org; none Received: from smtpclient.apple (unknown [IPv6:2a02:8109:1140:c3d:fdb8:ab7b:81de:24b7]) (Authenticated sender: micmac) by drew.franken.de (Postfix) with ESMTPSA id B916C7220B801; Wed, 10 Apr 2024 14:11:54 +0200 (CEST) Content-Type: text/plain; charset=utf-8 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\)) Subject: Re: Request for Testing: TCP RACK From: tuexen@freebsd.org In-Reply-To: Date: Wed, 10 Apr 2024 14:11:54 +0200 Cc: Drew Gallatin , Konstantin Belousov , rrs , Mike Karels , garyj@gmx.de, current@freebsd.org, net@freebsd.org, Randall Stewart Content-Transfer-Encoding: quoted-printable Message-Id: <52479AA6-04F6-4D4A-ABE0-7142B47E28DF@freebsd.org> References: <6e795e9c-8de4-4e02-9a96-8fabfaa4e66f@app.fastmail.com> <6047C8EF-B1B0-4286-93FA-AA38F8A18656@karels.net> <8031cd99-ded8-4b06-93b3-11cc729a8b2c@app.fastmail.com> <38c54399-6c96-44d8-a3a2-3cc1bfbe50c2@app.fastmail.com> <27d8144f-0658-46f6-b8f3-35eb60061644@lakerest.net> <5C9863F7-0F1C-4D02-9F6D-9DDC5FBEB368@freebsd.org> To: Nuno Teixeira X-Mailer: Apple Mail (2.3774.500.171.1.1) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:680, ipnet:2001:638::/32, country:DE] X-Rspamd-Queue-Id: 4VF1sd378Cz4MhP > On 10. Apr 2024, at 13:40, Nuno Teixeira wrote: >=20 > Hello all, >=20 > @ current 1500018 and fetching torrents with net-p2p/qbittorrent = finished ~2GB download and connection UP until the end:=20 >=20 > --- > Apr 10 11:26:46 leg kernel: re0: watchdog timeout > Apr 10 11:26:46 leg kernel: re0: link state changed to DOWN > Apr 10 11:26:49 leg dhclient[58810]: New IP Address (re0): = 192.168.1.67 > Apr 10 11:26:49 leg dhclient[58814]: New Subnet Mask (re0): = 255.255.255.0 > Apr 10 11:26:49 leg dhclient[58818]: New Broadcast Address (re0): = 192.168.1.255 > Apr 10 11:26:49 leg kernel: re0: link state changed to UP > Apr 10 11:26:49 leg dhclient[58822]: New Routers (re0): 192.168.1.1 > --- >=20 > In the past tests, I've got more watchdog timeouts, connection goes = down and a reboot needed to put it back (`service netif restart` didn't = work). >=20 > Other way to reproduce this is using sysutils/restic (backup program) = to read/check all files from a remote server via sftp: >=20 > `restic -r sftp:user@remote:restic-repo check --read-data` from a 60GB = compressed backup. >=20 > --- > watchdog timeout x3 as above > --- >=20 > restic check fail log @ 15% progress: > --- > > Load(, 17310001, 0) returned error, retrying after = 1.7670599s: connection lost > Load(, 17456892, 0) returned error, retrying after = 4.619104908s: connection lost > Load(, 17310001, 0) returned error, retrying after = 5.477648517s: connection lost > List(lock) returned error, retrying after 293.057766ms: connection = lost > List(lock) returned error, retrying after 385.206693ms: connection = lost > List(lock) returned error, retrying after 1.577594281s: connection = lost > >=20 > Connection continues UP. Hi, I'm not sure what the issue is you are reporting. Could you state what behavior you are experiencing with the base stack and with the RACK stack. In particular, what the difference is? Best regards Michael >=20 > Cheers, >=20 > escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53): >> On 28. Mar 2024, at 15:00, Nuno Teixeira wrote: >>=20 >> Hello all! >>=20 >> Running rack @b7b78c1c169 "Optimize HPTS..." very happy on my laptop = (amd64)! >>=20 >> Thanks all! > Thanks for the feedback! >=20 > Best regards > Michael >>=20 >> Drew Gallatin escreveu (quinta, 21/03/2024 = =C3=A0(s) 12:58): >> The entire point is to *NOT* go through the overhead of scheduling = something asynchronously, but to take advantage of the fact that a = user/kernel transition is going to trash the cache anyway. >>=20 >> In the common case of a system which has less than the threshold = number of connections , we access the tcp_hpts_softclock function = pointer, make one function call, and access hpts_that_need_softclock, = and then return. So that's 2 variables and a function call. >>=20 >> I think it would be preferable to avoid that call, and to move the = declaration of tcp_hpts_softclock and hpts_that_need_softclock so that = they are in the same cacheline. Then we'd be hitting just a single line = in the common case. (I've made comments on the review to that effect). >>=20 >> Also, I wonder if the threshold could get higher by default, so that = hpts is never called in this context unless we're to the point where = we're scheduling thousands of runs of the hpts thread (and taking all = those clock interrupts). >>=20 >> Drew >>=20 >> On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote: >>> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote: >>>> Ok I have created >>>>=20 >>>> https://reviews.freebsd.org/D44420 >>>>=20 >>>>=20 >>>> To address the issue. I also attach a short version of the patch = that Nuno >>>> can try and validate >>>>=20 >>>> it works. Drew you may want to try this and validate the = optimization does >>>> kick in since I can >>>>=20 >>>> only now test that it does not on my local box :) >>> The patch still causes access to all cpu's cachelines on each = userret. >>> It would be much better to inc/check the threshold and only schedule = the >>> call when exceeded. Then the call can occur in some dedicated = context, >>> like per-CPU thread, instead of userret. >>>=20 >>>>=20 >>>>=20 >>>> R >>>>=20 >>>>=20 >>>>=20 >>>> On 3/18/24 3:42 PM, Drew Gallatin wrote: >>>>> No. The goal is to run on every return to userspace for every = thread. >>>>>=20 >>>>> Drew >>>>>=20 >>>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrote: >>>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallatin wrote: >>>>>>> I got the idea from >>>>>>> = https://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf >>>>>>> The gist is that the TCP pacing stuff needs to run frequently, = and >>>>>>> rather than run it out of a clock interrupt, its more efficient = to run >>>>>>> it out of a system call context at just the point where we = return to >>>>>>> userspace and the cache is trashed anyway. The current = implementation >>>>>>> is fine for our workload, but probably not idea for a generic = system. >>>>>>> Especially one where something is banging on system calls. >>>>>>>=20 >>>>>>> Ast's could be the right tool for this, but I'm super unfamiliar = with >>>>>>> them, and I can't find any docs on them. >>>>>>>=20 >>>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be roughly = equivalent to >>>>>>> what's happening here? >>>>>> This call would need some AST number added, and then it registers = the >>>>>> ast to run on next return to userspace, for the current thread. >>>>>>=20 >>>>>> Is it enough? >>>>>>>=20 >>>>>>> Drew >>>>>>=20 >>>>>>>=20 >>>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote: >>>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote: >>>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: >>>>>>>>>=20 >>>>>>>>>>> On 18. Mar 2024, at 12:42, Nuno Teixeira >>>>>> wrote: >>>>>>>>>>>=20 >>>>>>>>>>> Hello all! >>>>>>>>>>>=20 >>>>>>>>>>> It works just fine! >>>>>>>>>>> System performance is OK. >>>>>>>>>>> Using patch on main-n268841-b0aaf8beb126(-dirty). >>>>>>>>>>>=20 >>>>>>>>>>> --- >>>>>>>>>>> net.inet.tcp.functions_available: >>>>>>>>>>> Stack D >>>>>> Alias PCB count >>>>>>>>>>> freebsd freebsd 0 >>>>>>>>>>> rack * >>>>>> rack 38 >>>>>>>>>>> --- >>>>>>>>>>>=20 >>>>>>>>>>> It would be so nice that we can have a sysctl tunnable for >>>>>> this patch >>>>>>>>>>> so we could do more tests without recompiling kernel. >>>>>>>>>> Thanks for testing! >>>>>>>>>>=20 >>>>>>>>>> @gallatin: can you come up with a patch that is acceptable >>>>>> for Netflix >>>>>>>>>> and allows to mitigate the performance regression. >>>>>>>>>=20 >>>>>>>>> Ideally, tcphpts could enable this automatically when it >>>>>> starts to be >>>>>>>>> used (enough?), but a sysctl could select auto/on/off. >>>>>>>> There is already a well-known mechanism to request execution of = the >>>>>>>> specific function on return to userspace, namely AST. The = difference >>>>>>>> with the current hack is that the execution is requested for = one >>>>>> callback >>>>>>>> in the context of the specific thread. >>>>>>>>=20 >>>>>>>> Still, it might be worth a try to use it; what is the reason to >>>>>> hit a thread >>>>>>>> that does not do networking, with TCP processing? >>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> Mike >>>>>>>>>=20 >>>>>>>>>> Best regards >>>>>>>>>> Michael >>>>>>>>>>>=20 >>>>>>>>>>> Thanks all! >>>>>>>>>>> Really happy here :) >>>>>>>>>>>=20 >>>>>>>>>>> Cheers, >>>>>>>>>>>=20 >>>>>>>>>>> Nuno Teixeira escreveu (domingo, >>>>>> 17/03/2024 =C3=A0(s) 20:26): >>>>>>>>>>>>=20 >>>>>>>>>>>> Hello, >>>>>>>>>>>>=20 >>>>>>>>>>>>> I don't have the full context, but it seems like the >>>>>> complaint is a performance regression in bonnie++ and perhaps = other >>>>>> things when tcp_hpts is loaded, even when it is not used. Is = that >>>>>> correct? >>>>>>>>>>>>>=20 >>>>>>>>>>>>> If so, I suspect its because we drive the >>>>>> tcp_hpts_softclock() routine from userret(), in order to avoid = tons >>>>>> of timer interrupts and context switches. To test this theory, = you >>>>>> could apply a patch like: >>>>>>>>>>>>=20 >>>>>>>>>>>> It's affecting overall system performance, bonnie was just >>>>>> a way to >>>>>>>>>>>> get some numbers to compare. >>>>>>>>>>>>=20 >>>>>>>>>>>> Tomorrow I will test patch. >>>>>>>>>>>>=20 >>>>>>>>>>>> Thanks! >>>>>>>>>>>>=20 >>>>>>>>>>>> -- >>>>>>>>>>>> Nuno Teixeira >>>>>>>>>>>> FreeBSD Committer (ports) >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> -- >>>>>>>>>>> Nuno Teixeira >>>>>>>>>>> FreeBSD Committer (ports) >>>>>>>>>=20 >>>>>>>>=20 >>>>>>=20 >>>>>=20 >>>=20 >>>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts.c >>>> index 8c4d2d41a3eb..eadbee19f69c 100644 >>>> --- a/sys/netinet/tcp_hpts.c >>>> +++ b/sys/netinet/tcp_hpts.c >>>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry { >>>> void *ie_cookie; >>>> uint16_t p_num; /* The hpts number one per cpu */ >>>> uint16_t p_cpu; /* The hpts CPU */ >>>> + uint8_t hit_callout_thresh; >>>> /* There is extra space in here */ >>>> /* Cache line 0x100 */ >>>> struct callout co __aligned(CACHE_LINE_SIZE); >>>> @@ -269,6 +270,11 @@ static struct hpts_domain_info { >>>> int cpu[MAXCPU]; >>>> } hpts_domains[MAXMEMDOM]; >>>>=20 >>>> +counter_u64_t hpts_that_need_softclock; >>>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, = needsoftclock, CTLFLAG_RD, >>>> + &hpts_that_need_softclock, >>>> + "Number of hpts threads that need softclock"); >>>> + >>>> counter_u64_t hpts_hopelessly_behind; >>>>=20 >>>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hopeless, = CTLFLAG_RD, >>>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, = precision, CTLFLAG_RW, >>>> &tcp_hpts_precision, 120, >>>> "Value for PRE() precision of callout"); >>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFLAG_RW, >>>> - &conn_cnt_thresh, 0, >>>> + &conn_cnt_thresh, DEFAULT_CONNECTION_THESHOLD, >>>> "How many connections (below) make us use the callout based = mechanism"); >>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_RW, >>>> &hpts_does_tp_logging, 0, >>>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void) >>>> struct tcp_hpts_entry *hpts; >>>> int ticks_ran; >>>>=20 >>>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0) >>>> + return; >>>> + >>>> hpts =3D tcp_choose_hpts_to_run(); >>>>=20 >>>> if (hpts->p_hpts_active) { >>>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx) >>>> ticks_ran =3D tcp_hptsi(hpts, 1); >>>> tv.tv_sec =3D 0; >>>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER_SLOT; >>>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) && = (hpts->hit_callout_thresh =3D=3D 0)) { >>>> + hpts->hit_callout_thresh =3D 1; >>>> + counter_u64_add(hpts_that_need_softclock, 1); >>>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thresh) && = (hpts->hit_callout_thresh =3D=3D 1)) { >>>> + hpts->hit_callout_thresh =3D 0; >>>> + counter_u64_add(hpts_that_need_softclock, -1); >>>> + } >>>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) { >>>> if(hpts->p_direct_wake =3D=3D 0) { >>>> /* >>>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void) >>>> cpu_top =3D NULL; >>>> #endif >>>> tcp_pace.rp_num_hptss =3D ncpus; >>>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK); >>>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK); >>>> hpts_loops =3D counter_u64_alloc(M_WAITOK); >>>> back_tosleep =3D counter_u64_alloc(M_WAITOK); >>>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void) >>>> free(tcp_pace.grps, M_TCPHPTS); >>>> #endif >>>>=20 >>>> + counter_u64_free(hpts_that_need_softclock); >>>> counter_u64_free(hpts_hopelessly_behind); >>>> counter_u64_free(hpts_loops); >>>> counter_u64_free(back_tosleep); >>>=20 >>>=20 >>=20 >>=20 >>=20 >> --=20 >> Nuno Teixeira >> FreeBSD Committer (ports) >=20 >=20 >=20 > --=20 > Nuno Teixeira > FreeBSD Committer (ports) From nobody Wed Apr 10 12:39:17 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VF2TG09Dlz5H380; Wed, 10 Apr 2024 12:39:30 +0000 (UTC) (envelope-from eduardo@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VF2TF6pbSz4R7d; Wed, 10 Apr 2024 12:39:29 +0000 (UTC) (envelope-from eduardo@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712752769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DpkPlNoF0Dic+HFl8IAcIYjH1s/pGnxXnG0kc5ttbZM=; b=VXO6sAPz5owi9bvRhnUIsoBeGkaCJWcoUvmZ+Pq8+3/8VVfZli9MhctJZQPlLI9krg97z8 4g2MZqO1ATKgp8ETVsqPEEdJ7z7kxbPk9//fgghq9u/X0Xxoxd0ECGskb+03pPH0yTwP/z kSiq5luCPO+p/ujm53CGjfkKAfNg8U8cOTvgYo/Jtbao7DEeqrSPXzE6dlpp9IugxtAq+l f7cIeCwW6FGahDnX5P/dLowLgKIza6Xj4gD345Z/yLmfOAMlRbS5KQhKclQLhw7XEpHaJa KZ17KalDZOdP4XkhnY4VVy3Pi8/DTBD8Q0v+Ak/3fZadxlzXbp+GFMDp6hF+Vg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712752769; a=rsa-sha256; cv=none; b=TyiRyIsxByXPlIVp5WISYZzdaOZCSbPZTEy8ilqeJ7WfqptPQwNsgrFkficXRxtSCBo4cd av+zLwmXEm/DdvZJUa+vf2U9RwzeB5Bca65AgCL2+OIVFEawS1BaASX3YE1EWWi89FnU5Y fGVcetIOJ75NTNqXNAD+10dEKDZgiO5n/ZocIXSx8nrV5eJwiRpYwq9IW8DqzgpMpomI2A gYSp/Q1CHhp+evK7UA3rjmLdALu02dBOOLnq2HNOh6t35sR2iMeFYAoa2Fz3i/R+URV/Ra Fz261QLiS1TxU8jM0yFojJGFdpYUPixMjBbo9x5Gn0F6dXI5yHtRxYo+FNzqiA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712752769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DpkPlNoF0Dic+HFl8IAcIYjH1s/pGnxXnG0kc5ttbZM=; b=IC72YzBOQOtfrJH1iu9Aw1NX9NXw/HqUCQ/yqQhZmtpJks15GRcvyDT2dzWqwtiKHYbwBH ft46y18etBkr+1TbGUmJpNU1Q652PsmCV3z4owBTIcEdk9K1v1VGsOrXEsrlnXY1uRsmhL oNY5i6bXc3dReJxjwSizVUQuO5T5wYSFLPfmYJ0tT17SFGYCRxmAeIXya3EYYEvdGo5ZiY gzlmL1Y39Ffgn9ZII2ZnwCJ2ZQ3OAPH4yok5ZbxVP8Ngl2UaWRnFsrUgf/ib6yOYM3SmJA w+v/sMukFLH5dmh2w76+B7TYx6BLqO6Jbnq6absbM5/gY8f7n/r478Hjrhvz+g== Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) (Authenticated sender: eduardo) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VF2TF6C2mzM4K; Wed, 10 Apr 2024 12:39:29 +0000 (UTC) (envelope-from eduardo@freebsd.org) Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-434899d6d2bso13270921cf.0; Wed, 10 Apr 2024 05:39:29 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCV5+GMBASN1pIuTI+qiCXyeMWT46xWeBCPVyYb5FVWoDOM35OMBb4e1aN/UjZ+lPzZqzaqYjP71JZYjUJUHGzA/Dai7FJqaEZcSk/CdymQfhd+MNA9SE0Th3RiIUIHqjpWEkBHmQgIZUNQ0u+vptbA2PGoLet17 X-Gm-Message-State: AOJu0Yz4dRLEi1tzX8cvrm6KR7iqEtd8TWYKZxpbCGdsS45xJDNK3X/1 cGP+StQxJB+iCwprNukSOmxqMeL78B6Y3wamLbibtveAyrfk7gEHbjyoaAMStlLG4MHSKaTV8jk RABPJzRt9hvSpALIH0k/8sFVLTu0= X-Google-Smtp-Source: AGHT+IELruYdl2HXz5eJ5u2RBZt4dvlQU22q2uwUByxBPplP6lm08HPZaprphMaSnqC7DujvKztmckYLSruG29p5yrc= X-Received: by 2002:ac8:7c52:0:b0:436:4e4c:3cab with SMTP id o18-20020ac87c52000000b004364e4c3cabmr170207qtv.33.1712752769307; Wed, 10 Apr 2024 05:39:29 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <6e795e9c-8de4-4e02-9a96-8fabfaa4e66f@app.fastmail.com> <6047C8EF-B1B0-4286-93FA-AA38F8A18656@karels.net> <8031cd99-ded8-4b06-93b3-11cc729a8b2c@app.fastmail.com> <38c54399-6c96-44d8-a3a2-3cc1bfbe50c2@app.fastmail.com> <27d8144f-0658-46f6-b8f3-35eb60061644@lakerest.net> <5C9863F7-0F1C-4D02-9F6D-9DDC5FBEB368@freebsd.org> <52479AA6-04F6-4D4A-ABE0-7142B47E28DF@freebsd.org> In-Reply-To: <52479AA6-04F6-4D4A-ABE0-7142B47E28DF@freebsd.org> From: Nuno Teixeira Date: Wed, 10 Apr 2024 13:39:17 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Request for Testing: TCP RACK To: tuexen@freebsd.org Cc: Drew Gallatin , Konstantin Belousov , rrs , Mike Karels , garyj@gmx.de, current@freebsd.org, net@freebsd.org, Randall Stewart Content-Type: multipart/alternative; boundary="0000000000006723920615bd54db" --0000000000006723920615bd54db Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable With base stack I can complete restic check successfully downloading/reading/checking all files from a "big" remote compressed backup. Changing it to RACK stack, it fails. I run this command often because in the past, compression corruption occured and this is the equivalent of restoring backup to check its integrity. Maybe someone could do a restic test to check if this is reproducible. Thanks, escreveu (quarta, 10/04/2024 =C3=A0(s) 13:12): > > > > On 10. Apr 2024, at 13:40, Nuno Teixeira wrote: > > > > Hello all, > > > > @ current 1500018 and fetching torrents with net-p2p/qbittorrent > finished ~2GB download and connection UP until the end: > > > > --- > > Apr 10 11:26:46 leg kernel: re0: watchdog timeout > > Apr 10 11:26:46 leg kernel: re0: link state changed to DOWN > > Apr 10 11:26:49 leg dhclient[58810]: New IP Address (re0): 192.168.1.67 > > Apr 10 11:26:49 leg dhclient[58814]: New Subnet Mask (re0): 255.255.255= .0 > > Apr 10 11:26:49 leg dhclient[58818]: New Broadcast Address (re0): > 192.168.1.255 > > Apr 10 11:26:49 leg kernel: re0: link state changed to UP > > Apr 10 11:26:49 leg dhclient[58822]: New Routers (re0): 192.168.1.1 > > --- > > > > In the past tests, I've got more watchdog timeouts, connection goes dow= n > and a reboot needed to put it back (`service netif restart` didn't work). > > > > Other way to reproduce this is using sysutils/restic (backup program) t= o > read/check all files from a remote server via sftp: > > > > `restic -r sftp:user@remote:restic-repo check --read-data` from a 60GB > compressed backup. > > > > --- > > watchdog timeout x3 as above > > --- > > > > restic check fail log @ 15% progress: > > --- > > > > Load(, 17310001, 0) returned error, retrying after > 1.7670599s: connection lost > > Load(, 17456892, 0) returned error, retrying after > 4.619104908s: connection lost > > Load(, 17310001, 0) returned error, retrying after > 5.477648517s: connection lost > > List(lock) returned error, retrying after 293.057766ms: connection lost > > List(lock) returned error, retrying after 385.206693ms: connection lost > > List(lock) returned error, retrying after 1.577594281s: connection lost > > > > > > Connection continues UP. > Hi, > > I'm not sure what the issue is you are reporting. Could you state > what behavior you are experiencing with the base stack and with > the RACK stack. In particular, what the difference is? > > Best regards > Michael > > > > Cheers, > > > > escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53): > >> On 28. Mar 2024, at 15:00, Nuno Teixeira wrote: > >> > >> Hello all! > >> > >> Running rack @b7b78c1c169 "Optimize HPTS..." very happy on my laptop > (amd64)! > >> > >> Thanks all! > > Thanks for the feedback! > > > > Best regards > > Michael > >> > >> Drew Gallatin escreveu (quinta, 21/03/2024 =C3= =A0(s) > 12:58): > >> The entire point is to *NOT* go through the overhead of scheduling > something asynchronously, but to take advantage of the fact that a > user/kernel transition is going to trash the cache anyway. > >> > >> In the common case of a system which has less than the threshold > number of connections , we access the tcp_hpts_softclock function pointer= , > make one function call, and access hpts_that_need_softclock, and then > return. So that's 2 variables and a function call. > >> > >> I think it would be preferable to avoid that call, and to move the > declaration of tcp_hpts_softclock and hpts_that_need_softclock so that th= ey > are in the same cacheline. Then we'd be hitting just a single line in th= e > common case. (I've made comments on the review to that effect). > >> > >> Also, I wonder if the threshold could get higher by default, so that > hpts is never called in this context unless we're to the point where we'r= e > scheduling thousands of runs of the hpts thread (and taking all those clo= ck > interrupts). > >> > >> Drew > >> > >> On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote: > >>> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote: > >>>> Ok I have created > >>>> > >>>> https://reviews.freebsd.org/D44420 > >>>> > >>>> > >>>> To address the issue. I also attach a short version of the patch tha= t > Nuno > >>>> can try and validate > >>>> > >>>> it works. Drew you may want to try this and validate the optimizatio= n > does > >>>> kick in since I can > >>>> > >>>> only now test that it does not on my local box :) > >>> The patch still causes access to all cpu's cachelines on each userret= . > >>> It would be much better to inc/check the threshold and only schedule > the > >>> call when exceeded. Then the call can occur in some dedicated contex= t, > >>> like per-CPU thread, instead of userret. > >>> > >>>> > >>>> > >>>> R > >>>> > >>>> > >>>> > >>>> On 3/18/24 3:42 PM, Drew Gallatin wrote: > >>>>> No. The goal is to run on every return to userspace for every > thread. > >>>>> > >>>>> Drew > >>>>> > >>>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrote: > >>>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallatin wrote: > >>>>>>> I got the idea from > >>>>>>> > https://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf > >>>>>>> The gist is that the TCP pacing stuff needs to run frequently, an= d > >>>>>>> rather than run it out of a clock interrupt, its more efficient t= o > run > >>>>>>> it out of a system call context at just the point where we return > to > >>>>>>> userspace and the cache is trashed anyway. The current > implementation > >>>>>>> is fine for our workload, but probably not idea for a generic > system. > >>>>>>> Especially one where something is banging on system calls. > >>>>>>> > >>>>>>> Ast's could be the right tool for this, but I'm super unfamiliar > with > >>>>>>> them, and I can't find any docs on them. > >>>>>>> > >>>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be roughly equivalent > to > >>>>>>> what's happening here? > >>>>>> This call would need some AST number added, and then it registers > the > >>>>>> ast to run on next return to userspace, for the current thread. > >>>>>> > >>>>>> Is it enough? > >>>>>>> > >>>>>>> Drew > >>>>>> > >>>>>>> > >>>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote: > >>>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote: > >>>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: > >>>>>>>>> > >>>>>>>>>>> On 18. Mar 2024, at 12:42, Nuno Teixeira > >>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Hello all! > >>>>>>>>>>> > >>>>>>>>>>> It works just fine! > >>>>>>>>>>> System performance is OK. > >>>>>>>>>>> Using patch on main-n268841-b0aaf8beb126(-dirty). > >>>>>>>>>>> > >>>>>>>>>>> --- > >>>>>>>>>>> net.inet.tcp.functions_available: > >>>>>>>>>>> Stack D > >>>>>> Alias PCB count > >>>>>>>>>>> freebsd freebsd 0 > >>>>>>>>>>> rack * > >>>>>> rack 38 > >>>>>>>>>>> --- > >>>>>>>>>>> > >>>>>>>>>>> It would be so nice that we can have a sysctl tunnable for > >>>>>> this patch > >>>>>>>>>>> so we could do more tests without recompiling kernel. > >>>>>>>>>> Thanks for testing! > >>>>>>>>>> > >>>>>>>>>> @gallatin: can you come up with a patch that is acceptable > >>>>>> for Netflix > >>>>>>>>>> and allows to mitigate the performance regression. > >>>>>>>>> > >>>>>>>>> Ideally, tcphpts could enable this automatically when it > >>>>>> starts to be > >>>>>>>>> used (enough?), but a sysctl could select auto/on/off. > >>>>>>>> There is already a well-known mechanism to request execution of > the > >>>>>>>> specific function on return to userspace, namely AST. The > difference > >>>>>>>> with the current hack is that the execution is requested for one > >>>>>> callback > >>>>>>>> in the context of the specific thread. > >>>>>>>> > >>>>>>>> Still, it might be worth a try to use it; what is the reason to > >>>>>> hit a thread > >>>>>>>> that does not do networking, with TCP processing? > >>>>>>>> > >>>>>>>>> > >>>>>>>>> Mike > >>>>>>>>> > >>>>>>>>>> Best regards > >>>>>>>>>> Michael > >>>>>>>>>>> > >>>>>>>>>>> Thanks all! > >>>>>>>>>>> Really happy here :) > >>>>>>>>>>> > >>>>>>>>>>> Cheers, > >>>>>>>>>>> > >>>>>>>>>>> Nuno Teixeira escreveu (domingo, > >>>>>> 17/03/2024 =C3=A0(s) 20:26): > >>>>>>>>>>>> > >>>>>>>>>>>> Hello, > >>>>>>>>>>>> > >>>>>>>>>>>>> I don't have the full context, but it seems like the > >>>>>> complaint is a performance regression in bonnie++ and perhaps othe= r > >>>>>> things when tcp_hpts is loaded, even when it is not used. Is that > >>>>>> correct? > >>>>>>>>>>>>> > >>>>>>>>>>>>> If so, I suspect its because we drive the > >>>>>> tcp_hpts_softclock() routine from userret(), in order to avoid ton= s > >>>>>> of timer interrupts and context switches. To test this theory, y= ou > >>>>>> could apply a patch like: > >>>>>>>>>>>> > >>>>>>>>>>>> It's affecting overall system performance, bonnie was just > >>>>>> a way to > >>>>>>>>>>>> get some numbers to compare. > >>>>>>>>>>>> > >>>>>>>>>>>> Tomorrow I will test patch. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks! > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> Nuno Teixeira > >>>>>>>>>>>> FreeBSD Committer (ports) > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Nuno Teixeira > >>>>>>>>>>> FreeBSD Committer (ports) > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>> > >>> > >>>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts.c > >>>> index 8c4d2d41a3eb..eadbee19f69c 100644 > >>>> --- a/sys/netinet/tcp_hpts.c > >>>> +++ b/sys/netinet/tcp_hpts.c > >>>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry { > >>>> void *ie_cookie; > >>>> uint16_t p_num; /* The hpts number one per cpu */ > >>>> uint16_t p_cpu; /* The hpts CPU */ > >>>> + uint8_t hit_callout_thresh; > >>>> /* There is extra space in here */ > >>>> /* Cache line 0x100 */ > >>>> struct callout co __aligned(CACHE_LINE_SIZE); > >>>> @@ -269,6 +270,11 @@ static struct hpts_domain_info { > >>>> int cpu[MAXCPU]; > >>>> } hpts_domains[MAXMEMDOM]; > >>>> > >>>> +counter_u64_t hpts_that_need_softclock; > >>>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, > needsoftclock, CTLFLAG_RD, > >>>> + &hpts_that_need_softclock, > >>>> + "Number of hpts threads that need softclock"); > >>>> + > >>>> counter_u64_t hpts_hopelessly_behind; > >>>> > >>>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hopeless, > CTLFLAG_RD, > >>>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, > precision, CTLFLAG_RW, > >>>> &tcp_hpts_precision, 120, > >>>> "Value for PRE() precision of callout"); > >>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFLAG_RW, > >>>> - &conn_cnt_thresh, 0, > >>>> + &conn_cnt_thresh, DEFAULT_CONNECTION_THESHOLD, > >>>> "How many connections (below) make us use the callout based > mechanism"); > >>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_RW, > >>>> &hpts_does_tp_logging, 0, > >>>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void) > >>>> struct tcp_hpts_entry *hpts; > >>>> int ticks_ran; > >>>> > >>>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0) > >>>> + return; > >>>> + > >>>> hpts =3D tcp_choose_hpts_to_run(); > >>>> > >>>> if (hpts->p_hpts_active) { > >>>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx) > >>>> ticks_ran =3D tcp_hptsi(hpts, 1); > >>>> tv.tv_sec =3D 0; > >>>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER_SLOT; > >>>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) && > (hpts->hit_callout_thresh =3D=3D 0)) { > >>>> + hpts->hit_callout_thresh =3D 1; > >>>> + counter_u64_add(hpts_that_need_softclock, 1); > >>>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thresh) && > (hpts->hit_callout_thresh =3D=3D 1)) { > >>>> + hpts->hit_callout_thresh =3D 0; > >>>> + counter_u64_add(hpts_that_need_softclock, -1); > >>>> + } > >>>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) { > >>>> if(hpts->p_direct_wake =3D=3D 0) { > >>>> /* > >>>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void) > >>>> cpu_top =3D NULL; > >>>> #endif > >>>> tcp_pace.rp_num_hptss =3D ncpus; > >>>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK); > >>>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK); > >>>> hpts_loops =3D counter_u64_alloc(M_WAITOK); > >>>> back_tosleep =3D counter_u64_alloc(M_WAITOK); > >>>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void) > >>>> free(tcp_pace.grps, M_TCPHPTS); > >>>> #endif > >>>> > >>>> + counter_u64_free(hpts_that_need_softclock); > >>>> counter_u64_free(hpts_hopelessly_behind); > >>>> counter_u64_free(hpts_loops); > >>>> counter_u64_free(back_tosleep); > >>> > >>> > >> > >> > >> > >> -- > >> Nuno Teixeira > >> FreeBSD Committer (ports) > > > > > > > > -- > > Nuno Teixeira > > FreeBSD Committer (ports) > > --=20 Nuno Teixeira FreeBSD Committer (ports) --0000000000006723920615bd54db Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
With base stack I can complete restic check successfu= lly downloading/reading/checking all files from a "big" remote co= mpressed backup.
Changing it to RACK stack, it fails.
<= br>
I run this command often because in the past, compression cor= ruption occured and this is the equivalent of restoring backup to check its= integrity.

Maybe someone could do a restic test t= o check if this is reproducible.

Thanks,
=



<tue= xen@freebsd.org> escreveu (quarta, 10/04/2024 =C3=A0(s) 13:12):
<= /div>


> On 10. Apr 2024, at 13:40, Nuno Teixeira <eduardo@freebsd.org> wrote:
>
> Hello all,
>
> @ current 1500018 and fetching torrents with net-p2p/qbittorrent finis= hed ~2GB download and connection UP until the end:
>
> ---
> Apr 10 11:26:46 leg kernel: re0: watchdog timeout
> Apr 10 11:26:46 leg kernel: re0: link state changed to DOWN
> Apr 10 11:26:49 leg dhclient[58810]: New IP Address (re0): 192.168.1.6= 7
> Apr 10 11:26:49 leg dhclient[58814]: New Subnet Mask (re0): 255.255.25= 5.0
> Apr 10 11:26:49 leg dhclient[58818]: New Broadcast Address (re0): 192.= 168.1.255
> Apr 10 11:26:49 leg kernel: re0: link state changed to UP
> Apr 10 11:26:49 leg dhclient[58822]: New Routers (re0): 192.168.1.1 > ---
>
> In the past tests, I've got more watchdog timeouts, connection goe= s down and a reboot needed to put it back (`service netif restart` didn'= ;t work).
>
> Other way to reproduce this is using sysutils/restic (backup program) = to read/check all files from a remote server via sftp:
>
> `restic -r sftp:user@remote:restic-repo check --read-data` from a 60GB= compressed backup.
>
> ---
> watchdog timeout x3 as above
> ---
>
> restic check fail log @ 15% progress:
> ---
> <snip>
> Load(<data/52e2923dd6>, 17310001, 0) returned error, retrying af= ter 1.7670599s: connection lost
> Load(<data/d27a0abe0f>, 17456892, 0) returned error, retrying af= ter 4.619104908s: connection lost
> Load(<data/52e2923dd6>, 17310001, 0) returned error, retrying af= ter 5.477648517s: connection lost
> List(lock) returned error, retrying after 293.057766ms: connection los= t
> List(lock) returned error, retrying after 385.206693ms: connection los= t
> List(lock) returned error, retrying after 1.577594281s: connection los= t
> <snip>
>
> Connection continues UP.
Hi,

I'm not sure what the issue is you are reporting. Could you state
what behavior you are experiencing with the base stack and with
the RACK stack. In particular, what the difference is?

Best regards
Michael
>
> Cheers,
>
> <tuexen@fre= ebsd.org> escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53):
>> On 28. Mar 2024, at 15:00, Nuno Teixeira <eduardo@freebsd.org> wrote:
>>
>> Hello all!
>>
>> Running rack @b7b78c1c169 "Optimize HPTS..." very happy = on my laptop (amd64)!
>>
>> Thanks all!
> Thanks for the feedback!
>
> Best regards
> Michael
>>
>> Drew Gallatin <gallatin@freebsd.org> escreveu (quinta, 21/03/2024 =C3= =A0(s) 12:58):
>> The entire point is to *NOT* go through the overhead of scheduling= something asynchronously, but to take advantage of the fact that a user/ke= rnel transition is going to trash the cache anyway.
>>
>> In the common case of a system which has less than the threshold= =C2=A0 number of connections , we access the tcp_hpts_softclock function po= inter, make one function call, and access hpts_that_need_softclock, and the= n return.=C2=A0 So that's 2 variables and a function call.
>>
>> I think it would be preferable to avoid that call, and to move the= declaration of tcp_hpts_softclock and hpts_that_need_softclock so that the= y are in the same cacheline.=C2=A0 Then we'd be hitting just a single l= ine in the common case.=C2=A0 (I've made comments on the review to that= effect).
>>
>> Also, I wonder if the threshold could get higher by default, so th= at hpts is never called in this context unless we're to the point where= we're scheduling thousands of runs of the hpts thread (and taking all = those clock interrupts).
>>
>> Drew
>>
>> On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote:
>>> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote:
>>>> Ok I have created
>>>>
>>>> https://reviews.freebsd.org/D44420
>>>>
>>>>
>>>> To address the issue. I also attach a short version of the= patch that Nuno
>>>> can try and validate
>>>>
>>>> it works. Drew you may want to try this and validate the o= ptimization does
>>>> kick in since I can
>>>>
>>>> only now test that it does not on my local box :)
>>> The patch still causes access to all cpu's cachelines on e= ach userret.
>>> It would be much better to inc/check the threshold and only sc= hedule the
>>> call when exceeded.=C2=A0 Then the call can occur in some dedi= cated context,
>>> like per-CPU thread, instead of userret.
>>>
>>>>
>>>>
>>>> R
>>>>
>>>>
>>>>
>>>> On 3/18/24 3:42 PM, Drew Gallatin wrote:
>>>>> No.=C2=A0 The goal is to run on every return to usersp= ace for every thread.
>>>>>
>>>>> Drew
>>>>>
>>>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov = wrote:
>>>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gal= latin wrote:
>>>>>>> I got the idea from
>>>>>>> h= ttps://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf >>>>>>> The gist is that the TCP pacing stuff needs to= run frequently, and
>>>>>>> rather than run it out of a clock interrupt, i= ts more efficient to run
>>>>>>> it out of a system call context at just the po= int where we return to
>>>>>>> userspace and the cache is trashed anyway. The= current implementation
>>>>>>> is fine for our workload, but probably not ide= a for a generic system.
>>>>>>> Especially one where something is banging on s= ystem calls.
>>>>>>>
>>>>>>> Ast's could be the right tool for this, bu= t I'm super unfamiliar with
>>>>>>> them, and I can't find any docs on them. >>>>>>>
>>>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be= roughly equivalent to
>>>>>>> what's happening here?
>>>>>> This call would need some AST number added, and th= en it registers the
>>>>>> ast to run on next return to userspace, for the cu= rrent thread.
>>>>>>
>>>>>> Is it enough?
>>>>>>>
>>>>>>> Drew
>>>>>>
>>>>>>>
>>>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin B= elousov wrote:
>>>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, = Mike Karels wrote:
>>>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: >>>>>>>>>
>>>>>>>>>>> On 18. Mar 2024, at 12:42, Nun= o Teixeira
>>>>>> <eduardo@freebsd.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello all!
>>>>>>>>>>>
>>>>>>>>>>> It works just fine!
>>>>>>>>>>> System performance is OK.
>>>>>>>>>>> Using patch on main-n268841-b0= aaf8beb126(-dirty).
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>> net.inet.tcp.functions_availab= le:
>>>>>>>>>>> Stack=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0D<= br> >>>>>> Alias=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 PCB count
>>>>>>>>>>> freebsd freebsd=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0
>>>>>>>>>>> rack=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *=
>>>>>> rack=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A038
>>>>>>>>>>> ---
>>>>>>>>>>>
>>>>>>>>>>> It would be so nice that we ca= n have a sysctl tunnable for
>>>>>> this patch
>>>>>>>>>>> so we could do more tests with= out recompiling kernel.
>>>>>>>>>> Thanks for testing!
>>>>>>>>>>
>>>>>>>>>> @gallatin: can you come up with a = patch that is acceptable
>>>>>> for Netflix
>>>>>>>>>> and allows to mitigate the perform= ance regression.
>>>>>>>>>
>>>>>>>>> Ideally, tcphpts could enable this aut= omatically when it
>>>>>> starts to be
>>>>>>>>> used (enough?), but a sysctl could sel= ect auto/on/off.
>>>>>>>> There is already a well-known mechanism to= request execution of the
>>>>>>>> specific function on return to userspace, = namely AST.=C2=A0 The difference
>>>>>>>> with the current hack is that the executio= n is requested for one
>>>>>> callback
>>>>>>>> in the context of the specific thread.
>>>>>>>>
>>>>>>>> Still, it might be worth a try to use it; = what is the reason to
>>>>>> hit a thread
>>>>>>>> that does not do networking, with TCP proc= essing?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Mike
>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>> Michael
>>>>>>>>>>>
>>>>>>>>>>> Thanks all!
>>>>>>>>>>> Really happy here :)
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Nuno Teixeira <eduardo@freebsd.org> es= creveu (domingo,
>>>>>> 17/03/2024 =C3=A0(s) 20:26):
>>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>>> I don't have the f= ull context, but it seems like the
>>>>>> complaint is a performance regression in bonnie++ = and perhaps other
>>>>>> things when tcp_hpts is loaded, even when it is no= t used.=C2=A0 Is that
>>>>>> correct?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If so, I suspect its b= ecause we drive the
>>>>>> tcp_hpts_softclock() routine from userret(), in or= der to avoid tons
>>>>>> of timer interrupts and context switches.=C2=A0 To= test this theory,=C2=A0 you
>>>>>> could apply a patch like:
>>>>>>>>>>>>
>>>>>>>>>>>> It's affecting overall= system performance, bonnie was just
>>>>>> a way to
>>>>>>>>>>>> get some numbers to compar= e.
>>>>>>>>>>>>
>>>>>>>>>>>> Tomorrow I will test patch= .
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Nuno Teixeira
>>>>>>>>>>>> FreeBSD Committer (ports)<= br> >>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Nuno Teixeira
>>>>>>>>>>> FreeBSD Committer (ports)
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>>>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts= .c
>>>> index 8c4d2d41a3eb..eadbee19f69c 100644
>>>> --- a/sys/netinet/tcp_hpts.c
>>>> +++ b/sys/netinet/tcp_hpts.c
>>>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry {
>>>> void *ie_cookie;
>>>> uint16_t p_num; /* The hpts number one per cpu */
>>>> uint16_t p_cpu; /* The hpts CPU */
>>>> + uint8_t hit_callout_thresh;
>>>> /* There is extra space in here */
>>>> /* Cache line 0x100 */
>>>> struct callout co __aligned(CACHE_LINE_SIZE);
>>>> @@ -269,6 +270,11 @@ static struct hpts_domain_info {
>>>> int cpu[MAXCPU];
>>>> } hpts_domains[MAXMEMDOM];
>>>>
>>>> +counter_u64_t hpts_that_need_softclock;
>>>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, ne= edsoftclock, CTLFLAG_RD,
>>>> +=C2=A0 =C2=A0 &hpts_that_need_softclock,
>>>> +=C2=A0 =C2=A0 "Number of hpts threads that need soft= clock");
>>>> +
>>>> counter_u64_t hpts_hopelessly_behind;
>>>>
>>>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hop= eless, CTLFLAG_RD,
>>>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUT= O, precision, CTLFLAG_RW,
>>>>=C2=A0 =C2=A0 &tcp_hpts_precision, 120,
>>>>=C2=A0 =C2=A0 "Value for PRE() precision of callout&qu= ot;);
>>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFL= AG_RW,
>>>> -=C2=A0 =C2=A0 &conn_cnt_thresh, 0,
>>>> +=C2=A0 =C2=A0 &conn_cnt_thresh, DEFAULT_CONNECTION_TH= ESHOLD,
>>>>=C2=A0 =C2=A0 "How many connections (below) make us us= e the callout based mechanism");
>>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_= RW,
>>>>=C2=A0 =C2=A0 &hpts_does_tp_logging, 0,
>>>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void)
>>>> struct tcp_hpts_entry *hpts;
>>>> int ticks_ran;
>>>>
>>>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0= )
>>>> + return;
>>>> +
>>>> hpts =3D tcp_choose_hpts_to_run();
>>>>
>>>> if (hpts->p_hpts_active) {
>>>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx)
>>>> ticks_ran =3D tcp_hptsi(hpts, 1);
>>>> tv.tv_sec =3D 0;
>>>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER= _SLOT;
>>>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) &= & (hpts->hit_callout_thresh =3D=3D 0)) {
>>>> + hpts->hit_callout_thresh =3D 1;
>>>> + counter_u64_add(hpts_that_need_softclock, 1);
>>>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thr= esh) && (hpts->hit_callout_thresh =3D=3D 1)) {
>>>> + hpts->hit_callout_thresh =3D 0;
>>>> + counter_u64_add(hpts_that_need_softclock, -1);
>>>> + }
>>>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) {
>>>> if(hpts->p_direct_wake =3D=3D 0) {
>>>> /*
>>>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void)
>>>> cpu_top =3D NULL;
>>>> #endif
>>>> tcp_pace.rp_num_hptss =3D ncpus;
>>>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK)= ;
>>>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK); >>>> hpts_loops =3D counter_u64_alloc(M_WAITOK);
>>>> back_tosleep =3D counter_u64_alloc(M_WAITOK);
>>>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void)
>>>> free(tcp_pace.grps, M_TCPHPTS);
>>>> #endif
>>>>
>>>> + counter_u64_free(hpts_that_need_softclock);
>>>> counter_u64_free(hpts_hopelessly_behind);
>>>> counter_u64_free(hpts_loops);
>>>> counter_u64_free(back_tosleep);
>>>
>>>
>>
>>
>>
>> --
>> Nuno Teixeira
>> FreeBSD Committer (ports)
>
>
>
> --
> Nuno Teixeira
> FreeBSD Committer (ports)



--
Nuno Teixeira
FreeBSD Committ= er (ports)
--0000000000006723920615bd54db-- From nobody Wed Apr 10 12:44:59 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VF2br1PBdz5H3fW; Wed, 10 Apr 2024 12:45:12 +0000 (UTC) (envelope-from eduardo@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VF2br16Sjz4THK; Wed, 10 Apr 2024 12:45:12 +0000 (UTC) (envelope-from eduardo@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712753112; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=doNA/KZxMAy9H2qRyYQFZaDJlTs3p0PtQwPZnbswvE0=; b=JphAbMSEVC8ZZZFhwP6xtwOfEOROF/BHSi/D+Om7Mux0qwl/BtHW5ywz9tBvvPOWp3rtlV JI1sLkL1KgaGsU33X2t1PRl7Is3s7KlJXkxlrbgd1M5cr+iRtkAFPTjWGnNvtm2g1RxYvU mpv3pkxLNXvKZRCpt9uhue5GVEkbeNn15oCFp4BCMai6GvTokhQyd5KfXMbV1FUFtz3tQn 4JMqNrC71okwFHyTXFbmkyVnkVz4j/eUzDPqL+bFVjzyF0HbrKW96MxhrK2lEthya65lnk vOlrntCHD/6WFuIAH6yan7crn38WHFb8JUUiH8ER0eNHjIV08kAXfk8N549k6g== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1712753112; a=rsa-sha256; cv=none; b=E8w3Mtm8iYk8UsPO3ue9/8qfISFftmmrL2mJOrDn9vcYUobCxa2k9E2sBIpXnNVg+H6Fju u+uDE/SXIJRegF/R5eOl3aw4paJGcZEx02Ffu5rNZ0cpkOOag2jPpb5F42WTcGoa3teeo9 vbgwv9YSkCGOZ2NcTZrDnb7SIgkkruJB1pK1touIWWi6CiE5jY3ppHuvTCLHu9hV9UeeVO nQIgNugZUsKD2fdFD4ZpJ+OmOH7JgEVOsuY2p4Hb1VF/HtlKOOMX7NAdY7PMo+QyIZQPS6 RIVkTQL+/bdEstA+kGLxqmaaIhp+fAbxbFD/kxlRDrPZsEgNt6YUKVK9ecK1HQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1712753112; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=doNA/KZxMAy9H2qRyYQFZaDJlTs3p0PtQwPZnbswvE0=; b=ZZXIQOIVdwGa83vXPnjexa5JqOD9ckzyoSCXE0NZRAq4gOb7Hy97RIA4wbjLi+DyUcgQam MoiCcjQbOvHHWDpDOqY5xWLHCBMl8geC8kLs8d742RV3K5P0y6oBBABq6D+f2/iZWjvCHa 7XDt69SjXbzHILEr8VBlbDe1TrCAjGjYXapcN+mk9kQldEuQa18u0nPDzShXnS0WI31ZJb jNYN9OifIJaaqmZ/3vYhVtNq6rCfxPWbUbw2zZK5Mn+jcGwelO8VZk7YUZlKJx2NY+Lr6R ocjmOueFquNoZjwEwhoQXJZ/io2YC8J77yYerCoLTLlpWy0gGWAJt/ytm/DnSg== Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) (Authenticated sender: eduardo) by smtp.freebsd.org (Postfix) with ESMTPSA id 4VF2br0bCqzM4T; Wed, 10 Apr 2024 12:45:12 +0000 (UTC) (envelope-from eduardo@freebsd.org) Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-434a5c9b998so11729521cf.0; Wed, 10 Apr 2024 05:45:12 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCWMb+n1GCBadWwfygFWv59jGFdvPUf4JARMy9xJoF3sVk8UMMzoFie93KCO9xbmeIl/oClhQldrxllYrp+2uMXfAfkz0zfl1oLkbAO6zjAwW+aR28xXaet9E8kbhh67iqp2wTPcSFjpNE/QxPHXLmvqic3Dikmo X-Gm-Message-State: AOJu0YyvBSdAVP7G7MDgkYda5WTrCpjrnKM7LsUSnRA1+9dSFhEb5Ilh rQy8JjWMGO1pk/jyhxV+USmxL97KZMx7Gr4v4GFvf7f5ff0ByQUD8nyQhyDlT99k/pip05nAYIt Ky2HJ/nziQIkuaN5ga3Gp4hfGyfU= X-Google-Smtp-Source: AGHT+IG28PXFtdY1WZ5r6onFzj1ABQIX6rYXb2ZGBE10CiwM7XZOsZlIo+IKvqixNfAvKXVIMzj+/YQu6J19LaTKwJU= X-Received: by 2002:ac8:7fca:0:b0:434:cebd:9551 with SMTP id b10-20020ac87fca000000b00434cebd9551mr2490104qtk.27.1712753111088; Wed, 10 Apr 2024 05:45:11 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <6e795e9c-8de4-4e02-9a96-8fabfaa4e66f@app.fastmail.com> <6047C8EF-B1B0-4286-93FA-AA38F8A18656@karels.net> <8031cd99-ded8-4b06-93b3-11cc729a8b2c@app.fastmail.com> <38c54399-6c96-44d8-a3a2-3cc1bfbe50c2@app.fastmail.com> <27d8144f-0658-46f6-b8f3-35eb60061644@lakerest.net> <5C9863F7-0F1C-4D02-9F6D-9DDC5FBEB368@freebsd.org> <52479AA6-04F6-4D4A-ABE0-7142B47E28DF@freebsd.org> In-Reply-To: From: Nuno Teixeira Date: Wed, 10 Apr 2024 13:44:59 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Request for Testing: TCP RACK To: tuexen@freebsd.org Cc: Drew Gallatin , Konstantin Belousov , rrs , Mike Karels , garyj@gmx.de, current@freebsd.org, net@freebsd.org, Randall Stewart Content-Type: multipart/alternative; boundary="000000000000c64a570615bd6820" --000000000000c64a570615bd6820 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable (...) Backup server is https://www.rsync.net/ (free 500GB for FreeBSD developers). Nuno Teixeira escreveu (quarta, 10/04/2024 =C3=A0(s) 13:39): > With base stack I can complete restic check successfully > downloading/reading/checking all files from a "big" remote compressed > backup. > Changing it to RACK stack, it fails. > > I run this command often because in the past, compression corruption > occured and this is the equivalent of restoring backup to check its > integrity. > > Maybe someone could do a restic test to check if this is reproducible. > > Thanks, > > > > escreveu (quarta, 10/04/2024 =C3=A0(s) 13:12): > >> >> >> > On 10. Apr 2024, at 13:40, Nuno Teixeira wrote: >> > >> > Hello all, >> > >> > @ current 1500018 and fetching torrents with net-p2p/qbittorrent >> finished ~2GB download and connection UP until the end: >> > >> > --- >> > Apr 10 11:26:46 leg kernel: re0: watchdog timeout >> > Apr 10 11:26:46 leg kernel: re0: link state changed to DOWN >> > Apr 10 11:26:49 leg dhclient[58810]: New IP Address (re0): 192.168.1.6= 7 >> > Apr 10 11:26:49 leg dhclient[58814]: New Subnet Mask (re0): >> 255.255.255.0 >> > Apr 10 11:26:49 leg dhclient[58818]: New Broadcast Address (re0): >> 192.168.1.255 >> > Apr 10 11:26:49 leg kernel: re0: link state changed to UP >> > Apr 10 11:26:49 leg dhclient[58822]: New Routers (re0): 192.168.1.1 >> > --- >> > >> > In the past tests, I've got more watchdog timeouts, connection goes >> down and a reboot needed to put it back (`service netif restart` didn't >> work). >> > >> > Other way to reproduce this is using sysutils/restic (backup program) >> to read/check all files from a remote server via sftp: >> > >> > `restic -r sftp:user@remote:restic-repo check --read-data` from a 60GB >> compressed backup. >> > >> > --- >> > watchdog timeout x3 as above >> > --- >> > >> > restic check fail log @ 15% progress: >> > --- >> > >> > Load(, 17310001, 0) returned error, retrying after >> 1.7670599s: connection lost >> > Load(, 17456892, 0) returned error, retrying after >> 4.619104908s: connection lost >> > Load(, 17310001, 0) returned error, retrying after >> 5.477648517s: connection lost >> > List(lock) returned error, retrying after 293.057766ms: connection los= t >> > List(lock) returned error, retrying after 385.206693ms: connection los= t >> > List(lock) returned error, retrying after 1.577594281s: connection los= t >> > >> > >> > Connection continues UP. >> Hi, >> >> I'm not sure what the issue is you are reporting. Could you state >> what behavior you are experiencing with the base stack and with >> the RACK stack. In particular, what the difference is? >> >> Best regards >> Michael >> > >> > Cheers, >> > >> > escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53): >> >> On 28. Mar 2024, at 15:00, Nuno Teixeira wrote: >> >> >> >> Hello all! >> >> >> >> Running rack @b7b78c1c169 "Optimize HPTS..." very happy on my laptop >> (amd64)! >> >> >> >> Thanks all! >> > Thanks for the feedback! >> > >> > Best regards >> > Michael >> >> >> >> Drew Gallatin escreveu (quinta, 21/03/2024 >> =C3=A0(s) 12:58): >> >> The entire point is to *NOT* go through the overhead of scheduling >> something asynchronously, but to take advantage of the fact that a >> user/kernel transition is going to trash the cache anyway. >> >> >> >> In the common case of a system which has less than the threshold >> number of connections , we access the tcp_hpts_softclock function pointe= r, >> make one function call, and access hpts_that_need_softclock, and then >> return. So that's 2 variables and a function call. >> >> >> >> I think it would be preferable to avoid that call, and to move the >> declaration of tcp_hpts_softclock and hpts_that_need_softclock so that t= hey >> are in the same cacheline. Then we'd be hitting just a single line in t= he >> common case. (I've made comments on the review to that effect). >> >> >> >> Also, I wonder if the threshold could get higher by default, so that >> hpts is never called in this context unless we're to the point where we'= re >> scheduling thousands of runs of the hpts thread (and taking all those cl= ock >> interrupts). >> >> >> >> Drew >> >> >> >> On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote: >> >>> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote: >> >>>> Ok I have created >> >>>> >> >>>> https://reviews.freebsd.org/D44420 >> >>>> >> >>>> >> >>>> To address the issue. I also attach a short version of the patch >> that Nuno >> >>>> can try and validate >> >>>> >> >>>> it works. Drew you may want to try this and validate the >> optimization does >> >>>> kick in since I can >> >>>> >> >>>> only now test that it does not on my local box :) >> >>> The patch still causes access to all cpu's cachelines on each userre= t. >> >>> It would be much better to inc/check the threshold and only schedule >> the >> >>> call when exceeded. Then the call can occur in some dedicated >> context, >> >>> like per-CPU thread, instead of userret. >> >>> >> >>>> >> >>>> >> >>>> R >> >>>> >> >>>> >> >>>> >> >>>> On 3/18/24 3:42 PM, Drew Gallatin wrote: >> >>>>> No. The goal is to run on every return to userspace for every >> thread. >> >>>>> >> >>>>> Drew >> >>>>> >> >>>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrote: >> >>>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallatin wrote: >> >>>>>>> I got the idea from >> >>>>>>> >> https://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf >> >>>>>>> The gist is that the TCP pacing stuff needs to run frequently, a= nd >> >>>>>>> rather than run it out of a clock interrupt, its more efficient >> to run >> >>>>>>> it out of a system call context at just the point where we retur= n >> to >> >>>>>>> userspace and the cache is trashed anyway. The current >> implementation >> >>>>>>> is fine for our workload, but probably not idea for a generic >> system. >> >>>>>>> Especially one where something is banging on system calls. >> >>>>>>> >> >>>>>>> Ast's could be the right tool for this, but I'm super unfamiliar >> with >> >>>>>>> them, and I can't find any docs on them. >> >>>>>>> >> >>>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be roughly equivalen= t >> to >> >>>>>>> what's happening here? >> >>>>>> This call would need some AST number added, and then it registers >> the >> >>>>>> ast to run on next return to userspace, for the current thread. >> >>>>>> >> >>>>>> Is it enough? >> >>>>>>> >> >>>>>>> Drew >> >>>>>> >> >>>>>>> >> >>>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote: >> >>>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote: >> >>>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: >> >>>>>>>>> >> >>>>>>>>>>> On 18. Mar 2024, at 12:42, Nuno Teixeira >> >>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> Hello all! >> >>>>>>>>>>> >> >>>>>>>>>>> It works just fine! >> >>>>>>>>>>> System performance is OK. >> >>>>>>>>>>> Using patch on main-n268841-b0aaf8beb126(-dirty). >> >>>>>>>>>>> >> >>>>>>>>>>> --- >> >>>>>>>>>>> net.inet.tcp.functions_available: >> >>>>>>>>>>> Stack D >> >>>>>> Alias PCB count >> >>>>>>>>>>> freebsd freebsd 0 >> >>>>>>>>>>> rack * >> >>>>>> rack 38 >> >>>>>>>>>>> --- >> >>>>>>>>>>> >> >>>>>>>>>>> It would be so nice that we can have a sysctl tunnable for >> >>>>>> this patch >> >>>>>>>>>>> so we could do more tests without recompiling kernel. >> >>>>>>>>>> Thanks for testing! >> >>>>>>>>>> >> >>>>>>>>>> @gallatin: can you come up with a patch that is acceptable >> >>>>>> for Netflix >> >>>>>>>>>> and allows to mitigate the performance regression. >> >>>>>>>>> >> >>>>>>>>> Ideally, tcphpts could enable this automatically when it >> >>>>>> starts to be >> >>>>>>>>> used (enough?), but a sysctl could select auto/on/off. >> >>>>>>>> There is already a well-known mechanism to request execution of >> the >> >>>>>>>> specific function on return to userspace, namely AST. The >> difference >> >>>>>>>> with the current hack is that the execution is requested for on= e >> >>>>>> callback >> >>>>>>>> in the context of the specific thread. >> >>>>>>>> >> >>>>>>>> Still, it might be worth a try to use it; what is the reason to >> >>>>>> hit a thread >> >>>>>>>> that does not do networking, with TCP processing? >> >>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Mike >> >>>>>>>>> >> >>>>>>>>>> Best regards >> >>>>>>>>>> Michael >> >>>>>>>>>>> >> >>>>>>>>>>> Thanks all! >> >>>>>>>>>>> Really happy here :) >> >>>>>>>>>>> >> >>>>>>>>>>> Cheers, >> >>>>>>>>>>> >> >>>>>>>>>>> Nuno Teixeira escreveu (domingo, >> >>>>>> 17/03/2024 =C3=A0(s) 20:26): >> >>>>>>>>>>>> >> >>>>>>>>>>>> Hello, >> >>>>>>>>>>>> >> >>>>>>>>>>>>> I don't have the full context, but it seems like the >> >>>>>> complaint is a performance regression in bonnie++ and perhaps oth= er >> >>>>>> things when tcp_hpts is loaded, even when it is not used. Is tha= t >> >>>>>> correct? >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> If so, I suspect its because we drive the >> >>>>>> tcp_hpts_softclock() routine from userret(), in order to avoid to= ns >> >>>>>> of timer interrupts and context switches. To test this theory, >> you >> >>>>>> could apply a patch like: >> >>>>>>>>>>>> >> >>>>>>>>>>>> It's affecting overall system performance, bonnie was just >> >>>>>> a way to >> >>>>>>>>>>>> get some numbers to compare. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Tomorrow I will test patch. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Thanks! >> >>>>>>>>>>>> >> >>>>>>>>>>>> -- >> >>>>>>>>>>>> Nuno Teixeira >> >>>>>>>>>>>> FreeBSD Committer (ports) >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> -- >> >>>>>>>>>>> Nuno Teixeira >> >>>>>>>>>>> FreeBSD Committer (ports) >> >>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>>> >> >>> >> >>>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts.c >> >>>> index 8c4d2d41a3eb..eadbee19f69c 100644 >> >>>> --- a/sys/netinet/tcp_hpts.c >> >>>> +++ b/sys/netinet/tcp_hpts.c >> >>>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry { >> >>>> void *ie_cookie; >> >>>> uint16_t p_num; /* The hpts number one per cpu */ >> >>>> uint16_t p_cpu; /* The hpts CPU */ >> >>>> + uint8_t hit_callout_thresh; >> >>>> /* There is extra space in here */ >> >>>> /* Cache line 0x100 */ >> >>>> struct callout co __aligned(CACHE_LINE_SIZE); >> >>>> @@ -269,6 +270,11 @@ static struct hpts_domain_info { >> >>>> int cpu[MAXCPU]; >> >>>> } hpts_domains[MAXMEMDOM]; >> >>>> >> >>>> +counter_u64_t hpts_that_need_softclock; >> >>>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, >> needsoftclock, CTLFLAG_RD, >> >>>> + &hpts_that_need_softclock, >> >>>> + "Number of hpts threads that need softclock"); >> >>>> + >> >>>> counter_u64_t hpts_hopelessly_behind; >> >>>> >> >>>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hopeless, >> CTLFLAG_RD, >> >>>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, >> precision, CTLFLAG_RW, >> >>>> &tcp_hpts_precision, 120, >> >>>> "Value for PRE() precision of callout"); >> >>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFLAG_RW, >> >>>> - &conn_cnt_thresh, 0, >> >>>> + &conn_cnt_thresh, DEFAULT_CONNECTION_THESHOLD, >> >>>> "How many connections (below) make us use the callout based >> mechanism"); >> >>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_RW, >> >>>> &hpts_does_tp_logging, 0, >> >>>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void) >> >>>> struct tcp_hpts_entry *hpts; >> >>>> int ticks_ran; >> >>>> >> >>>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0) >> >>>> + return; >> >>>> + >> >>>> hpts =3D tcp_choose_hpts_to_run(); >> >>>> >> >>>> if (hpts->p_hpts_active) { >> >>>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx) >> >>>> ticks_ran =3D tcp_hptsi(hpts, 1); >> >>>> tv.tv_sec =3D 0; >> >>>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER_SLOT; >> >>>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) && >> (hpts->hit_callout_thresh =3D=3D 0)) { >> >>>> + hpts->hit_callout_thresh =3D 1; >> >>>> + counter_u64_add(hpts_that_need_softclock, 1); >> >>>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thresh) && >> (hpts->hit_callout_thresh =3D=3D 1)) { >> >>>> + hpts->hit_callout_thresh =3D 0; >> >>>> + counter_u64_add(hpts_that_need_softclock, -1); >> >>>> + } >> >>>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) { >> >>>> if(hpts->p_direct_wake =3D=3D 0) { >> >>>> /* >> >>>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void) >> >>>> cpu_top =3D NULL; >> >>>> #endif >> >>>> tcp_pace.rp_num_hptss =3D ncpus; >> >>>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK); >> >>>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK); >> >>>> hpts_loops =3D counter_u64_alloc(M_WAITOK); >> >>>> back_tosleep =3D counter_u64_alloc(M_WAITOK); >> >>>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void) >> >>>> free(tcp_pace.grps, M_TCPHPTS); >> >>>> #endif >> >>>> >> >>>> + counter_u64_free(hpts_that_need_softclock); >> >>>> counter_u64_free(hpts_hopelessly_behind); >> >>>> counter_u64_free(hpts_loops); >> >>>> counter_u64_free(back_tosleep); >> >>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Nuno Teixeira >> >> FreeBSD Committer (ports) >> > >> > >> > >> > -- >> > Nuno Teixeira >> > FreeBSD Committer (ports) >> >> > > -- > Nuno Teixeira > FreeBSD Committer (ports) > --=20 Nuno Teixeira FreeBSD Committer (ports) --000000000000c64a570615bd6820 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
(...)

Backup server is https://www.rsync.net/ (free 500GB for Fr= eeBSD developers).

Nuno Teixeira <eduardo@freebsd.org> escreveu (quarta, 10/04/2024 =C3=A0= (s) 13:39):
With base stack I can complete restic check successfully = downloading/reading/checking all files from a "big" remote compre= ssed backup.
Changing it to RACK stack, it fails.

<= /div>
I run this command often because in the past, compression corrupt= ion occured and this is the equivalent of restoring backup to check its int= egrity.

Maybe someone could do a restic test to ch= eck if this is reproducible.

Thanks,



<tuexen@freebsd.org> escreveu (quarta, 10/04/2024 =C3=A0(= s) 13:12):


> On 10. Apr 2024, at 13:40, Nuno Teixeira <eduardo@freebsd.org> wrote:
>
> Hello all,
>
> @ current 1500018 and fetching torrents with net-p2p/qbittorrent finis= hed ~2GB download and connection UP until the end:
>
> ---
> Apr 10 11:26:46 leg kernel: re0: watchdog timeout
> Apr 10 11:26:46 leg kernel: re0: link state changed to DOWN
> Apr 10 11:26:49 leg dhclient[58810]: New IP Address (re0): 192.168.1.6= 7
> Apr 10 11:26:49 leg dhclient[58814]: New Subnet Mask (re0): 255.255.25= 5.0
> Apr 10 11:26:49 leg dhclient[58818]: New Broadcast Address (re0): 192.= 168.1.255
> Apr 10 11:26:49 leg kernel: re0: link state changed to UP
> Apr 10 11:26:49 leg dhclient[58822]: New Routers (re0): 192.168.1.1 > ---
>
> In the past tests, I've got more watchdog timeouts, connection goe= s down and a reboot needed to put it back (`service netif restart` didn'= ;t work).
>
> Other way to reproduce this is using sysutils/restic (backup program) = to read/check all files from a remote server via sftp:
>
> `restic -r sftp:user@remote:restic-repo check --read-data` from a 60GB= compressed backup.
>
> ---
> watchdog timeout x3 as above
> ---
>
> restic check fail log @ 15% progress:
> ---
> <snip>
> Load(<data/52e2923dd6>, 17310001, 0) returned error, retrying af= ter 1.7670599s: connection lost
> Load(<data/d27a0abe0f>, 17456892, 0) returned error, retrying af= ter 4.619104908s: connection lost
> Load(<data/52e2923dd6>, 17310001, 0) returned error, retrying af= ter 5.477648517s: connection lost
> List(lock) returned error, retrying after 293.057766ms: connection los= t
> List(lock) returned error, retrying after 385.206693ms: connection los= t
> List(lock) returned error, retrying after 1.577594281s: connection los= t
> <snip>
>
> Connection continues UP.
Hi,

I'm not sure what the issue is you are reporting. Could you state
what behavior you are experiencing with the base stack and with
the RACK stack. In particular, what the difference is?

Best regards
Michael
>
> Cheers,
>
> <tuexen@fre= ebsd.org> escreveu (quinta, 28/03/2024 =C3=A0(s) 15:53):
>> On 28. Mar 2024, at 15:00, Nuno Teixeira <eduardo@freebsd.org> wrote:
>>
>> Hello all!
>>
>> Running rack @b7b78c1c169 "Optimize HPTS..." very happy = on my laptop (amd64)!
>>
>> Thanks all!
> Thanks for the feedback!
>
> Best regards
> Michael
>>
>> Drew Gallatin <gallatin@freebsd.org> escreveu (quinta, 21/03/2024 =C3= =A0(s) 12:58):
>> The entire point is to *NOT* go through the overhead of scheduling= something asynchronously, but to take advantage of the fact that a user/ke= rnel transition is going to trash the cache anyway.
>>
>> In the common case of a system which has less than the threshold= =C2=A0 number of connections , we access the tcp_hpts_softclock function po= inter, make one function call, and access hpts_that_need_softclock, and the= n return.=C2=A0 So that's 2 variables and a function call.
>>
>> I think it would be preferable to avoid that call, and to move the= declaration of tcp_hpts_softclock and hpts_that_need_softclock so that the= y are in the same cacheline.=C2=A0 Then we'd be hitting just a single l= ine in the common case.=C2=A0 (I've made comments on the review to that= effect).
>>
>> Also, I wonder if the threshold could get higher by default, so th= at hpts is never called in this context unless we're to the point where= we're scheduling thousands of runs of the hpts thread (and taking all = those clock interrupts).
>>
>> Drew
>>
>> On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote:
>>> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote:
>>>> Ok I have created
>>>>
>>>> https://reviews.freebsd.org/D44420
>>>>
>>>>
>>>> To address the issue. I also attach a short version of the= patch that Nuno
>>>> can try and validate
>>>>
>>>> it works. Drew you may want to try this and validate the o= ptimization does
>>>> kick in since I can
>>>>
>>>> only now test that it does not on my local box :)
>>> The patch still causes access to all cpu's cachelines on e= ach userret.
>>> It would be much better to inc/check the threshold and only sc= hedule the
>>> call when exceeded.=C2=A0 Then the call can occur in some dedi= cated context,
>>> like per-CPU thread, instead of userret.
>>>
>>>>
>>>>
>>>> R
>>>>
>>>>
>>>>
>>>> On 3/18/24 3:42 PM, Drew Gallatin wrote:
>>>>> No.=C2=A0 The goal is to run on every return to usersp= ace for every thread.
>>>>>
>>>>> Drew
>>>>>
>>>>> On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov = wrote:
>>>>>> On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gal= latin wrote:
>>>>>>> I got the idea from
>>>>>>> h= ttps://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf >>>>>>> The gist is that the TCP pacing stuff needs to= run frequently, and
>>>>>>> rather than run it out of a clock interrupt, i= ts more efficient to run
>>>>>>> it out of a system call context at just the po= int where we return to
>>>>>>> userspace and the cache is trashed anyway. The= current implementation
>>>>>>> is fine for our workload, but probably not ide= a for a generic system.
>>>>>>> Especially one where something is banging on s= ystem calls.
>>>>>>>
>>>>>>> Ast's could be the right tool for this, bu= t I'm super unfamiliar with
>>>>>>> them, and I can't find any docs on them. >>>>>>>
>>>>>>> Would ast_register(0, ASTR_UNCOND, 0, func) be= roughly equivalent to
>>>>>>> what's happening here?
>>>>>> This call would need some AST number added, and th= en it registers the
>>>>>> ast to run on next return to userspace, for the cu= rrent thread.
>>>>>>
>>>>>> Is it enough?
>>>>>>>
>>>>>>> Drew
>>>>>>
>>>>>>>
>>>>>>> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin B= elousov wrote:
>>>>>>>> On Mon, Mar 18, 2024 at 07:26:10AM -0500, = Mike Karels wrote:
>>>>>>>>> On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: >>>>>>>>>
>>>>>>>>>>> On 18. Mar 2024, at 12:42, Nun= o Teixeira
>>>>>> <eduardo@freebsd.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello all!
>>>>>>>>>>>
>>>>>>>>>>> It works just fine!
>>>>>>>>>>> System performance is OK.
>>>>>>>>>>> Using patch on main-n268841-b0= aaf8beb126(-dirty).
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>> net.inet.tcp.functions_availab= le:
>>>>>>>>>>> Stack=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0D<= br> >>>>>> Alias=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 PCB count
>>>>>>>>>>> freebsd freebsd=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0
>>>>>>>>>>> rack=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *=
>>>>>> rack=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A038
>>>>>>>>>>> ---
>>>>>>>>>>>
>>>>>>>>>>> It would be so nice that we ca= n have a sysctl tunnable for
>>>>>> this patch
>>>>>>>>>>> so we could do more tests with= out recompiling kernel.
>>>>>>>>>> Thanks for testing!
>>>>>>>>>>
>>>>>>>>>> @gallatin: can you come up with a = patch that is acceptable
>>>>>> for Netflix
>>>>>>>>>> and allows to mitigate the perform= ance regression.
>>>>>>>>>
>>>>>>>>> Ideally, tcphpts could enable this aut= omatically when it
>>>>>> starts to be
>>>>>>>>> used (enough?), but a sysctl could sel= ect auto/on/off.
>>>>>>>> There is already a well-known mechanism to= request execution of the
>>>>>>>> specific function on return to userspace, = namely AST.=C2=A0 The difference
>>>>>>>> with the current hack is that the executio= n is requested for one
>>>>>> callback
>>>>>>>> in the context of the specific thread.
>>>>>>>>
>>>>>>>> Still, it might be worth a try to use it; = what is the reason to
>>>>>> hit a thread
>>>>>>>> that does not do networking, with TCP proc= essing?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Mike
>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>> Michael
>>>>>>>>>>>
>>>>>>>>>>> Thanks all!
>>>>>>>>>>> Really happy here :)
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Nuno Teixeira <eduardo@freebsd.org> es= creveu (domingo,
>>>>>> 17/03/2024 =C3=A0(s) 20:26):
>>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>>> I don't have the f= ull context, but it seems like the
>>>>>> complaint is a performance regression in bonnie++ = and perhaps other
>>>>>> things when tcp_hpts is loaded, even when it is no= t used.=C2=A0 Is that
>>>>>> correct?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If so, I suspect its b= ecause we drive the
>>>>>> tcp_hpts_softclock() routine from userret(), in or= der to avoid tons
>>>>>> of timer interrupts and context switches.=C2=A0 To= test this theory,=C2=A0 you
>>>>>> could apply a patch like:
>>>>>>>>>>>>
>>>>>>>>>>>> It's affecting overall= system performance, bonnie was just
>>>>>> a way to
>>>>>>>>>>>> get some numbers to compar= e.
>>>>>>>>>>>>
>>>>>>>>>>>> Tomorrow I will test patch= .
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Nuno Teixeira
>>>>>>>>>>>> FreeBSD Committer (ports)<= br> >>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Nuno Teixeira
>>>>>>>>>>> FreeBSD Committer (ports)
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>>>> diff --git a/sys/netinet/tcp_hpts.c b/sys/netinet/tcp_hpts= .c
>>>> index 8c4d2d41a3eb..eadbee19f69c 100644
>>>> --- a/sys/netinet/tcp_hpts.c
>>>> +++ b/sys/netinet/tcp_hpts.c
>>>> @@ -216,6 +216,7 @@ struct tcp_hpts_entry {
>>>> void *ie_cookie;
>>>> uint16_t p_num; /* The hpts number one per cpu */
>>>> uint16_t p_cpu; /* The hpts CPU */
>>>> + uint8_t hit_callout_thresh;
>>>> /* There is extra space in here */
>>>> /* Cache line 0x100 */
>>>> struct callout co __aligned(CACHE_LINE_SIZE);
>>>> @@ -269,6 +270,11 @@ static struct hpts_domain_info {
>>>> int cpu[MAXCPU];
>>>> } hpts_domains[MAXMEMDOM];
>>>>
>>>> +counter_u64_t hpts_that_need_softclock;
>>>> +SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, ne= edsoftclock, CTLFLAG_RD,
>>>> +=C2=A0 =C2=A0 &hpts_that_need_softclock,
>>>> +=C2=A0 =C2=A0 "Number of hpts threads that need soft= clock");
>>>> +
>>>> counter_u64_t hpts_hopelessly_behind;
>>>>
>>>> SYSCTL_COUNTER_U64(_net_inet_tcp_hpts_stats, OID_AUTO, hop= eless, CTLFLAG_RD,
>>>> @@ -334,7 +340,7 @@ SYSCTL_INT(_net_inet_tcp_hpts, OID_AUT= O, precision, CTLFLAG_RW,
>>>>=C2=A0 =C2=A0 &tcp_hpts_precision, 120,
>>>>=C2=A0 =C2=A0 "Value for PRE() precision of callout&qu= ot;);
>>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, cnt_thresh, CTLFL= AG_RW,
>>>> -=C2=A0 =C2=A0 &conn_cnt_thresh, 0,
>>>> +=C2=A0 =C2=A0 &conn_cnt_thresh, DEFAULT_CONNECTION_TH= ESHOLD,
>>>>=C2=A0 =C2=A0 "How many connections (below) make us us= e the callout based mechanism");
>>>> SYSCTL_INT(_net_inet_tcp_hpts, OID_AUTO, logging, CTLFLAG_= RW,
>>>>=C2=A0 =C2=A0 &hpts_does_tp_logging, 0,
>>>> @@ -1548,6 +1554,9 @@ __tcp_run_hpts(void)
>>>> struct tcp_hpts_entry *hpts;
>>>> int ticks_ran;
>>>>
>>>> + if (counter_u64_fetch(hpts_that_need_softclock) =3D=3D 0= )
>>>> + return;
>>>> +
>>>> hpts =3D tcp_choose_hpts_to_run();
>>>>
>>>> if (hpts->p_hpts_active) {
>>>> @@ -1683,6 +1692,13 @@ tcp_hpts_thread(void *ctx)
>>>> ticks_ran =3D tcp_hptsi(hpts, 1);
>>>> tv.tv_sec =3D 0;
>>>> tv.tv_usec =3D hpts->p_hpts_sleep_time * HPTS_TICKS_PER= _SLOT;
>>>> + if ((hpts->p_on_queue_cnt > conn_cnt_thresh) &= & (hpts->hit_callout_thresh =3D=3D 0)) {
>>>> + hpts->hit_callout_thresh =3D 1;
>>>> + counter_u64_add(hpts_that_need_softclock, 1);
>>>> + } else if ((hpts->p_on_queue_cnt <=3D conn_cnt_thr= esh) && (hpts->hit_callout_thresh =3D=3D 1)) {
>>>> + hpts->hit_callout_thresh =3D 0;
>>>> + counter_u64_add(hpts_that_need_softclock, -1);
>>>> + }
>>>> if (hpts->p_on_queue_cnt >=3D conn_cnt_thresh) {
>>>> if(hpts->p_direct_wake =3D=3D 0) {
>>>> /*
>>>> @@ -1818,6 +1834,7 @@ tcp_hpts_mod_load(void)
>>>> cpu_top =3D NULL;
>>>> #endif
>>>> tcp_pace.rp_num_hptss =3D ncpus;
>>>> + hpts_that_need_softclock =3D counter_u64_alloc(M_WAITOK)= ;
>>>> hpts_hopelessly_behind =3D counter_u64_alloc(M_WAITOK); >>>> hpts_loops =3D counter_u64_alloc(M_WAITOK);
>>>> back_tosleep =3D counter_u64_alloc(M_WAITOK);
>>>> @@ -2042,6 +2059,7 @@ tcp_hpts_mod_unload(void)
>>>> free(tcp_pace.grps, M_TCPHPTS);
>>>> #endif
>>>>
>>>> + counter_u64_free(hpts_that_need_softclock);
>>>> counter_u64_free(hpts_hopelessly_behind);
>>>> counter_u64_free(hpts_loops);
>>>> counter_u64_free(back_tosleep);
>>>
>>>
>>
>>
>>
>> --
>> Nuno Teixeira
>> FreeBSD Committer (ports)
>
>
>
> --
> Nuno Teixeira
> FreeBSD Committer (ports)



--
Nuno Teixeira
FreeBSD Committ= er (ports)


--
Nuno Teixeira
FreeBSD Committ= er (ports)
--000000000000c64a570615bd6820-- From nobody Thu Apr 11 13:07:35 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VFg3S6tyTz5HP5D for ; Thu, 11 Apr 2024 13:07:48 +0000 (UTC) (envelope-from hiroo@oikumene.net) Received: from barleycorn.oikumene.net (tk2-231-25124.vs.sakura.ne.jp [160.16.110.128]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VFg3Q5WNmz55gT for ; Thu, 11 Apr 2024 13:07:46 +0000 (UTC) (envelope-from hiroo@oikumene.net) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of hiroo@oikumene.net designates 160.16.110.128 as permitted sender) smtp.mailfrom=hiroo@oikumene.net Received: from nowhere.oikumene.ukehi.net (KD059129091046.ppp-bb.dion.ne.jp [59.129.91.46]) by barleycorn.oikumene.net (Postfix) with ESMTPSA id 7A22361FCC for ; Thu, 11 Apr 2024 22:07:36 +0900 (JST) Received: from nowhere.oikumene.ukehi.net ([IPv6:240f:3f:802f:2:82c1:6eff:fef8:b41e]) by nowhere.oikumene.ukehi.net (8.18.1/8.18.1) with ESMTP id 43BD7Zqn097469 for ; Thu, 11 Apr 2024 22:07:35 +0900 (JST) (envelope-from hiroo@oikumene.net) X-Authentication-Warning: nowhere.oikumene.ukehi.net: Host [IPv6:240f:3f:802f:2:82c1:6eff:fef8:b41e] claimed to be nowhere.oikumene.ukehi.net Date: Thu, 11 Apr 2024 22:07:35 +0900 From: Hiroo Ono To: freebsd-current Subject: llvm and Undefined symbols: ___truncsfbf2 problem Message-ID: <20240411220735.069cb283@nowhere.oikumene.ukehi.net> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.41; amd64-portbld-freebsd14.0) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.30 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; R_SPF_ALLOW(-0.20)[+ip4:160.16.110.128]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; HAS_XAW(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; RCVD_VIA_SMTP_AUTH(0.00)[]; ASN(0.00)[asn:9370, ipnet:160.16.0.0/17, country:JP]; MIME_TRACE(0.00)[0:+]; R_DKIM_NA(0.00)[]; DMARC_NA(0.00)[oikumene.net]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_COUNT_TWO(0.00)[2]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_ALL(0.00)[]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 4VFg3Q5WNmz55gT Hello, I am trying to update the lang/julia port to 1.11.0 (currently still in beta 1). I seem to ran across this problem initially reported on MacOS. https://github.com/JuliaLang/julia/issues/52067 The llvm team seems to have patched this problem only for Darwin. https://github.com/llvm/llvm-project/pull/84192 I think the solution is also needed for FreeBSD, but should I report it directly to llvm team or report here or to FreeBSD bugzilla and ask toolchain maintainer of FreeBSD to report upstream? ---- Hiroo Ono