From owner-freebsd-arch@freebsd.org Tue May 3 17:36:35 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42AFEB2C760 for ; Tue, 3 May 2016 17:36:35 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 26405198F for ; Tue, 3 May 2016 17:36:35 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id CCC78B93A for ; Tue, 3 May 2016 13:36:33 -0400 (EDT) From: John Baldwin To: 'freebsd-arch' Subject: Default KMODDIR Date: Tue, 03 May 2016 10:36:28 -0700 Message-ID: <1702887.zov9WbhYWY@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 03 May 2016 13:36:33 -0400 (EDT) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 17:36:35 -0000 I'd like to change KMODDIR's default from /boot/kernel to /boot/modules. Kernel builds already set KMODDIR explicitly in sys/conf/kern.pre.mk, so the only modules affected would be modules built standalone outside of a kernel build. There are two cases to consider I think: 1) Someone has installed a custom kernel and finds they need module 'foo' that wasn't in their custom kernel. They might do: # cd /sys/modules/foo # make all install # kldload foo For myself, if I need this I find myself instead modifying my kernel config to include the module in MODULES_OVERRIDE and then doing a 'make buildkernel && make reinstallkernel' as this will work after my next source upgrade. In this case, foo.ko would now end up in /boot/modules rather than /boot/kernel. Note that if a user just does 'make load' instead of 'make install' then nothing changes. 2) A module lives outside of the tree (or a vendor wishes to ship a newer version as a standalone module). All of these modules (including all of the one in ports) currently have to override KMODDIR explicitly in the module Makefile. It is the 2) case I would like to make more seamless by changing the default. This does change the behavior for 1) if someone is doing 'make install' rather than 'make load' from a module build directory. Strawman diff: Index: share/mk/bsd.own.mk =================================================================== --- share/mk/bsd.own.mk (revision 298711) +++ share/mk/bsd.own.mk (working copy) @@ -49,7 +49,7 @@ # # # KMODDIR Base path for loadable kernel modules -# (see kld(4)). [/boot/kernel] +# (see kld(4)). [/boot/modules] # # KMODOWN Kernel and KLD owner. [${BINOWN}] # @@ -165,11 +165,7 @@ BINMODE?= 555 NOBINMODE?= 444 -.if defined(MODULES_WITH_WORLD) KMODDIR?= /boot/modules -.else -KMODDIR?= /boot/kernel -.endif KMODOWN?= ${BINOWN} KMODGRP?= ${BINGRP} KMODMODE?= ${BINMODE} -- John Baldwin From owner-freebsd-arch@freebsd.org Tue May 3 17:41:03 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20569B2C81D for ; Tue, 3 May 2016 17:41:03 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 065E61B65; Tue, 3 May 2016 17:41:03 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from FreeBSD.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by freefall.freebsd.org (Postfix) with ESMTP id B695412AB; Tue, 3 May 2016 17:41:02 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Date: Tue, 3 May 2016 17:41:01 +0000 From: Glen Barber To: John Baldwin Cc: 'freebsd-arch' Subject: Re: Default KMODDIR Message-ID: <20160503174101.GX1804@FreeBSD.org> References: <1702887.zov9WbhYWY@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="gU5RoU0MzIDcUgzM" Content-Disposition: inline In-Reply-To: <1702887.zov9WbhYWY@ralph.baldwin.cx> X-Operating-System: FreeBSD 11.0-CURRENT amd64 X-SCUD-Definition: Sudden Completely Unexpected Dataloss X-SULE-Definition: Sudden Unexpected Learning Event X-PEKBAC-Definition: Problem Exists, Keyboard Between Admin/Computer User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 17:41:03 -0000 --gU5RoU0MzIDcUgzM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 03, 2016 at 10:36:28AM -0700, John Baldwin wrote: > I'd like to change KMODDIR's default from /boot/kernel to /boot/modules. > Kernel builds already set KMODDIR explicitly in sys/conf/kern.pre.mk, so > the only modules affected would be modules built standalone outside of a > kernel build. There are two cases to consider I think: >=20 > 1) Someone has installed a custom kernel and finds they need module > 'foo' that wasn't in their custom kernel. They might do: >=20 > # cd /sys/modules/foo > # make all install > # kldload foo >=20 > For myself, if I need this I find myself instead modifying my kernel > config to include the module in MODULES_OVERRIDE and then doing a > 'make buildkernel && make reinstallkernel' as this will work after my > next source upgrade. >=20 > In this case, foo.ko would now end up in /boot/modules rather than > /boot/kernel. Note that if a user just does 'make load' instead of > 'make install' then nothing changes. >=20 > 2) A module lives outside of the tree (or a vendor wishes to ship a > newer version as a standalone module). All of these modules (including > all of the one in ports) currently have to override KMODDIR explicitly > in the module Makefile. >=20 > It is the 2) case I would like to make more seamless by changing the > default. This does change the behavior for 1) if someone is doing=20 > 'make install' rather than 'make load' from a module build directory. >=20 > Strawman diff: >=20 > Index: share/mk/bsd.own.mk > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- share/mk/bsd.own.mk (revision 298711) > +++ share/mk/bsd.own.mk (working copy) > @@ -49,7 +49,7 @@ > # > # > # KMODDIR Base path for loadable kernel modules > -# (see kld(4)). [/boot/kernel] > +# (see kld(4)). [/boot/modules] > # > # KMODOWN Kernel and KLD owner. [${BINOWN}] > # > @@ -165,11 +165,7 @@ > BINMODE?=3D 555 > NOBINMODE?=3D 444 > =20 > -.if defined(MODULES_WITH_WORLD) > KMODDIR?=3D /boot/modules > -.else > -KMODDIR?=3D /boot/kernel > -.endif > KMODOWN?=3D ${BINOWN} > KMODGRP?=3D ${BINGRP} > KMODMODE?=3D ${BINMODE} >=20 This seems like a good idea to me. I tend to do (1) quite often, and sometimes forget what modules I had temporarily built for testing something specific, and your proposal would be very helpful in finding the modules I do not always want or need installed. Glen --gU5RoU0MzIDcUgzM Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJXKOKtAAoJEAMUWKVHj+KTA4kP/2azI81ABLmoeRWfUPVHA+hX yH+P0J2qXYSmtugK3jc+JzwJH7jnJNY6vjYEfig//gTZYgJl9Jmo7axYuGn5V9Af XxCRRnVhbe+WZiHZ8/Cey0LpZPPE3nazapLIWSovulMBAgUTcXxRvbRS931Q5s8h N7Xxyv62m1cKgJZLb5pUDiDAKukDK49IeX9oJtPSDBkR9bjHo4wVZhrSnD8h84Au ikuUeTsQg8EzCV2n4iPiY5gL7zsBu1sFDluUfqIijzx4gVThrumL5Cbo6N3pxqnN k5L6Qj1WdrIlp+bfTQ4r0rughv5JNVFkr6IYlvecrZRF8TZSq0+Ge2IKQJU5o4Bm 71gA9yX9LMurnLLHAxYg9H6dZ6KNBoJTEd06ZXQuXq2T7nzYiZm7Es0en6F9oaAw GFekaU+sxc1L3DXs83E1J3/WswnIS9mFofs4MEyAv3Xrf3KNp4Z9pQWAq2i+4ZRi zhk8HA9ffj17/KPTqKPGVGLoAtibzjzK0MgjzxsWWNPuD67tE7RS+AX1Y/h5k1XE I3vXHAMdqz5ig0otHqHFItiTR8wSRMA+ERxdu/Q34XHqVQ2pgg649gcEYvxgGyX5 /BiPlBi1vlw8oLw+FB8lROeOIiHDt8j56pQwCT3zBBm+/raE0ArAYwoqU/xQCc95 62qK0Si1PMuIfGG47p0J =0DnQ -----END PGP SIGNATURE----- --gU5RoU0MzIDcUgzM-- From owner-freebsd-arch@freebsd.org Tue May 3 17:47:50 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 55042B2CB7D for ; Tue, 3 May 2016 17:47:50 +0000 (UTC) (envelope-from kaduk@mit.edu) Received: from dmz-mailsec-scanner-3.mit.edu (dmz-mailsec-scanner-3.mit.edu [18.9.25.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E67F51070; Tue, 3 May 2016 17:47:49 +0000 (UTC) (envelope-from kaduk@mit.edu) X-AuditID: 1209190e-d13ff70000004bd5-da-5728e3109cdd Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by (Symantec Messaging Gateway) with SMTP id 15.69.19413.013E8275; Tue, 3 May 2016 13:42:40 -0400 (EDT) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id u43HgdJW026727; Tue, 3 May 2016 13:42:39 -0400 Received: from multics.mit.edu (system-low-sipb.mit.edu [18.187.2.37]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id u43HgaFk018778 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 3 May 2016 13:42:39 -0400 Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308) id u43Hga1q013448; Tue, 3 May 2016 13:42:36 -0400 (EDT) Date: Tue, 3 May 2016 13:42:36 -0400 (EDT) From: Benjamin Kaduk To: John Baldwin cc: "'freebsd-arch'" Subject: Re: Default KMODDIR In-Reply-To: <1702887.zov9WbhYWY@ralph.baldwin.cx> Message-ID: References: <1702887.zov9WbhYWY@ralph.baldwin.cx> User-Agent: Alpine 1.10 (GSO 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOIsWRmVeSWpSXmKPExsUixCmqrCvwWCPc4OR9fovZ06cxWew9cp3Z gcljxqf5LAGMUVw2Kak5mWWpRfp2CVwZ65bdYiuYwFTx6cRGxgbG64xdjJwcEgImEnNb3zJ1 MXJxCAm0MUk8Wn6ZBcLZwCixdPIcKOcgk8S0DfuZQVqEBOol+m9PYwexWQS0JH5NXQlmswmo SMx8s5Gti5GDQ0RASWLqNzWQMLOAvsTdb9tYQWxhARmJ2w/3go3hFDCSuPf7OVicV8BR4uKZ sywQ4w0lLp9oZQOxRQV0JFbvn8ICUSMocXLmExaImVoSy6dvY5nAKDALSWoWktQCRqZVjLIp uVW6uYmZOcWpybrFyYl5ealFusZ6uZkleqkppZsYwWEoybeDcVKD9yFGAQ5GJR7eBQ/Uw4VY E8uKK3MPMUpyMCmJ8k6/rBEuxJeUn1KZkVicEV9UmpNafIhRgoNZSYT3wQ2gHG9KYmVValE+ TEqag0VJnJeRgYFBSCA9sSQ1OzW1ILUIJivDwaEkwcv3CKhRsCg1PbUiLTOnBCHNxMEJMpwH aPjnhyDDiwsSc4sz0yHypxh1ORb8uL2WSYglLz8vVUqcdztIkQBIUUZpHtwccPrYzaT6ilEc 6C1hXiuQdTzA1AM36RXQEiagJdnrVUGWlCQipKQaGG8/bTj4/5BGvKOV3vf8axtX524znrBS rv7vhuXeiw+YLOete2bSP8duwXLR08FiOQZ7Y84dPuhTbsN7aLO4IKv3zw/MBdOYrHqPMJyI KzrUUd0/p1E15N4Wdvk8I23WhzHFq7PMBJ76u3hJ7qmMv3NRtPHXYmYzyY3P26STV3PYHJjy JyClUomlOCPRUIu5qDgRANKd6mj6AgAA X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 17:47:50 -0000 On Tue, 3 May 2016, John Baldwin wrote: > I'd like to change KMODDIR's default from /boot/kernel to /boot/modules. As someone claiming to maintain [but not doing a terribly good job at the moment] and out-of-tree module, this seems like a good idea to me. -Ben From owner-freebsd-arch@freebsd.org Tue May 3 17:56:14 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B6980B2CDCF for ; Tue, 3 May 2016 17:56:14 +0000 (UTC) (envelope-from ohauer@gmx.de) Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1B53F179B; Tue, 3 May 2016 17:56:13 +0000 (UTC) (envelope-from ohauer@gmx.de) Received: from [192.168.100.100] ([87.139.233.65]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0MYKGj-1bB2gu1wu1-00VBR8; Tue, 03 May 2016 19:56:05 +0200 Subject: Re: Default KMODDIR To: 'freebsd-arch' References: <1702887.zov9WbhYWY@ralph.baldwin.cx> From: olli hauer Message-ID: <5728E634.7010108@gmx.de> Date: Tue, 3 May 2016 19:56:04 +0200 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <1702887.zov9WbhYWY@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:s/rclB6Yp9vDZKG7lLyx+ZSygf2wVtjPoBtl7WsBW2xLapfLeKD bafFHb82x+VnbSxdD+nsFqE4WTS5xNOxu3X09CYc/fion6rURDVOrOY9MUnB3v8GvY8mZN6 SSw7RpIIBVczCm+edOcrYKUonXSDhp5ZWc58lKK9PojOsBw9TYJfItTLcrR/HRAQMgrd4IL 9N2uguq9hNo95i2etUBnQ== X-UI-Out-Filterresults: notjunk:1;V01:K0:LnB/xnUf+6c=:M3u+pj2r8HNmZ7GPA4UWz+ 1G39/2aDmfxt7jKAW44FLdTAEhYeQAZAfsVPTq72Oi0TdA4xzsJCDxPfB0csQf7zuomhcMGZF gnl4k1PK86wIWnbalXBYSVSb6Xb8J42avCQljckTPr6krj6bW8QyUWcAPhLDSyaN9VgTTfkN/ SJAHdUUoe76btEcc3HOCCGX370MplrfIkCbngFKRkm8zRTZ8q841/+vm5K/Qu3qGIbhfQT6sw opDJ8f/KvcvrRIcGBJdsIZSkTcEPeAZRP3sBMR9bbPwW746egkjl3ueTrO3g6vqMYVmRoAsfD 9ByMqDAJZD0Jfx7aPPo4bKKJnUJm3CdOeThrvHZYqC8hXjIusnoGpeA8hXzr6CWFJb5i5HRDu tlfcXMFnN/lLxU95qXkMqr0YkBbUtdP2thiWm+vmES4emA5ZsjIWhc1FTG4SIwwOalvbLHTSe 5lxKTrSls3pSwCQCqlyQh9PQTZKBZ9Ukg68Q1Yo4N3kO9kV60oXhC/HyvRQz4uqia3CZLUQ+a EIMahDz2NgXoHi3LTbCQgjLoEVM9ZRFpgYTXDa+t/F0xOPbUYGx12glFJEnwFGNaJGomfZj0h 2q5l13Rj0xER7FK5ugmA2tEswJM0XUywHASbJSBz3MoYfuGiYusC3wOXA17ioHqMC/owvIOxf PsiOYR4eNX/VfxP+NmiGfmQvRWw1VypEWutUES8ObO9pAxvGvZUOZzo7Ue5of8yk5Dg2g5QkK IFH9xGWTHyOyACvKqeyCA7tVEVD6Qo04FDo/5vCNMVsmrXgDUH7MIL2XRfALPW8PMvVJhQTkI Asu46gR X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 17:56:14 -0000 On 2016-05-03 19:36, John Baldwin wrote: > I'd like to change KMODDIR's default from /boot/kernel to /boot/modules. > Kernel builds already set KMODDIR explicitly in sys/conf/kern.pre.mk, so > the only modules affected would be modules built standalone outside of a > kernel build. There are two cases to consider I think: > > 1) Someone has installed a custom kernel and finds they need module > 'foo' that wasn't in their custom kernel. They might do: > > # cd /sys/modules/foo > # make all install > # kldload foo > > For myself, if I need this I find myself instead modifying my kernel > config to include the module in MODULES_OVERRIDE and then doing a > 'make buildkernel && make reinstallkernel' as this will work after my > next source upgrade. > > In this case, foo.ko would now end up in /boot/modules rather than > /boot/kernel. Note that if a user just does 'make load' instead of > 'make install' then nothing changes. > > 2) A module lives outside of the tree (or a vendor wishes to ship a > newer version as a standalone module). All of these modules (including > all of the one in ports) currently have to override KMODDIR explicitly > in the module Makefile. > > It is the 2) case I would like to make more seamless by changing the > default. This does change the behavior for 1) if someone is doing > 'make install' rather than 'make load' from a module build directory. ... Not totally sure about, but isn't /boot/modules used by third party drivers like original VMware and others? Will be there any impacts for this third party modules during kernel upgrades, like renaming modules to module.old? -- olli From owner-freebsd-arch@freebsd.org Tue May 3 18:19:51 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E0C0DB2A61F for ; Tue, 3 May 2016 18:19:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C32AB1E56 for ; Tue, 3 May 2016 18:19:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D8FFFB93B; Tue, 3 May 2016 14:19:50 -0400 (EDT) From: John Baldwin To: olli hauer Cc: 'freebsd-arch' Subject: Re: Default KMODDIR Date: Tue, 03 May 2016 11:16:48 -0700 Message-ID: <2852715.sr6RVszBnq@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <5728E634.7010108@gmx.de> References: <1702887.zov9WbhYWY@ralph.baldwin.cx> <5728E634.7010108@gmx.de> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 03 May 2016 14:19:50 -0400 (EDT) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 18:19:52 -0000 On Tuesday, May 03, 2016 07:56:04 PM olli hauer wrote: > On 2016-05-03 19:36, John Baldwin wrote: > > I'd like to change KMODDIR's default from /boot/kernel to /boot/modules. > > Kernel builds already set KMODDIR explicitly in sys/conf/kern.pre.mk, so > > the only modules affected would be modules built standalone outside of a > > kernel build. There are two cases to consider I think: > > > > 1) Someone has installed a custom kernel and finds they need module > > 'foo' that wasn't in their custom kernel. They might do: > > > > # cd /sys/modules/foo > > # make all install > > # kldload foo > > > > For myself, if I need this I find myself instead modifying my kernel > > config to include the module in MODULES_OVERRIDE and then doing a > > 'make buildkernel && make reinstallkernel' as this will work after my > > next source upgrade. > > > > In this case, foo.ko would now end up in /boot/modules rather than > > /boot/kernel. Note that if a user just does 'make load' instead of > > 'make install' then nothing changes. > > > > 2) A module lives outside of the tree (or a vendor wishes to ship a > > newer version as a standalone module). All of these modules (including > > all of the one in ports) currently have to override KMODDIR explicitly > > in the module Makefile. > > > > It is the 2) case I would like to make more seamless by changing the > > default. This does change the behavior for 1) if someone is doing > > 'make install' rather than 'make load' from a module build directory. > ... > > Not totally sure about, but isn't /boot/modules used by third party drivers like original VMware and others? > > Will be there any impacts for this third party modules during kernel upgrades, like renaming modules to module.old? No, kernel upgrades don't touch /boot/modules at all. Right now third party drivers have to explicitly set KMODDIR. This would mean they no longer have to do that as building a module "standalone" would now DTRT out of the box when installing. -- John Baldwin From owner-freebsd-arch@freebsd.org Tue May 3 18:43:00 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D7FB5B2AF9D for ; Tue, 3 May 2016 18:43:00 +0000 (UTC) (envelope-from ohauer@gmx.de) Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DD4C1366; Tue, 3 May 2016 18:42:59 +0000 (UTC) (envelope-from ohauer@gmx.de) Received: from [192.168.100.100] ([87.139.233.65]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0LjJCt-1bWSvW3Igp-00dUCU; Tue, 03 May 2016 20:37:53 +0200 Subject: Re: Default KMODDIR To: 'freebsd-arch' References: <1702887.zov9WbhYWY@ralph.baldwin.cx> <5728E634.7010108@gmx.de> <2852715.sr6RVszBnq@ralph.baldwin.cx> From: olli hauer Message-ID: <5728F001.50402@gmx.de> Date: Tue, 3 May 2016 20:37:53 +0200 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <2852715.sr6RVszBnq@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:OXTlHP3+2T2NEEC3Suz/yoZEAr5oaROudtMYkU5f7K++32ucbvZ ZIuWkBfMW7XOahwtntk8yQ4EsodHpGFgeg/BZeAhdxMNjhsEznTg0032feCEAjk23Nmr/mg TxjzXi+jsAqi593wVxCP8f10b3yNqaJUDl4BZc3KQSJfwPHbJgoXAxt8urWEGKV6KpWcG43 iboNfNDc7s3nvfQGKvBig== X-UI-Out-Filterresults: notjunk:1;V01:K0:fYo9O+xUi9w=:BEovx6dZDJuD/bASWzCEBo 53sSLV8Qacq3Hdp18x6FRP+15UIRh9x/L2fSB7egf4H71UXY1ZvJLwFsN4xmJ/HbEZQ2oilP7 x9nCmc2PW5x04jjRzmZJIc2eB+xlI0VKGJqxKQBtsb+ldLhXW7h1jcFJl1InyiCdZPNIJhDnI 4/0QJD5KuHPcqk69fmzm19KYW3e0Fr1lLh/oik43rofxuOO7BfUVKnu5gD/DvFHIdUYYDR3E0 +zMrCva9EHZbbCTbYBcMuA3YNKHNeisp9wL5LqhyYP2ZF+R0yuGFEvyktXgAgUehnOMwdIkb8 mwKrn0gGdW/EJNBIRF4fESiSsOc8ylUpg5I/1cQ4O42M0Uv12v5ZWBss6NYX6YR8t6cbocHjK /UO1B+0joUZSjgWrXUkqyyTLME33MNHINClbwZMckHQ0IlGcCZkPP35cqsUSomXXLbukNWNgn RyUMtcgHBncpRqw1/qkW7w9uiGpyQhr/dAos7pD3yPQD4qU2pSDjANRyAB2XRbBlxIMiGLQVd 8d0gtHcT4IM42xmWjDYLwhoTbGIJE+R3aYrGgXTNjlU9l5iqPG+LRdxZAt0JUsjjxt5pKeJlb vDXUGh0wLIO08npapIXHUG6H2Lk72KzkN4hcv2stNcyFXsk/qoyWJ1dluEWu6zInoQP/N8lsf PLhEfgOiiGjU4Tl15o582pJJPnWDbbMsOWaCTcbH85m8D0I+F3SSbXpflo8JAqyKiF+ggSTNm 489qjBMoUHkThHHL/xjmuWtTJt7+5TQsoyOqFVMbEBn1MN+gnzKEmY1evwFH2MKdrht/knnOY wklPUSp X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 18:43:00 -0000 On 2016-05-03 20:16, John Baldwin wrote: > On Tuesday, May 03, 2016 07:56:04 PM olli hauer wrote: >> On 2016-05-03 19:36, John Baldwin wrote: >>> I'd like to change KMODDIR's default from /boot/kernel to /boot/modules. >>> Kernel builds already set KMODDIR explicitly in sys/conf/kern.pre.mk, so >>> the only modules affected would be modules built standalone outside of a >>> kernel build. There are two cases to consider I think: >>> >>> 1) Someone has installed a custom kernel and finds they need module >>> 'foo' that wasn't in their custom kernel. They might do: >>> >>> # cd /sys/modules/foo >>> # make all install >>> # kldload foo >>> >>> For myself, if I need this I find myself instead modifying my kernel >>> config to include the module in MODULES_OVERRIDE and then doing a >>> 'make buildkernel && make reinstallkernel' as this will work after my >>> next source upgrade. >>> >>> In this case, foo.ko would now end up in /boot/modules rather than >>> /boot/kernel. Note that if a user just does 'make load' instead of >>> 'make install' then nothing changes. >>> >>> 2) A module lives outside of the tree (or a vendor wishes to ship a >>> newer version as a standalone module). All of these modules (including >>> all of the one in ports) currently have to override KMODDIR explicitly >>> in the module Makefile. >>> >>> It is the 2) case I would like to make more seamless by changing the >>> default. This does change the behavior for 1) if someone is doing >>> 'make install' rather than 'make load' from a module build directory. >> ... >> >> Not totally sure about, but isn't /boot/modules used by third party drivers like original VMware and others? >> >> Will be there any impacts for this third party modules during kernel upgrades, like renaming modules to module.old? > > No, kernel upgrades don't touch /boot/modules at all. Right now third > party drivers have to explicitly set KMODDIR. This would mean they no > longer have to do that as building a module "standalone" would now DTRT > out of the box when installing. > Thanks for clarification! -- olli From owner-freebsd-arch@freebsd.org Thu May 5 13:10:39 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B56E3B2CADC for ; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id A1E7118FE for ; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 9CEB0B2CADA; Thu, 5 May 2016 13:10:39 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C7C2B2CAD9; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 45F9618FD; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u45DATQK087691 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 5 May 2016 16:10:30 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u45DATQK087691 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u45DATPC087690; Thu, 5 May 2016 16:10:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 May 2016 16:10:29 +0300 From: Konstantin Belousov To: threads@freebsd.org Cc: arch@freebsd.org Subject: Robust mutexes implementation Message-ID: <20160505131029.GE2422@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.6.0 (2016-04-01) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 13:10:39 -0000 I implemented robust mutexes for our libthr. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. Initially I tried to extend TCBs with this data, so only a single syscall at the threading library initialization would be needed: for any thread the location of TCB is known by kernel, and the syscall would pass offsets. Unfortunately, on some architectures the size of TCB is part of the fixed ABI and cannot be changed. Instead, when a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. I do not understand the reasoning behind this POSIX requirement. Patch passes all glibc tests for robust mutexes I found in the nptl/ directory. See https://github.com/kostikbel/glibc-robust-tests . Patch is available at https://kib.kiev.ua/kib/pshared/robust.1.patch (beware of self-signed root certificate in the chain). Work was sponsored by The FreeBSD Foundation. Unrelated things in the patch: 1. Style. Since I had to re-read whole sys/kern/kern_umtx.c, lib/libthr/thread/thr_umtx.h and lib/libthr/thread/thr_umtx.c, I started fixing the numerous style violations in these files, which actually made my eyes bleed. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the struct pthread_mutex m_owner field. I cannot see why it is useful. The only optimization it provides is the possibility to avoid clearing UMUTEX_CONTESTED bit when reading m_lock.m_owner. The disadvantages of having this duplicated field is that kernel does not know about pthread_mutex, so cannot fix the dup value. Overall it is less work to clear UMUTEX_CONTESTED when checking owner, then to try and handle inconsistencies. I added the PMUTEX_OWNER_ID() macro to simplify code. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Apparently, there is real code around which expects the following to work: - mmap a file, create a shared mutex in the mapping; - the process exits; - another process starts, mmaps the same file and expects that the previously initialized mutex is still usable. The knob changes the lifetime of such shared off-page from the 'destroy on last unmap' to either 'until vnode is reclaimed' or until 'pthread_mutex_destroy' called, whatever comes first. From owner-freebsd-arch@freebsd.org Thu May 5 17:31:40 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0E398B2E650 for ; Thu, 5 May 2016 17:31:40 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id F157518A9 for ; Thu, 5 May 2016 17:31:39 +0000 (UTC) (envelope-from martin@lispworks.com) Received: by mailman.ysv.freebsd.org (Postfix) id F086FB2E64D; Thu, 5 May 2016 17:31:39 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F00B5B2E64C; Thu, 5 May 2016 17:31:39 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 9D40218A8; Thu, 5 May 2016 17:31:39 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.9/8.14.9) with ESMTP id u45HKZYM011908; Thu, 5 May 2016 18:20:35 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id u45HKZlO021098; Thu, 5 May 2016 18:20:35 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id u45HKZ76021094; Thu, 5 May 2016 18:20:35 +0100 Date: Thu, 5 May 2016 18:20:35 +0100 Message-Id: <201605051720.u45HKZ76021094@higson.cam.lispworks.com> From: Martin Simmons To: Konstantin Belousov CC: threads@freebsd.org, arch@freebsd.org In-reply-to: <20160505131029.GE2422@kib.kiev.ua> (message from Konstantin Belousov on Thu, 5 May 2016 16:10:29 +0300) Subject: Re: Robust mutexes implementation References: <20160505131029.GE2422@kib.kiev.ua> X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 17:31:40 -0000 There is a potential bug in enqueue_mutex when it tests m1 == NULL and m1 != NULL. These tests only work because m_lock is the first slot in struct pthread_mutex and hence 0 in curthread->robust_list is converted to NULL (rather than a negative value). Also, is it safe to assume memory ordering between the assignments of m->m_lock.m_rb_lnk and curthread->robust_list? Maybe it is OK because the kernel will never read curthread->robust_list until after the CPU executing enqueue_mutex has passed a memory barrier? __Martin From owner-freebsd-arch@freebsd.org Thu May 5 18:58:17 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 27292B2EA58 for ; Thu, 5 May 2016 18:58:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 1212F1FAE for ; Thu, 5 May 2016 18:58:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 0D2DBB2EA56; Thu, 5 May 2016 18:58:17 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CB3CB2EA55; Thu, 5 May 2016 18:58:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 82B551FAD; Thu, 5 May 2016 18:58:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u45IwAHm069983 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 5 May 2016 21:58:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u45IwAHm069983 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u45IwAcl069982; Thu, 5 May 2016 21:58:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 May 2016 21:58:10 +0300 From: Konstantin Belousov To: Martin Simmons Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160505185810.GF2422@kib.kiev.ua> References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201605051720.u45HKZ76021094@higson.cam.lispworks.com> User-Agent: Mutt/1.6.0 (2016-04-01) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 18:58:17 -0000 On Thu, May 05, 2016 at 06:20:35PM +0100, Martin Simmons wrote: > There is a potential bug in enqueue_mutex when it tests m1 == NULL and m1 != > NULL. These tests only work because m_lock is the first slot in struct > pthread_mutex and hence 0 in curthread->robust_list is converted to NULL > (rather than a negative value). Yes. In fact I wrote the __containerof stuff only later, initial version of the patch did relied on the fact that m_lock is at the beginning of pthread_mutex. I rewrote the code in enqueue, and there is one more similar check in dequeue_mutex(). Updated patch is at https://kib.kiev.ua/kib/pshared/robust.2.patch . It also fixed the lack of userland split for non-robust pp/robust pp queues. > > Also, is it safe to assume memory ordering between the assignments of > m->m_lock.m_rb_lnk and curthread->robust_list? Maybe it is OK because the > kernel will never read curthread->robust_list until after the CPU executing > enqueue_mutex has passed a memory barrier? Inter-CPU ordering (I suppose you mean this, because you mention barriers) only matter when we consider multi-threaded interaction. In case of dequeue_mutex, paired to corresponding enqueue_mutex(), the calls occur in the same thread, and the thread is always self-consistent. WRT possible reordering, we in fact only care that enqueue writes do not become visible before lock is obtained, and dequeue must finish for external observers before lock is released. This is ensured by the umutex lock semantic, which has neccessary acquire barrier on lock and release barrier on unlock. From owner-freebsd-arch@freebsd.org Fri May 6 13:00:53 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F28F8B2F78C for ; Fri, 6 May 2016 13:00:53 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id E0C331F1B for ; Fri, 6 May 2016 13:00:53 +0000 (UTC) (envelope-from martin@lispworks.com) Received: by mailman.ysv.freebsd.org (Postfix) id E0087B2F78A; Fri, 6 May 2016 13:00:53 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DF885B2F789; Fri, 6 May 2016 13:00:53 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 8AE7A1F18; Fri, 6 May 2016 13:00:52 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.9/8.14.9) with ESMTP id u46D0mQI006640; Fri, 6 May 2016 14:00:48 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id u46D0mux008057; Fri, 6 May 2016 14:00:48 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id u46D0mhN008053; Fri, 6 May 2016 14:00:48 +0100 Date: Fri, 6 May 2016 14:00:48 +0100 Message-Id: <201605061300.u46D0mhN008053@higson.cam.lispworks.com> From: Martin Simmons To: Konstantin Belousov CC: threads@freebsd.org, arch@freebsd.org In-reply-to: <20160505185810.GF2422@kib.kiev.ua> (message from Konstantin Belousov on Thu, 5 May 2016 21:58:10 +0300) Subject: Re: Robust mutexes implementation References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> <20160505185810.GF2422@kib.kiev.ua> X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 13:00:54 -0000 >>>>> On Thu, 5 May 2016 21:58:10 +0300, Konstantin Belousov said: > > Yes. In fact I wrote the __containerof stuff only later, initial > version of the patch did relied on the fact that m_lock is at the > beginning of pthread_mutex. > > I rewrote the code in enqueue, and there is one more similar check in > dequeue_mutex(). > > Updated patch is at https://kib.kiev.ua/kib/pshared/robust.2.patch . OK. > On Thu, May 05, 2016 at 06:20:35PM +0100, Martin Simmons wrote: > > Also, is it safe to assume memory ordering between the assignments of > > m->m_lock.m_rb_lnk and curthread->robust_list? Maybe it is OK because the > > kernel will never read curthread->robust_list until after the CPU executing > > enqueue_mutex has passed a memory barrier? > > Inter-CPU ordering (I suppose you mean this, because you mention > barriers) only matter when we consider multi-threaded interaction. > In case of dequeue_mutex, paired to corresponding enqueue_mutex(), > the calls occur in the same thread, and the thread is always > self-consistent. Agreed. > WRT possible reordering, we in fact only care that enqueue writes do > not become visible before lock is obtained, and dequeue must finish > for external observers before lock is released. This is ensured by > the umutex lock semantic, which has neccessary acquire barrier on > lock and release barrier on unlock. I meant the case where CPU 1 claims the lock, executing this from enqueue_mutex: m->m_lock.m_rb_lnk = 0; /* A */ *rl = (uintptr_t)&m->m_lock; /* B */ and then the thread dies and CPU 2 cleans up the dead thread executing this from umtx_handle_rb: error = copyin((void *)rbp, &m, sizeof(m)); /* C */ *rb_list = m.m_rb_lnk; /* D */ where rbp is the value stored into *rl at B by CPU 1. What ensures that CPU 1's store at A is seen by CPU 2 when it reads the value at D? I'm hoping this is covered by something that I don't understand, but I was worried by the comment "consistent even in the case of thread termination at arbitrary moment" above enqueue_mutex. __Martin From owner-freebsd-arch@freebsd.org Fri May 6 13:20:54 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A97F2B2FCAE for ; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 934DC1A8A for ; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 8E7D1B2FCAC; Fri, 6 May 2016 13:20:54 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8DDECB2FCAA; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3825B1A88; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u46DKi7P025450 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 6 May 2016 16:20:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u46DKi7P025450 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u46DKicH025449; Fri, 6 May 2016 16:20:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 6 May 2016 16:20:44 +0300 From: Konstantin Belousov To: Martin Simmons Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160506132044.GA89104@kib.kiev.ua> References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> <20160505185810.GF2422@kib.kiev.ua> <201605061300.u46D0mhN008053@higson.cam.lispworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201605061300.u46D0mhN008053@higson.cam.lispworks.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 13:20:54 -0000 On Fri, May 06, 2016 at 02:00:48PM +0100, Martin Simmons wrote: > >>>>> On Thu, 5 May 2016 21:58:10 +0300, Konstantin Belousov said: > > WRT possible reordering, we in fact only care that enqueue writes do > > not become visible before lock is obtained, and dequeue must finish > > for external observers before lock is released. This is ensured by > > the umutex lock semantic, which has neccessary acquire barrier on > > lock and release barrier on unlock. > > I meant the case where CPU 1 claims the lock, executing this from > enqueue_mutex: > > m->m_lock.m_rb_lnk = 0; /* A */ > *rl = (uintptr_t)&m->m_lock; /* B */ > > and then the thread dies and CPU 2 cleans up the dead thread executing this > from umtx_handle_rb: > > error = copyin((void *)rbp, &m, sizeof(m)); /* C */ > *rb_list = m.m_rb_lnk; /* D */ > > where rbp is the value stored into *rl at B by CPU 1. > > What ensures that CPU 1's store at A is seen by CPU 2 when it reads the value > at D? I'm hoping this is covered by something that I don't understand, but I > was worried by the comment "consistent even in the case of thread termination > at arbitrary moment" above enqueue_mutex. > The cleanup is performed by the same thread which did the lock, in the kernel mode. Thread can only die by explicit kernel action, and this action is executed by the dying thread. In fact, there is one (or two) cases when the cleanup is performed by thread other than the lock owner. Namely, when we execve(2) (or _exit(2) the whole process, but the mechanics is same), we first put the process into so-called single-threading state, where all threads other than the executing the syscall, are put into sleep at the safe moment. Then, the old address space if destroyed and new image activated. Only after that, other threads are killed. It is done to allow the error return, where we need to keep threads around on failed execve. And, right before the old address space is destroyed, robust mutexes for all threads are terminated, since we need access to usermode VA to operate on locks. But there, the single-threading mechanics contains neccessary barriers to ensure visibility in right order. In essence, there are so many locks/unlocks with barrier semantic on single-threading that this is not a problem. From owner-freebsd-arch@freebsd.org Fri May 6 14:43:51 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9BAE1B31091 for ; Fri, 6 May 2016 14:43:51 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 891CE1ADE for ; Fri, 6 May 2016 14:43:51 +0000 (UTC) (envelope-from martin@lispworks.com) Received: by mailman.ysv.freebsd.org (Postfix) id 88742B3108F; Fri, 6 May 2016 14:43:51 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 87FF0B3108E; Fri, 6 May 2016 14:43:51 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 335211ADD; Fri, 6 May 2016 14:43:50 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.9/8.14.9) with ESMTP id u46EhkkQ010464; Fri, 6 May 2016 15:43:46 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id u46EhkFT009474; Fri, 6 May 2016 15:43:46 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id u46EhkfY009470; Fri, 6 May 2016 15:43:46 +0100 Date: Fri, 6 May 2016 15:43:46 +0100 Message-Id: <201605061443.u46EhkfY009470@higson.cam.lispworks.com> From: Martin Simmons To: Konstantin Belousov CC: threads@freebsd.org, arch@freebsd.org In-reply-to: <20160506132044.GA89104@kib.kiev.ua> (message from Konstantin Belousov on Fri, 6 May 2016 16:20:44 +0300) Subject: Re: Robust mutexes implementation References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> <20160505185810.GF2422@kib.kiev.ua> <201605061300.u46D0mhN008053@higson.cam.lispworks.com> <20160506132044.GA89104@kib.kiev.ua> X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 14:43:51 -0000 >>>>> On Fri, 6 May 2016 16:20:44 +0300, Konstantin Belousov said: > > The cleanup is performed by the same thread which did the lock, in the > kernel mode. Thread can only die by explicit kernel action, and this > action is executed by the dying thread. Thanks, that's the part I didn't know about. __Martin From owner-freebsd-arch@freebsd.org Fri May 6 23:30:14 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E2910B31AF4 for ; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id D1D0E108C for ; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: by mailman.ysv.freebsd.org (Postfix) id CDB74B31AF2; Fri, 6 May 2016 23:30:14 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CAE71B31AF1; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6D574108B; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from toad2.stack.nl (toad2.stack.nl [IPv6:2001:610:1108:5010::161]) by mx1.stack.nl (Postfix) with ESMTP id 1734BB80ED; Sat, 7 May 2016 01:30:11 +0200 (CEST) Received: by toad2.stack.nl (Postfix, from userid 1677) id D8576892AE; Sat, 7 May 2016 01:30:11 +0200 (CEST) Date: Sat, 7 May 2016 01:30:11 +0200 From: Jilles Tjoelker To: Konstantin Belousov Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160506233011.GA99994@stack.nl> References: <20160505131029.GE2422@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160505131029.GE2422@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 23:30:15 -0000 On Thu, May 05, 2016 at 04:10:29PM +0300, Konstantin Belousov wrote: > I implemented robust mutexes for our libthr. A robust mutex is > guaranteed to be cleared by the system upon either thread or process > owner termination while the mutex is held. The next mutex locker is > then notified about inconsistent mutex state and can execute (or > abandon) corrective actions. > The patch mostly consists of small changes here and there, adding > neccessary checks for the inconsistent and abandoned conditions into > existing paths. Additionally, the thread exit handler was extended > to iterate over the userspace-maintained list of owned robust mutexes, > unlocking and marking as terminated each of them. > The list of owned robust mutexes cannot be maintained atomically > synchronous with the mutex lock state (it is possible in kernel, but > is too expensive). Instead, for the duration of lock or unlock > operation, the current mutex is remembered in a special slot that is > also checked by the kernel at thread termination. > Kernel must be aware about the per-thread location of the heads of > robust mutex lists and the current active mutex slot. Initially I tried > to extend TCBs with this data, so only a single syscall at the > threading library initialization would be needed: for any thread the > location of TCB is known by kernel, and the syscall would pass > offsets. Unfortunately, on some architectures the size of TCB is part > of the fixed ABI and cannot be changed. Instead, when a thread touches > a robust mutex for the first time, a new umtx op syscall is issued which > informs about location of lists heads. > The umtx sleep queues for PP and PI mutexes are split between > non-robust and robust. I do not understand the reasoning behind this > POSIX requirement. > Patch passes all glibc tests for robust mutexes I found in the nptl/ > directory. See https://github.com/kostikbel/glibc-robust-tests . > Patch is available at https://kib.kiev.ua/kib/pshared/robust.1.patch > (beware of self-signed root certificate in the chain). Work was > sponsored by The FreeBSD Foundation. > Unrelated things in the patch: > 1. Style. Since I had to re-read whole sys/kern/kern_umtx.c, > lib/libthr/thread/thr_umtx.h and lib/libthr/thread/thr_umtx.c, I > started fixing the numerous style violations in these files, which > actually made my eyes bleed. > 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared > pi mutexes. > 3. Removal of the struct pthread_mutex m_owner field. I cannot see > why it is useful. The only optimization it provides is the > possibility to avoid clearing UMUTEX_CONTESTED bit when reading > m_lock.m_owner. The disadvantages of having this duplicated field > is that kernel does not know about pthread_mutex, so cannot fix the > dup value. Overall it is less work to clear UMUTEX_CONTESTED when > checking owner, then to try and handle inconsistencies. > I added the PMUTEX_OWNER_ID() macro to simplify code. > 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls > the lifetime of the shared mutex associated with a vnode' page. > Apparently, there is real code around which expects the following > to work: > - mmap a file, create a shared mutex in the mapping; > - the process exits; > - another process starts, mmaps the same file and expects that the > previously initialized mutex is still usable. > The knob changes the lifetime of such shared off-page from the > 'destroy on last unmap' to either 'until vnode is reclaimed' or > until 'pthread_mutex_destroy' called, whatever comes first. The 'until vnode is reclaimed' bit sounds like a recipe for hard to reproduce bugs. I do think it is related to the robust mutex patch, though. Without robust mutexes and assuming threads do not unmap the memory while having a mutex locked or while waiting on a condition variable, it is sufficient to create the off-page mutex/condvar automatically in its initial state when pthread_mutex_lock() or pthread_cond_*wait() are called and no off-page object exists. With robust mutexes, we need to store somewhere whether the next thread should receive [EOWNERDEAD] or not, and this should persist even if no processes have the memory mapped. This might be done by replacing THR_PSHARED_PTR with a different value in the pthread_mutex_t. I'm not sure I like that memory write being done from the kernel though. The below is a review of https://kib.kiev.ua/kib/pshared/robust.2.patch > diff --git a/lib/libc/gen/Symbol.map b/lib/libc/gen/Symbol.map It is not necessary to export _pthread_mutex_consistent, _pthread_mutexattr_getrobust and _pthread_mutexattr_setrobust (under FBSDprivate_1.0 symbol version). They are not used by name outside the DSO, only via the jump table. The same thing is true of many other FBSDprivate_1.0 symbols but there is a difference between adding new unnecessary exports and removing existing unnecessary exports. > + if ((error2 == 0 || error2 == EOWNERDEAD) && cancel) > _thr_testcancel(curthread); I don't think [EOWNERDEAD] should be swept under the carpet like this. The cancellation cleanup handler will use the protected state and unlock the mutex without making the state consistent and the state will be unrecoverable. > +void > +_mutex_leave_robust(struct pthread *curthread, struct pthread_mutex *m) > +{ > + > + if (!is_robust_mutex(m)) > + return; This accesses the mutex after writing a value to the lock word which allows other threads to lock it. A use after free may result, since it is valid for another thread to lock, unlock and destroy the mutex (assuming the mutex is not used otherwise later). Memory ordering may permit the load of m->m_lock.m_flags to be moved before the actual unlock, so this issue may not actually appear. Given that the overhead of a system call on every robust mutex unlock is not desired, the kernel's unlock of a terminated thread's mutexes will unavoidably have this use after free. However, for non-robust mutexes the previous guarantees should be kept. > + int defered, error; Typo, should be "deferred". > +.It Bq Er EOWNERDEAD > +The argument > +.Fa mutex > +points to the robust mutex and the previous owning thread terminated > +while holding the mutex lock. > +The lock was granted to the caller and it is up to the new owner > +to make the state consistent. "points to a robust mutex". Perhaps add a See .Xr pthread_mutex_consistent 3 here. > diff --git a/share/man/man3/pthread_mutex_consistent.3 b/share/man/man3/pthread_mutex_consistent.3 This man page should mention that pthread_mutex_consistent() can be called when a mutex lock or condition variable wait failed with [EOWNERDEAD]. > @@ -37,7 +37,11 @@ struct umutex { > + __uintptr_t m_rb_lnk; /* Robust linkage */ Although Linux also stores the robust list nodes in the mutexes like this, I think it increases the chance of strange memory corruption. Making the robust list an array in the thread's memory would be more reliable. If the maximum number of owned robust mutexes can be small, this can have a fixed size; otherwise, it needs to grow as needed (which does add an allocation that may fail to the pthread_mutex_lock path, bad). > + * The umutex.m_lock values and bits. The m_owner is the word which > + * serves as the lock. It's high bit is the contention indicator, > + * rest of bits records the owner TID. TIDs values start with PID_MAX > + * + 2 and ends by INT32_MAX, the low range [1..PID_MAX] is guaranteed > + * to be useable as the special markers. Typo, "It's" should be "Its" and "ends" should be "end". Bruce Evans would probably complain about comma splice (two times). > +#define UMTX_OP_ROBUST 26 The name is rather vague. Perhaps this can be something like UMTX_OP_INIT_ROBUST. -- Jilles Tjoelker From owner-freebsd-arch@freebsd.org Sat May 7 17:00:03 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E37E8B31C9A for ; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id CDFD51F4F for ; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id BDD06B31C96; Sat, 7 May 2016 17:00:03 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB1B0B31C94; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3A96D1F4D; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u47GxuRJ024297 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 7 May 2016 19:59:56 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u47GxuRJ024297 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u47GxujY024296; Sat, 7 May 2016 19:59:56 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 7 May 2016 19:59:56 +0300 From: Konstantin Belousov To: Jilles Tjoelker Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160507165956.GC89104@kib.kiev.ua> References: <20160505131029.GE2422@kib.kiev.ua> <20160506233011.GA99994@stack.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160506233011.GA99994@stack.nl> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2016 17:00:04 -0000 On Sat, May 07, 2016 at 01:30:11AM +0200, Jilles Tjoelker wrote: > > 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls > > the lifetime of the shared mutex associated with a vnode' page. > > Apparently, there is real code around which expects the following > > to work: > > - mmap a file, create a shared mutex in the mapping; > > - the process exits; > > - another process starts, mmaps the same file and expects that the > > previously initialized mutex is still usable. > > > The knob changes the lifetime of such shared off-page from the > > 'destroy on last unmap' to either 'until vnode is reclaimed' or > > until 'pthread_mutex_destroy' called, whatever comes first. > > The 'until vnode is reclaimed' bit sounds like a recipe for hard to > reproduce bugs. > > I do think it is related to the robust mutex patch, though. > > Without robust mutexes and assuming threads do not unmap the memory > while having a mutex locked or while waiting on a condition variable, it > is sufficient to create the off-page mutex/condvar automatically in its > initial state when pthread_mutex_lock() or pthread_cond_*wait() are > called and no off-page object exists. > > With robust mutexes, we need to store somewhere whether the next thread > should receive [EOWNERDEAD] or not, and this should persist even if no > processes have the memory mapped. This might be done by replacing > THR_PSHARED_PTR with a different value in the pthread_mutex_t. I'm not > sure I like that memory write being done from the kernel though. In principle, I agree with this. Note that if we go with something like THR_OWNERDEAD_PTR, the kernel write to set the value would be not much different from the kernel write to unlock robust mutex with inlined lock structures. Still, I would prefer to to implement this now. For the local purposes, the knob was enough, and default value will be 'disabled'. > > The below is a review of https://kib.kiev.ua/kib/pshared/robust.2.patch > > > diff --git a/lib/libc/gen/Symbol.map b/lib/libc/gen/Symbol.map > > It is not necessary to export _pthread_mutex_consistent, > _pthread_mutexattr_getrobust and _pthread_mutexattr_setrobust (under > FBSDprivate_1.0 symbol version). They are not used by name outside the > DSO, only via the jump table. Removed from the export. > > The same thing is true of many other FBSDprivate_1.0 symbols but there > is a difference between adding new unnecessary exports and removing > existing unnecessary exports. > > > + if ((error2 == 0 || error2 == EOWNERDEAD) && cancel) > > _thr_testcancel(curthread); > > I don't think [EOWNERDEAD] should be swept under the carpet like this. > The cancellation cleanup handler will use the protected state and unlock > the mutex without making the state consistent and the state will be > unrecoverable. So your argument there is to return EOWNERDEAD and not cancelling, am I right ? I reused part of your text as the comment. > > > +void > > +_mutex_leave_robust(struct pthread *curthread, struct pthread_mutex *m) > > +{ > > + > > + if (!is_robust_mutex(m)) > > + return; > > This accesses the mutex after writing a value to the lock > word which allows other threads to lock it. A use after free may result, > since it is valid for another thread to lock, unlock and destroy the > mutex (assuming the mutex is not used otherwise later). > > Memory ordering may permit the load of m->m_lock.m_flags to be moved > before the actual unlock, so this issue may not actually appear. > > Given that the overhead of a system call on every robust mutex unlock is > not desired, the kernel's unlock of a terminated thread's mutexes will > unavoidably have this use after free. However, for non-robust mutexes > the previous guarantees should be kept. I agree that this is a bug, and agree that the kernel accesses to the curthread->inact_mtx are potentially unsafe. I also did not wanted to issue a syscall for unlock of a robust mutex, as you noted. I fixed the bug with the is_robust_mutex() test in _mutex_leave_robust() by caching the robust status. I was indeed worried by the kernel access issue you mentioned, but kernel is immune to 'bad' memory accesses. What bothered me is the ill ABA situation, where the lock memory is freed and repurposed, and then the lock word is written with the one of two specific values which give the same state as for locked mutex. This would cause kernel to 'unlock' it (but not to follow the invalid m_rb_link). But for this to happen, we must have a situation where a thread is being terminated before mutex_unlock_common() reached the _mutex_leave_robust() call. This is async thread termination, which then must be either process termination (including execve()), or a call to thr_exit() from a signal handler in our thread (including async cancellation). I am sure that the thr_exit() situation is non-conforming, so the only concern is the process exit, and then, shared robust mutex, because for private mutex, only the exiting process state is affected. I can verify in umtx_handle_rb(), for instance, that for USYNC_PROCESS_SHARED object, the underlying memory is backed by the umtx shm page. This would close the race. But this would interfere with libthr2, if that ever happen. > > > + int defered, error; > > Typo, should be "deferred". I also changed PMUTEX_FLAG_DEFERED. > > > +.It Bq Er EOWNERDEAD > > +The argument > > +.Fa mutex > > +points to the robust mutex and the previous owning thread terminated > > +while holding the mutex lock. > > +The lock was granted to the caller and it is up to the new owner > > +to make the state consistent. > > "points to a robust mutex". > > Perhaps add a See .Xr pthread_mutex_consistent 3 here. Both done. > > > diff --git a/share/man/man3/pthread_mutex_consistent.3 b/share/man/man3/pthread_mutex_consistent.3 > > This man page should mention that pthread_mutex_consistent() can be > called when a mutex lock or condition variable wait failed with > [EOWNERDEAD]. I took introductory text from the POSIX page for the function. > > > @@ -37,7 +37,11 @@ struct umutex { > > + __uintptr_t m_rb_lnk; /* Robust linkage */ > > Although Linux also stores the robust list nodes in the mutexes like > this, I think it increases the chance of strange memory corruption. > Making the robust list an array in the thread's memory would be more > reliable. If the maximum number of owned robust mutexes can be small, > this can have a fixed size; otherwise, it needs to grow as needed (which > does add an allocation that may fail to the pthread_mutex_lock path, > bad). I gave this proposal some thought. I very much dislike an idea of calling memory allocator on the lock, and esp. the trylock, path. The later could need to obtain allocator locks which (sometimes partially) defeat the trylock purpose. I can use mmap(2) directly there, similarly how pthread_setspecific() was changed recently, which would avoid the issue of calling userspace allocator. Still, the problem of an addiitonal syscall, resulting ENOMEM and also the time to copy the current robust owned list to grown location are there (I do not see that using chunked allocations is reasonable, since it would be the same list as current m_rb_lnk, but at different level. I prefer to keep the robust linked list for these reasons. In fact, the deciding argument would be actual application usage of the robustness. I thought, when writing the patch, when and how would I use the feature, and did not see compelling arguments to even try to use it. My stumbling block is the user data consistency recovery: for instance, I tried to write a loop which would increment shared variable N times, and I was not able to end up with any simple recovery mechanism from the aborted iteration, except writing iteration in assembly and have a parallel tick variable which enumerate each iteration action. > > > + * The umutex.m_lock values and bits. The m_owner is the word which > > + * serves as the lock. It's high bit is the contention indicator, > > + * rest of bits records the owner TID. TIDs values start with PID_MAX > > + * + 2 and ends by INT32_MAX, the low range [1..PID_MAX] is guaranteed > > + * to be useable as the special markers. > > Typo, "It's" should be "Its" and "ends" should be "end". > > Bruce Evans would probably complain about comma splice (two times). I tried to reword this. > > > +#define UMTX_OP_ROBUST 26 > > The name is rather vague. Perhaps this can be something like > UMTX_OP_INIT_ROBUST. I renamed this to UMTX_OP_ROBUST_LISTS, together with the parameters structure. Updated patch is at https://kib.kiev.ua/kib/pshared/robust.3.patch I did not added the check for umtx shm into the umtx_handle_rb() yet, waiting for your opinion.