From nobody Mon Aug 2 10:11:33 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 0E81912D0428 for ; Mon, 2 Aug 2021 10:11:56 +0000 (UTC) (envelope-from freebsd@grem.de) Received: from mail.evolve.de (mail.evolve.de [213.239.217.29]) (using TLSv1.3 with cipher TLS_CHACHA20_POLY1305_SHA256 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA512 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail.evolve.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GdYhB6ZrTz3hnk; Mon, 2 Aug 2021 10:11:51 +0000 (UTC) (envelope-from freebsd@grem.de) Received: by mail.evolve.de (OpenSMTPD) with ESMTP id 7d19cb40; Mon, 2 Aug 2021 10:11:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=grem.de; h=date:from:to:cc :subject:message-id:in-reply-to:references:mime-version :content-type:content-transfer-encoding; s=20180501; bh=a8fldIZJ gnAGWowAOla/r9LN8Ns=; b=OmHi+c1yY3AQDPgGqE/OX09uqtA+NZ1iJfbi9ADk o8oCgX+x/Qmr2bUAj3VOp2vMdHsEStCa0Wmld0owd9v1x+fdIdIClo51qCYXc4OM xwmpSogSKC2S9AA0Gay9JE5O69t6I44k9+w72hbpPSnMFmXLM565M8rGfdQStxm5 /YptmF0426p+kJPHRSUMHSZWq8zVMsO3jsYyQcq5Q+oEJXLBK8NP6cBWWJAb+ERw 9h/QIhVYLKCNs1/iE8o1WhVtWmFq/XCJbc1Gl8BCP/G6WB5IwNZcnQaXGFIHiuxI HpUasud67/KtsvP8a+jmEIpo9nl2lC3alDSJ8w49zAo5Uw== DomainKey-Signature: a=rsa-sha1; c=nofws; d=grem.de; h=date:from:to:cc :subject:message-id:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=20180501; b=lM orDBYB/lY6IZOdZC40fBLNgh3xZUUMiaPSjhdx/C/bMqozGB8VS4XnHm+S20dePO X4p/v6/YCUODRCZ/uBONPULpWMrLm7UXH18dqaHInFYkr3XgPVntpM5/8T1EJ+Jk gXHHq4em3ZAr6LCsVN0q+whCiZF8ekkN8kz4vtvOz+E7DJR8Nmm0tTGmMsjQiZ8n aWlEruKeDcm4PbrpX+5gnHAhq6+L1Lsgbp4Zh5dMy+/agk87XEsqQz5AzkA7MGuc JHh/fTCGpKF+9TxouW3CPV1NKiV6ZxgkDI2uZSBQaAdmhWWWQIxTUdseERmwZ9kb Kswb9nKh9q6D+a3zyF9Q== Received: by mail.evolve.de (OpenSMTPD) with ESMTPSA id 5abea926 (TLSv1.3:AEAD-CHACHA20-POLY1305-SHA256:256:NO); Mon, 2 Aug 2021 10:11:48 +0000 (UTC) Date: Mon, 2 Aug 2021 12:11:33 +0200 From: Michael Gmelin To: James Gritton Cc: jail@freebsd.org, Michael Gmelin Subject: Re: POSIX shared memory and dying jails Message-ID: <20210802121133.4456fb99@bsd64.grem.de> In-Reply-To: <8d9eb169d7b0072cd6f7ff00f5757842@freebsd.org> References: <20210625164100.73c71055@bsd64.grem.de> <03809b2655a40134dd802386afa6be7d@freebsd.org> <20210625185859.40fead46@bsd64.grem.de> <8d9eb169d7b0072cd6f7ff00f5757842@freebsd.org> X-Face: $wrgCtfdVw_H9WAY?S&9+/F"!41z'L$uo*WzT8miX?kZ~W~Lr5W7v?j0Sde\mwB&/ypo^}> +a'4xMc^^KroE~+v^&^#[B">soBo1y6(TW6#UZiC]o>C6`ej+i Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEWJBwe5BQDl LASZU0/LTEWEfHbyj0Txi32+sKrp1Mv944X8/fm1rS+cAAAACXBIWXMAAAsTAAAL EwEAmpwYAAAAB3RJTUUH3wESCxwC7OBhbgAAACFpVFh0Q29tbWVudAAAAAAAQ3Jl YXRlZCB3aXRoIFRoZSBHSU1QbbCXAAAAAghJREFUOMu11DFvEzEUAGCfEhBVFzuq AKkLd0O6VrIQsLXVSZXoWE5N1K3DobBBA9fQpRWc8OkWouaIjedWKiyREOKs+3PY fvalCNjgLVHeF7/3bMtBzV8C/VsQ8tecEgCcDgrzjekwKZ7TwsJZd/ywEKwwP+ZM 8P3drTsAwWn2mpWuDDuYiK1bFs6De0KUUFw0tWxm+D4AIhuuvZqtyWYeO7jQ4Aea 7jUqI+ixhQoHex4WshEvSXdood7stlv4oSuFOC4tqGcr0NjEqXgV4mMJO38nld4+ xKNxRDon7khyKVqY7YR4d+Cg0OMrkWXZOM7YDkEfKiilCn1qYv4mighZiynuHHOA Wq9QJq+BIES7lMFUtcikMnkDGHUoncA+uHgrP0ctIEqfwLHzeSo+eUA66AqzwN6n 2ZHJhw6Qh/PoyC/QENyEyC/AyNjq74Bs+3UH0xYwzDUC4B97HgLocg1QLYgDDO1v f3UX9Y307Ew4AHh67YAFFsxEpkXwpXY3eIgMhAAE3R19L919nNnuD2wlPcDE3UeT L2ytEICQib9BXgS2fU8PrD82ToYO1OEmMSnYTjSqSv9wdC0tPYC+rQRQD9ESnldF CyqfmiYW+tlALt8gH2xrMdC/youbjzPXEun+/ReXsMCDyve3dZc09fn2Oas8oXGc Jj6/fOeK5UmSMPmf/jL+GD8BEj0k/Fn6IO4AAAAASUVORK5CYII= List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4GdYhB6ZrTz3hnk X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=grem.de header.s=20180501 header.b=OmHi+c1y; dmarc=none; spf=pass (mx1.freebsd.org: domain of freebsd@grem.de designates 213.239.217.29 as permitted sender) smtp.mailfrom=freebsd@grem.de X-Spamd-Result: default: False [-3.50 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[grem.de:s=20180501]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:213.239.217.29/32]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[grem.de]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[grem.de:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.239.192.0/18, country:DE]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[jail] X-ThisMailContainsUnwantedMimeParts: N On Fri, 25 Jun 2021 20:18:39 -0700 James Gritton wrote: > On 2021-06-25 09:58, Michael Gmelin wrote: > > Another problem caused by the lack of jail ownership is that access > > semantics are a bit strange. E.g., a jail based on / can easily list > > (and remove) all memory allocations in the system, while for other > > jails > > it depends. They can stat their own allocations like in: > > > > # posixshmcontrol stat /xyz > > output as expected... > > > > But not list them: > > > > # posixshmcontrol ls > > posixshmcontrol: cannot get kern.ipc.posix_shm_list length: > > Operation not permitted > > > > Probably related to matching the path of the allocation, I didn't > > look into the code. > > That's just a case of the sysctl not being marked as jail-safe. > Looking at the code, it's clear that it needs to be altered when > called from within a jail, but preventing it is definitely not the > right thing. > > > but having something automatic in the OS would be nice. Or being > > able to run `posixshmcontrol -j shmtest ls`. Seems like this would > > be quite some effort though to get it right - also in terms of who > > can access what - right now, it's simply based on the path, which > > also gives > > a lot of flexibility. > > Since access to the shared memory segments themselves is only on file > permissions and pathnames, just making a "posixshmcontrol -j" also > rely on pathnames actually makes sense. > > Put this into a bug report, and I'll take a closer look. Probably two > different bugs for different issues (listing and automatic removal). > Hi Jamie, I *finally* found the time to write the bug reports: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257554 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257555 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257556 I took the liberty to assign them to you. Best, Michael -- Michael Gmelin From nobody Mon Aug 2 12:19:00 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4E95F12D9784 for ; Mon, 2 Aug 2021 12:19:22 +0000 (UTC) (envelope-from freebsd@grem.de) Received: from mail.evolve.de (mail.evolve.de [213.239.217.29]) (using TLSv1.3 with cipher TLS_CHACHA20_POLY1305_SHA256 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA512 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail.evolve.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GdcWF2bfxz3qJk for ; Mon, 2 Aug 2021 12:19:21 +0000 (UTC) (envelope-from freebsd@grem.de) Received: by mail.evolve.de (OpenSMTPD) with ESMTP id f18fd499 for ; Mon, 2 Aug 2021 12:19:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=grem.de; h=date:from:to :subject:message-id:mime-version:content-type :content-transfer-encoding; s=20180501; bh=KaW67icApRTvdpMUIkV7V C4TFi4=; b=Co8c0I/2tPbSITSuz61fNgz0M3GzQAjKf9VKWB/4D7wYroITP2UZS pmDTj7gtyyDGYH9ppOJGXqFqEQMTldo/KIZbaNUsHVEuKAQ0p0OQXvJ5ZrwURQOz UvV9+hsImZt4bxmGQzWYVAM3fEPcFqSC80P1daJjgl/sKWe8QphqAQ0Mocg1fBk9 /DZqKEwQ63aVm9DRM9mGqttPIJDU69NEz2/ZO3et35Svp0XB5V5YSZtw9Wb1Lm6u a19x3Wg9mZLk6KYIIUzaNd8VzK98Db0+9WN/yQU1yAaZDLSRnJVsWoeInizXeso9 YCCinJxA13loMiNMIt6aZE3fW/ZiEi10g== DomainKey-Signature: a=rsa-sha1; c=nofws; d=grem.de; h=date:from:to :subject:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=20180501; b=cbHlNW8wfHPjHro z+eMX3vzLUrlNQ34N3oobtZ7qEgiIEm7b77Qckft7Iku+37ss9HUARdRCgOjF/Kf jwFjpkWl4YRYepSLECZBHTum7ptac+NHravjp18nmjJav9UkC54v3zlMuo+rKov6 1LSpo8yr39ETU/bbQKZFS0q8u3K4mTxotc6R/gew1n3YDsj30od05QoK82raPLeP EFiXTKOb4Eoo1y/QJ+cUkjLfNDPwnYJAoCX8pyQDc83q/pVt1xZxQDzvEtuxrY1V CiAxPLfv4wo9UGgO5ZM6MnX6f1PBi1s6LapSi1J475aaeBvoUepovNRqW+2hVWwm UwFecZQ== Received: by mail.evolve.de (OpenSMTPD) with ESMTPSA id 7ba039a4 (TLSv1.3:AEAD-CHACHA20-POLY1305-SHA256:256:NO) for ; Mon, 2 Aug 2021 12:19:15 +0000 (UTC) Date: Mon, 2 Aug 2021 14:19:00 +0200 From: Michael Gmelin To: jail@freebsd.org Subject: POSIX shared memory, jails, and (lack of) limits Message-ID: <20210802141900.069d0051@bsd64.grem.de> X-Face: $wrgCtfdVw_H9WAY?S&9+/F"!41z'L$uo*WzT8miX?kZ~W~Lr5W7v?j0Sde\mwB&/ypo^}> +a'4xMc^^KroE~+v^&^#[B">soBo1y6(TW6#UZiC]o>C6`ej+i Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEWJBwe5BQDl LASZU0/LTEWEfHbyj0Txi32+sKrp1Mv944X8/fm1rS+cAAAACXBIWXMAAAsTAAAL EwEAmpwYAAAAB3RJTUUH3wESCxwC7OBhbgAAACFpVFh0Q29tbWVudAAAAAAAQ3Jl YXRlZCB3aXRoIFRoZSBHSU1QbbCXAAAAAghJREFUOMu11DFvEzEUAGCfEhBVFzuq AKkLd0O6VrIQsLXVSZXoWE5N1K3DobBBA9fQpRWc8OkWouaIjedWKiyREOKs+3PY fvalCNjgLVHeF7/3bMtBzV8C/VsQ8tecEgCcDgrzjekwKZ7TwsJZd/ywEKwwP+ZM 8P3drTsAwWn2mpWuDDuYiK1bFs6De0KUUFw0tWxm+D4AIhuuvZqtyWYeO7jQ4Aea 7jUqI+ixhQoHex4WshEvSXdood7stlv4oSuFOC4tqGcr0NjEqXgV4mMJO38nld4+ xKNxRDon7khyKVqY7YR4d+Cg0OMrkWXZOM7YDkEfKiilCn1qYv4mighZiynuHHOA Wq9QJq+BIES7lMFUtcikMnkDGHUoncA+uHgrP0ctIEqfwLHzeSo+eUA66AqzwN6n 2ZHJhw6Qh/PoyC/QENyEyC/AyNjq74Bs+3UH0xYwzDUC4B97HgLocg1QLYgDDO1v f3UX9Y307Ew4AHh67YAFFsxEpkXwpXY3eIgMhAAE3R19L919nNnuD2wlPcDE3UeT L2ytEICQib9BXgS2fU8PrD82ToYO1OEmMSnYTjSqSv9wdC0tPYC+rQRQD9ESnldF CyqfmiYW+tlALt8gH2xrMdC/youbjzPXEun+/ReXsMCDyve3dZc09fn2Oas8oXGc Jj6/fOeK5UmSMPmf/jL+GD8BEj0k/Fn6IO4AAAAASUVORK5CYII= List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4GdcWF2bfxz3qJk X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=grem.de header.s=20180501 header.b="Co8c0I/2"; dmarc=none; spf=pass (mx1.freebsd.org: domain of freebsd@grem.de designates 213.239.217.29 as permitted sender) smtp.mailfrom=freebsd@grem.de X-Spamd-Result: default: False [-3.50 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[grem.de:s=20180501]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:213.239.217.29/32]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[jail@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[grem.de:+]; NEURAL_HAM_SHORT(-1.00)[-0.999]; DMARC_NA(0.00)[grem.de]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.239.192.0/18, country:DE]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[jail] X-ThisMailContainsUnwantedMimeParts: N Hi, I've been playing a bit with POSIX shared memory and, unlike for SysV shared memory, I couldn't find any way to limit its use by jails. First, I looked at racct/rctl, but there is no resource for POSIX shared memory and memoryuse/vmemoryuse don't seem to have an effect (which makes sense). Then I checked if there are jail parameters that could help, but there doesn't seem to be anything like "allow.sysvshm" for POSIX shared memory to limit access to the feature. So, unless I'm missing something, it seems like all jails on a system have unlimited access to POSIX shared memory and therefore any single jail can use up the jailhost's virtual memory until the jailhost comes to a grinding halt. I wrote a little test program that keeps allocating POSIX shared memory inside of a jail and it can easily bring the host down to its knees: login: Aug 2 12:12:09 test kernel: pid 11825 (getty), jid 0, uid 0, was killed: out of swap space Aug 2 12:12:10 test init[11827]: getty repeating too quickly on port /dev/ttyu0, sleeping 30 secs Aug 2 12:12:10 test kernel: pid 11826 (getty), jid 0, uid 0, was killed: out of swap space Best, Michael -- Michael Gmelin From nobody Mon Aug 2 13:55:48 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id AE00E12DFA6C for ; Mon, 2 Aug 2021 13:55:56 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gdffh2Hhbz4SvK for ; Mon, 2 Aug 2021 13:55:56 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 172Dtmvv007984 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 2 Aug 2021 16:55:51 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 172Dtmvv007984 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 172DtmSK007983; Mon, 2 Aug 2021 16:55:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 2 Aug 2021 16:55:48 +0300 From: Konstantin Belousov To: Michael Gmelin Cc: jail@freebsd.org Subject: Re: POSIX shared memory, jails, and (lack of) limits Message-ID: References: <20210802141900.069d0051@bsd64.grem.de> List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210802141900.069d0051@bsd64.grem.de> X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4Gdffh2Hhbz4SvK X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Mon, Aug 02, 2021 at 02:19:00PM +0200, Michael Gmelin wrote: > Hi, > > I've been playing a bit with POSIX shared memory and, unlike for SysV > shared memory, I couldn't find any way to limit its use by jails. > > First, I looked at racct/rctl, but there is no resource for POSIX shared > memory and memoryuse/vmemoryuse don't seem to have an effect (which > makes sense). > > Then I checked if there are jail parameters that could help, but there > doesn't seem to be anything like "allow.sysvshm" for POSIX shared > memory to limit access to the feature. > > So, unless I'm missing something, it seems like all jails on a system > have unlimited access to POSIX shared memory and therefore any single > jail can use up the jailhost's virtual memory until the jailhost comes > to a grinding halt. > > I wrote a little test program that keeps allocating POSIX shared memory > inside of a jail and it can easily bring the host down to its knees: > > login: Aug 2 12:12:09 test kernel: pid 11825 (getty), jid 0, uid 0, > was killed: out of swap space > Aug 2 12:12:10 test init[11827]: getty repeating too quickly on port > /dev/ttyu0, sleeping 30 secs > Aug 2 12:12:10 test kernel: pid 11826 (getty), jid 0, uid 0, was > killed: out of swap space Posix shm is limited by the swap accounting. For non-jail consumers, it is per-uid RLIMIT_SWAP. I do not know if other mechanisms make RLIMIT_SWAP per-jail per-uid. From nobody Mon Aug 2 15:06:43 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id D155E1344686 for ; Mon, 2 Aug 2021 15:06:53 +0000 (UTC) (envelope-from freebsd@grem.de) Received: from mail.evolve.de (mail.evolve.de [213.239.217.29]) (using TLSv1.3 with cipher TLS_CHACHA20_POLY1305_SHA256 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA512 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail.evolve.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GdhDY3tVkz4ZJS for ; Mon, 2 Aug 2021 15:06:53 +0000 (UTC) (envelope-from freebsd@grem.de) Received: by mail.evolve.de (OpenSMTPD) with ESMTP id 91525f9d; Mon, 2 Aug 2021 15:06:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=grem.de; h=content-type :content-transfer-encoding:mime-version:subject:from:in-reply-to :date:cc:message-id:references:to; s=20180501; bh=3lE0XtCo4HXW+X tcuHBQykSwS5k=; b=alqY7AAQC/gxtYQ7ZGytQuIJBzWF9AKbYGRFNekWjHOdrG cjeD+DGEjfARky+G0GwSDzcraQhC1YtEwKih8CNa4HiFipZW+8ApaxqRd0kYeR9b eLr7Gsj2HzT6oBaNNU0Iw6eRCbcsxnyExxAaszE7yNYb5eY6QOoKy7sQWuJNxHCI bP3s3udSw+D788ZMkJRfq7Gf8O5i0yX4cS8LXLUzl7AnJYbXBcE1+vcYqnQC19NF nf4OsH+r8fDURBFpxDVQ2nboRwDTW5v87bhVvz9UM3PIO4emBX3ThEKmN0K1m38o IndYZ6SGM4uKBknkTf5pC1mJrTk4INe+NYrM1jmg== DomainKey-Signature: a=rsa-sha1; c=nofws; d=grem.de; h=content-type :content-transfer-encoding:mime-version:subject:from:in-reply-to :date:cc:message-id:references:to; q=dns; s=20180501; b=WAKmSzMi Cwi9I3WJSo17yICjWOrX0srUodV99LdrZoM0APnx6SPssje9pXV1xK2MCXSphQ3j /174/ruaxOFqcZtsOEZg59OL3O+8PdlSTvVYe5y/jVMt8igZ6eHg2keDW4ZjPUEK 0nAkGiMcP6r56FpGp5EpaTvhL6DFbl6MtKPB8QAi7TB7bRzj4W9Jvb3/BRDqpq7f zj1GVA3jf/afwhzcVywR92iEm5r1vQSlepDg9ZjsLGXeUyJ6eXebCPJjrvMx/Xzd rqWQlDkjaCIXRmxjCGCgTg2GsJwY6GvU1Fef0dM+Nl0S5YokfidLN1A80XRRSXwQ fW2KdstzdzHyfw== Received: by mail.evolve.de (OpenSMTPD) with ESMTPSA id d0bf7f9a (TLSv1.3:AEAD-CHACHA20-POLY1305-SHA256:256:NO); Mon, 2 Aug 2021 15:06:44 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org Mime-Version: 1.0 (1.0) Subject: Re: POSIX shared memory, jails, and (lack of) limits From: Michael Gmelin In-Reply-To: Date: Mon, 2 Aug 2021 17:06:43 +0200 Cc: jail@freebsd.org Message-Id: References: To: Konstantin Belousov X-Mailer: iPhone Mail (18F72) X-Rspamd-Queue-Id: 4GdhDY3tVkz4ZJS X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N > On 2. Aug 2021, at 15:56, Konstantin Belousov wrote:= >=20 > =EF=BB=BFOn Mon, Aug 02, 2021 at 02:19:00PM +0200, Michael Gmelin wrote: >> Hi, >>=20 >> I've been playing a bit with POSIX shared memory and, unlike for SysV >> shared memory, I couldn't find any way to limit its use by jails. >>=20 >> First, I looked at racct/rctl, but there is no resource for POSIX shared >> memory and memoryuse/vmemoryuse don't seem to have an effect (which >> makes sense). >>=20 >> Then I checked if there are jail parameters that could help, but there >> doesn't seem to be anything like "allow.sysvshm" for POSIX shared >> memory to limit access to the feature. >>=20 >> So, unless I'm missing something, it seems like all jails on a system >> have unlimited access to POSIX shared memory and therefore any single >> jail can use up the jailhost's virtual memory until the jailhost comes >> to a grinding halt. >>=20 >> I wrote a little test program that keeps allocating POSIX shared memory >> inside of a jail and it can easily bring the host down to its knees: >>=20 >> login: Aug 2 12:12:09 test kernel: pid 11825 (getty), jid 0, uid 0, >> was killed: out of swap space >> Aug 2 12:12:10 test init[11827]: getty repeating too quickly on port >> /dev/ttyu0, sleeping 30 secs >> Aug 2 12:12:10 test kernel: pid 11826 (getty), jid 0, uid 0, was >> killed: out of swap space >=20 > Posix shm is limited by the swap accounting. For non-jail consumers, > it is per-uid RLIMIT_SWAP. I do not know if other mechanisms make > RLIMIT_SWAP per-jail per-uid. Unfortunately it seems like POSIX shared memory is not linked to the jail it= was created in (we discussed this on this list in June and I created a few P= Rs about that), so per jail rctl rules don=E2=80=99t apply (and limiting uid= 0 won=E2=80=99t have the desired effect ^_^). Best Michael From nobody Mon Aug 2 19:03:27 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 469E012B55BE for ; Mon, 2 Aug 2021 19:03:42 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4GdnTn6MZ1z4rKk for ; Mon, 2 Aug 2021 19:03:41 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 172J3SmJ083294 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 2 Aug 2021 22:03:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 172J3SmJ083294 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 172J3R5B083293; Mon, 2 Aug 2021 22:03:27 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 2 Aug 2021 22:03:27 +0300 From: Konstantin Belousov To: Michael Gmelin Cc: jail@freebsd.org Subject: Re: POSIX shared memory, jails, and (lack of) limits Message-ID: References: List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4GdnTn6MZ1z4rKk X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Mon, Aug 02, 2021 at 05:06:43PM +0200, Michael Gmelin wrote: > > > > On 2. Aug 2021, at 15:56, Konstantin Belousov wrote: > > > > On Mon, Aug 02, 2021 at 02:19:00PM +0200, Michael Gmelin wrote: > >> Hi, > >> > >> I've been playing a bit with POSIX shared memory and, unlike for SysV > >> shared memory, I couldn't find any way to limit its use by jails. > >> > >> First, I looked at racct/rctl, but there is no resource for POSIX shared > >> memory and memoryuse/vmemoryuse don't seem to have an effect (which > >> makes sense). > >> > >> Then I checked if there are jail parameters that could help, but there > >> doesn't seem to be anything like "allow.sysvshm" for POSIX shared > >> memory to limit access to the feature. > >> > >> So, unless I'm missing something, it seems like all jails on a system > >> have unlimited access to POSIX shared memory and therefore any single > >> jail can use up the jailhost's virtual memory until the jailhost comes > >> to a grinding halt. > >> > >> I wrote a little test program that keeps allocating POSIX shared memory > >> inside of a jail and it can easily bring the host down to its knees: > >> > >> login: Aug 2 12:12:09 test kernel: pid 11825 (getty), jid 0, uid 0, > >> was killed: out of swap space > >> Aug 2 12:12:10 test init[11827]: getty repeating too quickly on port > >> /dev/ttyu0, sleeping 30 secs > >> Aug 2 12:12:10 test kernel: pid 11826 (getty), jid 0, uid 0, was > >> killed: out of swap space > > > > Posix shm is limited by the swap accounting. For non-jail consumers, > > it is per-uid RLIMIT_SWAP. I do not know if other mechanisms make > > RLIMIT_SWAP per-jail per-uid. > > Unfortunately it seems like POSIX shared memory is not linked to the jail it was created in (we discussed this on this list in June and I created a few PRs about that), so per jail rctl rules don’t apply (and limiting uid 0 won’t have the desired effect ^_^). > In what sense 'not linked'? The backing vm_object is created with the current process credentials, which are jailed if creator belongs to a jail. From nobody Mon Aug 2 19:40:00 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 2E44912B8925 for ; Mon, 2 Aug 2021 19:40:00 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GdpHh0D2Hz4vGD for ; Mon, 2 Aug 2021 19:40:00 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk1-x732.google.com with SMTP id c18so17759333qke.2 for ; Mon, 02 Aug 2021 12:40:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=etyVdWkVfqJmeIq6fYYlQGq3EVl9Heh9rmU8vy4vqYw=; b=ieNRC0SPyEy1Zoc0F1b7eXx8++FCEOU0xVSgvY9Cum8laCQT1lTpX6pSd+7iKc0FVN tnt8fPSvo1hzQgXDiISsjVx0NL+bs+BcNcP+vLsRLNEQ8rHaMqbAD6mSPWar4oi9VvRW mIuSHr/CgfWEpTz+8Lr3nkUTMJk9SbH2JQuji3dnIuRxjutI4gwO5Lo3gm/cw5Ycy3ml krifxjz1j/zDdHMak2I4WdxOSuCWaaEWMb/k/yX33stQ9vIzWidyPAFV6Wjv0FmXQTnM OFbbUx9HCcmEQrBaxgZFXgbPceFn4GPfIKgLxzt/Mewhxswcs4vAR00H1oAFoRdo3j85 mgHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=etyVdWkVfqJmeIq6fYYlQGq3EVl9Heh9rmU8vy4vqYw=; b=XhKuSl5134fQJ2W0/cd5VL/EQvk+n/1T79CSrRPNo3rG9qADUBzix9xj8Zl3HolSMR K2RsEdGEOw4iuekQCqHr0OG5ugiMMzQl1ZTBwp6XZYLsPZQ2iw42Ab1ar1FkYjXW4oAH ecRLExDCLp8Em5m8wDmMWoVTo7hfTMPw/U8dE5iUh8QhPWdjxr82H5W/uJXR974Jn0mQ ZUoak0hEYljzkw0PhP+dZPBR/16muSJizA8VHuXk3yrB2RevYR7RRfEUBj2Clk+eexBc bgopPTZ0YHxFJFh/q3gaBIfSo3gCVed8985vzHLqZn0Usyeg7696DTxbLa1b+R5+3n2P tQ2g== X-Gm-Message-State: AOAM5311suYeqKblxxiQnXCsWTl67DJlOiysoj7VAYiE07SHnwW9NTjP +fAPi3C9NHPu/i6ZSv6WLOA= X-Google-Smtp-Source: ABdhPJxuz1mWWPWKw34ih1GaTk44K8VjP/2YeyRXaz76V9f4q9Vw7HpgqQ5Kjw8QAXB4WfnmgsgX8A== X-Received: by 2002:a37:a3c5:: with SMTP id m188mr11712596qke.307.1627933199624; Mon, 02 Aug 2021 12:39:59 -0700 (PDT) Received: from nuc ([142.126.162.193]) by smtp.gmail.com with ESMTPSA id x7sm6306616qki.102.2021.08.02.12.39.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Aug 2021 12:39:59 -0700 (PDT) Date: Mon, 2 Aug 2021 15:40:00 -0400 From: Mark Johnston To: Konstantin Belousov Cc: Michael Gmelin , jail@freebsd.org Subject: Re: POSIX shared memory, jails, and (lack of) limits Message-ID: References: List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 4GdpHh0D2Hz4vGD X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Mon, Aug 02, 2021 at 10:03:27PM +0300, Konstantin Belousov wrote: > On Mon, Aug 02, 2021 at 05:06:43PM +0200, Michael Gmelin wrote: > > > > > > > On 2. Aug 2021, at 15:56, Konstantin Belousov wrote: > > > > > > On Mon, Aug 02, 2021 at 02:19:00PM +0200, Michael Gmelin wrote: > > >> Hi, > > >> > > >> I've been playing a bit with POSIX shared memory and, unlike for SysV > > >> shared memory, I couldn't find any way to limit its use by jails. > > >> > > >> First, I looked at racct/rctl, but there is no resource for POSIX shared > > >> memory and memoryuse/vmemoryuse don't seem to have an effect (which > > >> makes sense). Cyril has written a few patches for racct, including one which includes POSIX shared memory objects in rctl's "nshm" and "shmsize" resources, which currently only apply to SysV shm objects: https://reviews.freebsd.org/D30775 We plan to get them committed in the next couple of weeks. "memoryuse" and "vmemoryuse" only count objects that are mapped into some process' address space, so they're not the right way to limit allocations of POSIX shm objects, see below. > > >> > > >> Then I checked if there are jail parameters that could help, but there > > >> doesn't seem to be anything like "allow.sysvshm" for POSIX shared > > >> memory to limit access to the feature. > > >> > > >> So, unless I'm missing something, it seems like all jails on a system > > >> have unlimited access to POSIX shared memory and therefore any single > > >> jail can use up the jailhost's virtual memory until the jailhost comes > > >> to a grinding halt. > > >> > > >> I wrote a little test program that keeps allocating POSIX shared memory > > >> inside of a jail and it can easily bring the host down to its knees: > > >> > > >> login: Aug 2 12:12:09 test kernel: pid 11825 (getty), jid 0, uid 0, > > >> was killed: out of swap space > > >> Aug 2 12:12:10 test init[11827]: getty repeating too quickly on port > > >> /dev/ttyu0, sleeping 30 secs > > >> Aug 2 12:12:10 test kernel: pid 11826 (getty), jid 0, uid 0, was > > >> killed: out of swap space > > > > > > Posix shm is limited by the swap accounting. For non-jail consumers, > > > it is per-uid RLIMIT_SWAP. I do not know if other mechanisms make > > > RLIMIT_SWAP per-jail per-uid. racct/rctl provides the "swapuse" resource which should account for this. It does not apply to largepage objects, though. > > Unfortunately it seems like POSIX shared memory is not linked to the jail it was created in (we discussed this on this list in June and I created a few PRs about that), so per jail rctl rules don’t apply (and limiting uid 0 won’t have the desired effect ^_^). > > > > In what sense 'not linked'? The backing vm_object is created with the > current process credentials, which are jailed if creator belongs to a jail. I believe the problem that Michael is referring to is that named POSIX shm objects created within a jail do not disappear when the jail is destroyed, and the vm object cred reference is leaked. But this is unrelated to swap space accounting. From nobody Mon Aug 2 19:58:08 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 0920012B9F70 for ; Mon, 2 Aug 2021 19:58:20 +0000 (UTC) (envelope-from thomas@gibfest.dk) Received: from smtp2.servers.tyknet.dk (smtp2.servers.tyknet.dk [89.233.43.78]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gdphq6KCzz3CKd; Mon, 2 Aug 2021 19:58:19 +0000 (UTC) (envelope-from thomas@gibfest.dk) Subject: Re: POSIX shared memory, jails, and (lack of) limits DKIM-Filter: OpenDKIM Filter v2.10.3 smtp2.servers.tyknet.dk 879FC85A DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=gibfest.dk; s=default; t=1627934291; bh=bYPWq0pRB74nVJX9EWcBCN0VzZZFNJ6RcVBrhHkCrw0=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=of/N8YYQDg61+yad8gbC/ca7aheJp777hm+2HHC57eStWpP/ZvKEsdAWFgs6n6Cg1 UCx4vA021oYTyBH3aRwTAewZDlt/rwIq/rAD2X+s3frvlxLVNF50B8p8V1q/ZIASF7 VNqsl/9eog6axRXECZhLI9M9xyCJvT6CuaZZWi0IxHLUYzH8COibBTBX4fd9/4MPya 6IBaxp00xt34xOHOPX1/B7asHr5sHy1iiKdFYoo6Xz1t8KYQdaYoecdBfx6mPdI2AF 6OKOqMtiY7QLlqxVVrK4YIS1ZWUSu/0bviJFfkN2fO/aqVy27SPhQAqxXe0xNoH6it pJVFHv8lygptA== To: Mark Johnston Cc: Michael Gmelin , jail@freebsd.org, Konstantin Belousov References: Message-ID: <51d4462f-1958-3380-9973-365e018e533f@gibfest.dk> Date: Mon, 2 Aug 2021 21:58:08 +0200 List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4Gdphq6KCzz3CKd X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] Reply-To: thomas@gibfest.dk From: Thomas Steen Rasmussen via jail X-Original-From: Thomas Steen Rasmussen X-ThisMailContainsUnwantedMimeParts: N On 8/2/21 9:40 PM, Mark Johnston wrote: > Cyril has written a few patches for racct, including one which includes > POSIX shared memory objects in rctl's "nshm" and "shmsize" resources, > which currently only apply to SysV shm objects: > https://reviews.freebsd.org/D30775 > We plan to get them committed in the next couple of weeks. > Hello, I haven't looked at it for a bit, but the last time I tried to use sysutils/jail_exporter to get graphs for jail resource usage the graphs for Postgres jails were hilariously wrong, which I believe I tracked down to shared memory being counted more than once. I gave up trying to figure out how to fix it and just lived with Grafana telling me a postgres jail on a 128gb jailhost used 900gb of memory. But it sounds like the above might fix this? Thanks! Best regards, Thomas Steen Rasmussen From nobody Mon Aug 2 20:38:54 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 1EEC212BCF7B for ; Mon, 2 Aug 2021 20:39:05 +0000 (UTC) (envelope-from freebsd@grem.de) Received: from mail.evolve.de (mail.evolve.de [213.239.217.29]) (using TLSv1.3 with cipher TLS_CHACHA20_POLY1305_SHA256 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA512 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail.evolve.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gdqbr5BKpz3Gm5; Mon, 2 Aug 2021 20:39:04 +0000 (UTC) (envelope-from freebsd@grem.de) Received: by mail.evolve.de (OpenSMTPD) with ESMTP id b5cd62cc; Mon, 2 Aug 2021 20:38:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=grem.de; h=content-type :content-transfer-encoding:mime-version:subject:from:in-reply-to :date:cc:message-id:references:to; s=20180501; bh=RjfMQ9ZSSoTygL CnVKqm0Dfud4M=; b=cXl+ggoyulzmtN5hhk+t8QOqNyjUqfR9wV/F+9FDZQz9CA U5l+CabntfcSBV3CfMBNw9HUuFTBmxHfKi2dw4sN3tKnCMjCJaN7Pdk1DprgqiN2 oTI1OC3rC3H29IRLSruuzY300xNSbsXJbxWv8POHCUbpqm0J/GC6V2fFTFgOD6Va tsOv6Vm/QWqX3M0BOkzrf7ydOduAvMWMczI5rEOHLfAm929yit2vhYQwVeZAlfwR qU7ykizBoIa20bSPTDMZ5/mFz8sLqy5J4opsoDAGIepDRCXfeLAddypDPRFCSxcX mMflAICY4OVQBz+nnWWhSALVBUS/vSt96w/LWYSQ== DomainKey-Signature: a=rsa-sha1; c=nofws; d=grem.de; h=content-type :content-transfer-encoding:mime-version:subject:from:in-reply-to :date:cc:message-id:references:to; q=dns; s=20180501; b=MTaglQ8l fabzJvhrGenVsPJ+BngkRrUA3QtBD3fGgSJ14bdsfsTL9GS3f7Agn75aAjrQq0HS hsreBDiHN49b5tfL3+GqtO1CsDcRrTIvyvdWhlsb8GjltGmpIIvJ1pT6Vx2dIdE9 HxLyLul/TktLxT22WHIlQy42IcjCOE5WatjGuqnYCfETlFVGS4WU5a5gff/ERHQT p+n57RWArBY2NnXIvRU2bPrr2INvoa7isLdXJ1JMKNbbEw67siILhQW/lm1JAAEK LPW6tUe3Ttal8q1CjPDmlkJZRaMNBZnr1Jj5frsIcJ1LfoFvIDZNNPRjSosx6QZc bllQE9zU+hCxSA== Received: by mail.evolve.de (OpenSMTPD) with ESMTPSA id 90364455 (TLSv1.3:AEAD-CHACHA20-POLY1305-SHA256:256:NO); Mon, 2 Aug 2021 20:38:55 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org Mime-Version: 1.0 (1.0) Subject: Re: POSIX shared memory, jails, and (lack of) limits From: Michael Gmelin In-Reply-To: Date: Mon, 2 Aug 2021 22:38:54 +0200 Cc: Konstantin Belousov , jail@freebsd.org Message-Id: <26D98CA9-B4ED-4BCB-935D-1EB8EBDA8F5D@grem.de> References: To: Mark Johnston X-Mailer: iPhone Mail (18F72) X-Rspamd-Queue-Id: 4Gdqbr5BKpz3Gm5 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-Spam: Yes X-ThisMailContainsUnwantedMimeParts: N > On 2. Aug 2021, at 21:40, Mark Johnston wrote: >=20 > =EF=BB=BFOn Mon, Aug 02, 2021 at 10:03:27PM +0300, Konstantin Belousov wro= te: >>> On Mon, Aug 02, 2021 at 05:06:43PM +0200, Michael Gmelin wrote: >>>=20 >>>=20 >>>> On 2. Aug 2021, at 15:56, Konstantin Belousov wro= te: >>>>=20 >>>> =EF=BB=BFOn Mon, Aug 02, 2021 at 02:19:00PM +0200, Michael Gmelin wrote= : >>>>> Hi, >>>>>=20 >>>>> I've been playing a bit with POSIX shared memory and, unlike for SysV >>>>> shared memory, I couldn't find any way to limit its use by jails. >>>>>=20 >>>>> First, I looked at racct/rctl, but there is no resource for POSIX shar= ed >>>>> memory and memoryuse/vmemoryuse don't seem to have an effect (which >>>>> makes sense). >=20 > Cyril has written a few patches for racct, including one which includes > POSIX shared memory objects in rctl's "nshm" and "shmsize" resources, > which currently only apply to SysV shm objects: > https://reviews.freebsd.org/D30775 > We plan to get them committed in the next couple of weeks. >=20 > "memoryuse" and "vmemoryuse" only count objects that are mapped into > some process' address space, so they're not the right way to limit > allocations of POSIX shm objects, see below. >=20 >>>>>=20 >>>>> Then I checked if there are jail parameters that could help, but there= >>>>> doesn't seem to be anything like "allow.sysvshm" for POSIX shared >>>>> memory to limit access to the feature. >>>>>=20 >>>>> So, unless I'm missing something, it seems like all jails on a system >>>>> have unlimited access to POSIX shared memory and therefore any single >>>>> jail can use up the jailhost's virtual memory until the jailhost comes= >>>>> to a grinding halt. >>>>>=20 >>>>> I wrote a little test program that keeps allocating POSIX shared memor= y >>>>> inside of a jail and it can easily bring the host down to its knees: >>>>>=20 >>>>> login: Aug 2 12:12:09 test kernel: pid 11825 (getty), jid 0, uid 0, >>>>> was killed: out of swap space >>>>> Aug 2 12:12:10 test init[11827]: getty repeating too quickly on port >>>>> /dev/ttyu0, sleeping 30 secs >>>>> Aug 2 12:12:10 test kernel: pid 11826 (getty), jid 0, uid 0, was >>>>> killed: out of swap space >>>>=20 >>>> Posix shm is limited by the swap accounting. For non-jail consumers, >>>> it is per-uid RLIMIT_SWAP. I do not know if other mechanisms make >>>> RLIMIT_SWAP per-jail per-uid. >=20 > racct/rctl provides the "swapuse" resource which should account for > this. It does not apply to largepage objects, though. I tried to limit swapuse for a jail and it doesn=E2=80=99t limit posix share= d memory created within the jail (I can still create shared memory segments w= ithin the jail until the machine runs out of virtual memory). Should I share the test case to make sure I didn=E2=80=99t mess up? -m >=20 >>> Unfortunately it seems like POSIX shared memory is not linked to the jai= l it was created in (we discussed this on this list in June and I created a f= ew PRs about that), so per jail rctl rules don=E2=80=99t apply (and limiting= uid 0 won=E2=80=99t have the desired effect ^_^). >>>=20 >>=20 >> In what sense 'not linked'? The backing vm_object is created with the >> current process credentials, which are jailed if creator belongs to a jai= l. >=20 > I believe the problem that Michael is referring to is that named POSIX > shm objects created within a jail do not disappear when the jail is > destroyed, and the vm object cred reference is leaked. But this is > unrelated to swap space accounting. From nobody Tue Aug 3 13:50:55 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id CEC9212DADCF for ; Tue, 3 Aug 2021 13:50:54 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GfGVQ5Bk9z4X0w for ; Tue, 3 Aug 2021 13:50:54 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qv1-xf2e.google.com with SMTP id f91so10583464qva.9 for ; Tue, 03 Aug 2021 06:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=71eB9m6CTkeEylcrGs1ZJ5I/qfbvFJlcHKutTKu1YxI=; b=DRhOZ3UA5B2Fyopvw4kuGYc/RU3aTNdXMnFwmPcl4HvpqJwsI/BHIxbNaWCPiuXlDb rVAJPibQSm3OFtl8ut3ksV+EXZCIxL0sI2lnYxmrQteEZRBNBdA+HAR4H56lxOeG4u6L hkoUDbdrf/7aLRRUvSPYvBxMP58V0/btrg1FVo0b5xOHYBtYFWS0ax6qgBJGGoFW7r8H AvCscJdcNWs2/oHRO2eoqsKy60A9ikk9ge2wdQhErGg57ddQ/WpRqtQMhQTPT2gYK1WT JRO0eWuKVDcqxIDTEv2ljO6v2SijdJ0a+R6zlwz7vTbi2ZFLwPD9hfvraCId6WwHLCNw ZHzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=71eB9m6CTkeEylcrGs1ZJ5I/qfbvFJlcHKutTKu1YxI=; b=agUhCZcFP02037XiZSh356g99Evs7jD6mGUarsl9nJrU3BV772G6jKK4YqfnKeULDX cb94S902G5kFllELOu4N8yVOQ3o/2ND3QrM86mzs+F270405tdIHakJgtMxqypZZASXN pNBqJHTeGUfqvSYk0SjHPor1FsxuffeRSq/4cOcPiH3DLlK5DUJQqoN3DZHvdAAtXj7Z 0uASXz/nqSm/zqy8rqsmCK2RXot3Psw2We0wXcA3V44RNRef+cx4Pivo9zGVpsRQrtBl gG8mY3sirXIvTtPIJzyK4DJj0P0pLkHnjUxCkJ3cV+0K4NovCSaYyZOxa0dLMgPZcLgN yOqQ== X-Gm-Message-State: AOAM531Eev6T6nby0UobKUzQRBjsUUbW89RHcj61rKEliUNg8F0imGuj 832AkwbvD7WtZfYFf5LIRmw= X-Google-Smtp-Source: ABdhPJyWQMAO3o7kAEl/UMOQWadFAuDpPP/WIrweyazr/ageVIYrHfenCOgYdfz1NCXAFoZTSdoBKQ== X-Received: by 2002:a0c:ff01:: with SMTP id w1mr21658635qvt.28.1627998654415; Tue, 03 Aug 2021 06:50:54 -0700 (PDT) Received: from nuc ([142.126.162.193]) by smtp.gmail.com with ESMTPSA id r5sm5974326qtm.75.2021.08.03.06.50.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Aug 2021 06:50:54 -0700 (PDT) Date: Tue, 3 Aug 2021 09:50:55 -0400 From: Mark Johnston To: Thomas Steen Rasmussen Cc: Michael Gmelin , jail@freebsd.org, Konstantin Belousov Subject: Re: POSIX shared memory, jails, and (lack of) limits Message-ID: References: <51d4462f-1958-3380-9973-365e018e533f@gibfest.dk> List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51d4462f-1958-3380-9973-365e018e533f@gibfest.dk> X-Rspamd-Queue-Id: 4GfGVQ5Bk9z4X0w X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Mon, Aug 02, 2021 at 09:58:08PM +0200, Thomas Steen Rasmussen wrote: > On 8/2/21 9:40 PM, Mark Johnston wrote: > > Cyril has written a few patches for racct, including one which includes > > POSIX shared memory objects in rctl's "nshm" and "shmsize" resources, > > which currently only apply to SysV shm objects: > > https://reviews.freebsd.org/D30775 > > We plan to get them committed in the next couple of weeks. > > > Hello, > > I haven't looked at it for a bit, but the last time I tried to use > sysutils/jail_exporter to get graphs for jail resource usage the graphs > for Postgres jails were hilariously wrong, which I believe I tracked > down to shared memory being counted more than once. > > I gave up trying to figure out how to fix it and just lived with Grafana > telling me a postgres jail on a 128gb jailhost used 900gb of memory. > > But it sounds like the above might fix this? I'm not familiar with jail_exporter, which resource counters is it fetching? I doubt this change on its own will solve the problem, it simply includes POSIX shm objects in racct's shared memory accounting. The mechanisms we use for memory usage are not changed there. From nobody Fri Aug 6 15:53:34 2021 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 734791347A0C for ; Fri, 6 Aug 2021 15:53:58 +0000 (UTC) (envelope-from freebsd@grem.de) Received: from mail.evolve.de (mail.evolve.de [213.239.217.29]) (using TLSv1.3 with cipher TLS_CHACHA20_POLY1305_SHA256 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA512 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail.evolve.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gh9510jxgz4nMf; Fri, 6 Aug 2021 15:53:56 +0000 (UTC) (envelope-from freebsd@grem.de) Received: by mail.evolve.de (OpenSMTPD) with ESMTP id 7825ba30; Fri, 6 Aug 2021 15:53:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=grem.de; h=date:from:to:cc :subject:message-id:in-reply-to:references:mime-version :content-type:content-transfer-encoding; s=20180501; bh=lFRwa7d7 cCbioBUQo/c2wQ+Kpcg=; b=isnMXD07fUJsFUxOJiBtE3l4YxzjdVUZWq5o2iVM rJ2UYM6GQX2lNfF8w3Mp7fcQTnxJn343M1nKw5zJAuGxKuic1glYqL7ngtlMnYE6 ruQ4mDIxQJDF5h5Zwu/k/X0lVOYd6FUfcPVSmvK3FrytAZ14O6fSRZKH9jwJa1XI MS7ECbxSyP1GMJUn0FN6/wJ9Ff52N3XBK7TiENEXOjnN5+Jksl4mf73n6AW9e52p W/3c6Cl4Zfc5bDFzNJTOXJKtrZ5Q53ZqCS/YHAoVct65HDeBmtJU0j1XN1mgGDga 2dgEylS1eGQdLPNIk1+Cw5rkoosGMZddH2ktrw6mLhmcoA== DomainKey-Signature: a=rsa-sha1; c=nofws; d=grem.de; h=date:from:to:cc :subject:message-id:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=20180501; b=nS 7OcPGnHHYjxy/zkGh6Ar8ZoiMwwj+eg9H5w8qAsd0dE8i9n6vA1SYV9YEEhJQ7An umv+eiNbWGD6rY2zfGKDPiTY6Jddfgd7P+ZZiLK8pr6rvdK84YoLn2ibq1Kdw4ol GOYysGhXxthC6+ExcZsS8MdGjAwMvyGJiECFHNUwGFRIpWuXyp4vkTVTvI4TDBGP uoGnb1wZqWCBiykHKU3veKTHlmx6oOeGDAfIjizB/3prMk0QXgW+782glVpZB+4Y rnYUbAepzUBtcERILlmX0LD83a+87pFVQnyVeXXr6KqUhs4BuURNhosHlbFpB3hR lB9Phmug0ZMNbAN5amJw== Received: by mail.evolve.de (OpenSMTPD) with ESMTPSA id b9997e4b (TLSv1.3:AEAD-CHACHA20-POLY1305-SHA256:256:NO); Fri, 6 Aug 2021 15:53:46 +0000 (UTC) Date: Fri, 6 Aug 2021 17:53:34 +0200 From: Michael Gmelin To: Mark Johnston Cc: Konstantin Belousov , jail@freebsd.org Subject: Re: POSIX shared memory, jails, and (lack of) limits Message-ID: <20210806175334.12d5d943@bsd64.grem.de> In-Reply-To: <26D98CA9-B4ED-4BCB-935D-1EB8EBDA8F5D@grem.de> References: <26D98CA9-B4ED-4BCB-935D-1EB8EBDA8F5D@grem.de> X-Face: $wrgCtfdVw_H9WAY?S&9+/F"!41z'L$uo*WzT8miX?kZ~W~Lr5W7v?j0Sde\mwB&/ypo^}> +a'4xMc^^KroE~+v^&^#[B">soBo1y6(TW6#UZiC]o>C6`ej+i Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEWJBwe5BQDl LASZU0/LTEWEfHbyj0Txi32+sKrp1Mv944X8/fm1rS+cAAAACXBIWXMAAAsTAAAL EwEAmpwYAAAAB3RJTUUH3wESCxwC7OBhbgAAACFpVFh0Q29tbWVudAAAAAAAQ3Jl YXRlZCB3aXRoIFRoZSBHSU1QbbCXAAAAAghJREFUOMu11DFvEzEUAGCfEhBVFzuq AKkLd0O6VrIQsLXVSZXoWE5N1K3DobBBA9fQpRWc8OkWouaIjedWKiyREOKs+3PY fvalCNjgLVHeF7/3bMtBzV8C/VsQ8tecEgCcDgrzjekwKZ7TwsJZd/ywEKwwP+ZM 8P3drTsAwWn2mpWuDDuYiK1bFs6De0KUUFw0tWxm+D4AIhuuvZqtyWYeO7jQ4Aea 7jUqI+ixhQoHex4WshEvSXdood7stlv4oSuFOC4tqGcr0NjEqXgV4mMJO38nld4+ xKNxRDon7khyKVqY7YR4d+Cg0OMrkWXZOM7YDkEfKiilCn1qYv4mighZiynuHHOA Wq9QJq+BIES7lMFUtcikMnkDGHUoncA+uHgrP0ctIEqfwLHzeSo+eUA66AqzwN6n 2ZHJhw6Qh/PoyC/QENyEyC/AyNjq74Bs+3UH0xYwzDUC4B97HgLocg1QLYgDDO1v f3UX9Y307Ew4AHh67YAFFsxEpkXwpXY3eIgMhAAE3R19L919nNnuD2wlPcDE3UeT L2ytEICQib9BXgS2fU8PrD82ToYO1OEmMSnYTjSqSv9wdC0tPYC+rQRQD9ESnldF CyqfmiYW+tlALt8gH2xrMdC/youbjzPXEun+/ReXsMCDyve3dZc09fn2Oas8oXGc Jj6/fOeK5UmSMPmf/jL+GD8BEj0k/Fn6IO4AAAAASUVORK5CYII= List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4Gh9510jxgz4nMf X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=grem.de header.s=20180501 header.b=isnMXD07; dmarc=none; spf=pass (mx1.freebsd.org: domain of freebsd@grem.de designates 213.239.217.29 as permitted sender) smtp.mailfrom=freebsd@grem.de X-Spamd-Result: default: False [-3.31 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[grem.de:s=20180501]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:213.239.217.29/32]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[grem.de]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[grem.de:+]; NEURAL_HAM_SHORT(-0.81)[-0.811]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.239.192.0/18, country:DE]; RCVD_TLS_ALL(0.00)[]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org] X-ThisMailContainsUnwantedMimeParts: N On Mon, 2 Aug 2021 22:38:54 +0200 Michael Gmelin wrote: > > On 2. Aug 2021, at 21:40, Mark Johnston wrote: > > ... > > racct/rctl provides the "swapuse" resource which should account for > > this. It does not apply to largepage objects, though. =20 >=20 > I tried to limit swapuse for a jail and it doesn=E2=80=99t limit posix sh= ared > memory created within the jail (I can still create shared memory > segments within the jail until the machine runs out of virtual > memory). >=20 > Should I share the test case to make sure I didn=E2=80=99t mess up? See a stripped down example below (originally I did this in a proper jail). Cheers Michael (resent, originally off-list) cat >/tmp/shmtest.c< #include #include #include #include #include #include #define SEG_LEN 1024*1024*10 int main(int argc, char** argv) { int fd =3D shm_open(argv[1], O_CREAT | O_RDWR, S_IRUSR | S_IWUSR); ftruncate(fd, SEG_LEN); char* ptr =3D mmap(NULL, SEG_LEN,=20 PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); memset(ptr, 0xff, SEG_LEN); } EOF cc -o /tmp/shmtest /tmp/shmtest.c rctl -a jail:test:vmemoryuse:deny=3D500M rctl -a jail:test:memoryuse:deny=3D400M rctl -a jail:test:swapuse:deny=3D200M jail -c path=3D/ name=3Dtest \ command=3Dsh -c 'for name in $(jot 1000); do /tmp/shmtest /$name; done' --=20 Michael Gmelin