Date: Wed, 8 Aug 2012 22:10:03 GMT From: Konstantin Belousov <konstantin.belousov@zoral.com.ua> To: freebsd-amd64@FreeBSD.org Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get unlimited rlimit Message-ID: <201208082210.q78MA3kS036202@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR amd64/170351; it has been noted by GNATS. From: Konstantin Belousov <konstantin.belousov@zoral.com.ua> To: Ming Qiao <mqiao@juniper.net> Cc: Erin MacNeil <emacneil@juniper.net>, freebsd-gnats-submit@freebsd.org Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get unlimited rlimit Date: Thu, 9 Aug 2012 01:06:31 +0300 --V8ijD2GsuVnOuiV5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Do not strip public lists from the discussion. There is nothing private. On Tue, Aug 07, 2012 at 05:52:07PM -0400, Ming Qiao wrote: > Hi Konstantin, >=20 > Thanks for your quick response. Actually I'm not very clear about > the second approach you mentioned. Some questions here: 1) Could you > please elaborate the idea of "tracking rlimits set to ABI infinity"? > If I understand correctly, you are referring to a model where a > process can have it rlimit set multiple times by different ABI? But > what does it mean exactly? Could you give a simple example here? 2) > What do you mean by "per-struct rlimit"? Do you mean each memory > segment as a struct? such as datasize, stacksize, etc. I mean that in addition to the existing array of pl_rlimit in struct plimit, you also create an bitmap array of the same size. Set bit in this new array would indicate that corresponding limit was set (either implicit, or explicitely by usermode) to infinity. The bit has its meaning regardless of the actual numeric value written into the pl_rlimit, either by syscall or by sv_fixup. Then, 64bit sysent should also grow sv_fixup for resource limits, and set it accordingly for host ABI if array indicates that resource is logically 'infinite'. For completeness, I should note that bit is cleared if syscall sets the resource to non-infinite value. Per-struct rlimit means that there is a bit for each resource. Is it clear now ? > >=20 > Thanks, > Ming >=20 > -----Original Message----- > From: Konstantin Belousov [mailto:kostikbel@gmail.com]=20 > Sent: Friday, August 03, 2012 1:39 PM > To: Ming Qiao > Cc: freebsd-gnats-submit@freebsd.org > Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get= unlimited rlimit >=20 > On Fri, Aug 03, 2012 at 03:35:20PM +0000, Ming Qiao wrote: > >=20 > > >Number: 170351 > > >Category: amd64 > > >Synopsis: [patch] amd64: 64-bit process can't always get unlimit= ed rlimit > > >Confidential: no > > >Severity: non-critical > > >Priority: low > > >Responsible: freebsd-amd64 > > >State: open > > >Quarter: =20 > > >Keywords: =20 > > >Date-Required: > > >Class: sw-bug > > >Submitter-Id: current-users > > >Arrival-Date: Fri Aug 03 15:40:08 UTC 2012 > > >Closed-Date: > > >Last-Modified: > > >Originator: Ming Qiao > > >Release: FreeBSD 9.0-RC2 > > >Organization: > > Juniper Networks > > >Environment: > > FreeBSD neys 9.0-RC2 FreeBSD 9.0-RC2 #0: Thu Jul 26 01:27:46 UTC 2012= =20 > > root@neys:/usr/obj/usr/src/sys/GENERIC amd64 > > >Description: > > On the amd64 platform, if a 32-bit process ever manually set its=20 > > rlimit, none of its 64-bit child or offspring will be able to get the= =20 > > full 64-bit rlimit anymore, even if they explicitly set the limit to un= limited. > >=20 > > Note that for the sake of simplicity, only datasize limit is referred= =20 > > in this report. But the same logic applies to all other memory segment= =20 > > (i.e. stacksize, etc.). > >=20 > > Take the following scenario as an example: > > 1) Let's say we have a 32-bit process p1 whose hard limit is set to=20 > > 500MB by calling setrlimit(). > > 2) p1 then exec another 32-bit process p2. > > 3) p2 set its hard limit to unlimited by calling setrlimit(). > > 4) p2 exec a 64-bit process p3. > > 5) check the hard limit of p3, we can see that it only has 3GB (value= =20 > > of > > ia32_maxdsiz) instead of 32GB which is the global kernel limit (value= =20 > > of > > maxdsiz) for a 64-bit process. > >=20 > > The root cause is that on step 3, p2 didn't actually set its limit to= =20 > > the correct value when calling setrlimit(). Instead the limit is set=20 > > to ia32_maxdsiz since ia32_fixlimit() is called in kern_proc_setrlimit(= ). > > >How-To-Repeat: > > There are 3 test programs attached in this report: 32_p1.c, 32_p2.c,=20 > > and 64_p3.c. They can be used to reproduce the problem. > >=20 > > 1) Compile 32_p1.c and 32_p2.c into 32-bit binaries. Compile 64_p3.c=20 > > into 64-bit binary. > > 2) Put all 3 binaries into the same directory on a machine running=20 > > FreeBSD > > amd64 version. > > 3) Run 32_p1 which will exec 32_p2 and 64_p3. The output of 64_p3 will= =20 > > show its limit is capped at ia32_maxdsiz. > > >Fix: > > The proposed fix is to change kern_proc_setrlimit() so that=20 > > sv_fixlimit() will not be called if the caller wants to set the new lim= it to RLIM_INFINITY. > > Please refer to the attached diff file for the proposed fix. > The 'fix' is wrong and does not address the issue. > Instead, it uses some arbitrary properties of the scenario you considered= and adapts kernel code to suit your scenario. Your deny the correction of = the infinity limit, I do not see how it can be right. >=20 > The problem you described is architectural. By design, Unix resource limi= ts cannot be increased after they were decreased, except by root. > In your scenario, the limits were decreased by mere fact of running the 3= 2bit process which have lower 'infinity' limits then 64bit processes. >=20 > That said, I see two possible solutions. >=20 > First is to manually set compat.ia32.max* sysctls to 0. Then you get desi= red behaviour for 64bit processes execed from 32bit, it seems. > It does not require code change. Since you are fine with denying fix for = infinity, this setting gives the same effect as the patch. >=20 > Second approach (which is essentially a correction to your approach from = fix.diff) is to track the fact that corresponding rlimits are set to 'ABI i= nfinity', in some per-struct rlimit flag. Then, get/setrlimit should first = test the 'ABI infinity' flag and behave as if rlimit is set to infinity for= current bitness even if the actual value of rlimit is not infinity. Flag i= s set when rlimit is set to infinity by current ABI. >=20 > The second approach would provide 'correct' fix, but it is not trivial am= ount of work for very rare situation (execing 64bit process from 32bit), an= d current behaviour of inheriting 32bit limits may be argued as right. > If you want, feel free to develop such patch, I will review and commit it= , but I do not want to spend efforts on developing it myself ATM. --V8ijD2GsuVnOuiV5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlAi4ucACgkQC3+MBN1Mb4grKACg01g2AphuVQdC389JCrfSck+x 5xIAoMuYfuQ4aKvCgcKShvGM4b2ftkVn =q/lG -----END PGP SIGNATURE----- --V8ijD2GsuVnOuiV5--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201208082210.q78MA3kS036202>