From owner-freebsd-emulation@FreeBSD.ORG Sat Jun 11 18:40:10 2005 Return-Path: X-Original-To: freebsd-emulation@hub.freebsd.org Delivered-To: freebsd-emulation@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E9CD16A41C for ; Sat, 11 Jun 2005 18:40:10 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3396043D49 for ; Sat, 11 Jun 2005 18:40:10 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j5BIe8Xa072872 for ; Sat, 11 Jun 2005 18:40:08 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j5BIe8nO072871; Sat, 11 Jun 2005 18:40:08 GMT (envelope-from gnats) Date: Sat, 11 Jun 2005 18:40:08 GMT Message-Id: <200506111840.j5BIe8nO072871@freefall.freebsd.org> To: freebsd-emulation@FreeBSD.org From: Bruce Evans Cc: Subject: Re: kern/81951: [patch] linux emulation: getpriority() returns incorrect value X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Bruce Evans List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Jun 2005 18:40:10 -0000 The following reply was made to PR kern/81951; it has been noted by GNATS. From: Bruce Evans To: Andriy Gapon Cc: freebsd-gnats-submit@freebsd.org, freebsd-emulation@freebsd.org Subject: Re: kern/81951: [patch] linux emulation: getpriority() returns incorrect value Date: Sun, 12 Jun 2005 04:37:24 +1000 (EST) On Thu, 9 Jun 2005, Andriy Gapon wrote: > on 09.06.2005 16:17 Bruce Evans said the following: >>> on 08.06.2005 23:49 Maxim Sobolev said the following: >>>> Committed, thanks! >>>> >>>> I wonder if the setpriority(2) needs the same cure. Please clarify and >>>> let me know. I'll keep the PR open till your reply. >> >> I wonder why committers commit patches without fully understanding them. > > I wonder if you fully understood the patch, the issue and the > getriority/setpriority. I thought I did, but I read POSIX partly backwards. >> POSIX specifies that the non-error range of values returned by >> getpriority() >> is [0, 2*{NZERO}-1]; -1 is the error indicator. Applications must subtract >> NZERO to get the actual priority value. > I think you have misread POSIX specification and you are confusing two > things: (1) priority - priority inside the blackbox that schedules > processes versus values that should be passed to setpriotiy() and > returned from getpriority(); (2) syscall internal implementation versus > user-visible libc function. Priority in the black bix is td->td_priority. p->p_nice is supposed to be the user-visible priority offset by NZERO in freeBSD, and it is, but things are made confusing by "fixing" the historical value of NZERO so that NZERO is 0. Biases of 0 are subtle and POSIX has made the NZERO = 0 bias by wrong over-specifying the behaviour as the historical behaviour. > Regaridng #1, here's a direct quote: > http://www.opengroup.org/onlinepubs/009695399/functions/getpriority.html > > "Upon successful completion, getpriority() shall return an integer in > the range -{NZERO} to {NZERO}-1. Otherwise, -1 shall be returned and > errno set to indicate the error." > Also: > "The getpriority() and setpriority() functions work with an offset nice > value (nice value -{NZERO}). The nice value is in the range [0,2*{NZERO} > -1], while the return value for getpriority() and the third parameter > for setpriority() are in the range [-{NZERO},{NZERO} -1]." This is the part that I misread. I only saw the "Also" part and I read it backwards as specifying Linux-like behaviour to avoid the in-band ierror indicator. > So this is a difference between priority as it is seen in user-land > (above libc layer) and priority inside the POSIX blackbox of OS (the one > in [0,2*{NZERO} -1] range). It is a bug in POSIX for POSIX to specify the black box. The FreeBSD black box doesn't actually use this range, and applications and users hardly notice since they mostly see the adjusted priorities (with default priority 0 instead of NZERO). > My understanding is that FreeBSD and Linux are very close to POSIXly > correct implemetations with NZERO=20. In fact, Linux's implementation is > completely compliant and FreeBSD allows +20 which is beyond the POSIX range. > Also, -1 return value from getpriority() is a problematic point of POSIX > specification not implemenations. To conform, FreeBSD would need to expand or shrink the priority range by 1 to cover or drop +20, and change NZERO from 0 to 20 or 21, and move the priorities in the grey box up by NZERO. > Regarding #2, both FreeBSD and Linux in their unique ways correctly > return errno/priority level from kernel-land to user-land. FreeBSD > syscall returns priority already in [-{NZERO},{NZERO} -1] range; Linux Except NZERO is 0 in FreeBSD. > syscall returns priority in [1,2*{NZERO}] range and with reversed > comparison, and then (g)libc stub of getpritority performs 20-X > conversion to return a correct value to application. >> I think the reason that setpriority(2) is not affected is actually that >> Linux applications know to use (20 - pri) to recover the actual priority. It is actually the library stub that does this. So getpriority(2) doesn't give POSIX getpriority in Linux, but getpriority() 3 does. >> Fixing getpriority() in FreeBSD and all emulators should involve much the >> same code: map the range of internal priorities [PRIO_MIN, PRIO_MAX] to >> getpriority()'s range [0, 2*{SUBSYSTEM_SPECIFIC_NZERO}-1] as linearly >> as possible (something like: >> >> pri |-> (pri - PRIO_MIN) * (2 * SUBSYSTEM_SPECIFIC_NZERO - 1) / >> (PRIO_MAX - PRIO_MIN) >> >> but more complicated, since for if SUBSYSTEM_SPECIFIC_NZERO == 20 the >> above maps the default priority 0 to (20 * 39 / 2) = 19, but 20 is >> required; also for Linux there must be a negation. > > I think you have greatly overcomplicated thing sbecause of your original > misunderstanding. Just compile a small program using > getpriority/setpriority for FreeBSD, Linux and any other Unix avaialble > to you, run it and you will see how simple thingx are in reality and > that NZERO is not visible to userland. Read the man pages too. > Yes, and try Linux emulation with and without my patch to understand > what the problem with emualtion really is. This part of my previous mail is almost correct. There is an internal range [PRIO_MIN, PRIO_MAX] which should be mapped to the [-{NZERO}, {NZERO} -1] range (not the [0, 2*{NZERO} - 1] range like I said previously. setpriority() should invert this mapping. Matching the range of the emulated system is actually more important for setpriority(), since applications probably treat values returned by getpriority() as cookies and don't notice if they are out of bounds, but the kernel does range checking on the values passed by setpriority(). In addition, for Linux getpriority() the values must be mapped by pri |-> 20 - pri so that the library stub can restore the previous values. The magic 20 is spelled 20 in the Linux kernel (2.6.10 at least) and as PZERO in glibc (2.3.2 at least). This secondary mapping makes scaling in the first mapping more important, since if FreeBSD had +21 in its priority range, then 20 - pri would give a value of -1 and the library stub would conider this to be an error. Summary: I don't like the committed version since it has many subtle magic numbers in its 20 - X formula: 20: part of Linux adjustment. 20 = 1 + Linux's maximum priority. -1: another part of Linux adjustment 1: factor of 20/20 for the scaling step, where the first 20 is what should be Linux's NZERO and the second 20 is what should be FreeBSD's NZERO (= (PRIO_MAX - PRIO_MIN) / 2). Note that these 20's are subtly different from the 20 in Linux's adjustment. 0: bias for the scaling step (= FreeBSD NZERO). Bruce