From owner-freebsd-current@FreeBSD.ORG Sat Jan 31 13:50:23 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 33E4A16A4CE for ; Sat, 31 Jan 2004 13:50:23 -0800 (PST) Received: from cirb503493.alcatel.com.au (c211-30-75-229.belrs2.nsw.optusnet.com.au [211.30.75.229]) by mx1.FreeBSD.org (Postfix) with ESMTP id 82FBC43D1D for ; Sat, 31 Jan 2004 13:50:20 -0800 (PST) (envelope-from PeterJeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])i0VLoEe3026692; Sun, 1 Feb 2004 08:50:14 +1100 (EST) (envelope-from jeremyp@cirb503493.alcatel.com.au) Received: (from jeremyp@localhost)i0VLo6pU026691; Sun, 1 Feb 2004 08:50:06 +1100 (EST) (envelope-from jeremyp) Date: Sun, 1 Feb 2004 08:50:06 +1100 From: Peter Jeremy To: Jun Su Message-ID: <20040131215006.GP908@cirb503493.alcatel.com.au> References: <20040129134121.GB53644@frontfree.net> <20040129200442.GA52780@VARK.homeunix.com> <01bd01c3e742$093d83b0$56dea1d3@shasujunmv> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <01bd01c3e742$093d83b0$56dea1d3@shasujunmv> User-Agent: Mutt/1.4.1i cc: hackers@freebsd.org cc: current@freebsd.org cc: David Schultz cc: Olivier Houchard Subject: Re: Call for testers: New PID allocator patch for -CURRENT X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Jan 2004 21:50:23 -0000 On Fri, Jan 30, 2004 at 11:02:08PM +0800, Jun Su wrote: >I think this algorithms for proc_alloc, pfind and zpfind are all O(1). The >worst situation is that >it reaches PID_MAX. In this situation, we need expand our pidtbl. This may >bring some delay. However, this situation will only occurs few times. If a >machine need 1000 concurrent process, it will only need a 2 ^ 10 slots >process table. From the original 2 ^ 5 table, we need only 5 times to expend >the table. What was the reason for picking an initial value of 32? A typical -CURRENT system will have more than this many processes running by the time rc.d is finished. A more reasonable initial value would seem to be 2^7. Compaq/HP Tru64 uses a similar pid allocation strategy and it appears to start with a table of around 2^10. The memory requirements for embedded systems could be reduced by making the initial value tunable - either at boot time or config time. >> [2] Many systems have a high enough fork rate that pids recycle >> every few minutes or hours with the present algorithm. These >> systems don't necessarily have lots of processes running at any >> given time, so the table (and thus the cycle length) in your >> patch could remain relatively small if I'm interpreting the >> code correctly. I think the code would have to be changed to >> prevent reuse from happening too quickly in wall time. >Reusing the proc slot doesn't mean reusing the pid. Everytime we >reuse a proc slot, we will add pidtbl_mask to the pid. We reuse >the pid when the pid reach the max_pid. Therefore if a user wants, he can >increase >the max_pid to a larger number to increase the period that the pid is not be >reused. I will add a sysctl to allow user to adjust max_pid. I don't believe it's reasonable to just create a max_pid sysctl and expect users to tune this to avoid obscure system misbehaviour. If the maximum number of simultaneous processes is just below a power of two then there are very few free slots. These slots will therefore be reused very rapidly. Even taking into account the pid_tbl_mask, the pid's could be reused quite rapidly - especially since pid_max may only be twice pid_tbl_mask. The code does include a comment "ensure pids cycle through 2000+ values" but it's very unclear how this actually works - pid_alloc_cnt is just a count of the used slots in pid_table and pid_alloc_lim starts off as the size of pid_table and either doubles or triples whenever the pid_table size doubles. I can't see any mechanism to ensure any minimum pid cycle length longer than 2. Note that many system utilities "know" that pids can be represented in 5 digits and having max_pid exceed 99999 will disrupt output from top, ps, lsof, pstat etc. This places an upper limit on how high max_pid can be realistically tuned. Rather than doubling the size of pid_table only if the existing table is full, you need a mechanism to also double the pid_table size and/or increase max_pid if pids are reused too quickly. A simple check would be ((pid_tbl_mask - pid_alloc_cnt) * pid_max / pid_table_mask < N) [for some suitable N]. It would be nice if N depended on the fork() rate but this may make to code sensitive to fork bombs. Peter