From owner-freebsd-current@freebsd.org Fri Dec 18 16:38:15 2015 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1A0DDA4C99B for ; Fri, 18 Dec 2015 16:38:15 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x232.google.com (mail-wm0-x232.google.com [IPv6:2a00:1450:400c:c09::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BE98312A0; Fri, 18 Dec 2015 16:38:14 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x232.google.com with SMTP id l126so71584113wml.0; Fri, 18 Dec 2015 08:38:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=3L0X+IcMFjJtxNL1WJx6xVINCgs9ofF5RAS8V4Do9qI=; b=0Q5trPrfMdj91IEOfyW8RLpKro3yRYeeYcx8fLjPuCNAM2UgqGEPhK87LHxMFOL45M mP0Re77dRhr+WPmBsb1uidtUsn2QUpGHOXVBJawOBNDZyroH7cLfaD1RSc65icc68uSp 93P55/1IO+4gy26MD5pjvEYJDVCuJaWUVUhRecIAxxxpWBlmypbDID7Cuyl0w2Os3IeX 9QiEfFfZMXF6PrUPFfbmqZvGj/evWBpCanKuRRdNt3mRo2g20+ZznG3Q9QHl35evczKy MxTPhIRbUpqRU9nj3sRdmlVBnAHjQ5xjvxC6IB/U3Dpajj0O2d5WKps1pk5t+MT9KDCG 4HPw== X-Received: by 10.28.94.1 with SMTP id s1mr4129237wmb.60.1450456693271; Fri, 18 Dec 2015 08:38:13 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id a63sm7342842wmc.5.2015.12.18.08.38.12 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Fri, 18 Dec 2015 08:38:12 -0800 (PST) Date: Fri, 18 Dec 2015 17:38:10 +0100 From: Mateusz Guzik To: Don Lewis Cc: kostikbel@gmail.com, freebsd-current@freebsd.org Subject: Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode Message-ID: <20151218163810.GB830@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Don Lewis , kostikbel@gmail.com, freebsd-current@freebsd.org References: <20151217195802.GA29200@dft-labs.eu> <201512172233.tBHMXkNR096011@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <201512172233.tBHMXkNR096011@gw.catspoiler.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2015 16:38:15 -0000 On Thu, Dec 17, 2015 at 02:33:46PM -0800, Don Lewis wrote: > On 17 Dec, Mateusz Guzik wrote: > > On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote: > >> On 17 Dec, Konstantin Belousov wrote: > >> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote: > >> >> I used to have a patch the deferred linking the new process into > >> >> proctree/allproc until it was fully formed. The motivation was to get > >> >> rid of all of the PRS_NEW stuff scattered around the source. > >> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it. > >> > > >> > I had similar tought for a second as one of the possibilities to fix the > >> > issue, but rejected it outright due to the way the pid allocator works. > >> > The loop which faulted is the allocator, it depends on the new pid being > >> > linked early to detect the duplicated alloc. > >> > > >> > What you wrote could be done, but this restructuring requires the separate > >> > pid allocator, and probably it must repeat all quirks and subtle behaviour > >> > of the current algorithm. But I do not object, PRS_NEW is a trouble > >> > on its own. > >> > >> I don't think it requires any changes to the allocater. It should only > >> be necessary to delay the call to fork_findpid() until we are ready to > >> link the new proc into allproc. Basically, drop the locks at the > >> beginning of do_fork(), then grab them again somewhere near the end > >> (probably where we are currently mark the process as PRS_NORMAL) and > >> move the call to fork_findpid(), the p2->p_pid assignment, and the list > >> manipulation code to a location after that. > >> > >> It's probably not quite that simple though ... > > > > That would mean you would need to be able to deconstruct the process > > because you cannot guarantee there are any pids left, which may or may > > not be easily doable. > > It doesn't look like we handle that properly in the current code. I > think fork_findpid() will loop forever. It shouldn't be possible if > maxproc < pid_max / 3, or maybe pid_max / 2. It might be a good idea to > enforce this. > Not sure I follow, can you rephrase/elaborate? > > The current method is going to bite us performance-wise anyway and an > > allocater which does not require a walk over the tree is necessary in > > the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to > > go here. > > I think that separate bitmaps for process, process group, and session > ids would be needed. It would waste some space, but it's probably more > efficent to use a byte array and store all the bits for the pid > together. > Well I had such separate bitmaps in mind with addition of a combined "the id is in use bitmap". This would make the common case of finding a new pid reasonably fast. Access to all bitmaps would be protected with proctree lock, which matches current locking scheme anyway. -- Mateusz Guzik