From owner-freebsd-current@freebsd.org Wed Dec 16 19:08:11 2015 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6457AA48594 for ; Wed, 16 Dec 2015 19:08:11 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 455461DF7 for ; Wed, 16 Dec 2015 19:08:11 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id tBGJ8286089845; Wed, 16 Dec 2015 11:08:06 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201512161908.tBGJ8286089845@gw.catspoiler.org> Date: Wed, 16 Dec 2015 11:08:02 -0800 (PST) From: Don Lewis Subject: Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode To: kostikbel@gmail.com cc: freebsd-current@freebsd.org In-Reply-To: <20151216121000.GV3625@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2015 19:08:11 -0000 On 16 Dec, Konstantin Belousov wrote: > On Wed, Dec 16, 2015 at 12:21:16PM +0100, Fabian Keil wrote: >> Konstantin Belousov wrote: >> > It is the values of *p and *(p->p_pgrp) that are needed, from the frame 8. >> >> Unfortunately it's not available and apparently I removed the attempts >> to get it from the previous output. > >> allproc is available and the first one matches lastpid and has an invalid >> p_pgrp, but due to trypid being optimized out as well, it's not obvious >> (to me) that it's the right process. > > p_suspcount = 0, p_xthread = 0xfffff801162819a0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_procdesc = 0x0, p_treeflag = 0, p_magic = 3203398350, p_osrel = 1100090, >> p_comm = 0xfffff800304df3c4 "privoxy", > p_pgrp = 0x618b0080, > >> I've changed p's declaration to static so hopefully its value will >> be available the next time the panic occurs, but it may take a while >> until that happens. > > From the state of the process you provided, it is a new (zigote) of the > forking process, which was already linked into allproc list. Also, > it seems that bzero part of the forking procedure was finished, but bcopy > was not yet. The p_pgrp cannot be a pointer, it is not yet initialized. > > There, we have at least one issue, since zigote is linked before the > p_pgrp is initialized, and the proctree/allproc locks are dropped. > As result, fork_findpid() accesses memory with undefined content. > > It seems that the least morbid solution is to slightly extend the scope > of the allproc lock in do_fork(), to prevent fork_findpid() from working > while we did not finished copying data from old to new process. I used to have a patch the deferred linking the new process into proctree/allproc until it was fully formed. The motivation was to get rid of all of the PRS_NEW stuff scattered around the source. Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.