From owner-freebsd-arch@FreeBSD.ORG Sat Nov 15 15:06:11 2014 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 22AE1DD4; Sat, 15 Nov 2014 15:06:11 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E922A17E; Sat, 15 Nov 2014 15:06:09 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA03762; Sat, 15 Nov 2014 17:08:00 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1XpevO-000P6i-U3; Sat, 15 Nov 2014 17:06:06 +0200 Message-ID: <54676BA6.7000202@FreeBSD.org> Date: Sat, 15 Nov 2014 17:05:10 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: suspending threads before devices References: <201203202037.q2KKbNfK037014@svn.freebsd.org> <201203211502.14353.jkim@FreeBSD.org> <4F6AF1CB.80902@FreeBSD.org> <201203220748.49635.jhb@freebsd.org> <20120322141436.GC2358@deviant.kiev.zoral.com.ua> <54666FD5.6080705@FreeBSD.org> <20141115105819.GJ17068@kib.kiev.ua> In-Reply-To: <20141115105819.GJ17068@kib.kiev.ua> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: Jung-uk Kim , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Nov 2014 15:06:11 -0000 On 15/11/2014 12:58, Konstantin Belousov wrote: > On Fri, Nov 14, 2014 at 11:10:45PM +0200, Andriy Gapon wrote: >> On 22/03/2012 16:14, Konstantin Belousov wrote: >>> I already noted this to Jung-uk, I think that current suspend handling >>> is (somewhat) wrong. We shall not stop other CPUs for suspension when >>> they are executing some random kernel code. Rather, CPUs should be safely >>> stopped at the kernel->user boundary, or at sleep point, or at designated >>> suspend point like idle loop. >>> >>> We already are engaged into somewhat doubtful actions like restoring of %cr2, >>> since we might, for instance, preemt page fault handler with suspend IPI. >> >> I recently revisited this issue in the context of some suspend+resume problems >> that I am having with radeonkms driver. What surprised me is that the driver's >> suspend code has no synchronization whatsoever with its other code paths. So, I >> looked first at the Linux code and then at the illumos code to see how suspend >> is implemented there. >> As far as I can see, those kernels do exactly what you suggest that we do. >> Before suspending devices they first suspend all threads except for one that >> initiates the suspend. For userland threads a signal-like mechanism is used to >> put them in a state similar to SIGSTOP-ed one. With the kernel threads >> mechanisms are different between the kernels. Also, illumos freezes kernel >> threads after suspending the devices, not before. >> >> I think that we could start with only the userland threads initially. Do you >> think the SIGSTOP-like approach would be hard to implement for us? > We have most, if not all, parts of the stopping code > already implemented. I mean the single-threading code, see > thread_single(SINGLE_BOUNDARY). The code ensures that other threads in > the current process are stopped either at the kernel->user boundary, or > at the safe kernel sleep point. > > This is not immediately applicable, since the caller is supposed to be > a thread in the suspended process, but modifications to allow external > process to do the same are really small comparing with the complexity > of the code. I suspect that all what is needed is change of > while/if (remaining != 1) > to > while/if ((p == curproc && remaining != 1) || > (p != curproc && remaining != 0)) > together with explicit passing of struct proc *p to thread_single. Thank you for the pointer! I think that maybe even more changes are required for that code to be usable for suspending. E.g. maybe a different p_flag bit should be used, because I think that we would like to avoid interaction between the process level suspend and the global suspend. I.e. the global suspend might encounter a multi-threaded process in a single thread mode and would need to suspend its remaining thread. -- Andriy Gapon