From owner-freebsd-arch@FreeBSD.ORG Fri Nov 14 21:11:46 2014 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6912E47C; Fri, 14 Nov 2014 21:11:46 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 28B631FC; Fri, 14 Nov 2014 21:11:44 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA20584; Fri, 14 Nov 2014 23:13:35 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1XpO9d-000NDa-TB; Fri, 14 Nov 2014 23:11:41 +0200 Message-ID: <54666FD5.6080705@FreeBSD.org> Date: Fri, 14 Nov 2014 23:10:45 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Konstantin Belousov , John Baldwin Subject: suspending threads before devices [Was: svn commit: r233249 - head/sys/amd64/acpica] References: <201203202037.q2KKbNfK037014@svn.freebsd.org> <201203211502.14353.jkim@FreeBSD.org> <4F6AF1CB.80902@FreeBSD.org> <201203220748.49635.jhb@freebsd.org> <20120322141436.GC2358@deviant.kiev.zoral.com.ua> In-Reply-To: <20120322141436.GC2358@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: "src-committers@freebsd.org" , Jung-uk Kim , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2014 21:11:46 -0000 On 22/03/2012 16:14, Konstantin Belousov wrote: > I already noted this to Jung-uk, I think that current suspend handling > is (somewhat) wrong. We shall not stop other CPUs for suspension when > they are executing some random kernel code. Rather, CPUs should be safely > stopped at the kernel->user boundary, or at sleep point, or at designated > suspend point like idle loop. > > We already are engaged into somewhat doubtful actions like restoring of %cr2, > since we might, for instance, preemt page fault handler with suspend IPI. I recently revisited this issue in the context of some suspend+resume problems that I am having with radeonkms driver. What surprised me is that the driver's suspend code has no synchronization whatsoever with its other code paths. So, I looked first at the Linux code and then at the illumos code to see how suspend is implemented there. As far as I can see, those kernels do exactly what you suggest that we do. Before suspending devices they first suspend all threads except for one that initiates the suspend. For userland threads a signal-like mechanism is used to put them in a state similar to SIGSTOP-ed one. With the kernel threads mechanisms are different between the kernels. Also, illumos freezes kernel threads after suspending the devices, not before. I think that we could start with only the userland threads initially. Do you think the SIGSTOP-like approach would be hard to implement for us? References: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/cpr/cpr_main.c#425 http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/cpr/cpr_uthread.c#80 http://lxr.free-electrons.com/source/kernel/power/suspend.c#L388 http://lxr.free-electrons.com/source/kernel/power/suspend.c#L207 http://lxr.free-electrons.com/source/kernel/power/power.h#L235 http://lxr.free-electrons.com/source/kernel/power/process.c#L118 http://lxr.free-electrons.com/source/kernel/power/process.c#L27 http://lxr.free-electrons.com/source/kernel/freezer.c#L115 -- Andriy Gapon