From owner-freebsd-arch@freebsd.org Thu Dec 22 19:37:06 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E5951C8D28A for ; Thu, 22 Dec 2016 19:37:06 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: from mail-qk0-x243.google.com (mail-qk0-x243.google.com [IPv6:2607:f8b0:400d:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9D696B98 for ; Thu, 22 Dec 2016 19:37:06 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: by mail-qk0-x243.google.com with SMTP id t184so14472684qkd.1 for ; Thu, 22 Dec 2016 11:37:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=cc:message-id:from:to:in-reply-to:content-transfer-encoding :mime-version:subject:date:references; bh=hnXU+Xm2Ahl8K9IgddzZko2da8VOBJRVCLveVqFusxw=; b=Q1rx/eIWdBl+3aWFRA4pn9eynKqFdWixj72zm42lHTGh+fowLai6mhHu8XdWM7QPCm cQswiVJgfe+k5l8UhETlsORY0HIDE9y/Jc3Fb6KJ9hYvsTbjpOF1fPYrw5ABTM/DTd5/ wngFoDUDEmSvgDF8KNjZV4z4To3Y+S2vdlnns9+y6XkhpPP/2b2r8froAc/6nWy+SOuE WLJsGN0DIxhiRRHJUc5KGnh5hItcxT1ulMuT08afeHeUdaPCDd7EFCryFyu2xRHM+LLy 4+RwjWmXypx/u7tx2Z7L94HTS7ucnuWO40qu9R/2P+S5BW5a1r4savpQf7yS0MuHnsWX E/FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:cc:message-id:from:to:in-reply-to :content-transfer-encoding:mime-version:subject:date:references; bh=hnXU+Xm2Ahl8K9IgddzZko2da8VOBJRVCLveVqFusxw=; b=r9kVqKIyHKm72zWkj6RJ/QykPAmpi6SyLDynCLg8jlGRnLwjFRNhOoo09tV6CpbfQy 1fd4wGbypTeMnrGufNbjazJsVOHvwowyGLFZqHG3/d3BvP7ymSatkXuSsKDKCCUGvycu 0YUCZgQE6NP9TPeDJb7M8bKaaExcw2p1OpkGu2vlxr6LIrqFtcuBtoewTlxgs6ZJRnAh uCZQTgLn7VD50ubU3+yXmR6XPy6LL5h0YZt5WvpeFdWn3Bp7aVAEsc+czBX/0UvWnsRT 71MgwayfTJmcKsi86AiU5drPXG8NxqGnNZlAPiWx1JVl5eYQ81wU+jEZgeUf9ZCm2SPc nD8A== X-Gm-Message-State: AIkVDXJLYmXcK6+Sz6/tILcVCBhqfvz8QZBrmjj1gSUjvwu/H0TsJCS9ubONyieLrwms9A== X-Received: by 10.55.160.18 with SMTP id j18mr12897220qke.239.1482435425769; Thu, 22 Dec 2016 11:37:05 -0800 (PST) Received: from blackstar.home (pool-173-79-29-94.washdc.fios.verizon.net. [173.79.29.94]) by smtp.gmail.com with ESMTPSA id p196sm18607695qke.47.2016.12.22.11.37.04 (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 22 Dec 2016 11:37:05 -0800 (PST) Cc: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= , FreeBSD Arch Message-Id: <6C1FBD30-8301-4C6D-8C8B-653C6C096A93@gmail.com> From: Justin Hibbits To: Warner Losh In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: Order of device suspend/resume Date: Thu, 22 Dec 2016 14:37:04 -0500 References: <20161215114033.r33nt3fqhnfi7hqw@dhcp-3-221.uk.xensource.com> <7469755.xT5lfhErkd@ralph.baldwin.cx> X-Mailer: Apple Mail (2.936) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Dec 2016 19:37:07 -0000 On Dec 16, 2016, at 12:25 AM, Warner Losh wrote: > On Thu, Dec 15, 2016 at 8:34 PM, Justin Hibbits =20 > wrote: >> >> On Dec 15, 2016, at 3:38 PM, John Baldwin wrote: >> >>> On Thursday, December 15, 2016 11:40:33 AM Roger Pau Monn=E9 wrote: >>>> >>>> Hello, >>>> >>>> I'm currently dealing with a bug in the Xen suspend/resume =20 >>>> sequence, and >>>> I've >>>> found that lacking a way to order device priority during suspend/=20= >>>> resume >>>> is >>>> proving quite harmful for Xen (and maybe other systems too). The =20= >>>> current >>>> suspend/resume code simply scans the root bus, and suspends/=20 >>>> resumes every >>>> device >>>> based on the order they are attached to their parents. The =20 >>>> problem here >>>> is that >>>> there's no way to tell that some devices should be resumed before =20= >>>> others, >>>> for >>>> example the event timers/time counters/uarts should definitely be =20= >>>> resume >>>> before >>>> other devices, but that's seems to happens mostly out of chance. >>>> >>>> Currently most time related devices are attached directly to the =20= >>>> nexus, >>>> which >>>> means they will get resumed first, but for example the uart is =20 >>>> currently >>>> attached to the pci bus IIRC, which means it gets resumed quite =20 >>>> late. On >>>> Xen >>>> systems, this is even worse. The Xen PV bus (that contains all >>>> Xen-related >>>> devices) is attached the last one (because it tends to pick up =20 >>>> unused >>>> memory >>>> regions for it's own usage) and this bus also contains the PV =20 >>>> timecounter >>>> which >>>> should be resumed _before_ other devices, or else timecounting =20 >>>> will be >>>> completely screwed and things can get stuck in indefinitely long =20= >>>> loops >>>> (due to >>>> the fact that the timecounter is implemented based on the uptime =20= >>>> of the >>>> host, >>>> and that changes from host-to-host). >>>> >>>> In order to solve this I could add a hack to the Xen resume process >>>> (which is >>>> already different from the ACPI one), but this looks gross. I =20 >>>> could also >>>> attach >>>> the Xen PV timer to the nexus directly (as it was done before), =20 >>>> but I >>>> also >>>> prefer to keep all Xen-related devices in the same bus for =20 >>>> coherency. >>>> Last >>>> option would be to add some kind of suspend/resume priorities to =20= >>>> the >>>> devices, >>>> and do more than one suspend/resume pass. This is more complex and >>>> requires more >>>> changes, so I would like to know if it would be helpful for other >>>> systems, or if >>>> someone has already attempted to do it. >>> >>> >>> I think Justin Hibbits had some patches to make use of the boot-time >>> new-bus >>> passes for suspend and resume which I think would help with this. =20= >>> You >>> suspend >>> things in the reverse order of boot and resume operates in the =20 >>> same order >>> as >>> boot. >>> >>> -- >>> John Baldwin >> >> >> John is right. I have a (somewhat abandoned due to time and focus) =20= >> branch, >> https://svnweb.freebsd.org/base/projects/pmac_pmu/ which has the =20 >> necessary >> code working mostly on PowerPC. The diff can be found at >> https://reviews.freebsd.org/D203 too. > > Cool. Does it have a mechanism similar to the attach code that lets > you run again at each pass? > > Warner Not exactly. The code will call the BUS_SUSPEND_CHILD() as it rolls =20 back the pass levels, and stop on errors. The meat is in a rewrite of =20= bus_generic_suspend() in that review. - Justin=