Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Dec 2016 14:37:04 -0500
From:      Justin Hibbits <chmeeedalf@gmail.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>, FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: Order of device suspend/resume
Message-ID:  <6C1FBD30-8301-4C6D-8C8B-653C6C096A93@gmail.com>
In-Reply-To: <CANCZdfp_Hx7AXT1N5zjjwMxBMvRM5F_VdkJo%2B=XmUhXj_zxhNA@mail.gmail.com>
References:  <20161215114033.r33nt3fqhnfi7hqw@dhcp-3-221.uk.xensource.com> <7469755.xT5lfhErkd@ralph.baldwin.cx> <CDAA6577-C325-4691-9317-8CB0CE30959D@gmail.com> <CANCZdfp_Hx7AXT1N5zjjwMxBMvRM5F_VdkJo%2B=XmUhXj_zxhNA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Dec 16, 2016, at 12:25 AM, Warner Losh wrote:

> On Thu, Dec 15, 2016 at 8:34 PM, Justin Hibbits =20
> <chmeeedalf@gmail.com> wrote:
>>
>> On Dec 15, 2016, at 3:38 PM, John Baldwin wrote:
>>
>>> On Thursday, December 15, 2016 11:40:33 AM Roger Pau Monn=E9 wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm currently dealing with a bug in the Xen suspend/resume =20
>>>> sequence, and
>>>> I've
>>>> found that lacking a way to order device priority during suspend/=20=

>>>> resume
>>>> is
>>>> proving quite harmful for Xen (and maybe other systems too). The =20=

>>>> current
>>>> suspend/resume code simply scans the root bus, and suspends/=20
>>>> resumes every
>>>> device
>>>> based on the order they are attached to their parents. The =20
>>>> problem here
>>>> is that
>>>> there's no way to tell that some devices should be resumed before =20=

>>>> others,
>>>> for
>>>> example the event timers/time counters/uarts should definitely be =20=

>>>> resume
>>>> before
>>>> other devices, but that's seems to happens mostly out of chance.
>>>>
>>>> Currently most time related devices are attached directly to the =20=

>>>> nexus,
>>>> which
>>>> means they will get resumed first, but for example the uart is =20
>>>> currently
>>>> attached to the pci bus IIRC, which means it gets resumed quite =20
>>>> late. On
>>>> Xen
>>>> systems, this is even worse. The Xen PV bus (that contains all
>>>> Xen-related
>>>> devices) is attached the last one (because it tends to pick up =20
>>>> unused
>>>> memory
>>>> regions for it's own usage) and this bus also contains the PV =20
>>>> timecounter
>>>> which
>>>> should be resumed _before_ other devices, or else timecounting =20
>>>> will be
>>>> completely screwed and things can get stuck in indefinitely long =20=

>>>> loops
>>>> (due to
>>>> the fact that the timecounter is implemented based on the uptime =20=

>>>> of the
>>>> host,
>>>> and that changes from host-to-host).
>>>>
>>>> In order to solve this I could add a hack to the Xen resume process
>>>> (which is
>>>> already different from the ACPI one), but this looks gross. I =20
>>>> could also
>>>> attach
>>>> the Xen PV timer to the nexus directly (as it was done before), =20
>>>> but I
>>>> also
>>>> prefer to keep all Xen-related devices in the same bus for =20
>>>> coherency.
>>>> Last
>>>> option would be to add some kind of suspend/resume priorities to =20=

>>>> the
>>>> devices,
>>>> and do more than one suspend/resume pass. This is more complex and
>>>> requires more
>>>> changes, so I would like to know if it would be helpful for other
>>>> systems, or if
>>>> someone has already attempted to do it.
>>>
>>>
>>> I think Justin Hibbits had some patches to make use of the boot-time
>>> new-bus
>>> passes for suspend and resume which I think would help with this.  =20=

>>> You
>>> suspend
>>> things in the reverse order of boot and resume operates in the =20
>>> same order
>>> as
>>> boot.
>>>
>>> --
>>> John Baldwin
>>
>>
>> John is right.  I have a (somewhat abandoned due to time and focus) =20=

>> branch,
>> https://svnweb.freebsd.org/base/projects/pmac_pmu/ which has the =20
>> necessary
>> code working mostly on PowerPC.  The diff can be found at
>> https://reviews.freebsd.org/D203 too.
>
> Cool. Does it have a mechanism similar to the attach code that lets
> you run again at each pass?
>
> Warner

Not exactly.  The code will call the BUS_SUSPEND_CHILD() as it rolls =20
back the pass levels, and stop on errors.  The meat is in a rewrite of =20=

bus_generic_suspend() in that review.

- Justin=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6C1FBD30-8301-4C6D-8C8B-653C6C096A93>