From owner-freebsd-acpi@FreeBSD.ORG Thu Jan 10 15:56:25 2008 Return-Path: Delivered-To: freebsd-acpi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7BA4B16A417 for ; Thu, 10 Jan 2008 15:56:25 +0000 (UTC) (envelope-from robert.moore@intel.com) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by mx1.freebsd.org (Postfix) with ESMTP id 5955C13C459 for ; Thu, 10 Jan 2008 15:56:24 +0000 (UTC) (envelope-from robert.moore@intel.com) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 10 Jan 2008 07:54:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.24,267,1196668800"; d="scan'208";a="283095795" Received: from orsmsx334.jf.intel.com ([10.22.226.45]) by fmsmga002.fm.intel.com with ESMTP; 10 Jan 2008 07:53:28 -0800 Received: from orsmsx415.amr.corp.intel.com ([10.22.226.49]) by orsmsx334.jf.intel.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 10 Jan 2008 07:54:06 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Thu, 10 Jan 2008 07:53:16 -0800 Message-ID: In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: GPE handler livelock [fixed?] Thread-Index: AchTiUeLz6GxwdkQTS23flA3PBzpwAAFwjAg References: <200712311556.lBVFuVZf030567@freefall.freebsd.org><477916E0.2090702@root.org><200712311243.18123.jhb@freebsd.org><47802510.3040203@root.org><47826AAA.6040101@root.org><47827291.60405@suse.de> <478272C3.5080704@suse.de><47829C03.3010802@root.org> <4782A051.3050107@suse.de><4782BA24.8030703@root.org> From: "Moore, Robert" To: "Yousif Hassan" , "Nate Lawson" , "Alexey Starikovskiy" X-OriginalArrivalTime: 10 Jan 2008 15:54:06.0110 (UTC) FILETIME=[0785ABE0:01C853A1] Cc: freebsd-acpi@FreeBSD.org Subject: RE: GPE handler livelock [fixed?] X-BeenThere: freebsd-acpi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: ACPI and power management development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jan 2008 15:56:25 -0000 This is great news. Special thanks to Alexey. We've been going around and around for almost 2 years attempting to solve these kinds of notify/GPE issues. It sounds like we may have solved the issue(s). We will of course integrate the changes that were made to the core ACPICA code. Bob >-----Original Message----- >From: owner-freebsd-acpi@freebsd.org [mailto:owner-freebsd- >acpi@freebsd.org] On Behalf Of Yousif Hassan >Sent: Thursday, January 10, 2008 4:57 AM >To: Nate Lawson; Alexey Starikovskiy >Cc: freebsd-acpi@FreeBSD.org; Moore, Robert >Subject: Re: GPE handler livelock [fixed?] > >Nate, everyone: > >Well guess what? The patch appears to WORK!!!! > >I tested it first with a normal startup with ACPI enabled, and it didn't >freeze. >Not satisfied, I turned on verbose booting to watch the thermal change >messages, >and in fact, for the first time, 7.0 successfully changed AC states and TZ >zones >on this laptop. > >Still not satisfied, I began some tasks that generate lots of heat (say, >compiling the kernel), and watched as the temperature rose. No freeze. > >So, I'd like to give a huge thank-you to Nate and the gang, and cautiously, >I'd like to say congratulations, too. > >I'm happy to run any additional tests you need to confirm. > >Is this fix too late for 7.0 or is it a candidate for an RC? I'd >definitely >be >willing to test any new release w/ this fix to make sure it works >out-of-the-box. > >--Yousif > >----- Original Message ----- >From: "Nate Lawson" >To: "Alexey Starikovskiy" >Cc: "Moore, Robert" ; "Yousif Hassan" >; >Sent: Monday, January 07, 2008 6:47 PM >Subject: Re: GPE handler livelock > > >> This makes sense. For the _L00 method in question, it runs 2 notifies >> before it completes. So in our queue, that would be: >> >> T1: GPE:_L00 >> [_LOO method runs, GPE removed] >> [_L00 queues 2 more Notifies] >> T2: Notify, Notify >> [_L00 completes] >> >> Right now, there is nothing in our code to synchronize _L00's completion >> with completion of the Notifys. In fact, _L00 finishes before the >> Notifys run. Also, if another GPE comes in while this GPE is running, >> it will be queued in front of the Notifys and so would execute first. >> >> So I think with just your evgpe.c change, we will defer re-enabling the >> GPE until after the queued Notifys complete since the re-enable task >> will be queued at the same priority. >> >> The only remaining concern I have is if another GPE comes in before one >> or both Notifys run, it will be inserted at the head of the queue. So >> as a hack, I'm setting the priority of these to be equal. >> >> Yousif, please try the attached patch and don't load a custom ASL. >> >> Thanks, >> Nate >> >> Alexey Starikovskiy wrote: >>> Nate, >>> >>> There are no debugger events in Linux, and all other deferred execution >>> happens >>> in kacpid (GPE, EC and GL included). Notify events used to be executed >>> on this same >>> queue until we noticed deadlocks with HP machines. All events are not >>> prioritized in >>> any way -- it is a simple FIFO. To avoid deadlock, we moved notify >>> events to separate queue, >>> but it had a drawback of enabling level GPE too early, thus I inserted a >>> reschedule call to each completion on first queue, giving the notify >>> queue chance to complete. >>> Later, I was reminded that this approach is not bulletproof, so enabling >>> of the level events was >>> moved to notify queue as well. As it happens after all notify events for >>> the gpe event were called, >>> (but, probably, not executed), enabling of GPE will be deferred until >>> these notify events have chance to >>> complete. >>> >>> So, essentially, we had no priority for any event, but now notify event >>> could preempt execution of >>> any other event, and level GPE event does a flush of notify queue. >>> >>> Hope this helps, >>> Alex. >>> >>> Nate Lawson wrote: >>>> Alex, >>>> >>>> I had one question about your approach. It maintains two single-thread >>>> task queues (kacpid and kacpi_notify). It inserts each type of event >on >>>> its own queue. So there is no strict ordering of handling notifies in >>>> priority to other acpi tasks unless you're assuming something about the >>>> linux task priority model. Do you have any expectation that notify >>>> tasks run before other tasks (perhaps by a special priority assigned to >>>> the kacpi_notify work queue)? >>>> >>>> In FreeBSD, we have a single task queue. However, we prioritize events >>>> in the queue in the following order (highest to lowest priority): >>>> >>>> * GPE >>>> * EC/global lock >>>> * Notify >>>> * Debugger >>>> >>>> If an event is inserted on the queue with a higher priority and a >>>> previous event has not started executing yet, this priority determines >>>> the order of insertion. Thus if GPEs keep arriving, the Notify won't >be >>>> executed until they're done. >>>> >>>> Thanks, >>>> Nate >>>> >>>> Alexey Starikovskiy wrote: >>>>> Here is the patch... >>>>> Alexey Starikovskiy wrote: >>>>>> I proposed this patch some time ago, it moves _Lxx enabling to the >end >>>>>> of Notify queue, thus all notifies must complete before event becomes >>>>>> enabled again. >>>>>> Hope it is readable to non-Linux people... >>>>>> >>>>>> Regards, >>>>>> Alex. >>>>>> >>>>>> Moore, Robert wrote: >>>>>>> No changes that I know of before 20070508. >>>>>>> >>>>>>> You'll need to figure out why you are getting another GPE before the >>>>>>> _Lxx method completes. There was something like this on Linux with >>>>>>> an HP >>>>>>> machine, perhaps Alexey can help. >>>>>>> >>>>>>> As I recall, there was something nasty happening where the TZ trip >>>>>>> points had to be reset before the Notify() handler completed, but >>>>>>> this >>>>>>> ended up causing another GPE, etc. etc. >>>>>>> >>>>>>> Bob >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Nate Lawson [mailto:nate@root.org] >>>>>>>> Sent: Monday, January 07, 2008 10:09 AM >>>>>>>> To: Moore, Robert >>>>>>>> Cc: Yousif Hassan; freebsd-acpi@FreeBSD.org >>>>>>>> Subject: Re: GPE handler livelock >>>>>>>> >>>>>>>> Bob, thanks for the reply. That's exactly what my investigation is >>>>>>>> showing also. It appears we're still on 20070320 so I'm not sure >>>>>>>> why >>>>>>>> this would affect us though. Perhaps a similar change was already >>>>>>>> present? In any case, we should see if an import fixes this. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Nate >>>>>>>> >>>>>>>> Moore, Robert wrote: >>>>>>>>> This sounds suspiciously like the changes we made to the Notify() >>>>>>>>> handling last year. We attempted to make the notify handler run >>>>>>>>> synchronously with the caller to Notify(), but this created more >>>>>>>>> problems than it solved. We ended up returning the behavior of >>>>>>>>> Notify >>>>>>>>> handlers to be asynchronous: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19 October 2007. Summary of changes for version 20071019: >>>>>>>>> >>>>>>>>> 1) ACPI CA Core Subsystem: >>>>>>>>> >>>>>>>>> Reverted a change to Notify handling that was introduced in >version >>>>>>>>> 20070508. This version changed the Notify handling from >>>>>>>>> asynchronous >>>>>>> to >>>>>>>>> fully synchronous (Device driver Notify handling with respect to >>>>>>>>> the >>>>>>>>> Notify >>>>>>>>> ASL operator). It was found that this change caused more problems >>>>>>> than >>>>>>>>> it >>>>>>>>> solved and was removed by most users. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: owner-freebsd-acpi@freebsd.org [mailto:owner-freebsd- >>>>>>>>>> acpi@freebsd.org] On Behalf Of Yousif Hassan >>>>>>>>>> Sent: Sunday, January 06, 2008 12:18 PM >>>>>>>>>> To: Nate Lawson >>>>>>>>>> Cc: freebsd-acpi@FreeBSD.org >>>>>>>>>> Subject: Re: GPE handler livelock >>>>>>>>>> >>>>>>>>>> Nate wrote: >>>>>>>>>>> Thanks for digging into this. I reviewed this and am trying to >>>>>>>>> figure >>>>>>>>>>> out why the _L00 handler never completes. It keeps getting >>>>>>> preempted >>>>>>>>> by >>>>>>>>>>> the next one. To help track this down, try removing these two >>>>>>> lines >>>>>>>>>>> from the _L00 method and recompile your ASL: >>>>>>>>>>> >>>>>>>>>>> Acquire (\_TZ.C173, 0xFFFF) >>>>>>>>>>> ... >>>>>>>>>>> Release (\_TZ.C173) >>>>>>>>>>> >>>>>>>>>>> For others who have this problem, instructions on how to >>>>>>>>>>> recompile >>>>>>>>> and >>>>>>>>>>> load your custom ASL can be found here (11.16.4 and 5): >>>>>>>>>>> >>>>>>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/acpi- >debug.htm >>>>>>> >>>>>>>>> l >>>> >> >> > > >----------------------------------------------------------------------- ---- >----- > > >> Index: sys/dev/acpica/Osd/OsdSchedule.c >> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> RCS file: /home/ncvs/src/sys/dev/acpica/Osd/OsdSchedule.c,v >> retrieving revision 1.39 >> diff -u -r1.39 OsdSchedule.c >> --- sys/dev/acpica/Osd/OsdSchedule.c 22 Mar 2007 18:16:41 -0000 1.39 >> +++ sys/dev/acpica/Osd/OsdSchedule.c 7 Jan 2008 23:46:47 -0000 >> @@ -114,7 +114,7 @@ >> pri =3D 5; >> break; >> case OSL_NOTIFY_HANDLER: >> - pri =3D 3; >> + pri =3D 10; // XXX testing >> break; >> case OSL_DEBUGGER_THREAD: >> pri =3D 0; >> Index: sys/contrib/dev/acpica/evgpe.c >> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> RCS file: /home/ncvs/src/sys/contrib/dev/acpica/evgpe.c,v >> retrieving revision 1.1.1.12 >> diff -u -r1.1.1.12 evgpe.c >> --- sys/contrib/dev/acpica/evgpe.c 22 Mar 2007 17:23:30 -0000 1.1.1.12 >> +++ sys/contrib/dev/acpica/evgpe.c 7 Jan 2008 21:17:28 -0000 >> @@ -123,6 +123,10 @@ >> >> /* Local prototypes */ >> >> +static void >> +AcpiEvAsynchEnableGpe ( >> + void *Context); >> + >> static void ACPI_SYSTEM_XFACE >> AcpiEvAsynchExecuteGpeMethod ( >> void *Context); >> @@ -684,14 +688,26 @@ >> } >> } >> >> - if ((LocalGpeEventInfo.Flags & ACPI_GPE_XRUPT_TYPE_MASK) =3D=3D >> + /* Defer enabling of GPE until all notify handlers are done */ >> + AcpiOsExecute(OSL_NOTIFY_HANDLER, AcpiEvAsynchEnableGpe, >> GpeEventInfo); >> + return_VOID; >> +} >> + >> +static void >> +AcpiEvAsynchEnableGpe ( >> + void *Context) >> +{ >> + ACPI_GPE_EVENT_INFO *GpeEventInfo =3D (void *) Context; >> + ACPI_STATUS Status; >> + >> + if ((GpeEventInfo->Flags & ACPI_GPE_XRUPT_TYPE_MASK) =3D=3D >> ACPI_GPE_LEVEL_TRIGGERED) >> { >> /* >> * GPE is level-triggered, we clear the GPE status bit after >> * handling the event. >> */ >> - Status =3D AcpiHwClearGpe (&LocalGpeEventInfo); >> + Status =3D AcpiHwClearGpe (GpeEventInfo); >> if (ACPI_FAILURE (Status)) >> { >> return_VOID; >> @@ -700,7 +716,7 @@ >> >> /* Enable this GPE */ >> >> - (void) AcpiHwWriteGpeEnableReg (&LocalGpeEventInfo); >> + (void) AcpiHwWriteGpeEnableReg (GpeEventInfo); >> return_VOID; >> } >> >> > >_______________________________________________ >freebsd-acpi@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-acpi >To unsubscribe, send any mail to "freebsd-acpi-unsubscribe@freebsd.org"