From owner-freebsd-stable@FreeBSD.ORG Thu Jul 19 14:13:48 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 937041065672; Thu, 19 Jul 2012 14:13:48 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from aussmtpmrkpc120.us.dell.com (aussmtpmrkpc120.us.dell.com [143.166.82.159]) by mx1.freebsd.org (Postfix) with ESMTP id 541728FC1D; Thu, 19 Jul 2012 14:13:48 +0000 (UTC) X-Loopcount0: from 64.238.244.148 X-IronPort-AV: E=Sophos;i="4.77,615,1336366800"; d="scan'208";a="508717179" Received: from mail.compellent.com ([64.238.244.148]) by aussmtpmrkpc120.us.dell.com with ESMTP; 19 Jul 2012 09:12:40 -0500 Message-ID: <500815D7.4030407@vangyzen.net> Date: Thu, 19 Jul 2012 09:12:39 -0500 From: Eric van Gyzen User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:12.0) Gecko/20120531 Thunderbird/12.0.1 MIME-Version: 1.0 To: Steve McCoy References: <4FDABA0B.5030702@greatbaysoftware.com> <4FFF34BA.9030002@greatbaysoftware.com> <4FFF9A50.40006@greatbaysoftware.com> <201207130939.54311.jhb@freebsd.org> <5005CD83.306@greatbaysoftware.com> In-Reply-To: <5005CD83.306@greatbaysoftware.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit Cc: Charles Owens , freebsd-stable@freebsd.org, John Baldwin Subject: Re: mfi(4) IO performance regression, post 8.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jul 2012 14:13:48 -0000 On 07/17/12 15:39, Steve McCoy wrote: > On 7/13/12 9:39 AM, John Baldwin wrote: >> On Thursday, July 12, 2012 11:47:28 pm Steve McCoy wrote: >>> On 7/12/12 4:34 PM, Steve McCoy wrote: >>>>>> John Baldwin wrote: >>>>> >>>>> Barring that, can you do a binary search of kernels from stable/8 >>>>> between 8.1 >>>>> and 8.2 on an 8.1 world to see which commit caused the change in write >>>>> performance? >>>>> >>>> >>>> Hi John, I'm working with Charles to narrow this down. >>>> >>>> Looks like revision 212229 is the culprit, or at least around the same >>>> time to it, if this change isn't what slowed things down. The change to >>>> sys/kern/vfs_bio.c modifies some synchronization in dev_strategy(): >>>> >>> >>> Actually, hold that thought. I had a hunch that I wasn't thorough >>> enough, so I decided to try 212228 — the performance is the same as with >>> 212229, so vfs_bio seems to be out of the picture. I'm going to binary >>> search between 209459 and 212229, and see what I find. >> >> Ok. Please let me know what you find. Thanks! >> > > Alright, I've finally narrowed it down to r209897, which only affects > acpi_cpu_idle(): > > --- stable/8/sys/dev/acpica/acpi_cpu.c 2010/06/23 17:04:42 209471 > +++ stable/8/sys/dev/acpica/acpi_cpu.c 2010/07/11 11:58:46 209897 > @@ -930,12 +930,16 @@ > > /* > * Execute HLT (or equivalent) and wait for an interrupt. We can't > - * calculate the time spent in C1 since the place we wake up is an > - * ISR. Assume we slept half of quantum and return. > + * precisely calculate the time spent in C1 since the place we wake up > + * is an ISR. Assume we slept no more then half of quantum. > */ > if (cx_next->type == ACPI_STATE_C1) { > - sc->cpu_prev_sleep = (sc->cpu_prev_sleep * 3 + 500000 / hz) / 4; > + AcpiHwRead(&start_time, &AcpiGbl_FADT.XPmTimerBlock); > acpi_cpu_c1(); > + AcpiHwRead(&end_time, &AcpiGbl_FADT.XPmTimerBlock); > + end_time = acpi_TimerDelta(end_time, start_time); > + sc->cpu_prev_sleep = (sc->cpu_prev_sleep * 3 + > + min(PM_USEC(end_time), 500000 / hz)) / 4; > return; > } > > My current guess is that AcpiHwRead() is a problem on our hardware. It's > an isolated change and, to my desperate eyes, the commit message implies > that it isn't critical — Do you think we could buy ourselves some time > by pulling it out of our version of the kernel? Or is this essential for > correctness? Any thoughts are appreciated, thanks! You might simply try a different idle function. See these sysctls: machdep.idle: acpi machdep.idle_available: spin, mwait, mwait_hlt, hlt, acpi, Eric