From owner-freebsd-stable@FreeBSD.ORG Thu Oct 18 21:44:11 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 90E44FE9 for ; Thu, 18 Oct 2012 21:44:11 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 691EA8FC08 for ; Thu, 18 Oct 2012 21:44:11 +0000 (UTC) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 18 Oct 2012 14:44:18 -0700 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.4/8.14.4) with ESMTP id q9ILh2AN010854; Thu, 18 Oct 2012 14:43:02 -0700 (PDT) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.4/8.14.4/Submit) id q9ILh2Bh010846; Thu, 18 Oct 2012 14:43:02 -0700 (PDT) (envelope-from ambrisko) Date: Thu, 18 Oct 2012 14:43:02 -0700 From: Doug Ambrisko To: Anton Yuzhaninov Subject: Re: Problem with IPMI KCS driver Message-ID: <20121018214302.GA8009@ambrisko.com> References: <503DE2AB.6030702@citrin.ru> <201208290825.44198.jhb@freebsd.org> <506573DD.2030808@citrin.ru> <201209280848.35380.jhb@freebsd.org> <507FCF9B.9080104@citrin.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <507FCF9B.9080104@citrin.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 21:44:11 -0000 On Thu, Oct 18, 2012 at 01:44:59PM +0400, Anton Yuzhaninov wrote: | On 28.09.2012 16:48, John Baldwin wrote: | >>kcs_wait_for_obf() at kcs_wait_for_obf+0xb6 point to | >>> /usr/src/sys/dev/ipmi/ipmi_kcs.c:94 | >>> | >>> 91 while (ticks - start< MAX_TIMEOUT&& | >>> 92 !(status& KCS_STATUS_OBF)) { | >>> 93 DELAY(100); | >>> 94 status = INB(sc, KCS_CTL_STS); | >>> 95 } | >Hummm. I'm a bit out of ideas then. Even the volatile change is a bug | >that | >could have been confirmed (to see if volatile was preventing the compiler | >from caching the value of 'ticks') by examining the assembly. | > | >Well, maybe this. This just avoids using 'ticks' altogether and depends on | >DELAY(100) doing what it says: | | New patch also don't solve my problem. | | My guess was wrong. Loop in kcs_wait_for_obf() is not endless, at least | with last patch. | Whole function called in some loop, but because loop in kcs_wait_for_obf() | takes much CPU time, backtrace always point to loop kcs_wait_for_obf(). Yep, the IPMI local interfaces are polled so they use a lot of CPU so it pretty much always going to be checking "are you done yet" once a command is submitted. We have local patches here that changes the DELAY into a tsleep when the system is running. It has the bad feature of making it a lot slower but uses far less CPU. So for us it is a good trade off. One reason to put it into a loop is so things happen in order and are not interrupted. I guess a different approach might be to do a "big" lock around the entire submit and get response code fargment. Then it would be expensed against the application thread running in the kernel. We also have local changes to all it to run in polled mode without the kernel thread when we are dumping a kernel backtrace into the IPMI system event log. That's nice when the kernel core hasn't worked on a remote machine but we see the back trace in SEL. | This problem need further investigation. It might be good to instrument the code in ipmi.c in which it sending a command and then getting status. If that is actually looking okay then maybe some application is doing something bad. Doug A.