From owner-freebsd-hackers@FreeBSD.ORG Wed Dec 26 08:53:18 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3B732E10 for ; Wed, 26 Dec 2012 08:53:18 +0000 (UTC) (envelope-from dieterbsd@engineer.com) Received: from mout.gmx.net (mout.gmx.net [74.208.4.200]) by mx1.freebsd.org (Postfix) with ESMTP id 0065B8FC14 for ; Wed, 26 Dec 2012 08:53:17 +0000 (UTC) Received: from mailout-us.gmx.com ([172.19.198.45]) by mrigmx.server.lan (mrigmxus002) with ESMTP (Nemesis) id 0MYN9B-1TasLv2zMv-00VAm5 for ; Wed, 26 Dec 2012 09:48:06 +0100 Received: (qmail 21254 invoked by uid 0); 26 Dec 2012 08:48:06 -0000 Received: from 67.206.183.187 by rms-us012 with HTTP Content-Type: text/plain; charset="utf-8" Date: Wed, 26 Dec 2012 03:48:04 -0500 From: "Dieter BSD" Message-ID: <20121226084805.91840@gmx.com> MIME-Version: 1.0 Subject: Re: FreeBSD for serious performance? To: freebsd-hackers@freebsd.org X-Authenticated: #74169980 X-Flags: 0001 X-Mailer: GMX.com Web Mailer x-registered: 0 Content-Transfer-Encoding: 8bit X-GMX-UID: ktyocIRC3zOlNR3dAHAhU25+IGRvb4Bc X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2012 08:53:18 -0000 > If the driver is doing something daft like DELAY(x) in a fast > interrupt handler which would lead to that behaviour, it should be > fixed. > > If it's doing a DELAY(x) in a critical section, it shuld be fixed. They are doing *something* that completely locks out everything else. It is always a device driver. > Now, it's quite likely you hit some kind of ata(4) bug which kept it > in a tight loop Hard to imagine locking everything out for 19 minutes without being in a loop. > So it was likely just spun > in some high priority loop that nothing lower-priority could really do > anything about. Would several different drivers have this same bug? > The next time it happens, please break into the debugger and grab some > debugging output. Show alllocks, ps, should be a good couple of things > to start with. I've only caught it hanging forever once. It only takes a few milliseconds to cause incoming data to be lost, so I usually don't know about it until looking at the log file later. Not that I could jump into the debugger and gather data in a few milliseconds even if I knew when it was happening. BTW, how do I break into the debugger and gather data when all of the devices are locked out, including the console? I assume that once it recovers, there is no point in gathering data. > Alternately - please find a currently actively maintained SATA chipset. The ata controller is soldered to the mainboard, a gazillion pins I'm sure, and no doubt requires very specialized equipment to replace, and I don't know of any pin-compatable replacements. Besides the hardware itself has never caused any problems. The problem is caused by the software, it is the software that needs to be fixed. Ata isn't maintained? Why the bleep not? Disk drivers are essential. I was under the impression that siis(4) and ahci(4) were actively maintained? I'm running four sata controllers using three different drivers and all three drivers lock out other drivers for too long when something unusual happens. And other, non-disk drivers have the same problem of locking out other drivers, even during normal operation. And this happens on yet other drivers on other people's hardware, not just mine. > help migrate the nvidia chipset support out of ata(4) I've looked at several of FreeBSD's device drivers (including, as you might expect, ata, siis, and ahci) and I can't make heads or tails out of any of them. Back before FreeBSD existed, I did manage to make a significant improvement to a driver in a BSD-derived system, so I'm not a complete idiot. Several different drivers cause the same problem. Are they all making the same mistake? Or is there a problem in something they all use? Whether a design problem or an implementation bug.