From owner-freebsd-stable@FreeBSD.ORG Fri Oct 17 14:08:46 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60BF61065688; Fri, 17 Oct 2008 14:08:46 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [91.103.162.4]) by mx1.freebsd.org (Postfix) with ESMTP id 16B268FC16; Fri, 17 Oct 2008 14:08:46 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from localhost (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 26A3B19E02A; Fri, 17 Oct 2008 16:08:45 +0200 (CEST) Received: from [192.168.1.2] (r5bb235.net.upc.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id D3CB519E027; Fri, 17 Oct 2008 16:08:42 +0200 (CEST) Message-ID: <48F89C8D.5020301@quip.cz> Date: Fri, 17 Oct 2008 16:09:17 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 X-Accept-Language: cz, cs, en, en-us MIME-Version: 1.0 To: Jeremy Chadwick References: <20080927064417.GA43638@icarus.home.lan> <20080927202250.GA60980@icarus.home.lan> <48E0DB7E.20804@quip.cz> <1222699642.24339.12.camel@buffy.york.ac.uk> <48E0F36C.1080400@quip.cz> <20080929153220.GA11459@icarus.home.lan> <48F7964C.4060309@quip.cz> <20081016202322.GA2429@icarus.home.lan> <48F87C0E.8060404@quip.cz> <20081017120858.GA20746@icarus.home.lan> In-Reply-To: <20081017120858.GA20746@icarus.home.lan> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: Gavin Atkinson , freebsd-stable@FreeBSD.org Subject: Re: Recommendations for servers running SATA drives [hot-swap] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Oct 2008 14:08:46 -0000 Jeremy Chadwick wrote: > On Fri, Oct 17, 2008 at 01:50:38PM +0200, Miroslav Lachman wrote: > >>Jeremy Chadwick wrote: >> >>>On Thu, Oct 16, 2008 at 09:30:20PM +0200, Miroslav Lachman wrote: >>> >>> >>>>Today I was replacing disk in one Sun Fire X2100 M2 so I tried >>>>hot-swapping. It was as you said: atacontrol detach ata3, replace the >>>> HDD, atacontrol attach ata3 and new disk is in the system. I tried >>>>it 3 times to be sure that it was not coincidence - no panic was >>>>produced ;o) >>>>So in this case, hot-swapping on Sun Fire X2100 M2 with FreeBSD 7.0 >>>>i386 works. >>> >>> >>>That's excellent news. So it seems possibly the problem I was seeing >>>was with "reinit" causing some sort of chaos. I'll have to check things >>>on my testbox here at home to see how I caused the panic last time. >>> >>>Thanks for providing feedback, as usual! :-) >> >>Unfortunately there is one problem - I see a lot of interrupts after >>disk swapping (about 193k of atapci1) >> >>Interrupts >>197k total >> ohci0 21 >> ehci0 22 >>193k atapci1 23 >>2001 cpu0: time >> 1 bge1 273 >>2001 cpu1: time > > > Okay, so it looks like the interrupt rate on atapci1 after swapping is > going crazy. What you're showing there looks like heavily modified > vmstat -i output. The shown is manually cropped from systat -vm, I'll try vmstat -i next time. ;) >>Full output of systat -vm 2 is attached. >> >>It is shown in top as 50% interrupt (CPU state) and load 1 until I >>rebooted the machine (I can provide MRTG graphs). The system was not in >>production load, but almost idle. (I will put it in production tomorrow). >>After reboot, everything is OK. > > > And this box is running the ATA patch Andrey provided, yes? It is clean install of FreeBSD 7.0-RELEASE-p5 amd64 without patches. >>Can somebody test hot-swapping with SATA drives and confirm this >>behavior? (I can't test it now, because machine is in datacenter) > > > I can test it on my P4SCE box. > > I'll check the interrupt rates after each step of the hot-swap to see > if/when the problem starts. I'll check the interrupts next time too and will post results to this thread. Miroslav Lachman