From owner-freebsd-stable@FreeBSD.ORG Thu Feb 9 20:24:39 2006 Return-Path: X-Original-To: stable@FreeBSD.ORG Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 972CA16A420 for ; Thu, 9 Feb 2006 20:24:39 +0000 (GMT) (envelope-from sos@deepcore.dk) Received: from spider.deepcore.dk (cpe.atm2-0-53484.0x50a6c9a6.abnxx9.customer.tele.dk [80.166.201.166]) by mx1.FreeBSD.org (Postfix) with ESMTP id BBAE343D69 for ; Thu, 9 Feb 2006 20:24:38 +0000 (GMT) (envelope-from sos@deepcore.dk) Received: from [194.192.25.142] (spider.deepcore.dk [194.192.25.142]) by spider.deepcore.dk (8.13.4/8.13.4) with ESMTP id k19KONiW022984; Thu, 9 Feb 2006 21:24:24 +0100 (CET) (envelope-from sos@deepcore.dk) Message-ID: <43EBA4F7.7040407@deepcore.dk> Date: Thu, 09 Feb 2006 21:24:23 +0100 From: =?ISO-8859-1?Q?S=F8ren_Schmidt?= User-Agent: Thunderbird 1.5 (X11/20060130) MIME-Version: 1.0 To: Wilko Bulte References: <20060208194603.GA689@freebie.xs4all.nl> <43EA5C50.5020804@deepcore.dk> <20060208213704.GA703@freebie.xs4all.nl> <43EA6625.2070106@deepcore.dk> <20060208221056.GA1299@freebie.xs4all.nl> <43EB5393.5090502@deepcore.dk> <20060209144250.GB4874@freebie.xs4all.nl> <43EB55A1.9040405@deepcore.dk> <20060209201912.GA680@freebie.xs4all.nl> In-Reply-To: <20060209201912.GA680@freebie.xs4all.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-mail-scanned: by DeepCore Virus & Spam killer v1.16 Cc: stable@FreeBSD.ORG Subject: Re: Showstopper ATA bug in 6.1-PRE? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Feb 2006 20:24:39 -0000 Wilko Bulte wrote: > On Thu, Feb 09, 2006 at 03:45:53PM +0100, Sren Schmidt wrote.. >> Wilko Bulte wrote: >>> On Thu, Feb 09, 2006 at 03:37:07PM +0100, Sren Schmidt wrote.. >>>> Wilko Bulte wrote: >>>>> On Wed, Feb 08, 2006 at 10:44:05PM +0100, Sren Schmidt wrote.. >>>>>> Wilko Bulte wrote: >>>>>>> On Wed, Feb 08, 2006 at 10:02:08PM +0100, Sren Schmidt wrote.. >>>>>>>> Wilko Bulte wrote: >>>>>>>>> Hi Soren, >>>>>>>>> >>>>>>>>> I just went to 6.1-PRE on my main machine, coming from 6.0-STABLE >>>>>>>>> of roughly end of december. >>>>>>>>> >>>>>>>>> And I hit some stuff that really worries me: >>>>>>>>> >>>>>>>>> - the freshly built kernel keels over with (hand transcribed): >>>>>>>>> >>>>>>>>> ata3: reiniting channel SATA connect ... >>>>>>>>> SATA connected >>>>>>>>> sata_connect_devices 0x1 >>>>>>>>> >>>>>>>>> ad6: req=0xC35ba0c8 SETFEATURES SETTRANSFERMODE semaphore timeout >>>>>>>>> !! DANGER Will RObinson !! >>>>>>>>> >>>>>>>>> (... is where I cannot read my own handwriting, it scrolled quite >>>>>>>>> fast on >>>>>>>>> the screen..) >>>>>>>>> >>>>>>>>> Boot device is a SATA RAID1 on a Promise 2300. >>>>>>>> Hmm, that should not happen. Could you try to backstep just ATA to >>>>>>>> before the MFC, that is 24/1/06 and let me know if that helps please ? >>>>>>> First impression is that the problem is gone. None of the previously >>>>>>> reported errors are seen. I am running a level 0 dump from disk to >>>>>>> disk >>>>>>> to see if the box remains stable. Given that this is my primary >>>>>>> machine >>>>>>> I sure hope it will be :-) >>>>>>> >>>>>>>>> Another snag is that my ad10 disk on 6.0-STABLE suddenly became ad12 >>>>>>>>> on >>>>>>>>> 6.1-PRE >>>>>>>> Hmm that is because there is only 2 ports on your promise which is >>>>>>>> now correctly identified, before it was errounsly found as 3 ports. >>>>>>> Ah, OK. I would suggest a note to the Release Note writers would be a >>>>>>> good >>>>>>> thing, devices changing location after an upgrade in the -stable branch >>>>>>> is unnerving ;-) >>>>>> Well, the good thing is that I can reproduce the error here, the bad >>>>>> thing is that it slipped through testing on -current... >>>>>> Oh, well, I'll look into it ASAP... >>>>> Thank you Soren! >>>> OK, had a few this afternoon, could you try this patch and let me know >>>> if it helps, at least it makes the problem go away on my testbed.. >>> Is this relative to HEAD or RELENG_6? I cannot / will not go to HEAD >>> with this machine (my main production box.. :-) >> Doesn't matter, ATA is the same on both... > > OK, I was not sure if they were 100% identical. > > The patch at first impression seems to have eliminated the problem. Good seems I'm on the right track at least. > Interestingly enough ad10 remained ad10 with the patch applied? Yeah, thats intentional, I though we better not break POLA here.. > I'll put some load on to see what happens. Let me know how that turns out, I'll clean things up a bit and get it committed to -current, then get permission to MFC when we are sure it fixes the problem... -Søren