From owner-freebsd-scsi@FreeBSD.ORG Sat Sep 17 07:19:10 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CCB3116A41F for ; Sat, 17 Sep 2005 07:19:10 +0000 (GMT) (envelope-from ade@lovett.com) Received: from mail.lovett.com (foo.lovett.com [67.134.38.158]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8745943D4C for ; Sat, 17 Sep 2005 07:19:10 +0000 (GMT) (envelope-from ade@lovett.com) Received: from hellfire.lab.lovett.com ([192.168.32.20]:52018) by mail.lovett.com with esmtpa (Exim 4.52 (FreeBSD)) id 1EGWyc-000PSv-BV; Sat, 17 Sep 2005 00:19:10 -0700 In-Reply-To: <432BC20B.2090603@samsco.org> References: <3A1FD217-5880-4845-9F64-5DD9395D1C6D@FreeBSD.org> <432BC20B.2090603@samsco.org> Mime-Version: 1.0 (Apple Message framework v734) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <8A5A248D-7529-4EE7-A7B7-4CB81C3ABDDB@freebsd.org> Content-Transfer-Encoding: 7bit From: Ade Lovett Date: Sat, 17 Sep 2005 00:19:09 -0700 To: Scott Long X-Mailer: Apple Mail (2.734) Sender: ade@lovett.com X-SA-Exim-Connect-IP: 192.168.32.20 X-SA-Exim-Mail-From: ade@lovett.com X-SA-Exim-Scanned: No (on mail.lovett.com); SAEximRunCond expanded to false Cc: freebsd-scsi@freebsd.org Subject: Re: CAM tags / reset problem X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Sep 2005 07:19:10 -0000 On Sep 17, 2005, at 00:13 , Scott Long wrote: > It's a bit late tonight for me to think coherently about this and give > the patch a good review, but I promise that it will happen. Maybe > Justin, Ken, Matt, or Nate can also step in and look at it. Are > you saying that there are multiple bus resets happening in > relatively quick > succession on each bus during the inital probe? Are you observing > this > with only a particular controller family, or with multiple ones? This is fallout from the well-documented issues with multiple Seagate drives on an Adaptec 39320 controller: ahd0: port 0x3400-0x34ff, 0x3000-0x30ff me m 0xfc200000-0xfc201fff irq 48 at device 1.0 on pci2 ahd0: [GIANT-LOCKED] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs After the recent nswbuf patch, we were still noticing anomalies with performance on the chain. This is certainly a degenerate case (Seagate drives on a U320 chain), but as far as I can tell it applies to any set of controller/disk pairs should multiple bus resets occur on startup. I've got another test box with Hitachi drives (same controller) and am trying to figure out a way to engineer a set of events that would cause such multiple resets but have as yet been unsuccessful. However, with this patch, we now have chains of 7/8 Seagate drives running at U320 with overall bus throughput around 312MBps. Prior to this, on random read/writes, we were observing less than 100MBps. It definitely merits looking at to see if it breaks anything elsewhere (particularly with "well-behaved" SCSI controller/disk combinations), hence the request for review. -aDe