From owner-freebsd-scsi@FreeBSD.ORG Fri Apr 30 05:07:19 2010 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ACB25106566C; Fri, 30 Apr 2010 05:07:19 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 4777F8FC0C; Fri, 30 Apr 2010 05:07:18 +0000 (UTC) Received: from [127.0.0.1] (pooker.samsco.org [168.103.85.57]) (authenticated bits=0) by pooker.samsco.org (8.14.3/8.14.3) with ESMTP id o3U57F41018855; Thu, 29 Apr 2010 23:07:16 -0600 (MDT) (envelope-from scottl@samsco.org) Mime-Version: 1.0 (Apple Message framework v1078) Content-Type: text/plain; charset=us-ascii From: Scott Long In-Reply-To: <4BDA6310.10902@FreeBSD.org> Date: Thu, 29 Apr 2010 23:07:15 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <221F6444-3102-4CD7-A1A7-1DF4352E7F50@samsco.org> References: <4BD98DE2.8020703@FreeBSD.org> <4A883035-3570-4FCC-B8EB-F205BD6D640D@samsco.org> <4BDA6310.10902@FreeBSD.org> To: Alexander Motin X-Mailer: Apple Mail (2.1078) X-Spam-Status: No, score=-1.0 required=3.8 tests=ALL_TRUSTED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on pooker.samsco.org Cc: freebsd-scsi@FreeBSD.org, FreeBSD Stable , Pete French , Robert Noland Subject: Re: MFC of "Large set of CAM improvements" breaks I/O to Adaptec 29160 SCSI controller X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Apr 2010 05:07:19 -0000 On Apr 29, 2010, at 10:56 PM, Alexander Motin wrote: > Scott Long wrote: >> On Apr 29, 2010, at 7:47 AM, Robert Noland wrote: >>>=20 >>> Scott Long wrote: >>>> On Apr 29, 2010, at 2:50 AM, Pete French wrote: >>>>>> Thanks. First step successful - I can steadily reproduce problem = on >>>>>> CURRENT. raidtest with 200 I/O streams over gmirror of two disks = on same >>>>>> channel triggers issue in seconds. Any I/O on channel dying after = both >>>>>> disks report "Queue full" error same time. The rest of system = works >>>>>> fine. If I preliminarily manually adjust queue depth of one disk = - >>>>>> everything works fine. I'll investigate it tomorrow. >>>>> Glad you have managed to dupliate it - the queue depth thing is >>>>> inetersting, what changes did you make ? I can try them here and = see >>>>> if they improve the situation on either of my two machines. >>>>>=20 >>>> For the record, queue-full is a common, expected condition in CAM. = It's not something that should be avoided =3D-) >>> Should we maybe have a counter in sysctl rather than flooding the = console with these messages then? >>=20 >> That's a pretty good idea. I'll make it happen. >=20 > It is already hidden behind bootverbose. Hiding it deeper will make > unclear why CAM requeues the rest of commands (also reported under > bootverbose). I've tuned log messages a bit recently and they seem to = be > more consistent and readable now IMHO. >=20 We used to run FreeBSD at Yahoo with bootverbose turned on in order to = help with debugging. After years of doing this, I finally turned = bootverbose off last year, partially because of the excessive console = spam produced by these queue-full messages. Even when we were writing = the ahc/ahd drivers at Adaptec years ago, I never really liked these = messages, and we rarely ran with bootverbose turned on unless we were = actively developing code or debugging a problem. I like Robert's = suggestion because not only does it make running with bootverbose less = painful, it can also provide counters and also calculate and report rate = measurements that might be more useful than just the printf. If you feel strongly against it, I won't push it, but I do like the = suggestion. Scott