From owner-freebsd-current@FreeBSD.ORG Wed Dec 16 17:11:01 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2CC6B106568D; Wed, 16 Dec 2009 17:11:01 +0000 (UTC) (envelope-from pisymbol@gmail.com) Received: from mail-yx0-f171.google.com (mail-yx0-f171.google.com [209.85.210.171]) by mx1.freebsd.org (Postfix) with ESMTP id C757D8FC1B; Wed, 16 Dec 2009 17:11:00 +0000 (UTC) Received: by yxe1 with SMTP id 1so1172926yxe.3 for ; Wed, 16 Dec 2009 09:11:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=mDom1J2H74WpG05iDovAklbLuaLYP4jhefSVo4b4QtA=; b=NPUwMUyadFKM9xnKQyeVmUyQXP/iA96WY136tO46NxE/4nGSVd9ag/PE/CSRlYFoA2 EGwu864xBv/2CEADg+1ozH6rZP0DTZTG7c5OXnoXhk/n65L0FfIeT1Fbxd0/Kv3VUxfa /ZuLh6JS7+a4jVuEwIq1wCY7IegK380/ihNlQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=F4qTo9Yah/M3bsEoEmBXEjWPnpX+yg3i5Ut/7/8LUHgODWz5+RuVq77Nc+k/UKJ1bj 8xHORWwN09LE+6Yi2U3Lex+OOZgY9ZiQyvMOcbF/s7fJT55nz66+jNWgrS6ZaML7xGuZ EYi8E1XlyVKeeZYP15Y24/IqFTFjSufcM/oWc= MIME-Version: 1.0 Received: by 10.101.4.16 with SMTP id g16mr1993180ani.25.1260983459758; Wed, 16 Dec 2009 09:10:59 -0800 (PST) In-Reply-To: <978BBD51-222D-42F0-9D3A-FFACCBCC886D@samsco.org> References: <3c0b01820912141347y366a7252y5d9711b1141b9b70@mail.gmail.com> <978BBD51-222D-42F0-9D3A-FFACCBCC886D@samsco.org> Date: Wed, 16 Dec 2009 12:10:59 -0500 Message-ID: <3c0b01820912160910i35e12112s4d6412d6cb174f3b@mail.gmail.com> From: Alexander Sack To: Scott Long Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-scsi@freebsd.org, freebsd-current@freebsd.org Subject: Re: aac(4) handling of probe when no devices are there X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 17:11:01 -0000 On Tue, Dec 15, 2009 at 4:54 AM, Scott Long wrote: > On Dec 14, 2009, at 2:47 PM, Alexander Sack wrote: >> >> Hello Again: >> >> I guess I have a technical question/concern that I was looking for >> feedback. =A0 During the probe sequence, aac(4) conditionally responds >> to INQUIRY commands depending on target LUN: >> >> aac_cam.c/aac_cam_complete(): >> 532 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (command =3D=3D I= NQUIRY) { >> 533 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (= ccb->ccb_h.status =3D=3D CAM_REQ_CMP) >> { >> 534 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 devi= ce =3D ccb->csio.data_ptr[0] & 0x1f; >> 535 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* >> 536 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*= We want DASD and PROC devices to >> only be >> 537 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*= visible through the pass device. >> 538 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*= / >> 539 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (= (device =3D=3D T_DIRECT) || >> 540 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 (device =3D=3D T_PROCESSOR) || >> 541 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 (sc->flags & >> AAC_FLAGS_CAM_PASSONLY)) >> 542 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 ccb->csio.data_ptr[0] =3D >> 543 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 ((device & 0xe0) | >> T_NODEVICE); >> 544 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } el= se if (ccb->ccb_h.status =3D=3D >> CAM_SEL_TIMEOUT && >> 545 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 ccb->ccb_h.target_lun !=3D 0) { >> 546 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 /* fix for INQUIRYs on Lun>0 >> */ >> 547 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 ccb->ccb_h.status =3D >> CAM_DEV_NOT_THERE; >> 548 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> 549 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> Why is CAM_DEV_NOT_THERE skipped on LUN 0? > > In the parallel scsi world, a selection timeout means that all LUNs withi= n > the entire target =A0do not (or no longer) exist. =A0So returning > CAM_SEL_TIMEOUT for LUN 1 would tell CAM to invalidate LUN 0 as well. > > If you look higher up in this function, you'll see a note about the > error/status codes from the AAC firmware coincidentally matching CAM's > status codes. =A0My guess is that somewhere along the line, someone at Ad= aptec > stopped reading the SCSI spec and starting returning CAM_SEL_TIMEOUT for > LUNs greater than 0, which is why this work-around is now in the driver. Interesting. Learn something everyday. I did not know that a selection timeout on a non-zero LUN meant no other LUN was available. As a colleague noted, "Has Adaptec ever read the SCSI spec?" Just kidding (somewhat).... >> =A0This is true on my target >> 6.1-amd64 machine as well as CURRENT. =A0The reason why I ask this is >> because now that aac(4) is sequential scanned, there are a lot of cam >> interrupts that come in on my 6.x machine where the threshold is only >> 500 and I get the interrupt storm threshold warning for swi2 pretty >> quickly: >> >> Interrupt storm detected on "swi2:"; throttling interrupt source >> >> Obviously its contingent on the number of adapters you have on your >> system. =A0On CURRENT I didn't see this because the threshold is double >> (I think its a 1000 by default). >> >> The issue is the number of xpt_async(AC_LOST_DEVICE, ..) calls during >> the scan. =A0The probe sequence in CURRENT as well as 6.1 handles >> CAM_SEL_TIMEOUT a little differently depending on context. Yeah I spoke too soon. I think that is a red herring though and misinterpretation of what that was really doing (in this case just seeing the device as unconfigured and moving on). But I STILL don't understand why its treated as a AC_LOST_DEVICE event at scan time (i.e. more overhead than really necessary but perhaps I am not thinking of all the possibilities down this code path, i.e. why create a path, then call xpt_asyc, all to just set the flag as unconfigured - perhaps its more align with the model than anything else and I'm reading too much into it). > It's not at all clear to me what is going on here. =A0Can you instrument = the > code to record the status of everything that is being issued to the aac_c= am > module? Yes surely. I think what might be happening is that after the INQUIRY fails, xpt_release_ccb() which I think will also check to see if any more CCBs should be sent to the device and send them. Basically the boot -v output is I am getting a CAM_SEL_TIMEOUT for each target and just hit into the 500 interrupt storm default threshold on 6.1. Let me investigate further...I'm on the right track, but I need to instrument more...Scott its my first time playing with CAM (be gentle). :D -aps