From owner-freebsd-scsi@FreeBSD.ORG Sat Sep 15 04:39:39 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90986106566B; Sat, 15 Sep 2012 04:39:39 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 5602D8FC0A; Sat, 15 Sep 2012 04:39:39 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id q8F4dcFC073123; Fri, 14 Sep 2012 22:39:38 -0600 (MDT) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id q8F4dcVx073122; Fri, 14 Sep 2012 22:39:38 -0600 (MDT) (envelope-from ken) Date: Fri, 14 Sep 2012 22:39:38 -0600 From: "Kenneth D. Merry" To: John Message-ID: <20120915043938.GA71754@nargothrond.kdm.org> References: <20120915022437.GA90210@FreeBSD.org> <20120915023329.GA55292@nargothrond.kdm.org> <20120915031305.GA97685@FreeBSD.org> <20120915032826.GA63349@nargothrond.kdm.org> <20120915040907.GA5458@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120915040907.GA5458@FreeBSD.org> User-Agent: Mutt/1.4.2i Cc: FreeBSD iSCSI Subject: Re: How to force a reset of a device (disk) in an enclosre slot X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Sep 2012 04:39:39 -0000 On Sat, Sep 15, 2012 at 04:09:07 +0000, John wrote: > ----- Kenneth D. Merry's Original Message ----- > > On Sat, Sep 15, 2012 at 03:13:05 +0000, John wrote: > > > ----- Kenneth D. Merry's Original Message ----- > > > > On Sat, Sep 15, 2012 at 02:24:37 +0000, John wrote: > > > > > Hi Folks, > > > > > > > > > > I've been poking around and can't seem to find a way to reset and > > > > > hopefully acquire access to a disk device in an enclosure. For instance: > > > > > > > > > > FreeBSD 9.1-PRERELEASE > > > > > > > > > > # camcontrol smpphylist ses4 > > > > > 37 PHYs: > > > > > PHY Attached SAS Address > > > > > 0 0x5000039368233602 (pass105,da98) > > > > > 1 0x5000039368238e3e (pass106,da99) > > > > > 2 0x500003936823bca2 (pass107,da100) > > > > > 3 0x500003936819507e (pass108,da101) > > > > > 4 0x5000039368197d5a (pass109,da102) > > > > > 5 0x5000039368197c6e (pass110,da103) > > > > > 6 0x500003936818770e (pass111,da104) > > > > > 7 0x5000039368238eba (pass112,da105) > > > > > 8 0x5000039368232f42 (pass113,da106) > > > > > 9 0x0000000000000000 > > > > > 10 0x500003936813c31e > > > > > 11 0x5000039368233892 (pass114,da107) > > > > > 12 0x500003936813c2ca (pass115,da108) > > > > > ... > > > > > > > > > > Note, bay/slot 10 has a listed device address. If I were to pull the > > > > > drive and re-insert it, it would show up (as da390 in this case). > > > > > The above is after a fresh reboot. Note da106 to da107 skipping > > > > > slot 10 (slot 9 is empty). > > > > > > > > > > The smp utils provide a similar view: > > > > > > > > > > # smp_discover /dev/ses4 > > > > > phy 0:D:attached:[5000039368233602:00 t(SSP)] 6 Gbps > > > > > phy 1:D:attached:[5000039368238e3e:00 t(SSP)] 6 Gbps > > > > > phy 2:D:attached:[500003936823bca2:00 t(SSP)] 6 Gbps > > > > > phy 3:D:attached:[500003936819507e:00 t(SSP)] 6 Gbps > > > > > phy 4:D:attached:[5000039368197d5a:00 t(SSP)] 6 Gbps > > > > > phy 5:D:attached:[5000039368197c6e:00 t(SSP)] 6 Gbps > > > > > phy 6:D:attached:[500003936818770e:00 t(SSP)] 6 Gbps > > > > > phy 7:D:attached:[5000039368238eba:00 t(SSP)] 6 Gbps > > > > > phy 8:D:attached:[5000039368232f42:00 t(SSP)] 6 Gbps > > > > > phy 10:D:attached:[500003936813c31e:00 t(SSP)] 6 Gbps > > > > > phy 11:D:attached:[5000039368233892:00 t(SSP)] 6 Gbps > > > > > phy 12:D:attached:[500003936813c2ca:00 t(SSP)] 6 Gbps > > > > > ... > > > > > > > > > > The address of slot 10 matches. There is a disk in the slot - just > > > > > isn't recognized and attached. > > > > > > > > > > Back to the basic question. How can I issue a command to the enclosure > > > > > to force a re-initialization of the device to recover it without > > > > > having to physically pull & insert it. Even if the device numbers > > > > > are not sequential, I need access to the drive... > > > > > > > > You can try sending a link reset: > > > > > > > > camcontrol smppc ses4 -p 10 -o linkreset > > > > > > > > It may or may not work. You can also try disabling the PHY (-o disable) > > > > and then sending a link reset to re-enable the link. You can also try a > > > > hard reset (-o hardreset) > > > > > > Hi Ken, > > > > > > Well, I hadn't tried to actually disable the device. That did bring some > > > reaction: > > > > > > # camcontrol smppc ses4 -p 10 -o disable > > > # camcontrol smpphylist ses4 > > > 37 PHYs: > > > PHY Attached SAS Address > > > 0 0x5000039368233602 (pass105,da98) > > > .... > > > 8 0x5000039368232f42 (pass113,da106) > > > 9 0x0000000000000000 > > > 10 0x0000000000000000 > > > 11 0x5000039368233892 (pass114,da107) > > > ... > > > > > > The device is gone. > > > > > > # camcontrol smppc ses4 -p 10 -o hardreset > > > root@vprzfs01p:/root # camcontrol smpphylist ses4 > > > 37 PHYs: > > > PHY Attached SAS Address > > > 0 0x5000039368233602 (pass105,da98) > > > .... > > > 8 0x5000039368232f42 (pass113,da106) > > > 9 0x0000000000000000 > > > 10 0x500003936813c31e > > > 11 0x5000039368233892 (pass114,da107) > > > ... > > > > > > The device is back, but not attached - This msg: > > > > > > kernel: mps1: mpssas_alloc_tm freezing simq > > > kernel: mps1: mpssas_remove_complete on handle 0x0069, IOCStatus= 0x0 > > > kernel: mps1: mpssas_free_tm releasing simq > > > kernel: _mapping_add_new_device: failed to add the device with handle 0x0069 to persistent table because there is no free space available - entry 0 > > > > That message is harmless, it won't prevent the drive from attaching. > > > > > >From a debug statement in the driver: MaxPersistentEntries == 128, but I > > > have more than 128 devices per LSI card and they normally all show up - > > > though I do get a bunch of the above messages in dmesg.. > > > > You might try turning on some of the debugging in the mps(4) driver and > > disabling and resetting the link again. > > > > Try: > > > > sysctl -w dev.mps.0.debug_level=0xf > > > > You might get a lot of output, so be prepared to reset it back to 4: > > > > sysctl -w dev.mps.0.debug_level=4 > > Hi Ken, > > I don't see anything obvious. Hopefully you're more familair with the > code and have better eyes than I do... Here's everything from messages > after the -o disable. There are some "unknown/unhandled"s showing up. Here is where the drive shows up: > kernel: mps_intr_locked sc 0xffffff8001353000 writing postindex 243 > kernel: mps_enqueue_request SMID 653 cm 0xffffff80013ca4a8 ccb 0 > kernel: mps_intr_locked sc 0xffffff8001353000 starting with replypostindex 243 > kernel: mps_intr_locked sc 0xffffff8001353000 writing postindex 244 > kernel: SAS Address from SAS device page0 = 500003936811feae > kernel: Found device <401,End Device> <6.0Gbps> <0x0078> <4/36> > kernel: mpssas_rescan_target targetid 255 > kernel: mpssas_rescan > kernel: > kernel: Target id 0xff added It finds the device, with target ID 255 (which is a little suspicious) and queues a rescan, but nothing happens after that. You might try doing a manual rescan of that device to see what happens: camcontrol rescan X:255:0 Where X is the scbus number from camcontrol devlist. If that doesn't work, then we need to figure out what the maximum number of targets supported by the adapter is. To do that, set this in /boot/loader.conf and reboot: hw.mps.debug_level=1 That should result in the IOCFacts page getting printed on boot. How many drives and other devices are currently attached to that controller? What controller model is it, and do you have IT or IR firmware on it? Ken -- Kenneth Merry ken@FreeBSD.ORG