From owner-freebsd-scsi@FreeBSD.ORG Mon Sep 20 11:07:04 2010 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EDD4E1065741 for ; Mon, 20 Sep 2010 11:07:03 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DC1288FC2E for ; Mon, 20 Sep 2010 11:07:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o8KB73dO015067 for ; Mon, 20 Sep 2010 11:07:03 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o8KB73Kk015065 for freebsd-scsi@FreeBSD.org; Mon, 20 Sep 2010 11:07:03 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 20 Sep 2010 11:07:03 GMT Message-Id: <201009201107.o8KB73Kk015065@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Sep 2010 11:07:04 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/149502 scsi [mpt] Latent buglet in debug print code o kern/148083 scsi [aac] Strange device reporting o kern/147704 scsi [mpt] sys/dev/mpt: new chip revision, partially unsupp o kern/146287 scsi [ciss] ciss(4) cannot see more than one SmartArray con o kern/145768 scsi [mpt] can't perform I/O on SAS based SAN disk in freeb o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/144301 scsi [ciss] [hang] HP proliant server locks when using ciss o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/141934 scsi [cam] [patch] add support for SEAGATE DAT Scopion 130 o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/132250 scsi [ciss] ciss driver does not support more then 15 drive o kern/132206 scsi [mpt] system panics on boot when mirroring and 2nd dri p kern/130735 scsi [cam] [patch] pass M_NOWAIT to the malloc() call insid o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 41 problems total. From owner-freebsd-scsi@FreeBSD.ORG Tue Sep 21 16:14:51 2010 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 479801065670 for ; Tue, 21 Sep 2010 16:14:51 +0000 (UTC) (envelope-from aboyer@averesystems.com) Received: from zimbra.averesystems.com (75-149-8-243-Pennsylvania.hfc.comcastbusiness.net [75.149.8.243]) by mx1.freebsd.org (Postfix) with ESMTP id 9A8458FC15 for ; Tue, 21 Sep 2010 16:14:50 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zimbra.averesystems.com (Postfix) with ESMTP id EEF638BC031; Tue, 21 Sep 2010 11:52:58 -0400 (EDT) X-Virus-Scanned: amavisd-new at averesystems.com Received: from zimbra.averesystems.com ([127.0.0.1]) by localhost (zimbra.averesystems.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tD3OfAutRxGX; Tue, 21 Sep 2010 11:52:58 -0400 (EDT) Received: from riven.arriad.com (fw.arriad.com [10.0.0.16]) by zimbra.averesystems.com (Postfix) with ESMTPSA id 0E1948BC030; Tue, 21 Sep 2010 11:52:58 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Andrew Boyer In-Reply-To: <20100910150438.GA64519@nargothrond.kdm.org> Date: Tue, 21 Sep 2010 11:57:30 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20100910150438.GA64519@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1081) Cc: scsi@freebsd.org Subject: Re: LSI 6Gb SAS driver committed X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Sep 2010 16:14:51 -0000 Folks, This driver advertises support for PCI IDs 0x74, 0x76, and 0x77, LSI = SAS2108-based products judging by the description. Are these = RAID-on-chip devices? How do they work with this driver? Has anyone = tried it? Does anybody know how those devices differ from the SAS2108-based = MegaSAS Gen2 ID 0x79 supported by mfi? I would really like to discard the hardware RAID functions of the = MegaSAS and treat it like an HBA, but after adding the 0x79 ID to mps it = wasn't able to initialize the card. Thanks, Andrew On Sep 10, 2010, at 11:04 AM, Kenneth D. Merry wrote: > Hey folks, >=20 > I have commited the mps driver (LSI Logic 6Gb SAS controller driver) = to the > FreeBSD perforce server (//depot/projects/mps/... and FreeBSD-current. >=20 > The driver works with SAS and SATA drives, directly attached or = attached > through expanders. Basic error recovery works as well (i.e. timeouts = and > aborts). >=20 > There are some known issues, including: >=20 > - No support for integrated RAID (IR) arrays. >=20 > - Devices tend to disappear and come back in one of my configurations. = I > also see some phantom devices, and events that don't make sense: >=20 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > (da2:mps0:0:6:0): SCSI command timeout on device handle 0x0017 SMID 90 > mps0: mpssas_abort_complete: abort request on handle 0x17 SMID 90 = complete > mps0: Unhandled event 0x0 > (probe2:mps0:0:2:0): AutoSense failed > mps0: Unhandled event 0x0 > (da10:mps0:0:0:0): unsupportable block size 0 > (da10:mps0:0:0:0): lost device > (da10:mps0:0:0:0): removing device entry > (da2:mps0:0:6:0): lost device > (da2:mps0:0:6:0): removing device entry > da2 at mps0 bus 0 scbus0 target 6 lun 0 > da2: Fixed Direct Access SCSI-5 device > da2: 150.000MB/s transfers > da2: Command Queueing enabled > da2: 152627MB (312581808 512 byte sectors: 255H 63S/T 19457C) > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 > mps0: Unhandled event 0x0 >=20 >=20 > - Sometimes you'll run into a device that fails part of the probe on = boot, > and you'll end up running into the run_interrupt_driven_config_hooks > timeout. You see some aborts during probe, and then the 5 minute = probe > timeout kicks in and panics the kernel. For instance: >=20 > (probe4:mps0:0:20:0): SCSI command timeout on device handle 0x0012 = SMID 81 > mps0: mpssas_abort_complete: abort request on handle 0x12 SMID 81 = complete > run_interrupt_driven_hooks: still waiting after 60 seconds for = xpt_config > (probe4:mps0:0:20:0): SCSI command timeout on device handle 0x0012 = SMID 214 > mps0: mpssas_abort_complete: abort request on handle 0x12 SMID 214 = complete > run_interrupt_driven_hooks: still waiting after 120 seconds for = xpt_config > run_interrupt_driven_hooks: still waiting after 180 seconds for = xpt_config > (probe4:mps0:0:20:0): SCSI command timeout on device handle 0x0012 = SMID 281 > mps0: mpssas_abort_complete: abort request on handle 0x12 SMID 281 = complete > run_interrupt_driven_hooks: still waiting after 240 seconds for = xpt_config > (probe4:mps0:0:20:0): SCSI command timeout on device handle 0x0012 = SMID 348 > mps0: mpssas_abort_complete: abort request on handle 0x12 SMID 348 = complete > run_interrupt_driven_hooks: still waiting after 300 seconds for = xpt_config > (probe4:mps0:0:20:0): SCSI command timeout on device handle 0x0012 = SMID 415 > mps0: mpssas_abort_complete: abort request on handle 0x12 SMID 415 = complete > panic: run_interrupt_driven_config_hooks: waited too long > cpuid =3D 0 > KDB: enter: panic > [ thread pid 0 tid 100000 ] > Stopped at kdb_enter+0x3d: movq $0,0x4c70b0(%rip) > db> >=20 > - ioctl support isn't complete, and there is no userland utility. >=20 > - There is no man page. >=20 > The driver is in the tree at this point to allow people to test it = out, > report any problems, and hopefully contribute bug fixes. >=20 > LSI has some developers working on this driver, and we hope to get = them to > put some of their work-in-progress in the FreeBSD Perforce repo. So, = in > view of that, if you make any changes to the driver, please make them = in > the FreeBSD Perforce repository first (in //depot/projects/mps/...) = and > then merge them into FreeBSD-current. >=20 > Thanks to Scott Long for writing the driver, and to Yahoo and Spectra = Logic > for sponsoring the work. >=20 > Ken > --=20 > Kenneth Merry > ken@FreeBSD.ORG > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to = "freebsd-scsi-unsubscribe@freebsd.org" -------------------------------------------------- Andrew Boyer aboyer@averesystems.com From owner-freebsd-scsi@FreeBSD.ORG Tue Sep 21 17:44:54 2010 Return-Path: Delivered-To: scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 19CB01065695; Tue, 21 Sep 2010 17:44:54 +0000 (UTC) (envelope-from michael@fuckner.net) Received: from dedihh.fuckner.net (dedihh.fuckner.net [81.209.183.161]) by mx1.freebsd.org (Postfix) with ESMTP id 964E48FC1B; Tue, 21 Sep 2010 17:44:53 +0000 (UTC) Received: from dedihh.fuckner.net (localhost [127.0.0.1]) by dedihh.fuckner.net (Postfix) with ESMTP id D1C3216590; Tue, 21 Sep 2010 19:25:42 +0200 (CEST) X-Virus-Scanned: amavisd-new at fuckner.net Received: from dedihh.fuckner.net ([127.0.0.1]) by dedihh.fuckner.net (dedihh.fuckner.net [127.0.0.1]) (amavisd-new, port 10024) with SMTP id 8PERJYtX6qEK; Tue, 21 Sep 2010 19:25:36 +0200 (CEST) Received: from dedihh.fuckner.net (localhost [127.0.0.1]) by dedihh.fuckner.net (Postfix) with ESMTP id 890E816588; Tue, 21 Sep 2010 19:25:35 +0200 (CEST) Received: from 85.176.131.130 (SquirrelMail authenticated user molli123) by dedihh.fuckner.net with HTTP; Tue, 21 Sep 2010 19:25:35 +0200 Message-ID: In-Reply-To: References: <20100910150438.GA64519@nargothrond.kdm.org> Date: Tue, 21 Sep 2010 19:25:35 +0200 From: "Michael Fuckner" To: "Andrew Boyer" User-Agent: SquirrelMail/1.4.21 [SVN] MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 X-Priority: 3 (Normal) Importance: Normal Content-Transfer-Encoding: quoted-printable Cc: "Kenneth D. Merry" , scsi@freebsd.org Subject: Re: LSI 6Gb SAS driver committed X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Sep 2010 17:44:54 -0000 > Folks, > This driver advertises support for PCI IDs 0x74, 0x76, and 0x77, LSI > SAS2108-based products judging by the description. Are these RAID-on-c= hip > devices? How do they work with this driver? Has anyone tried it? > > Does anybody know how those devices differ from the SAS2108-based MegaS= AS > Gen2 ID 0x79 supported by mfi? > > I would really like to discard the hardware RAID functions of the MegaS= AS > and treat it like an HBA, but after adding the 0x79 ID to mps it wasn't > able to initialize the card. Hi all, I hope to clarify a few things LSI 2008: SAS2 (6GBit) HBA, can do Integrated Raid, but currently not supported, Driver: mps (Linux: mpt2sas) LSI 2108: SAS2 (6GBit) Raidcontroller with Cache, optional battery... Driver mfi? (Linux: megaraid_sas) LSI2208: Dual Core Raid Chip?!? I haven't seen this one until today. The driver claims to support 2[0-2]08. I hope I have time this week to plug some cards into a FreeBSD-Machine to let you know what exactly works. Regards, Michael! From owner-freebsd-scsi@FreeBSD.ORG Tue Sep 21 19:49:31 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 46B4A1065672 for ; Tue, 21 Sep 2010 19:49:31 +0000 (UTC) (envelope-from lambert@lambertfam.org) Received: from sysmon.tcworks.net (sysmon.tcworks.net [65.66.76.4]) by mx1.freebsd.org (Postfix) with ESMTP id EE4F98FC26 for ; Tue, 21 Sep 2010 19:49:30 +0000 (UTC) Received: from sysmon.tcworks.net (localhost [127.0.0.1]) by sysmon.tcworks.net (8.13.1/8.13.1) with ESMTP id o8LJa3Qo071405 for ; Tue, 21 Sep 2010 14:36:03 -0500 (CDT) (envelope-from lambert@lambertfam.org) Received: (from lambert@localhost) by sysmon.tcworks.net (8.13.1/8.13.1/Submit) id o8LJa37p071404 for freebsd-scsi@freebsd.org; Tue, 21 Sep 2010 14:36:03 -0500 (CDT) (envelope-from lambert@lambertfam.org) X-Authentication-Warning: sysmon.tcworks.net: lambert set sender to lambert@lambertfam.org using -f Date: Tue, 21 Sep 2010 14:36:03 -0500 From: Scott Lambert To: freebsd-scsi@freebsd.org Message-ID: <20100921193603.GA18674@sysmon.tcworks.net> Mail-Followup-To: freebsd-scsi@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.2i Subject: Controller is no longer running X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Sep 2010 19:49:31 -0000 I've had this problem occur about five times in the last year since we've been on 8.x. It happened with 7.x also, but it wasn't as critical a machine back then and I didn't care as much and hoped 8 would make it all better. The machine wasn't loaded as heavily and it probably happenned three times in two years. The problem may happen two days, two hours, or 5 months apart. I haven't been able to figure out a set of conditions which apply every time it happens. It does tend to happen while the backups are running, amanda dump or tar. I think that just provides the critical disk I/O load level to make the problem more likely. I swear I took picture of the error messages on the console the time before this when it happened, but can't find them now. This morning I had remote hands power cycle it while I was en-route to the office. The message on-screen was or was very close to "The controller is no longer running". I remember messages about timing out commands to the raid controller after something like 15 seconds from the last time. The firmware on the controller is from 2006 and is the latest I found to be available. Is this a known problem with the Adaptec 2120S type RAID cards? Or do I just have bad hardware? The array is always intact after a power cycle. But fsck has to fix many things. It is now a cyrus-imapd mail server. FreeBSD 8.1-STABLE #0: Thu Aug 19 19:41:51 CDT 2010 root@cyrus.example.com:/usr/obj/usr/src/sys/GENERIC i386 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.02-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf48 Family = f Model = 4 Stepping = 8 Features=0xbfebfbff Features2=0x649d AMD Features=0x20100000 AMD Features2=0x1 TSC: P-state invariant real memory = 2147483648 (2048 MB) Physical memory chunk(s): 0x0000000000001000 - 0x000000000009dfff, 643072 bytes (157 pages) 0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages) 0x0000000001026000 - 0x000000007db8afff, 2092322816 bytes (510821 pages) avail memory = 2090995712 (1994 MB) aac0: mem 0xf8000000-0xfbffffff irq 50 at device 9.0 on pci3 aac0: Reserved 0x4000000 bytes for rid 0x10 type 3 at 0xf8000000 aac0: Enable Raw I/O aac0: New comm. interface enabled ioapic2: routing intpin 2 (PCI IRQ 50) to lapic 0 vector 51 aac0: [MPSAFE] aac0: [ITHREAD] aac0: i960 80303 100MHz, 64MB memory (48MB cache, 16MB execution), optional battery present aac0: Kernel 4.2-0, Build 8205, S/N 503926 aac0: Supported Options=31d7e aac0: Adaptec 2120S, aac driver 2.1.9-1 aacp0: on aac0 aacd0: on aac0 aacd0: 279962MB (573362176 sectors) GEOM: new disk aacd0 (probe0:aacp0:0:0:0): Data overrun (probe0:aacp0:0:0:0): Retrying command (probe0:aacp0:0:0:0): Data overrun (probe0:aacp0:0:0:0): Retrying command (probe0:aacp0:0:0:0): Data overrun (probe0:aacp0:0:0:0): Retrying command (probe0:aacp0:0:0:0): Data overrun (probe0:aacp0:0:0:0): Retrying command (probe0:aacp0:0:0:0): Data overrun (probe0:aacp0:0:0:0): Error 5, Retries exhausted (probe0:aacp0:0:2:0): Data overrun (probe0:aacp0:0:2:0): Retrying command (probe0:aacp0:0:2:0): Data overrun (probe0:aacp0:0:2:0): Retrying command (probe0:aacp0:0:2:0): Data overrun (probe0:aacp0:0:2:0): Retrying command (probe0:aacp0:0:2:0): Data overrun (probe0:aacp0:0:2:0): Retrying command (probe0:aacp0:0:2:0): Data overrun (probe0:aacp0:0:2:0): Error 5, Retries exhausted (probe0:aacp0:0:3:0): Data overrun (probe0:aacp0:0:3:0): Retrying command (probe0:aacp0:0:3:0): Data overrun (probe0:aacp0:0:3:0): Retrying command (probe0:aacp0:0:3:0): Data overrun (probe0:aacp0:0:3:0): Retrying command (probe0:aacp0:0:3:0): Data overrun (probe0:aacp0:0:3:0): Retrying command (probe0:aacp0:0:3:0): Data overrun (probe0:aacp0:0:3:0): Error 5, Retries exhausted (probe0:aacp0:0:4:0): Data overrun (probe0:aacp0:0:4:0): Retrying command (probe0:aacp0:0:4:0): Data overrun (probe0:aacp0:0:4:0): Retrying command (probe0:aacp0:0:4:0): Data overrun (probe0:aacp0:0:4:0): Retrying command (probe0:aacp0:0:4:0): Data overrun (probe0:aacp0:0:4:0): Retrying command (probe0:aacp0:0:4:0): Data overrun (probe0:aacp0:0:4:0): Error 5, Retries exhausted (probe0:aacp0:0:6:0): Data overrun (probe0:aacp0:0:6:0): Retrying command (probe0:aacp0:0:6:0): Data overrun (probe0:aacp0:0:6:0): Retrying command (probe0:aacp0:0:6:0): Data overrun (probe0:aacp0:0:6:0): Retrying command (probe0:aacp0:0:6:0): Data overrun (probe0:aacp0:0:6:0): Retrying command (probe0:aacp0:0:6:0): Data overrun (probe0:aacp0:0:6:0): Error 5, Retries exhausted pass0 at aacp0 bus 0 scbus0 target 0 lun 0 pass0: Fixed Uninstalled SCSI-3 device pass0: 3.300MB/s transfers pass1 at aacp0 bus 0 scbus0 target 2 lun 0 pass1: Fixed Uninstalled SCSI-3 device pass1: 3.300MB/s transfers pass2 at aacp0 bus 0 scbus0 target 3 lun 0 pass2: Fixed Uninstalled SCSI-3 device pass2: 3.300MB/s transfers pass3 at aacp0 bus 0 scbus0 target 4 lun 0 pass3: Fixed Uninstalled SCSI-3 device pass3: 3.300MB/s transfers pass4 at aacp0 bus 0 scbus0 target 6 lun 0 pass4: Fixed Uninstalled SCSI-2 device pass4: 3.300MB/s transfers ses0 at aacp0 bus 0 scbus0 target 6 lun 0 ses0: Fixed Uninstalled SCSI-2 device ses0: 3.300MB/s transfers ses0: SAF-TE Compliant Device pass0 at aacp0 bus 0 scbus0 target 0 lun 0 pass0: Fixed Uninstalled SCSI-3 device pass0: 3.300MB/s transfers pass1 at aacp0 bus 0 scbus0 target 2 lun 0 pass1: Fixed Uninstalled SCSI-3 device pass1: 3.300MB/s transfers pass2 at aacp0 bus 0 scbus0 target 3 lun 0 pass2: Fixed Uninstalled SCSI-3 device pass2: 3.300MB/s transfers pass3 at aacp0 bus 0 scbus0 target 4 lun 0 pass3: Fixed Uninstalled SCSI-3 device pass3: 3.300MB/s transfers Trying to mount root from ufs:/dev/aacd0s1a WARNING: / was not properly dismounted -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org From owner-freebsd-scsi@FreeBSD.ORG Wed Sep 22 00:33:54 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A186F10656A5 for ; Wed, 22 Sep 2010 00:33:54 +0000 (UTC) (envelope-from emaste@freebsd.org) Received: from mail1.sandvine.com (Mail1.sandvine.com [64.7.137.134]) by mx1.freebsd.org (Postfix) with ESMTP id 63F4E8FC13 for ; Wed, 22 Sep 2010 00:33:54 +0000 (UTC) Received: from labgw2.phaedrus.sandvine.com (192.168.222.22) by WTL-EXCH-1.sandvine.com (192.168.196.31) with Microsoft SMTP Server id 14.0.694.0; Tue, 21 Sep 2010 20:23:07 -0400 Received: by labgw2.phaedrus.sandvine.com (Postfix, from userid 10332) id 7169D33C00; Tue, 21 Sep 2010 20:23:06 -0400 (EDT) Date: Tue, 21 Sep 2010 20:23:06 -0400 From: Ed Maste To: Message-ID: <20100922002306.GA24983@sandvine.com> References: <20100921193603.GA18674@sysmon.tcworks.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20100921193603.GA18674@sysmon.tcworks.net> User-Agent: Mutt/1.4.2.1i Subject: Re: Controller is no longer running X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Sep 2010 00:33:54 -0000 On Tue, Sep 21, 2010 at 02:36:03PM -0500, Scott Lambert wrote: > [Adaptec 2120 controller problems snipped] > > aac0: Kernel 4.2-0, Build 8205, S/N 503926 Unfortunately I think you're out of luck unless you can get Adaptec to release a new firmware for this card; this version has a lot of bugs. The "controller is no longer running" message means the firmware on the card has crashed. We had a ton of problems with 8205 firmware (on a 2130 controller, but I presume it's built from the same source tree). Firmware 15611 has been stable for us with the 2130. -Ed From owner-freebsd-scsi@FreeBSD.ORG Wed Sep 22 05:43:12 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7968F1065670 for ; Wed, 22 Sep 2010 05:43:12 +0000 (UTC) (envelope-from lambert@lambertfam.org) Received: from sysmon.tcworks.net (sysmon.tcworks.net [65.66.76.4]) by mx1.freebsd.org (Postfix) with ESMTP id 440778FC0A for ; Wed, 22 Sep 2010 05:43:11 +0000 (UTC) Received: from sysmon.tcworks.net (localhost [127.0.0.1]) by sysmon.tcworks.net (8.13.1/8.13.1) with ESMTP id o8M5hB1U010092 for ; Wed, 22 Sep 2010 00:43:11 -0500 (CDT) (envelope-from lambert@lambertfam.org) Received: (from lambert@localhost) by sysmon.tcworks.net (8.13.1/8.13.1/Submit) id o8M5hBSe010091 for freebsd-scsi@freebsd.org; Wed, 22 Sep 2010 00:43:11 -0500 (CDT) (envelope-from lambert@lambertfam.org) X-Authentication-Warning: sysmon.tcworks.net: lambert set sender to lambert@lambertfam.org using -f Date: Wed, 22 Sep 2010 00:43:11 -0500 From: Scott Lambert To: freebsd-scsi@freebsd.org Message-ID: <20100922054311.GB18674@sysmon.tcworks.net> Mail-Followup-To: freebsd-scsi@freebsd.org References: <20100921193603.GA18674@sysmon.tcworks.net> <20100922002306.GA24983@sandvine.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100922002306.GA24983@sandvine.com> User-Agent: Mutt/1.4.2.2i Subject: Re: Controller is no longer running X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Sep 2010 05:43:12 -0000 On Tue, Sep 21, 2010 at 08:23:06PM -0400, Ed Maste wrote: > On Tue, Sep 21, 2010 at 02:36:03PM -0500, Scott Lambert wrote: > > > [Adaptec 2120 controller problems snipped] > > > > aac0: Kernel 4.2-0, Build 8205, S/N 503926 > > Unfortunately I think you're out of luck unless you can get Adaptec to > release a new firmware for this card; this version has a lot of bugs. > The "controller is no longer running" message means the firmware on the > card has crashed. > > We had a ton of problems with 8205 firmware (on a 2130 controller, but > I presume it's built from the same source tree). Firmware 15611 has > been stable for us with the 2130. That's what I was afraid of. I suppose we'll build a new box and transfer the load to it. Then I'll rip the adaptec card out and jump up and down on it for a while. After that, we'll take the motherboard's SCSI controller for a spin with ZFS, or gmirror, and use the server as a hot spare. Thanks for the reply. -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org From owner-freebsd-scsi@FreeBSD.ORG Thu Sep 23 10:42:11 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 05096106566B for ; Thu, 23 Sep 2010 10:42:11 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id B4CC38FC0A for ; Thu, 23 Sep 2010 10:42:10 +0000 (UTC) Received: by qyk31 with SMTP id 31so7353833qyk.13 for ; Thu, 23 Sep 2010 03:42:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=IkUwa/nuiWLgr8nfSkcltwE3OVXo/9tgynfp1kR90Ug=; b=thC/7bO8qfPhhoBszVo3ii213U+YFRMdhf0mdu5OOW/fcWXouwXWyJZ0UVmkU6IDdk EbZx9XASJnsKkC2cMDG6eVPB+jRW6t0LQYmKOaI9JOPsSfS1I8OP4TkUbksxnHoq1H+O 5N8bhr74SEC8wJPPUvGZyy3MRUncUVB47rPiQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=XDZs2O+pJOkZINgdncwVv843S+LgsiEmnGtW1Js4H7ze0X/IP5WBr5pWEpL2ZjJEp/ zUDqCJPG8EHyZPal4SycJ6q3ueuwl+uaNx1r2SC47CSJJhXtbRJiIF9bFrgcIkLq+8tH qjVYkA9drknAOAorZ39cAYTwdxyPj7OJrmbBc= MIME-Version: 1.0 Received: by 10.229.235.6 with SMTP id ke6mr1170458qcb.101.1285237102701; Thu, 23 Sep 2010 03:18:22 -0700 (PDT) Received: by 10.229.50.8 with HTTP; Thu, 23 Sep 2010 03:18:22 -0700 (PDT) Date: Thu, 23 Sep 2010 14:18:22 +0400 Message-ID: From: pluknet To: freebsd-scsi Content-Type: text/plain; charset=ISO-8859-1 Subject: mptutil shows two volumes as with index 0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2010 10:42:11 -0000 Hi. Seen on xSeries x3550 M2 with IBM SR-BR10i (C1068E). This is FreeBSD 8.1 amd64. HW config and further info below. # dmesg | grep mpt mpt0: port 0x1000-0x10ff mem 0x97910000-0x97913fff,0x97900000-0x9790ffff irq 16 at device 0.0 on pci1 mpt0: [ITHREAD] mpt0: MPI Version=1.5.20.0 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 2 Active Volumes (2 Max) mpt0: 4 Hidden Drive Members (14 Max) mpt0:vol0(mpt0:0:0): Settings ( Member-WCE Hot-Plug-Spares ) mpt0:vol0(mpt0:0:0): Using Spare Pool: 0 mpt0:vol0(mpt0:0:0): 2 Members: (mpt0:1:5:0): Primary Online (mpt0:1:6:0): Secondary Online mpt0:vol0(mpt0:0:0): RAID-1 - Optimal mpt0:vol0(mpt0:0:0): Status ( Enabled ) (mpt0:vol0:1): Physical (mpt0:0:6:0), Pass-thru (mpt0:1:0:0) (mpt0:vol0:1): Online (mpt0:vol0:0): Physical (mpt0:0:5:0), Pass-thru (mpt0:1:1:0) (mpt0:vol0:0): Online (mpt0:0:7): Physical (mpt0:0:7:0), Pass-thru (mpt0:1:2:0) (mpt0:0:7): Online (mpt0:0:4): Physical (mpt0:0:4:0), Pass-thru (mpt0:1:3:0) (mpt0:0:4): Online (xpt0:mpt0:1:-1:-1): rescan already queued da0 at mpt0 bus 0 scbus0 target 0 lun 0 da1 at mpt0 bus 0 scbus0 target 2 lun 0 mpt0@pci0:1:0:0: class=0x010000 card=0x03941014 chip=0x00581000 rev=0x08 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' device = 'SAS 3000 series, 8-port with 1068E -StorPort' class = mass storage subclass = SCSI cap 01[50] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 10[68] = PCI-Express 1 endpoint max data 128(4096) link x4(x8) cap 05[98] = MSI supports 1 message, 64 bit cap 11[b0] = MSI-X supports 1 message in map 0x14 # mptutil show config mpt0 Configuration: 2 volumes, 4 drives volume 0 (136G) RAID-1 OPTIMAL spans: drive 1 (137G) ONLINE SAS drive 0 (137G) ONLINE SAS spare pools: 0 volume 0 (29G) RAID-1 OPTIMAL spans: drive 2 (30G) ONLINE SATA drive 3 (30G) ONLINE SATA spare pools: 0 # mptutil show volumes mpt0 Volumes: Id Size Level Stripe State Write-Cache Name 0 ( 136G) RAID-1 OPTIMAL Enabled 0 ( 29G) RAID-1 OPTIMAL Disabled # mptutil show drives mpt0 Physical Drives: 0 ( 137G) ONLINE SAS bus 0 id 6 1 ( 137G) ONLINE SAS bus 0 id 5 2 ( 30G) ONLINE SATA bus 0 id 7 3 ( 30G) ONLINE SATA bus 0 id 4 # camcontrol devlist at scbus0 target 0 lun 0 (da0,pass0) at scbus0 target 2 lun 0 (da1,pass1) -- wbr, pluknet From owner-freebsd-scsi@FreeBSD.ORG Thu Sep 23 12:55:07 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EBAF11065695 for ; Thu, 23 Sep 2010 12:55:07 +0000 (UTC) (envelope-from niklas@saers.com) Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 73D718FC1C for ; Thu, 23 Sep 2010 12:55:07 +0000 (UTC) Received: by ewy22 with SMTP id 22so509758ewy.13 for ; Thu, 23 Sep 2010 05:55:06 -0700 (PDT) Received: by 10.213.22.200 with SMTP id o8mr1859451ebb.62.1285246506362; Thu, 23 Sep 2010 05:55:06 -0700 (PDT) Received: from [10.32.100.120] (webmail.danskscanning.dk [89.184.151.254]) by mx.google.com with ESMTPS id u9sm1167619eeh.17.2010.09.23.05.55.04 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 23 Sep 2010 05:55:04 -0700 (PDT) From: Niklas Saers Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Thu, 23 Sep 2010 14:55:02 +0200 Message-Id: <2EA9CBBC-3F97-4AF2-BFB5-96DF39FDE376@saers.com> To: freebsd-scsi@freebsd.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) Subject: mfi - setting up disks X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2010 12:55:08 -0000 Hi guys, In the SuperMicro system where I had problems with the mpt controller, I = switched it for a mfi-based controller. I had it set up with 36x RAID0 = volumes with each their own disk (no way to access the disk otherwise I = found), and added them to a ZFS system. The numbering became a bit = weird, so I pulled the disks out one by one and put them back to figure = out and note down what disk number was in what slot. Only test data on = my ZFS volume, so I didn't mind that crashing. Now that all disks have been taken out and put back in one by one, I do: # mfiutil show volumes mfi0 Volumes: Id Size Level Stripe State Cache Name Whoops, no volumes? That can't be good. I check up on the disks, they're = all there: mfiutil show drives mfi0 Physical Drives: ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 2 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 3 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 4 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 14 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 15 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 17 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 23 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 1 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 4 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 5 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 6 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 7 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 8 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 10 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 11 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 0 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 5 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 6 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 7 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 8 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 9 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 10 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 11 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 12 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 13 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 16 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 19 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 21 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 22 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 2 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 3 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 9 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 1 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 20 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 2, slot 0 ( 1863G) UNCONFIGURED GOOD = SATA enclosure 1, slot 18 Well, that's an excelent way of adding the volumes in the sequence they = appear physically, so I start with the first one from top left: # mfiutil create raid0 -v E01:S05 Adding drive 26 to array 0 Adding array 0 to volume 0 mfiutil: Command failed: Wrong firmware or drive state mfiutil: Failed to add volume: Input/output error Firmware error? Invalid drive state? # mfiutil good E01:S05 mfiutil: Drive 26 is already in the desired state Seems to be good... do I have a firmware issue? Check dmesg: mfi0: port 0xc000-0xc0ff mem = 0xfad7c000-0xfad7ffff,0xfadc0000-0xfadfffff irq 16 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00=20 mfi0: 1966 (338565595s/0x0020/info) - Shutdown command received from = host mfi0: 1967 (boot + 3s/0x0020/info) - Firmware initialization started = (PCI ID 0079/1000/9261/1000) mfi0: 1968 (boot + 3s/0x0020/info) - Firmware version 2.0.03-0673 mfi0: 1969 (boot + 4s/0x0020/info) - Board Revision=20 mfi0: 1970 (boot + 24s/0x0004/info) - Enclosure (SES) discovered on PD = 08(c Port 0 - 3/p1) mfi0: 1971 (boot + 24s/0x0004/info) - Enclosure (SES) discovered on PD = 09(c Port 0 - 3/p2) mfi0: 1972 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) = communication restored mfi0: 1973 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) = fan 1 speed changed mfi0: 1974 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) = fan 2 speed changed mfi0: 1975 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) = fan 3 speed changed mfi0: 1976 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) = communication restored mfi0: 1977 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) = fan 1 speed changed mfi0: 1978 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) = fan 2 speed changed mfi0: 1979 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) = fan 3 speed changed mfi0: 1980 (boot + 24s/0x0002/info) - Inserted: Encl PD 08 mfi0: 1981 (boot + 24s/0x0002/info) - Inserted: PD 08(c Port 0 - 3/p1) = Info: enclPd=3D08, scsiType=3Dd, portMap=3D00, = sasAddr=3D50030480008fb0fd,0000000000000000 mfi0: 1982 (boot + 24s/0x0002/info) - Inserted: Encl PD 09 mfi0: 1983 (boot + 24s/0x0002/info) - Inserted: PD 09(c Port 0 - 3/p2) = Info: enclPd=3D09, scsiType=3Dd, portMap=3D00, = sasAddr=3D50030480008e7b7d,0000000000000000 mfi0: 1984 (boot + 24s/0x0002/info) - Inserted: PD 0a(e0x08/s2) and then lots of disks.... looks fine, right? What am I missing? I badly want mfid0-mfid35 back so that I can recreate = my ZFS and get to work :-) On a side note, the ZFS I'll make is this, any comments on the = configuration? zpool create tank \ raidz2 /dev/mfid0 /dev/mfid1 /dev/mfid2 /dev/mfid3 /dev/mfid4 /dev/mfid5 = /dev/mfid30 \ raidz2 /dev/mfid6 /dev/mfid7 /dev/mfid8 /dev/mfid9 /dev/mfid10 = /dev/mfid11 /dev/mfid31 \ raidz2 /dev/mfid12 /dev/mfid13 /dev/mfid14 /dev/mfid15 /dev/mfid16 = /dev/mfid17 /dev/mfid32 \ raidz2 /dev/mfid18 /dev/mfid19 /dev/mfid20 /dev/mfid21 /dev/mfid22 = /dev/mfid23 /dev/mfid33 \ raidz2 /dev/mfid24 /dev/mfid25 /dev/mfid26 /dev/mfid27 /dev/mfid28 = /dev/mfid29 /dev/mfid34 \ spare /dev/mfid35 Cheers Nik= From owner-freebsd-scsi@FreeBSD.ORG Thu Sep 23 13:56:04 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D2FA106564A for ; Thu, 23 Sep 2010 13:56:04 +0000 (UTC) (envelope-from darksoul@darkbsd.org) Received: from zelretch.yomi.darkbsd.org (fnttkyo035000.tkyo.fnt.ftth.ppp.infoweb.ne.jp [58.1.242.192]) by mx1.freebsd.org (Postfix) with ESMTP id 8ED5F8FC25 for ; Thu, 23 Sep 2010 13:56:03 +0000 (UTC) Received: from zelretch.yomi.darkbsd.org (localhost [127.0.0.1]) by zelretch.yomi.darkbsd.org (Postfix) with ESMTP id 0DBE22840C7; Thu, 23 Sep 2010 22:36:11 +0900 (JST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=darkbsd.org; h=message-id :date:from:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; s=selector1; bh=andHIrl bifbbrThP7xuPamKtvEE=; b=UQaXm9x5bImvis3hx/2Gm6OrilaQp6eY/+DAZgo 79Wp9pIhKcvUeL+NIycrKsJD8orOWaPaiDW/XJM/KGCDBKENCOm7GGZVXkDOkUQ2 F6CXZkCcvM0L3BG/cwy5MMBC41Z/Ve9LFmMcJR+T7jGxBN6MjZ4oj0aVBnEbqZfs flYA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=darkbsd.org; h=message-id :date:from:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=selector1; b=Y dn/OhszlOk3fYmGRCtCCDpEBapq1P5jzF32xwtZJbUDA7UJXP+UIaa3+Y8rbJowp s/wkNuyvJWczZLSDHwTh+n2GX3empEq8x+EdSzLUK1njUBipvSDmUquYDXuA8Vq4 hGDDo0aAYn4AuXbsLCThv52az3Y0MQcfJ40UFDL688= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=darkbsd.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=selector1; t= 1285248960; bh=xENIcoOsSaxrFNUCjH05G9v/Oe0MmuuQ8IrO+8IN2nw=; b=C aHcTxQGp32U2GLNdX1nQBEwQk32OdYMro4sHOkGtRs6BsZa9KzEHRQmzFsgHEdbO pPwvyKH1XmPIbwbdc0QqBnZQd070y0Ir/V9F6WH7mCOjX2aKnD/eWMpsIjTixkzd RG4iwhnZZBOoEKyk1w3F+ltEaO51A+9V49nEr2JiHE= Received: from zelretch.yomi.darkbsd.org ([127.0.0.1]) by zelretch.yomi.darkbsd.org (zelretch.yomi.darkbsd.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id H1948hLH4BcZ; Thu, 23 Sep 2010 22:36:00 +0900 (JST) Received: from [192.168.3.42] (archer.yomi.darkbsd.org [192.168.3.42]) (Authenticated sender: darksoul) by zelretch.yomi.darkbsd.org (Postfix) with ESMTPSA id 060882840C4; Thu, 23 Sep 2010 22:35:59 +0900 (JST) Message-ID: <4C9B57DF.4090905@darkbsd.org> Date: Thu, 23 Sep 2010 22:36:31 +0900 From: DarkSoul User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9pre) Gecko/20100821 Lanikai/3.1.3pre MIME-Version: 1.0 To: Niklas Saers References: <2EA9CBBC-3F97-4AF2-BFB5-96DF39FDE376@saers.com> In-Reply-To: <2EA9CBBC-3F97-4AF2-BFB5-96DF39FDE376@saers.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org Subject: Re: mfi - setting up disks X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2010 13:56:04 -0000 Hello, I already tinkered with that kind of controller too (it seems we faced the same problems in the same order too...). For this kind of purposes, you really want a controller that does real JBOD and tells you about what is going on with the disk. :/ I bumped in exactly the same problem as you, meaning that removing the drive will destroy the RAID-0 volume, and frankly, even if the tools were working for recreating the volumes, having to do that to see the disk again is a "dirty hack"(tm) that is bound to blow up in your face at some point, or to be really hard to maintain for any upcoming upgrade. Sorry for not providing any additional help with your problem. :/ On 09/23/2010 09:55 PM, Niklas Saers wrote: > Hi guys, > In the SuperMicro system where I had problems with the mpt controller, I switched it for a mfi-based controller. I had it set up with 36x RAID0 volumes with each their own disk (no way to access the disk otherwise I found), and added them to a ZFS system. The numbering became a bit weird, so I pulled the disks out one by one and put them back to figure out and note down what disk number was in what slot. Only test data on my ZFS volume, so I didn't mind that crashing. > > Now that all disks have been taken out and put back in one by one, I do: > > # mfiutil show volumes > mfi0 Volumes: > Id Size Level Stripe State Cache Name > > Whoops, no volumes? That can't be good. I check up on the disks, they're all there: > > mfiutil show drives > mfi0 Physical Drives: > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 2 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 3 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 4 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 14 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 15 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 17 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 23 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 1 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 4 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 5 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 6 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 7 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 8 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 10 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 11 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 0 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 5 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 6 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 7 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 8 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 9 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 10 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 11 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 12 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 13 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 16 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 19 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 21 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 22 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 2 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 3 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 9 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 1 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 20 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 2, slot 0 > ( 1863G) UNCONFIGURED GOOD SATA enclosure 1, slot 18 > > Well, that's an excelent way of adding the volumes in the sequence they appear physically, so I start with the first one from top left: > # mfiutil create raid0 -v E01:S05 > Adding drive 26 to array 0 > Adding array 0 to volume 0 > mfiutil: Command failed: Wrong firmware or drive state > mfiutil: Failed to add volume: Input/output error > > Firmware error? Invalid drive state? > > # mfiutil good E01:S05 > mfiutil: Drive 26 is already in the desired state > > Seems to be good... do I have a firmware issue? Check dmesg: > > mfi0: port 0xc000-0xc0ff mem 0xfad7c000-0xfad7ffff,0xfadc0000-0xfadfffff irq 16 at device 0.0 on pci5 > mfi0: Megaraid SAS driver Ver 3.00 > mfi0: 1966 (338565595s/0x0020/info) - Shutdown command received from host > mfi0: 1967 (boot + 3s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/9261/1000) > mfi0: 1968 (boot + 3s/0x0020/info) - Firmware version 2.0.03-0673 > mfi0: 1969 (boot + 4s/0x0020/info) - Board Revision > mfi0: 1970 (boot + 24s/0x0004/info) - Enclosure (SES) discovered on PD 08(c Port 0 - 3/p1) > mfi0: 1971 (boot + 24s/0x0004/info) - Enclosure (SES) discovered on PD 09(c Port 0 - 3/p2) > mfi0: 1972 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) communication restored > mfi0: 1973 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) fan 1 speed changed > mfi0: 1974 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) fan 2 speed changed > mfi0: 1975 (boot + 24s/0x0004/info) - Enclosure PD 08(c Port 0 - 3/p1) fan 3 speed changed > mfi0: 1976 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) communication restored > mfi0: 1977 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) fan 1 speed changed > mfi0: 1978 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) fan 2 speed changed > mfi0: 1979 (boot + 24s/0x0004/info) - Enclosure PD 09(c Port 0 - 3/p2) fan 3 speed changed > mfi0: 1980 (boot + 24s/0x0002/info) - Inserted: Encl PD 08 > mfi0: 1981 (boot + 24s/0x0002/info) - Inserted: PD 08(c Port 0 - 3/p1) Info: enclPd=08, scsiType=d, portMap=00, sasAddr=50030480008fb0fd,0000000000000000 > mfi0: 1982 (boot + 24s/0x0002/info) - Inserted: Encl PD 09 > mfi0: 1983 (boot + 24s/0x0002/info) - Inserted: PD 09(c Port 0 - 3/p2) Info: enclPd=09, scsiType=d, portMap=00, sasAddr=50030480008e7b7d,0000000000000000 > mfi0: 1984 (boot + 24s/0x0002/info) - Inserted: PD 0a(e0x08/s2) > > and then lots of disks.... looks fine, right? > > What am I missing? I badly want mfid0-mfid35 back so that I can recreate my ZFS and get to work :-) > > On a side note, the ZFS I'll make is this, any comments on the configuration? > > zpool create tank \ > raidz2 /dev/mfid0 /dev/mfid1 /dev/mfid2 /dev/mfid3 /dev/mfid4 /dev/mfid5 /dev/mfid30 \ > raidz2 /dev/mfid6 /dev/mfid7 /dev/mfid8 /dev/mfid9 /dev/mfid10 /dev/mfid11 /dev/mfid31 \ > raidz2 /dev/mfid12 /dev/mfid13 /dev/mfid14 /dev/mfid15 /dev/mfid16 /dev/mfid17 /dev/mfid32 \ > raidz2 /dev/mfid18 /dev/mfid19 /dev/mfid20 /dev/mfid21 /dev/mfid22 /dev/mfid23 /dev/mfid33 \ > raidz2 /dev/mfid24 /dev/mfid25 /dev/mfid26 /dev/mfid27 /dev/mfid28 /dev/mfid29 /dev/mfid34 \ > spare /dev/mfid35 > > > Cheers > > Nik_______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" -- Stephane LAPIE, EPITA SRS, Promo 2005 "Even when they have digital readouts, I can't understand them." --MegaTokyo From owner-freebsd-scsi@FreeBSD.ORG Thu Sep 23 14:30:42 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 039E41065693 for ; Thu, 23 Sep 2010 14:30:42 +0000 (UTC) (envelope-from mjohnston@sandvine.com) Received: from mail1.sandvine.com (Mail1.sandvine.com [64.7.137.134]) by mx1.freebsd.org (Postfix) with ESMTP id 998538FC1F for ; Thu, 23 Sep 2010 14:30:41 +0000 (UTC) Received: from WTL-EXCH-2.sandvine.com ([fe80::8959:ede3:2dbe:c1b]) by wtl-exch-1.sandvine.com ([fe80::f523:8e57:71d7:5206%14]) with mapi; Thu, 23 Sep 2010 10:19:54 -0400 From: Mark Johnston To: "freebsd-scsi@freebsd.org" Thread-Topic: [PATCH] trigger cam rescans in aac Thread-Index: ActbJ+sd+JKJPIh2QYClvJChBIisVAAAmR0A Date: Thu, 23 Sep 2010 14:19:47 +0000 Message-ID: <649630C24D5E884F85DD13645FE903A7ACA7@wtl-exch-2.sandvine.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-cr-puzzleid: {C9EA6ECB-B36C-40D7-8E69-DD753925B556} x-cr-hashedpuzzle: hCc= AIVR BaGD BkGe CUuX Ddbi DxBp EDwo EjTh FL63 FYvy Fm/d GL2D GfU2 HWtE IIPu; 1; ZgByAGUAZQBiAHMAZAAtAHMAYwBzAGkAQABmAHIAZQBlAGIAcwBkAC4AbwByAGcA; Sosha1_v1; 7; {C9EA6ECB-B36C-40D7-8E69-DD753925B556}; bQBqAG8AaABuAHMAdABvAG4AQABzAGEAbgBkAHYAaQBuAGUALgBjAG8AbQA=; Thu, 23 Sep 2010 14:19:47 GMT; WwBQAEEAVABDAEgAXQAgAHQAcgBpAGcAZwBlAHIAIABjAGEAbQAgAHIAZQBzAGMAYQBuAHMAIABpAG4AIABhAGEAYwA= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: [PATCH] trigger cam rescans in aac X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2010 14:30:42 -0000 Hi, Previously, the aac driver did not handle enclosure management AIFs, which were raised during hot-swap events. Now such events trigger cam rescans, as is done in the mps driver. Any comments or suggestions would be appreciated. Thanks, -Mark --- tmp/aac/aac.c 2010-09-16 13:24:53.000000000 -0400 +++ aac.c 2010-09-21 13:40:51.000000000 -0400 @@ -3209,6 +3209,7 @@ aac_handle_aif(struct aac_softc *sc, str struct aac_mntinforesp *mir; int next, current, found; int count =3D 0, added =3D 0, i =3D 0; + uint32_t channel; =20 fwprintf(sc, HBA_FLAGS_DBG_FUNCTION_ENTRY_B, ""); =20 @@ -3316,7 +3317,25 @@ aac_handle_aif(struct aac_softc *sc, str } =20 break; - + case AifEnEnclosureManagement: + switch (aif->data.EN.data.EEE.eventType) { + case AIF_EM_DRIVE_INSERTION: + case AIF_EM_DRIVE_REMOVAL: + channel =3D aif->data.EN.data.EEE.unitID; + if (sc->cam_rescan_cb !=3D NULL) + sc->cam_rescan_cb(sc,=20 + (channel >> 24) & 0xF, + (channel & 0xFFFF)); + break; + } + break; + case AifEnAddJBOD: + case AifEnDeleteJBOD: + channel =3D aif->data.EN.data.ECE.container; + if (sc->cam_rescan_cb !=3D NULL) + sc->cam_rescan_cb(sc, (channel >> 24) & 0xF, + AAC_CAM_TARGET_WILDCARD); + break; default: break; } --- tmp/aac/aac_cam.c 2010-09-16 13:24:53.000000000 -0400 +++ aac_cam.c 2010-09-21 13:40:21.000000000 -0400 @@ -37,12 +37,15 @@ __FBSDID("$FreeBSD: src/sys/dev/aac/aac_ #include #include #include +#include #include #include +#include =20 #include #include #include +#include #include #include #include @@ -76,6 +79,8 @@ static int aac_cam_detach(device_t dev); static void aac_cam_action(struct cam_sim *, union ccb *); static void aac_cam_poll(struct cam_sim *); static void aac_cam_complete(struct aac_command *); +static void aac_cam_rescan(struct aac_softc *sc, uint32_t channel, + uint32_t target_id); static u_int32_t aac_cam_reset_bus(struct cam_sim *, union ccb *); static u_int32_t aac_cam_abort_ccb(struct cam_sim *, union ccb *); static u_int32_t aac_cam_term_io(struct cam_sim *, union ccb *); @@ -101,6 +106,39 @@ MODULE_DEPEND(aacp, cam, 1, 1, 1); MALLOC_DEFINE(M_AACCAM, "aaccam", "AAC CAM info"); =20 static void +aac_cam_rescan(struct aac_softc *sc, uint32_t channel, uint32_t target_id) +{ + union ccb *ccb; + struct aac_sim *sim; + struct aac_cam *camsc; + + TAILQ_FOREACH(sim, &sc->aac_sim_tqh, sim_link) { + camsc =3D sim->aac_cam; + if (camsc =3D=3D NULL || camsc->inf =3D=3D NULL || + camsc->inf->BusNumber !=3D channel) + continue; + + ccb =3D xpt_alloc_ccb_nowait(); + if (ccb =3D=3D NULL) { + device_printf(sc->aac_dev, + "Cannot allocate ccb for bus rescan.\n"); + return; + } + + if (xpt_create_path(&ccb->ccb_h.path, xpt_periph, + cam_sim_path(camsc->sim), + target_id, CAM_LUN_WILDCARD) !=3D CAM_REQ_CMP) { + xpt_free_ccb(ccb); + device_printf(sc->aac_dev, + "Cannot create path for bus rescan.\n"); + return; + } + xpt_rescan(ccb); + break; + } +} + +static void aac_cam_event(struct aac_softc *sc, struct aac_event *event, void *arg) { union ccb *ccb; @@ -141,6 +179,7 @@ aac_cam_detach(device_t dev) =20 camsc =3D (struct aac_cam *)device_get_softc(dev); sc =3D camsc->inf->aac_sc; + camsc->inf->aac_cam =3D NULL; =20 mtx_lock(&sc->aac_io_lock); =20 @@ -149,6 +188,8 @@ aac_cam_detach(device_t dev) xpt_bus_deregister(cam_sim_path(camsc->sim)); cam_sim_free(camsc->sim, /*free_devq*/TRUE); =20 + sc->cam_rescan_cb =3D NULL; + mtx_unlock(&sc->aac_io_lock); =20 return (0); @@ -171,6 +212,7 @@ aac_cam_attach(device_t dev) camsc =3D (struct aac_cam *)device_get_softc(dev); inf =3D (struct aac_sim *)device_get_ivars(dev); camsc->inf =3D inf; + camsc->inf->aac_cam =3D camsc; =20 devq =3D cam_simq_alloc(inf->TargetsPerBus); if (devq =3D=3D NULL) @@ -198,6 +240,7 @@ aac_cam_attach(device_t dev) mtx_unlock(&inf->aac_sc->aac_io_lock); return (EIO); } + inf->aac_sc->cam_rescan_cb =3D aac_cam_rescan; mtx_unlock(&inf->aac_sc->aac_io_lock); =20 camsc->sim =3D sim; @@ -611,4 +654,3 @@ aac_cam_term_io(struct cam_sim *sim, uni { return (CAM_UA_TERMIO); } - --- tmp/aac/aacvar.h 2010-09-16 13:24:53.000000000 -0400 +++ aacvar.h 2010-09-21 13:41:14.000000000 -0400 @@ -120,6 +120,7 @@ struct aac_sim int BusNumber; int InitiatorBusId; struct aac_softc *aac_sc; + struct aac_cam *aac_cam; TAILQ_ENTRY(aac_sim) sim_link; }; =20 @@ -383,6 +384,9 @@ struct aac_softc=20 struct selinfo rcv_select; struct proc *aifthread; int aifflags; +#define AAC_CAM_TARGET_WILDCARD ~0 + void (*cam_rescan_cb)(struct aac_softc *, uint32_t, + uint32_t); #define AAC_AIFFLAGS_RUNNING (1 << 0) #define AAC_AIFFLAGS_AIF (1 << 1) #define AAC_AIFFLAGS_EXIT (1 << 2) --- tmp/aac/aacreg.h 2010-09-16 13:24:53.000000000 -0400 +++ aacreg.h 2010-09-20 14:27:52.000000000 -0400 @@ -885,6 +885,8 @@ typedef enum { AifEnBatteryNeedsRecond, /* The battery needs reconditioning */ AifEnClusterEvent, /* Some cluster event */ AifEnDiskSetEvent, /* A disk set event occured. */ + AifEnAddJBOD=3D31, /* A new JBOD type drive was created. */ + AifEnDeleteJBOD, /* A JBOD type drive was deleted. */ AifDriverNotifyStart=3D199, /* Notifies for host driver go here */ /* Host driver notifications start here */ AifDenMorphComplete, /* A morph operation completed */ @@ -922,6 +924,11 @@ struct aac_AifEnsEnclosureEvent { u_int32_t eventType; /* event type */ } __packed; =20 +typedef enum { + AIF_EM_DRIVE_INSERTION=3D31, + AIF_EM_DRIVE_REMOVAL +} aac_AifEMEventType; + struct aac_AifEnsBatteryEvent { AAC_NVBATT_TRANSITION transition_type; /* eg from low to ok */ AAC_NVBATTSTATUS current_state; /* current batt state */ From owner-freebsd-scsi@FreeBSD.ORG Fri Sep 24 22:18:31 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB069106566C for ; Fri, 24 Sep 2010 22:18:31 +0000 (UTC) (envelope-from tomas.hlavacek@elfove.cz) Received: from manwe.elfove.cz (unknown [IPv6:2001:1ab0:7e1e:d150:21e:bff:febc:5fe8]) by mx1.freebsd.org (Postfix) with ESMTP id 7AD758FC15 for ; Fri, 24 Sep 2010 22:18:31 +0000 (UTC) Received: from [192.168.1.24] (ip-78-45-42-7.net.upcbroadband.cz [78.45.42.7]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by manwe.elfove.cz (Postfix) with ESMTPSA id A897674277 for ; Sat, 25 Sep 2010 00:18:29 +0200 (CEST) Message-ID: <4C9D23B5.5050303@elfove.cz> Date: Sat, 25 Sep 2010 00:18:29 +0200 From: Tomas Hlavacek User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.12) Gecko/20100917 Icedove/3.0.8 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org X-Enigmail-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: mpt0 and removing disks X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Sep 2010 22:18:32 -0000 Hi! I had the same problem and I tried to research a bit into that. I tried to read the Linux mptsas driver, but I have only found out that lots of messages from HW are ignored in FreeBSD. It would be job lasting month (at least for me) to understand what does it mean and how to port the hotplugging-related code to FreeBSD. So I have made a simple hack or workaround to make FreeBSD kernel notice that device is lost when I physically disconnect the disk to prevent freeze and/or kernel panic (that's what happened to me when I disconnected one or more disks being used at the moment by ZFS). I have to warn that I am not a FreeBSD developer at all, actually I am pretty new here in *BSD. But anyway, this hack (for FreeBSD-stable) worked for me: --- sys/dev/mpt/mpt_cam.c.orig 2010-07-22 17:38:36.000000000 +0200 +++ sys/dev/mpt/mpt_cam.c 2010-09-24 23:19:30.000000000 +0200 @@ -2415,6 +2415,12 @@ xpt_async(AC_BUS_RESET, mpt->path, NULL); break; +// Hacked MPI_EVENT_SAS_PHY_LINK_STATUS handler to react on SAS device removal. + case MPI_EVENT_SAS_PHY_LINK_STATUS: + mpt_prt(mpt, "Bus reset due to SAS PHY link status change.\n"); + xpt_async(AC_BUS_RESET, mpt->path, NULL); + break; + case MPI_EVENT_RESCAN: #if __FreeBSD_version >= 600000 { It seems that it prevents ending up in the loop waiting for command completion on SATA disks since the device is kicked out before. With SAS disks it behaves a bit differently: Few timeouts and completing timeouted/... messages are printed but after a while it resets driver and realizes that the disk is lost. I think it would be much better to parse data0 and/or data1 variables in order to react only on device removal but I have not succeeded to understand to meaning of each bit there, even though I tried to consult (much more advanced) Linux mptsas driver. BTW.: Sorry for starting a new thread for this. I just subscribed to the mailinglist and so in archives I can not see Message-ID and other headers of the original message to properly reply on it. Tomas -- Tomáš Hlaváček From owner-freebsd-scsi@FreeBSD.ORG Fri Sep 24 22:45:47 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A5BC106566C for ; Fri, 24 Sep 2010 22:45:47 +0000 (UTC) (envelope-from des@des.no) Received: from smtp.des.no (smtp.des.no [194.63.250.102]) by mx1.freebsd.org (Postfix) with ESMTP id 4BAF38FC17 for ; Fri, 24 Sep 2010 22:45:47 +0000 (UTC) Received: from ds4.des.no (des.no [84.49.246.2]) by smtp.des.no (Postfix) with ESMTP id 777101FFC53; Fri, 24 Sep 2010 22:45:46 +0000 (UTC) Received: by ds4.des.no (Postfix, from userid 1001) id 3D9E6844A3; Sat, 25 Sep 2010 00:45:46 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Niklas Saers References: <2EA9CBBC-3F97-4AF2-BFB5-96DF39FDE376@saers.com> Date: Sat, 25 Sep 2010 00:45:46 +0200 In-Reply-To: <2EA9CBBC-3F97-4AF2-BFB5-96DF39FDE376@saers.com> (Niklas Saers's message of "Thu, 23 Sep 2010 14:55:02 +0200") Message-ID: <86aan67obp.fsf@ds4.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-scsi@freebsd.org Subject: Re: mfi - setting up disks X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Sep 2010 22:45:47 -0000 Niklas Saers writes: > In the SuperMicro system where I had problems with the mpt controller, > I switched it for a mfi-based controller. I had it set up with 36x > RAID0 volumes with each their own disk (no way to access the disk > otherwise I found), and added them to a ZFS system. The numbering > became a bit weird, so I pulled the disks out one by one and put them > back to figure out and note down what disk number was in what > slot. Only test data on my ZFS volume, so I didn't mind that crashing. You can wire down SCSI buses and disks in /boot/device.hints so each disk always gets the same device number regardless of the order in which the disks spin up. The syntax is documented in /sys/conf/NOTES (search for "SCSI DEVICE CONFIGURATION"). It's a CAM feature, and mfi uses CAM, so I *think* it should work for mfi as well, but what you'll actually be wiring down are mfi volumes, not individual disks, so it's up to you to assign the right disk to the right volume. I am very close to suggesting that you just let the controller handle the RAID part of things and just build your zfs pool on top of that, i.e. use mfiutil to divide your disks into 5 x 6+1, 5 x 5+2 or 7 x 4+1 volumes plus one spare, and use each volume as a separate vdev in your zfs pool. You may lose a small amount of performance, and rebuilds will be slower, but the main argument in favor of zfs, the write hole, is moot if your controller has battery-backed cache. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no