From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 8 10:30:40 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E492016A41F for ; Mon, 8 Aug 2005 10:30:40 +0000 (GMT) (envelope-from danny@cs.huji.ac.il) Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D43143D45 for ; Mon, 8 Aug 2005 10:30:40 +0000 (GMT) (envelope-from danny@cs.huji.ac.il) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by cs1.cs.huji.ac.il with esmtp id 1E24tu-0000Hl-1P; Mon, 08 Aug 2005 13:30:34 +0300 X-Mailer: exmh version 2.7.0 06/18/2004 with nmh-1.0.4 To: freebsd-scsi@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 08 Aug 2005 13:30:33 +0300 From: Danny Braniss Message-ID: Cc: Subject: CAM, SCSIn/iSCSI & LUNs X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 10:30:41 -0000 hi, it seems that one of the differences between the SCSI1/2/3/4/i is the size of the LUN :-) Now, it seems that the CAM will search sequencially for LUNs, from 0 -> max_lun which i don't think will scale nicely. is there a way to tell the cam to do a scsi report luns command, instead of the sequential search? danny From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 8 11:02:04 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 096A216A42A for ; Mon, 8 Aug 2005 11:02:04 +0000 (GMT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6ACD343D46 for ; Mon, 8 Aug 2005 11:02:02 +0000 (GMT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j78B22sM006937 for ; Mon, 8 Aug 2005 11:02:02 GMT (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j78B21Y8006928 for freebsd-scsi@freebsd.org; Mon, 8 Aug 2005 11:02:01 GMT (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 8 Aug 2005 11:02:01 GMT Message-Id: <200508081102.j78B21Y8006928@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to you X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 11:02:05 -0000 Current FreeBSD problem reports Critical problems Serious problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2001/05/03] kern/27059 scsi (symbios) SCSI subsystem hangs under heav o [2001/06/29] kern/28508 scsi problems with backup to Tandberg SLR40 st o [2002/06/17] kern/39388 scsi ncr/sym drivers fail with 53c810 and more o [2002/07/22] kern/40895 scsi wierd kernel / device driver bug s [2003/09/30] kern/57398 scsi Current fails to install on mly(4) based o [2003/12/26] kern/60598 scsi wire down of scsi devices conflicts with a [2004/01/10] kern/61165 scsi [panic] kernel page fault after calling c o [2004/09/15] kern/71778 scsi 5.3 BETA3 doesnt see Adaptec 2015S FW Rev o [2004/12/02] kern/74607 scsi FreeBSD 5.3 install CD crashes on SCSI de o [2004/12/02] kern/74627 scsi Adaptec 2940U2W Can't boot 5.3 10 problems total. Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/12/06] kern/23314 scsi aic driver fails to detect Adaptec 1520B o [2001/08/15] kern/29727 scsi [amr] [patch] amr_enquiry3 structure in a o [2002/02/23] kern/35234 scsi World access to /dev/pass? (for scanner) o [2002/06/02] kern/38828 scsi [feature request] DPT PM2012B/90 doesn't o [2002/10/29] kern/44587 scsi dev/dpt/dpt.h is missing defines required o [2003/10/01] kern/57469 scsi [patch] Quirk for Conner CP3500 6 problems total. From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 8 15:06:14 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8901116A41F for ; Mon, 8 Aug 2005 15:06:14 +0000 (GMT) (envelope-from lydianconcepts@gmail.com) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.200]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1D78A43D46 for ; Mon, 8 Aug 2005 15:06:13 +0000 (GMT) (envelope-from lydianconcepts@gmail.com) Received: by rproxy.gmail.com with SMTP id i8so914720rne for ; Mon, 08 Aug 2005 08:06:13 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=aBOhH0RkO1nRyVovrrXbfhu3n1MN87DMLVOdWHusLB0QF+0IGp5i5nXTeRfi7dKuoZYK/7Qjg0k7uVBzFLBvD66/76l6KtEg6s8kZ1yDu0s0Y8LQw1pH7yLJgA9TY0PYmjEjDf403ZLdREG13+BukMl8q63uLLlT8STujFRpOwc= Received: by 10.39.3.19 with SMTP id f19mr2592733rni; Mon, 08 Aug 2005 08:06:13 -0700 (PDT) Received: by 10.38.104.60 with HTTP; Mon, 8 Aug 2005 08:06:13 -0700 (PDT) Message-ID: <7579f7fb05080808066c205a08@mail.gmail.com> Date: Mon, 8 Aug 2005 08:06:13 -0700 From: Matthew Jacob To: Danny Braniss In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: Cc: freebsd-scsi@freebsd.org Subject: Re: CAM, SCSIn/iSCSI & LUNs X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 15:06:14 -0000 People have yacked for years about implementing REPORT_LUNS within FreeBSD/CAM. I mean, Windows has only used it for the last 8 years, so it's probably okay. It's been nothing but talk really so far. On 8/8/05, Danny Braniss wrote: > hi, > it seems that one of the differences between the SCSI1/2/3/4/i > is the size of the LUN :-) >=20 > Now, it seems that the CAM will search sequencially for LUNs, from 0 -> m= ax_lun > which i don't think will scale nicely. >=20 > is there a way to tell the cam to do a scsi report luns command, instead = of > the sequential search? >=20 > danny >=20 >=20 > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 8 16:10:49 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B55C716A41F for ; Mon, 8 Aug 2005 16:10:49 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4E5B143D58 for ; Mon, 8 Aug 2005 16:10:46 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.14] (imini.samsco.home [192.168.254.14]) (authenticated bits=0) by pooker.samsco.org (8.13.3/8.13.3) with ESMTP id j78GMQEk013071; Mon, 8 Aug 2005 10:22:27 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <42F783FC.7090904@samsco.org> Date: Mon, 08 Aug 2005 10:10:36 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.7) Gecko/20050416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Danny Braniss References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org Cc: freebsd-scsi@freebsd.org Subject: Re: CAM, SCSIn/iSCSI & LUNs X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 16:10:49 -0000 Danny Braniss wrote: > hi, > it seems that one of the differences between the SCSI1/2/3/4/i > is the size of the LUN :-) > > Now, it seems that the CAM will search sequencially for LUNs, from 0 -> max_lun > which i don't think will scale nicely. > > is there a way to tell the cam to do a scsi report luns command, instead of > the sequential search? > > danny > > I, Matt Jacob, and Ken Merry talked about implementing REPORT_LUNS a couple of months ago, but it stalled for a reason that I cannot recall. Scott From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 8 17:46:03 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D1CDE16A422 for ; Mon, 8 Aug 2005 17:46:03 +0000 (GMT) (envelope-from dbaukus@chiaro.com) Received: from rchss002.chiaro.com (rchss002.chiaro.com [63.88.196.82]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5572343E72 for ; Mon, 8 Aug 2005 17:35:36 +0000 (GMT) (envelope-from dbaukus@chiaro.com) Received: from rchst007.cus.chiaro.com ([192.168.8.120]) by rchss002.chiaro.com (8.12.11/8.12.11) with SMTP id j78HWEe1015547 for ; Mon, 8 Aug 2005 12:32:14 -0500 (CDT) (envelope-from dbaukus@chiaro.com) Received: from chiaro.com ([192.168.25.95]) by rchst007.cus.chiaro.com with Microsoft SMTPSVC(5.0.2195.6713); Mon, 8 Aug 2005 12:35:34 -0500 Message-ID: <42F79956.8070401@chiaro.com> Date: Mon, 08 Aug 2005 12:41:42 -0500 From: dave baukus Organization: Chiaro Networks User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6) Gecko/20040429 X-Accept-Language: en-us, en MIME-Version: 1.0 To: dave baukus References: <42C06470.9080700@chiaro.com> In-Reply-To: <42C06470.9080700@chiaro.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 08 Aug 2005 17:35:34.0810 (UTC) FILETIME=[9516CFA0:01C59C3F] Cc: freebsd-scsi@freebsd.org Subject: Re: iur crash ? -- long X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 17:46:04 -0000 I have hit the same iir crash again; that is gdt->sc_gccbs[]->gc_ucmd referencing freed memory. Can anyone tell me why iir_ioctl() is not broken in the GDT_IOCTL_GENERAL case ? The bug I see is that ucmd references the ioctl arguments which are either on the stack or malloc()ed by ioctl(). The code queues the command to the driver for processing at interrupt level. The code then checks a flag for completion; if not complete it tsleeps(). When tsleep() returns the code assumes the command is done and returns. If there is a signal pending tsleep() never sleeps and ioctl() frees the memory referenced by ucmd. Finally, the iir interrupt handler runs and attempts to process the command --- crash ! case GDT_IOCTL_GENERAL: { gdt_ucmd_t *ucmd; struct gdt_softc *gdt; int lock; ucmd = (gdt_ucmd_t *)cmdarg; gdt = gdt_minor2softc(ucmd->io_node); if (gdt == NULL) return (ENXIO); lock = splcam(); TAILQ_INSERT_TAIL(&gdt->sc_ucmd_queue, ucmd, links); ucmd->complete_flag = FALSE; splx(lock); gdt_next(gdt); if (!ucmd->complete_flag) (void) tsleep((void *)ucmd, PCATCH | PRIBIO, "iirucw", 0); break; } dave baukus wrote: > I have a crash on BSD4.10 w/ a heavily modified network stack, but > the disk/scsi subsystem is ostensively unmodified. > > I'm reasonably certain that the crash is caused by > iir_intr(void *arg) passing 0xcOdedead to bcopy() as a length. > We have INVARIANTS enabled. > > The quick question is: has this been seen ? > > The long question goes like this: > ---------------------- > Here's the stack trace: > #0 dumpsys () at ../../kern/kern_shutdown.c:519 > #1 0xc01e7253 in boot (howto=0x100) at ../../kern/kern_shutdown.c:331 > #2 0xc01e76cb in panic (fmt=0xc04cb120 "vm_fault: fault on nofault > entry, addr: %lx") > at ../../kern/kern_shutdown.c:635 > #3 0xc03c810c in vm_fault (map=0xc059fa78, vaddr=0xf5315000, > fault_type=0x1, fault_flags=0x0) > at ../../vm/vm_fault.c:240 > #4 0xc041ca5e in trap_pfault (frame=0xf8225d90, usermode=0x0, > eva=0xf5315000) at ../../i386/i386/trap.c:921 > #5 0xc041c5e0 in trap (frame={tf_fs = 0xc01e0010, tf_es = 0xffff0010, > tf_ds = 0xcb3c0010, tf_edi = 0xcbaeb52e, > tf_esi = 0xf5315000, tf_ebp = 0xf8225e0c, tf_isp = 0xf8225dbc, > tf_ebx = 0xc0dedead, tf_edx = 0xf5222b20, > tf_ecx = 0x3033ee73, tf_eax = 0xd67d652e, tf_trapno = 0xc, tf_err > = 0x0, tf_eip = 0xc041b502, tf_cs = 0x8, > tf_eflags = 0x10202, tf_esp = 0xcadf9000, tf_ss = 0xcb9f9000}) at > ../../i386/i386/trap.c:500 > #6 0xc041b502 in generic_bcopy () > #7 0xc04252fd in intr_mux (arg=x) at ../../i386/isa/intr_machdep.c:609 > #8 0xc040e74e in vec9 () > #9 0xc01df10b in exit1 (p=0xf80c88a0, rv=0xf) at > ../../kern/kern_exit.c:225 > #10 0xc01e9262 in sigexit (p=0xf80c88a0, sig=0xf) at > ../../kern/kern_sig.c:1519 > #11 0xc01e8fa4 in postsig (sig=0xf) at ../../kern/kern_sig.c:1422 > #12 0xc041d254 in syscall2 (frame={tf_fs = 0xbfbf002f, tf_es = > 0x8fe002f, tf_ds = 0xbfbf002f, tf_edi = 0xbfbff800, > tf_esi = 0x8fe5540, tf_ebp = 0xbfbff800, tf_isp = 0xf8225fd4, > tf_ebx = 0x64, tf_edx = 0xbfbff780, > tf_ecx = 0xbfbff700, tf_eax = 0x4, tf_trapno = 0x7, tf_err = 0x2, > tf_eip = 0x88fe6ef4, tf_cs = 0x1f, > tf_eflags = 0x203, tf_esp = 0xbfbff604, tf_ss = 0x2f}) at > ../../i386/i386/trap.c:177 > > -------------------------------- > Since intr_mux() does not call generic_bcopy()/bcopy(), the stack frames > must be a mangled. > The frames 4 to 7 look like: > > 0xf8225d10: 0xf8225d40 0xc01eabd3 0xffffffff > frame 4 > 0xf8225d44 > trap_pfault > 0xf8225d20: 0xc041ca5e 0xc059fa78 0xf5315000 0x00000001 > 0xf8225d30: 0x00000000 0x0000000c 0xf80c88a0 0xf5315000 > frame 5 trap > 0xf8225d40: 0xcb3cdf01 0xf8225d88 0xc041c5e0 0xf8225d90 > 0xf8225d50: 0x00000000 0xf5315000 0x006c0200 0xf5315000 > 0xf8225d60: 0xcbaeb52e 0x00000000 0xc0dedead 0x00000000 > 0xf8225d70: 0xcbae4e5a 0xf8225d80 0xc0421783 0xc04217e9 > calltrap > 0xf8225d80: 0xf8225e0c 0xc040dcb4 0xf8225e0c 0xc040d0c0 > 0xf8225d90: 0xc01e0010 0xffff0010 0xcb3c0010 0xcbaeb52e > 0xf8225da0: 0xf5315000 0xf8225e0c 0xf8225dbc 0xc0dedead > 0xf8225db0: 0xf5222b20 0x3033ee73 0xd67d652e 0x0000000c > 0xf8225dc0: 0x00000000 0xc041b502 0x00000008 0x00010202 > 0xf8225dd0: 0xcadf9000 0xcb9f9000 0xc01b7e11 0xf5222b20 > 0xf8225de0: 0xcb9f904e 0xc0dedead 0xc6048460 0x00400200 > 0xf8225df0: 0x00000000 0xf5222b20 0x006c0200 0x00000000 > 0xf8225e00: 0x00000000 0x00090001 0x006fc67b frame 7 > 0xf8225e24 > intr_mux > 0xf8225e10: 0xc04252fd 0xcadf9000 0x006c0200 0x00000000 > 0xf8225e20: 0xc2ee216c 0xf8225e90 0xc040e1d7 0xc6048460 > > ------------------------------------ > It's between frame 7 (intr_mux()) and frame 5 (trap()), that I begin > guessing at the sequence of events. > > Based on the 0xcadf9000 at 0xf8225e14 I speculate that iir_intr was the > last interrupt routine called. > Here is the intrec * list passed to intr_mux() > > set $P=(intrec *)0xc6048460 > (kgdb) intrecwalk $P > $186 = {mask = 0x6c0200, handler = 0xc01b7b90 , argument = > 0xcadf9000, next = 0xc60483e0, > name = 0xcadf3b80 "iir0", intr = 0x9, maskptr = 0xc0552b54, flags = 0x0} > $187 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcadfc000, next = 0xc6048260, > name = 0xcadf3a60 "em0", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $188 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcadfd000, next = 0xcae01e60, > name = 0xcadf3950 "em1", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $189 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcadff000, next = 0xcae01ce0, > name = 0xcadf3810 "em2", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $190 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcae03000, next = 0xcae01ba0, > name = 0xcadf3700 "em3", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $191 = {mask = 0x630212, handler = 0xc03b6570 , argument = > 0x0, next = 0xcae01b00, name = 0xcadf3620 "ics0", > intr = 0x9, maskptr = 0xc0552b48, flags = 0x0} > $192 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcae04000, next = 0xcae01a00, > name = 0xcadf3520 "em4", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $193 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcae06000, next = 0xcae01880, > name = 0xcadf3410 "em5", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $194 = {mask = 0x660200, handler = 0xc03ef350 , argument = > 0xcae07000, next = 0xcae01760, > name = 0xcadf3300 "em6", intr = 0x9, maskptr = 0xc0552b50, flags = 0x0} > $195 = {mask = 0x68c640, handler = 0xc03d74b0 , argument = > 0xcae09000, next = 0x0, > name = 0xcadf32c0 "uhci0", intr = 0x9, maskptr = 0xc0552b4c, flags = 0x0} > > ----------------------------------------------- > Since I think its iir_intr that was called, I poke around in the stack > frames > between frame 7 and 5. > At 0xf8225dd8 I see the value 0xc01b7e11 > > (kgdb) x 0xc01b7e11 > 0xc01b7e11 : 0x830cc483 > > (kgdb) disass : > ... > ... > ... > 0xc01b7dff : test %ebx,%ebx > 0xc01b7e01 : je 0xc01b7e14 > 0xc01b7e03 : push %ebx > 0xc01b7e04 : lea 0x4e(%esi),%eax > 0xc01b7e07 : push %eax > 0xc01b7e08 : mov 0xffffffe8(%ebp),%edx > 0xc01b7e0b : push %edx > 0xc01b7e0c : call 0xc041b4d8 > 0xc01b7e11 : add $0xc,%esp > 0xc01b7e14 : cmpl $0x0,0x42(%esi) > > ----------------------------------------------------- > Therefore, 0xcadf9000 is the struct gdt_softc * argument to iir_intr() > > > (kgdb) set $SC=(struct gdt_softc *)0xcadf9000 > (kgdb) p *$SC > $217 = {sc_hanum = 0x0, sc_class = 0x5, sc_bus = 0x4, sc_slot = 0x8, > sc_device = 0x600, sc_subdevice = 0x1af, > sc_fw_vers = 0x22a, sc_init_level = 0x6, sc_state = 0x0, sc_dev = > 0xcadf4000, sc_dpmemt = 0x1, > sc_dpmemh = 0xf31c7000, sc_dpmembase = 0xf8000000, sc_parent_dmat = > 0xcadfad00, sc_buffer_dmat = 0xcadfacc0, > sc_gccb_dmat = 0xcadfac80, sc_gccb_dmamap = 0x0, sc_gccb_busbase = > 0x1d000, sc_gccbs = 0xf51c7000, sc_free_gccb = { > slh_first = 0xf51e1860}, sc_pending_gccb = {slh_first = 0xf5211440}, > sc_ccb_queue = {tqh_first = 0x0, > tqh_last = 0xcadf9050}, sc_ucmd_queue = {tqh_first = 0x0, tqh_last = > 0xcadf9058}, sc_ic_all_size = 0x2fc0, > sc_cmd_len = 0x24, sc_cmd_off = 0x24, sc_cmd_cnt = 0x1, > sc_cmd = > "\000\000\000\000d\000\000\000\002\000\000\000¿#\235\000\200\000\000\000ÿÿÿÿ\001\000\000\000\000\000\006â\000\000\001", > '\000' , sc_info = 0x0, sc_info2 = 0x0, sc_status = > 0x1000, sc_service = 0x0, > sc_bus_cnt = 0x3, sc_virt_bus = 0x2, sc_bus_id = "\a\a\000\000\000", > sc_more_proc = 0x0, sc_hdr = {{ > hd_present = 0x1, hd_is_logdrv = 0x0, hd_is_arraydrv = 0x0, > hd_is_master = 0x0, hd_is_parity = 0x0, > hd_is_hotfix = 0x0, hd_master_no = 0x0, hd_lock = 0x0, hd_heads = > 0xff, hd_secs = 0x3f, hd_devtype = 0x0, > hd_size = 0x88efe6a, hd_ldr_no = 0x0, hd_rw_attribs = 0x0, > hd_start_sec = 0x0}, {hd_present = 0x0, > hd_is_logdrv = 0x0, hd_is_arraydrv = 0x0, hd_is_master = 0x0, > hd_is_parity = 0x0, hd_is_hotfix = 0x0, > hd_master_no = 0x0, hd_lock = 0x0, hd_heads = 0x0, hd_secs = 0x0, > hd_devtype = 0x0, hd_size = 0x0, > hd_ldr_no = 0x0, hd_rw_attribs = 0x0, hd_start_sec = 0x0} 99 times>}, sc_raw_feat = 0x1, > sc_cache_feat = 0x101, sc_dvr = {size = 0x0, eu = {stream = '\000' > , driver = {ionode = 0x0, > service = 0x0, index = 0x0}, async = {ionode = 0x0, service = > 0x0, status = 0x0, info = 0x0, > scsi_coord = "\000\000"}, sync = {ionode = 0x0, service = 0x0, > status = 0x0, info = 0x0, hostdrive = 0x0, > scsi_coord = "\000\000", sense_key = 0x0}, test = {l1 = 0x0, l2 > = 0x0, l3 = 0x0, l4 = 0x0}}, severity = 0x0, > event_string = '\000' }, sims = {0xcadfac00, > 0xcadfab40, 0xcadfaa80, 0x0, 0x0, 0x0}, paths = { > 0xcadf3c20, 0xcadf3bf0, 0xcadf3bc0, 0x0, 0x0, 0x0}, sc_copy_cmd = > 0xc01b90d4 , > sc_get_status = 0xc01b9190 , sc_intr = 0xc01b91b4 > , > sc_release_event = 0xc01b92d0 , sc_set_sema0 = > 0xc01b92f0 , > sc_test_busy = 0xc01b9310 , links = {tqe_next = > 0x0, tqe_prev = 0xc04ffe80}} > > ----------------------------------------------------- > Now I try to figure which iir_intr() code path was executed. > Only the case GDT_GCF_IOCTL: code path leads to a bcopy(). > > I walked all the struct gdt_ccb * in gdt->sc_gccbs[], > Only 1 has a non-zero gccb->gc_flags value; its > value is 4 (GDT_GCF_IOCTL) > > (kgdb) set $SCBS=(struct gdt_ccb *)&$SC->sc_gccbs[121] > (kgdb) p *$SCBS > $218 = {gc_scratch = "\001\000\0013", '\000' , > gc_ccb = 0xcb16c400, gc_ucmd = 0xcb9f9000, > gc_dmamap = 0x0, gc_map_flag = 0x1, gc_timeout = 0x0, gc_state = 0x0, > gc_service = 0x9, gc_cmd_index = 0x7b, > gc_flags = 0x4, sle = {sle_next = 0x0}} > > --------------------------------------- > Down to the bcopy(): > the bcopy() decission is made off of values in gc_ucmd, > and nothing good can come from using most of these values: > > (kgdb) set $UCMD=(gdt_ucmd_t *)$SCBS->gc_ucmd > (kgdb) p *$UCMD > $219 = {io_node = 0xc0de, service = 0xdead, timeout = 0xc05076a0, status > = 0x1, info = 0x0, BoardNode = 0xc0ded8b2, > CommandIndex = 0xc0dedead, OpCode = 0xdead, u = {cache = {DeviceNo = > 0xc0de, BlockNo = 0xc0dedead, > BlockCnt = 0xc0dedead, DestAddr = 0xc0dedead}, ioctl = {param_size > = 0xc0de, subfunc = 0xc0dedead, > channel = 0xc0dedead, p_param = 0xc0dedead}, raw = {reserved = > 0xc0de, direction = 0xc0dedead, > mdisc_time = 0xc0dedead, mcon_time = 0xc0dedead, sdata = > 0xc0dedead, sdlen = 0xc0dedead, clen = 0xc0dedead, > cmd = "­ÞÞÀ­ÞÞÀ­ÞÞÀ", target = 0xad, lun = 0xde, bus = 0x1, > priority = 0x0, sense_len = 0x0, sense_data = 0x0, > link_p = 0x10}}, data = "\001\000\0013", '\000' times>, complete_flag = 0xcb16c400, links = { > tqe_next = 0xcb9f9000, tqe_prev = 0x0}} > > ---------------------------- > Not knowing anything about iir/scsi, it appears to me > that gdt->sc_gccbs[121]->gc_ucmd has been freed and yet is still > referenced and > in use. > > How is this ddt_ucmd_t * gc_ucmd data managed ? > Is it actively malloc()ed and free()d ? > > Any clues or pointers will be appreciated. > > -- Dave Baukus dbaukus@chiaro.com Chiaro Networks Ltd. Richardson, Texas USA From owner-freebsd-scsi@FreeBSD.ORG Tue Aug 9 13:47:25 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B5A0116A41F for ; Tue, 9 Aug 2005 13:47:25 +0000 (GMT) (envelope-from danny@cs.huji.ac.il) Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3502343D53 for ; Tue, 9 Aug 2005 13:47:24 +0000 (GMT) (envelope-from danny@cs.huji.ac.il) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by cs1.cs.huji.ac.il with esmtp id 1E2URp-000BjS-VA; Tue, 09 Aug 2005 16:47:17 +0300 X-Mailer: exmh version 2.7.0 06/18/2004 with nmh-1.0.4 To: Scott Long In-Reply-To: Message from Scott Long of "Mon, 08 Aug 2005 10:10:36 MDT." <42F783FC.7090904@samsco.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 09 Aug 2005 16:47:17 +0300 From: Danny Braniss Message-ID: Cc: freebsd-scsi@freebsd.org Subject: Re: CAM, SCSIn/iSCSI & LUNs X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2005 13:47:25 -0000 > Danny Braniss wrote: > > hi, > > it seems that one of the differences between the SCSI1/2/3/4/i > > is the size of the LUN :-) > > > > Now, it seems that the CAM will search sequencially for LUNs, from 0 -> max_lun > > which i don't think will scale nicely. > > > > is there a way to tell the cam to do a scsi report luns command, instead of > > the sequential search? > > > > danny > > > > > > I, Matt Jacob, and Ken Merry talked about implementing REPORT_LUNS a > couple of months ago, but it stalled for a reason that I cannot recall. > > Scott i was wondering, it would be nice, if setting in XPT_PATH_INQ response cpi->max_lun = -1, which would trigger the REPORT_LUNS ... next question: when a target/unit is lost it seems that the luns > 0 have to be removed/cleared by the sim, shouldn't it be done by the CAM? danny From owner-freebsd-scsi@FreeBSD.ORG Tue Aug 9 15:53:58 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 20BA816A41F for ; Tue, 9 Aug 2005 15:53:58 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id B9AE943D55 for ; Tue, 9 Aug 2005 15:53:56 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.14] (imini.samsco.home [192.168.254.14]) (authenticated bits=0) by pooker.samsco.org (8.13.3/8.13.3) with ESMTP id j79G64HP002663; Tue, 9 Aug 2005 10:06:04 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <42F8D188.4020700@samsco.org> Date: Tue, 09 Aug 2005 09:53:44 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.7) Gecko/20050416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Danny Braniss References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org Cc: freebsd-scsi@freebsd.org Subject: Re: CAM, SCSIn/iSCSI & LUNs X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2005 15:53:58 -0000 Danny Braniss wrote: >>Danny Braniss wrote: >> >>>hi, >>> it seems that one of the differences between the SCSI1/2/3/4/i >>>is the size of the LUN :-) >>> >>>Now, it seems that the CAM will search sequencially for LUNs, from 0 -> max_lun >>>which i don't think will scale nicely. >>> >>>is there a way to tell the cam to do a scsi report luns command, instead of >>>the sequential search? >>> >>>danny >>> >>> >> >>I, Matt Jacob, and Ken Merry talked about implementing REPORT_LUNS a >>couple of months ago, but it stalled for a reason that I cannot recall. >> >>Scott > > > i was wondering, it would be nice, if setting in XPT_PATH_INQ response > cpi->max_lun = -1, which would trigger the REPORT_LUNS ... > That's not a very good general purpose solution. While it probably would work without problems for an iSCSI SIM, it doesn't work well for a normal SCSI SIM. The ability of a target to accept a REPORT_LUNS command is a property of the target, not a property of the SIM, and normal SCSI targets have a wide range of capabilities and limitations. So it should be up to the scsi probe code to determine on a target-by-target basis whether a REPORT_LUNS should be sent or if the luns should be linearly scanned. I think that our previous discussion on how to implement REPORT_LUNS got tangled up in trying to guess the proper heuristics for knowing whether to do a scan or a REPORT_LUNS command, how to fall back if one or the other fails, etc, i.e. we got caught up in the details. If people have FC, SCSI, and iSCSI targets that support multiple luns then we should just write a prototype implementation and refine it from there. Actually, now that OpenSolaris is available, it might be good to peek in there for ideas on the heuristics. I'd trust that code a whole lot more than I'd trust Linux. > next question: > when a target/unit is lost it seems that the luns > 0 have to be > removed/cleared by the sim, shouldn't it be done by the CAM? I would think that sending an AC_LOST_DEVICE event with a path containing the proper bus and target but a wildcard lun/device id would do the right thing. If not then it's definitely something that should be looked into. Scott From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 10 01:44:28 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 44AD916A423 for ; Wed, 10 Aug 2005 01:44:28 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: from angryfist.fasttrackmonkey.com (angryfist.fasttrackmonkey.com [216.223.196.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 101E6457E5 for ; Wed, 10 Aug 2005 01:14:06 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: (qmail 93632 invoked by uid 2003); 10 Aug 2005 01:11:46 -0000 Received: from spork@fasttrackmonkey.com by angryfist.fasttrackmonkey.com by uid 1001 with qmail-scanner-1.20 (clamscan: 0.65. Clear:RC:1(216.220.116.154):. Processed in 0.097011 secs); 10 Aug 2005 01:11:46 -0000 Received: from unknown (HELO gee5.nat.fasttrackmonkey.com) (216.220.116.154) by 0 with (DHE-RSA-AES256-SHA encrypted) SMTP; 10 Aug 2005 01:11:45 -0000 Date: Tue, 9 Aug 2005 21:14:02 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@gee5.nat.fasttrackmonkey.com To: freebsd-scsi@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: Adaptec management tools X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2005 01:44:28 -0000 Hi, I think we have a few Adaptec people here... Having been bit in the ass by "raidutil" in the past (if there's one app I'd like to see with a "Are you sure? (y/n)" dialogue, it's that one) I'm wondering what the other options are. Some background... I'm working with about a dozen boxes that have some flavor or Adaptec RAID controller. It seems that we are reaching a point where more and more drives are reaching the end of their useful lives. This means that I'm having to use the management tools more often. So far even just replacing a failed drive is a pain: -yank old drive -mount new drive in carrier and install -reboot -enter SMOR -fiddle with interface until it's convinced to rebuild onto new drive -hope for the best :) I would really like to avoid the reboot step, there's no reason for it other than to get to that BIOS screen. I do not want to install X on these servers to run the full GUI. I simply will not trust "raidutil" until I see Adaptec post some more detailed documentation, specifically in a "HOWTO" format. I don't like parsing dense docs when in a panic (who really does?). There's also quite a disconnect between the paper docs packaged with the units (mostly ZCR units in SuperMicro SuperServers). There's instructions for a program that I don't think even exists for FreeBSD (Storage Manager Pro?), yet that's what they suggest. I could possibly stomach an X app if I could connect to it remotely. There are hints of this in the manuals, but I don't see any such thing in the asr-utils package. For the Adaptec folks: Is there anything you can do to get more information online? Will there be any updates to existing tools (RAIDUTIL Version: 3.04 Date: 9/27/2000 - 5 years old)? Will there be a way to config in a GUI without X libs on the server anytime soon? For the FreeBSD folks: What are you using? Do you trust the tools you can find? Are you using something besides Adaptec for hardware RAID? If so, how do you like it? What are the management tools like? Does it tend to randomly mark drives as "bad" for no apparent reason then later decide they are indeed "good"? I'm ranting because Adaptec support has been next to useless. All total over the last few years I've installed more than two dozen Adaptec RAID products on FreeBSD and I'm struggling to understand why I should continue. Don't get me wrong, the asr driver "just works", but the management is opaque at best, dangerous at worst. Thanks, Charles From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 10 05:55:44 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BB79B16A8BD for ; Wed, 10 Aug 2005 05:54:58 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 70786461E0 for ; Wed, 10 Aug 2005 05:47:00 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.14] (imini.samsco.home [192.168.254.14]) (authenticated bits=0) by pooker.samsco.org (8.13.3/8.13.3) with ESMTP id j7A5x2To006108; Tue, 9 Aug 2005 23:59:04 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <42F994C7.6060400@samsco.org> Date: Tue, 09 Aug 2005 23:46:47 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.7) Gecko/20050416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Charles Sprickman References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org Cc: freebsd-scsi@freebsd.org Subject: Re: Adaptec management tools X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2005 05:55:48 -0000 Charles Sprickman wrote: > Hi, > > I think we have a few Adaptec people here... > > Having been bit in the ass by "raidutil" in the past (if there's one app > I'd like to see with a "Are you sure? (y/n)" dialogue, it's that one) > I'm wondering what the other options are. > > Some background... I'm working with about a dozen boxes that have some > flavor or Adaptec RAID controller. It seems that we are reaching a > point where more and more drives are reaching the end of their useful > lives. This means that I'm having to use the management tools more > often. So far even just replacing a failed drive is a pain: > > -yank old drive > -mount new drive in carrier and install > -reboot > -enter SMOR > -fiddle with interface until it's convinced to rebuild onto new drive > -hope for the best :) > > I would really like to avoid the reboot step, there's no reason for it > other than to get to that BIOS screen. I do not want to install X on > these servers to run the full GUI. I simply will not trust "raidutil" > until I see Adaptec post some more detailed documentation, specifically > in a "HOWTO" format. I don't like parsing dense docs when in a panic > (who really does?). At one point Adaptec released the source to raidutil. I'm not sure if it's still available for download, though someone on this list might have it archived somewhere. I was always skeptical of the legality of distributing modifications to it since the licensing in the source files seemed ambiguous at best. Others may have more positive opinions on it. > > There's also quite a disconnect between the paper docs packaged with the > units (mostly ZCR units in SuperMicro SuperServers). There's > instructions for a program that I don't think even exists for FreeBSD > (Storage Manager Pro?), yet that's what they suggest. > > I could possibly stomach an X app if I could connect to it remotely. > There are hints of this in the manuals, but I don't see any such thing > in the asr-utils package. > A lesson on Adaptec apps: Storage Manager - X/Motif app for managing DPT Gen5+6 and Adaptec 2005/2015/2100/2110/3xxx controllers (note that this list does not include the 2120 and 2200 or any of the newer SATA controllers). I think that there might have been a FreeBSD version at some point, but it's honestly been too many years for me to remember or care. The sources had proprietary pieces that could not be released. raidutil - CLI for the same family of controllers as Storage Manager. Ported to FreeBSD of course. Storage Manager Pro - Java/Swing app for managing every Adaptec RAID controller circa 2001 except for the old Dell zero channel UW/U2/UDMA controllers. I ported this to FreeBSD, but the lack of working native threads at the time severly limited it so I never released it. aaccli - CLI for the 2020/2120/2200/5400 and the more recent SATA offerings, as well as the Dell PERC series. I ported this to FreeBSD, though I recommend using the Linux version under emulation these days since it is more up to date. Storage Manager Browser Edition (SMBE) - web server/client app for managing most of Adaptec's controllers circa 2003. This was never ported to FreeBSD in any form, but I personally don't consider this to be a loss. > For the Adaptec folks: Is there anything you can do to get more > information online? Will there be any updates to existing tools > (RAIDUTIL Version: 3.04 Date: 9/27/2000 - 5 years old)? Will there be > a way to config in a GUI without X libs on the server anytime soon? raidutil is abandonware. Again, if you can find the released source tarball and do something with it, more power to you. I doubt that you'll be able to get much technical assistance out of Adaptec unless you represent a significant amount of future revenue. You could try experimenting with SMBE under Linux emulation. The server part can run without X libs, though it contains its own complete private web server and can often times chew a lot of CPU while idle. Some people don't care about these details, others do. YMMV. > > For the FreeBSD folks: What are you using? Do you trust the tools you > can find? Are you using something besides Adaptec for hardware RAID? > If so, how do you like it? What are the management tools like? Does it > tend to randomly mark drives as "bad" for no apparent reason then later > decide they are indeed "good"? > LSI, 3ware, Areca, and Highpoint all have decent controllers with management apps for FreeBSD. Each has its strengths and weaknesses, but I'd recommend LSI MegaRAID as the best high-end choice, followed by Areca for mid-range SATA. > I'm ranting because Adaptec support has been next to useless. All total > over the last few years I've installed more than two dozen Adaptec RAID > products on FreeBSD and I'm struggling to understand why I should > continue. Don't get me wrong, the asr driver "just works", but the > management is opaque at best, dangerous at worst. I spent several years writing management apps for Adaptec, and your comments don't surprise me at all. Scott From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 10 05:56:27 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A876216AA65 for ; Wed, 10 Aug 2005 05:55:36 +0000 (GMT) (envelope-from nikolai@net24.co.nz) Received: from netmon.net24.net.nz (netmon.net24.net.nz [210.55.4.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id 239D543E07 for ; Wed, 10 Aug 2005 05:54:00 +0000 (GMT) (envelope-from nikolai@net24.co.nz) Received: from [210.55.30.50] ([210.55.30.50]) by netmon.net24.net.nz (8.11.6/8.11.6) with ESMTP id j7A5rcK97505; Wed, 10 Aug 2005 17:53:45 +1200 (NZST) (envelope-from nikolai@net24.co.nz) Message-ID: <42F995FD.2000705@net24.co.nz> Date: Wed, 10 Aug 2005 17:51:57 +1200 From: Nikolai Schupbach User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Charles Sprickman References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org Subject: Re: Adaptec management tools X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2005 05:56:27 -0000 Hi Charles, I hear where you are coming from. I also have had the same headache with xSeries IBM servers, with the integrated LSI RAID controllers. At the end of the day, we opted to buy hardware that is specifically supported by the manufacture for FreeBSD, rather than trying to make hardware work. For us this was the AMCC 3ware 9000 series cards, which are very well supported in FreeBSD and have excellent tools, including stand-alone web server (no apache needed!) for browser based management and also CLI tools. We have had zero problems with them, everything works just like you would and should expect. If you want no hassle, then the solution is simple; drop hardware that is not properly supported, that's what we did, and I sleep so much better now :) Cheers, Nikolai. Charles Sprickman wrote: > Hi, > > I think we have a few Adaptec people here... > > Having been bit in the ass by "raidutil" in the past (if there's one > app I'd like to see with a "Are you sure? (y/n)" dialogue, it's that > one) I'm wondering what the other options are. > > Some background... I'm working with about a dozen boxes that have > some flavor or Adaptec RAID controller. It seems that we are reaching > a point where more and more drives are reaching the end of their > useful lives. This means that I'm having to use the management tools > more often. So far even just replacing a failed drive is a pain: > > -yank old drive > -mount new drive in carrier and install > -reboot > -enter SMOR > -fiddle with interface until it's convinced to rebuild onto new drive > -hope for the best :) > > I would really like to avoid the reboot step, there's no reason for it > other than to get to that BIOS screen. I do not want to install X on > these servers to run the full GUI. I simply will not trust "raidutil" > until I see Adaptec post some more detailed documentation, > specifically in a "HOWTO" format. I don't like parsing dense docs > when in a panic (who really does?). > > There's also quite a disconnect between the paper docs packaged with > the units (mostly ZCR units in SuperMicro SuperServers). There's > instructions for a program that I don't think even exists for FreeBSD > (Storage Manager Pro?), yet that's what they suggest. > > I could possibly stomach an X app if I could connect to it remotely. > There are hints of this in the manuals, but I don't see any such thing > in the asr-utils package. > > For the Adaptec folks: Is there anything you can do to get more > information online? Will there be any updates to existing tools > (RAIDUTIL Version: 3.04 Date: 9/27/2000 - 5 years old)? Will there > be a way to config in a GUI without X libs on the server anytime soon? > > For the FreeBSD folks: What are you using? Do you trust the tools > you can find? Are you using something besides Adaptec for hardware > RAID? If so, how do you like it? What are the management tools > like? Does it tend to randomly mark drives as "bad" for no apparent > reason then later decide they are indeed "good"? > > I'm ranting because Adaptec support has been next to useless. All > total over the last few years I've installed more than two dozen > Adaptec RAID products on FreeBSD and I'm struggling to understand why > I should continue. Don't get me wrong, the asr driver "just works", > but the management is opaque at best, dangerous at worst. > > Thanks, > > Charles > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > > !DSPAM:42f95c4a622361847719832! > > > From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 10 19:15:03 2005 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4FA9216A41F for ; Wed, 10 Aug 2005 19:15:03 +0000 (GMT) (envelope-from volker@vwsoft.com) Received: from mail.vtec.ipme.de (Ad6bc.a.pppool.de [213.6.214.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5223643D46 for ; Wed, 10 Aug 2005 19:14:58 +0000 (GMT) (envelope-from volker@vwsoft.com) Received: from [192.168.16.3] (cesar.sz.vwsoft.com [192.168.16.3]) by mail.vtec.ipme.de (Postfix) with ESMTP id D3A825C77 for ; Wed, 10 Aug 2005 21:14:53 +0200 (CEST) Message-ID: <42FA522D.2050508@vwsoft.com> Date: Wed, 10 Aug 2005 21:14:53 +0200 From: Volker User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.6) Gecko/20050317 Thunderbird/1.0.2 Mnenhy/0.6.0.101 X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-scsi@freebsd.org X-Enigmail-Version: 0.91.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-VWSoft-MailScanner: Found to be clean X-MailScanner-From: volker@vwsoft.com Subject: DDS trouble - device hanging X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2005 19:15:03 -0000 Hi guys, on a system with 5.4-STABLE I'm experiencing trouble with a DAT drive. I've used a Sony SDT-7000 (DDS-2) attached to an Adaptec 3985 for the last years without trouble (narrow SCSI). A year ago I've changed the controller to an Adaptec AAA-133B. When trying to get a backup some time later, the backup was hanging in the middle of the backup and I thought the SDT-7000 would be faulty. Now I've changed the tape drive to a Sony SDT-11000 (DDS-4) and the same thing happens, it stops _in the middle_ of the backup (whether using tar or bacula doesn't make a difference). The cabling has been changed when changing the tape drive. SCSI bus termination is ok, tape drive firmware and jumper settings have been double checked. When the error occours, the tape drive doesn't respond to any commands being sent (camcontrol) and will not eject media (emergency eject). In the meantime (a year ago) I've changed the server from RELENG_4 to RELENG_5 but while the backup has been disabled over months I can't say for sure if the fBSD version update or the hardware change was causing the fault. The following error messages are taken from the console after the backup has been aborted. Please note a manual `camcontrol reset 1:6:0' at the end (which did _not_ solve the frozen device). How do I debug, what can be read from the card dump state or the debug messages? Is the controller at fault? Any hints? I'm near of pulling out the controller and going single channel (that would be possible in my setup without causing additional trouble). Thanks, Volker #uname -v FreeBSD 5.4-STABLE #10: Fri May 13 16:12:28 CEST 2005 ahc2: Recovery Initiated >>>>>>>>>>>>>>>>>> Dump Card State Begins >>>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<< ahc2: Dumping Card State while idle, at SEQADDR 0x7 Card was paused ACCUM = 0xd9, SINDEX = 0x67, DINDEX = 0x27, ARG_2 = 0x3 HCNT = 0x0 SCBPTR = 0x0 SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x1]:(P_BUSFREE) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) SBLKCTL[0x2]:(SELWIDE) SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SSTAT0[0x5]:(DMADONE|SDONE) SSTAT1[0xa]:(PHASECHG|BUSFREE) SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON) DFCNTRL[0x0] DFSTATUS[0x2d]:(FIFOEMP|DFTHRESH|HDONE|FIFOQWDEMP) STACK: 0xcc 0x151 0x192 0x3 SCB count = 20 Kernel NEXTQSCB = 17 Card NEXTQSCB = 17 QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 0:14 QOUTFIFO entries: Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Sequencer SCB Info: 0 SCB_CONTROL[0x4c]:(DISCONNECTED|ULTRAENB|DISCENB) SCB_SCSIID[0x67] SCB_LUN[0x0] SCB_TAG[0xe] 1 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Pending list: 14 SCB_CONTROL[0x48]:(ULTRAENB|DISCENB) SCB_SCSIID[0x67] SCB_LUN[0x0] Kernel Free SCB list: 18 9 8 6 5 3 2 0 19 16 15 1 7 4 13 12 11 10 Untagged Q(6): 14 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> (sa0:ahc2:0:6:0): SCB 0xe - timed out sg[0] - Addr 0x219d028 : Length 4056 sg[1] - Addr 0x96de000 : Length 4096 sg[2] - Addr 0x15d3f000 : Length 4096 sg[3] - Addr 0x3c00000 : Length 4096 sg[4] - Addr 0xbe61000 : Length 4096 sg[5] - Addr 0xb9e2000 : Length 4096 sg[6] - Addr 0xeca3000 : Length 4096 sg[7] - Addr 0xc064000 : Length 4096 sg[8] - Addr 0x7485000 : Length 4096 sg[9] - Addr 0x41e6000 : Length 4096 sg[10] - Addr 0xdde7000 : Length 4096 sg[11] - Addr 0xb488000 : Length 4096 sg[12] - Addr 0x7689000 : Length 4096 sg[13] - Addr 0xb7ca000 : Length 4096 sg[14] - Addr 0xceb000 : Length 4096 sg[15] - Addr 0x3a4c000 : Length 3112 (sa0:ahc2:0:6:0): Queuing a BDR SCB (sa0:ahc2:0:6:0): Bus Device Reset Message Sent ahc2: Timedout SCBs already complete. Interrupts may not be functioning. (sa0:ahc2:0:6:0): no longer in timeout, status = 24b ahc2: Bus Device Reset on A:6. 1 SCBs aborted (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x0 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): NO SENSE ILI (length mismatch): -2048 asc:0,0 (sa0:ahc2:0:6:0): No additional sense information (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): NO SENSE ILI (length mismatch): -2048 asc:0,0 (sa0:ahc2:0:6:0): No additional sense information (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): NO SENSE ILI (length mismatch): -2048 asc:0,0 (sa0:ahc2:0:6:0): No additional sense information (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x0 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x0 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (sa0:ahc2:0:6:0): MODE SENSE(06). CDB: 1a 0 f 0 1c 0 (sa0:ahc2:0:6:0): Sense Error Code 0x75 (pass3:ahc2:0:6:0): Bus Device Reset Message Sent ahc2: Bus Device Reset on A:6. 1 SCBs aborted -- GPG/PGP fingerprint: FF93 13A1 2477 B631 E953 06DF 4C49 ADD9 E4BF 79B1