From owner-freebsd-hackers Thu Jul 13 07:57:03 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id HAA10330 for hackers-outgoing; Thu, 13 Jul 1995 07:57:03 -0700 Received: from kitten.mcs.com (Kitten.mcs.com [192.160.127.90]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id HAA10318 for ; Thu, 13 Jul 1995 07:56:59 -0700 Received: from Jupiter.mcs.net (Jupiter.mcs.net [192.160.127.89]) by kitten.mcs.com (8.6.10/8.6.9) with ESMTP id JAA04601; Thu, 13 Jul 1995 09:56:55 -0500 Received: (from karl@localhost) by Jupiter.mcs.net (8.6.11/8.6.9) id JAA01679; Thu, 13 Jul 1995 09:56:55 -0500 From: Karl Denninger Message-Id: <199507131456.JAA01679@Jupiter.mcs.net> Subject: Re: SCSI disk wedge To: terry@cs.weber.edu (Terry Lambert) Date: Thu, 13 Jul 1995 09:56:52 -0500 (CDT) Cc: tom@misery.sdf.com, karl@Mcs.Net, rgrimes@gndrsh.aac.dev.com, freebsd-hackers@FreeBSD.ORG In-Reply-To: <9507130522.AA22828@cs.weber.edu> from "Terry Lambert" at Jul 12, 95 11:22:01 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 2932 Sender: hackers-owner@FreeBSD.ORG Precedence: bulk > > The 1742 driver has been around for a long time and is very similar to > > the driver in NetBSD. However, the 2742/2842/2942 driver is quite > > recent. It is _very_ odd that you have problems with both adapters. Yep. But I do. > Clearly, it is an issue of tagged command queuing, very large transfers, > bus on time, or one of the many issues in the shared SCSI, VM, block I/O, > and/or user space code above them all, only the last of which is shared > with BSDI (and therefore possible to rule out). > > Another possibility is that BSDI, to my knowledge, does not support > bounce buffers, instead allocating them at boot time and as a result > allocating them in low memory. It is entirely possible that you are > runninh a HiNT EISA chipset or other chipset that does not support > bus mastering DMA transfers to memory regions above the 16M limit > (which would put your board in violation of the EISA standard, but > still claiming to be EISA in the ROM tag location). No, that's not possible. These machines are 64MB P90s running on ASUS dual Pentia motherboards (one processor on the card). They are completely stable with BSDI and a number of other platforms, including Win/NT and other Unices, and are *certified* as Novell 4.x compatible (by Novell themselves). This is a *standardized* configuration here; we're not playing with whatever we can get our hands on. Remember, we're a quite-large ISP -- we have had hardware standardized for over a year now, and haven't had any real reason to move that configuration around much. > There is insufficient information presented to diagnose your problem. Since there are no errors presented to us when the 1742 hangs, and the 2742 starts complaining about timeouts, I don't know where to go next. Tagged queueing is not at issue; I have tried with it both enabled and disabled. With it *ON* the incidence of the hangs is reduced, but not eliminated. It LOOKS like something has requested an action on the SCSI bus which is causing problems (ie: disconnect sequencing, etc) for devices, and/or the adapter itself, causing a wedge condition. Why this is not detectable and correctable (or at least abortable with a panic) in the driver is unknown to me. The kernel IS running -- I can telnet to the machine affected and get connected, but any disk I/O attempt goes nowhere. There is a difference -- the 2742 is MORE stable than the 1742. The 1742 machines run about 8 hours before dying -- the 2742 with MUCH heavier load on it can, in some cases, run for 2-3 days. -- -- Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity Modem: [+1 312 248-0900] | (shell, PPP, SLIP, leased) in Chicagoland Voice: [+1 312 248-8649] | 7 Chicagoland POPs, ISDN, 28.8, much more Fax: [+1 312 248-9865] | Email to "info@mcs.net" WWW: http://www.mcs.net ISDN - Get it here TODAY! | Home of Chicago's only FULL AP Clarinet feed!