Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Jul 1995 09:56:52 -0500 (CDT)
From:      Karl Denninger <karl@Mcs.Net>
To:        terry@cs.weber.edu (Terry Lambert)
Cc:        tom@misery.sdf.com, karl@Mcs.Net, rgrimes@gndrsh.aac.dev.com, freebsd-hackers@FreeBSD.ORG
Subject:   Re: SCSI disk wedge
Message-ID:  <199507131456.JAA01679@Jupiter.mcs.net>
In-Reply-To: <9507130522.AA22828@cs.weber.edu> from "Terry Lambert" at Jul 12, 95 11:22:01 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > The 1742 driver has been around for a long time and is very similar to 
> > the driver in NetBSD.  However, the 2742/2842/2942 driver is quite 
> > recent.  It is _very_ odd that you have problems with both adapters.

Yep.  But I do.

> Clearly, it is an issue of tagged command queuing, very large transfers,
> bus on time, or one of the many issues in the shared SCSI, VM, block I/O,
> and/or user space code above them all, only the last of which is shared
> with BSDI (and therefore possible to rule out).
>
> Another possibility is that BSDI, to my knowledge, does not support
> bounce buffers, instead allocating them at boot time and as a result
> allocating them in low memory.  It is entirely possible that you are
> runninh a HiNT EISA chipset or other chipset that does not support
> bus mastering DMA transfers to memory regions above the 16M limit
> (which would put your board in violation of the EISA standard, but
> still claiming to be EISA in the ROM tag location).

No, that's not possible.

These machines are 64MB P90s running on ASUS dual Pentia motherboards (one
processor on the card).  They are completely stable with BSDI and a number 
of other platforms, including Win/NT and other Unices, and are *certified*
as Novell 4.x compatible (by Novell themselves).  This is a *standardized*
configuration here; we're not playing with whatever we can get our hands on.

Remember, we're a quite-large ISP -- we have had hardware standardized for
over a year now, and haven't had any real reason to move that configuration
around much.

> There is insufficient information presented to diagnose your problem.

Since there are no errors presented to us when the 1742 hangs, and the 
2742 starts complaining about timeouts, I don't know where to go next.
Tagged queueing is not at issue; I have tried with it both enabled and
disabled.  With it *ON* the incidence of the hangs is reduced, but not
eliminated.

It LOOKS like something has requested an action on the SCSI bus which is
causing problems (ie: disconnect sequencing, etc) for devices, and/or the
adapter itself, causing a wedge condition.  Why this is not detectable and
correctable (or at least abortable with a panic) in the driver is unknown 
to me.  The kernel IS running -- I can telnet to the machine affected and
get connected, but any disk I/O attempt goes nowhere.

There is a difference -- the 2742 is MORE stable than the 1742.  The
1742 machines run about 8 hours before dying -- the 2742 with MUCH 
heavier load on it can, in some cases, run for 2-3 days.

--
--
Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity
Modem: [+1 312 248-0900]     | (shell, PPP, SLIP, leased) in Chicagoland
Voice: [+1 312 248-8649]     | 7 Chicagoland POPs, ISDN, 28.8, much more
Fax: [+1 312 248-9865]       | Email to "info@mcs.net" WWW: http://www.mcs.net
ISDN - Get it here TODAY!    | Home of Chicago's only FULL AP Clarinet feed!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507131456.JAA01679>