From owner-aic7xxx  Fri Jul  9 10:50:59 1999
Delivered-To: aic7xxx@freebsd.org
Received: from ronly.co.uk (rongw1.ronly.co.uk [194.126.70.65])
	by hub.freebsd.org (Postfix) with ESMTP id 103BB14C3F
	for <AIC7xxx@FreeBSD.ORG>; Fri,  9 Jul 1999 10:50:52 -0700 (PDT)
	(envelope-from nt@dataskill.co.uk)
Received: by ronly.co.uk
	id m112enk-00013mC
	(Debian Smail-3.2 1996-Jul-4 #3); Fri, 9 Jul 1999 18:51:08 +0100 (BST)
Received: from dsl1.dataskill.co.uk(10.1.1.1) by rongw1.ronly.co.uk via smap (V1.3)
	id sma025008; Fri Jul  9 18:51:01 1999
Received: from dataskill.co.uk ([10.1.1.51]) by dataskill.co.uk
	 with esmtp id m112enE-000H4RC
	(Debian Smail-3.2 1996-Jul-4 #1); Fri, 9 Jul 1999 18:50:36 +0100 (BST)
Message-ID: <37863674.C14847BC@dataskill.co.uk>
Date: Fri, 09 Jul 1999 18:50:44 +0100
From: Nick Taylor <nt@dataskill.co.uk>
Organization: Dataskill
X-Mailer: Mozilla 4.5 [en] (X11; I; Linux 2.0.36 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Doug Ledford <dledford@redhat.com>
Cc: Stephan Loescher <loescher@leo.org>, AIC7xxx@FreeBSD.ORG
Subject: 3940W / Kernel 2.0.36+ fix
References: <m3vhbtj1sl.fsf@sl.sl.de> <37862400.921473D7@dataskill.co.uk> <378619DC.7E22CD80@redhat.com>
Content-Type: multipart/alternative;
 boundary="------------03544987E7EA8D9080B6A618"
Sender: owner-aic7xxx@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


--------------03544987E7EA8D9080B6A618
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

Hi again

Have tried disabling the MMAPIO as suggested and that seems to have done the
trick :-)  Many thanks.

I don't know whether you feel it's worth building in some checks so that these
3940s don't break - The BIOS on mine are 1.24, maybe this has been superceded
with one that does work.

If there's any info that would help....

Nick
---

Doug Ledford wrote:

> Nick Taylor wrote:
> >
> > Hi
> >
> > I am still seeing 3940 problems, I get a similar lock up. For me it is as
> > soon as I try to access 2 hds at the same time.
> >
> > However it appears that not all 3940Ws fail as some people have indicated
> > that they are using them OK.
> >
> > My problem also appeared with kernel 2.0.36, 2.0.33 being OK. I am also
> > convinced that something has been broken, but sadly am not a C hacker so
> > don't know how this problem can be resolved.
> >
> > Nick
> > ---
> >
> > Stephan Loescher wrote:
> >
> > > Hi!
> > >
> > > I have found a bug, that appeares first in the aic7xxx-Code in Linux
> > > 2.0.34 (5.0.14/3.2.4) and is there up to recent 2.3.xx-kernels! The
> > > aic7xxx-Code in Linux 2.0.33 (4.1.1/3.2.1) runs stable for me.
> > >
> > > The sympoms:
> > > When I copy a lot of large files from my harddisk (IBM DCAS-34330W) to
> > > my magneto-optical (MO) drive, then after some time (5 seconds to
> > > several minutes) the Linux kernel stops. The system is freezed (locked
> > > up) and the SCSI-bus led, the MO-led and the harddisk-led is lighting. I
> > > can´t log into my system. Mouse and keyboard are "dead".
> > > The source-and target-filesystems are ext2.
> > > I can reproduce this behaviour.
> > > I can copy files between all my harddisks without any error.
> > > With kernel 2.0.33 there are no problems!
> > >
> > > I nailed it down with linux/Documentation/BUG-HUNTING to the
> > > aic7xxx-Code, because when I replace the aic7xxx-files in 2.0.34 with
> > > the files from 2.0.33, then the system runs stable.
> > >
> > > I have tried the following kernels:
> > > 2.0.34
> > > 2.0.35
> > > 2.0.36
> > > 2.1.128
> > > 2.2.2
> > > 2.2.5
> > > 2.2.6
> > > 2.2.7
> > > 2.2.10
> > > 2.3.4
> > > (with and without all AC-patches)
> > >
> > > Also disabling all aic7xxx-features does not help.
> > > I tried these options:
> > > aic7xxx=verbose, aic7xxx=pci_parity, aic7xxx=verbose:0x1ffff
> > > and disabled TAGGED_QUEUEING at all.
> > >
> > > To help you finding the bug, I tried all aic7xxx-patches for Linux
> > > 2.0.33 from the last 4.x.x up to 5.0.13. The results are:
> > >
> > > 5.0.0 /3.2.2: OK
> > > 5.0.1 /3.2.2: does not boot, seems _very_ unstable
> > > 5.0.10/3.2.2: OK
> > > 5.0.11/3.2.2: Makes endless SCSI-resets after issuing commands like
> > >               echo "scsi remove-single-device 0 0 1 0 " >/proc/scsi/scsi
> > > 5.0.12/3.2.2: locks up the system as 5.0.14 does!
> > > 5.0.13/3.2.2: locks up the system as 5.0.14 does!
> > >
> > > My system:
> > > Pentium-200 (single-CPU)
> > > SCSI-HA: Adaptec 3490U, Bios 1.24
> > > Channel A:
> > > 0 : CD Sony CDU-76S
> > > 1 : HD Seagate ST32430N
> > > 3 : CDRW Yamaha CRW4416S 1.0f
> > > 4 : Streamer Tandberg NS20 Pro
> > > 5 : HD IBM DCAS-34330
> > > 6 : HD IBM DCAS-34330W
> > > (End of SCSI-bus with active termination, and AHA with auto-termination.)
> > > Channel B:
> > > 0 : Olympus Deltis-MOS320 (MO)
> > > 3 : HP ScanJet
> > > (End of SCSI-bus with passive termination, and AHA with auto-termination.)
> > >
> > > What was changed in the aic7xxx-code after 5.0.10/3.2.2?
> > >
> > > What can I do to help you finding the bug?
> > >
> > > Stephan.
>
> OK, I must have missed this report somehow.  Anyway, the big item of change
> between the 5.0.10 and later versions is that all later versions default to
> using MMAPIO instead of PIO.  So, if you want to test things out, go into the
> aic7xxx.c file, find the line that reads:
>
> #define MMAPIO
>
> and comment that line out then recompile.  That should disable MMAPed I/O on
> your system and that will let us know if your problem is related to
> simultaneous I/O to different MMAP regions on the card.  Note, there may be
> more than one line of the #define MMAPIO in your source code, but assuming you
> are using an Intel based machine, you need only find the one in the #ifdef
> __i386__ block of code.  You can ignore the other architectures.  I would give
> and exact line number, but depending on which patch you use this could be
> greatly different as that section of code was in a state of flux during the
> 5.0.10->14 days.
>
> --
>   Doug Ledford   <dledford@redhat.com>
>    Opinions expressed are my own, but
>       they should be everybody's.
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe aic7xxx" in the body of the message

--
Nick Taylor   mailto://nt@dataskill.co.uk   Dataskill, London, England
mailto://webmaster@reflexology.org
HOME OF REFLEXOLOGY   http://www.reflexology.org


--------------03544987E7EA8D9080B6A618
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Hi again
<p>Have tried disabling the MMAPIO&nbsp;as suggested and that seems to
have done the trick :-)&nbsp; Many thanks.
<p>I&nbsp;don't know whether you feel it's worth building in some checks
so that these 3940s don't break - The BIOS on mine are 1.24, maybe this
has been superceded with one that does work.
<p>If there's any info that would help....
<p>Nick
<br>---
<p>Doug Ledford wrote:
<blockquote TYPE=CITE>Nick Taylor wrote:
<br>>
<br>> Hi
<br>>
<br>> I am still seeing 3940 problems, I get a similar lock up. For me
it is as
<br>> soon as I try to access 2 hds at the same time.
<br>>
<br>> However it appears that not all 3940Ws fail as some people have indicated
<br>> that they are using them OK.
<br>>
<br>> My problem also appeared with kernel 2.0.36, 2.0.33 being OK. I am
also
<br>> convinced that something has been broken, but sadly am not a C hacker
so
<br>> don't know how this problem can be resolved.
<br>>
<br>> Nick
<br>> ---
<br>>
<br>> Stephan Loescher wrote:
<br>>
<br>> > Hi!
<br>> >
<br>> > I have found a bug, that appeares first in the aic7xxx-Code in
Linux
<br>> > 2.0.34 (5.0.14/3.2.4) and is there up to recent 2.3.xx-kernels!
The
<br>> > aic7xxx-Code in Linux 2.0.33 (4.1.1/3.2.1) runs stable for me.
<br>> >
<br>> > The sympoms:
<br>> > When I copy a lot of large files from my harddisk (IBM DCAS-34330W)
to
<br>> > my magneto-optical (MO) drive, then after some time (5 seconds
to
<br>> > several minutes) the Linux kernel stops. The system is freezed
(locked
<br>> > up) and the SCSI-bus led, the MO-led and the harddisk-led is lighting.
I
<br>> > can&acute;t log into my system. Mouse and keyboard are "dead".
<br>> > The source-and target-filesystems are ext2.
<br>> > I can reproduce this behaviour.
<br>> > I can copy files between all my harddisks without any error.
<br>> > With kernel 2.0.33 there are no problems!
<br>> >
<br>> > I nailed it down with linux/Documentation/BUG-HUNTING to the
<br>> > aic7xxx-Code, because when I replace the aic7xxx-files in 2.0.34
with
<br>> > the files from 2.0.33, then the system runs stable.
<br>> >
<br>> > I have tried the following kernels:
<br>> > 2.0.34
<br>> > 2.0.35
<br>> > 2.0.36
<br>> > 2.1.128
<br>> > 2.2.2
<br>> > 2.2.5
<br>> > 2.2.6
<br>> > 2.2.7
<br>> > 2.2.10
<br>> > 2.3.4
<br>> > (with and without all AC-patches)
<br>> >
<br>> > Also disabling all aic7xxx-features does not help.
<br>> > I tried these options:
<br>> > aic7xxx=verbose, aic7xxx=pci_parity, aic7xxx=verbose:0x1ffff
<br>> > and disabled TAGGED_QUEUEING at all.
<br>> >
<br>> > To help you finding the bug, I tried all aic7xxx-patches for Linux
<br>> > 2.0.33 from the last 4.x.x up to 5.0.13. The results are:
<br>> >
<br>> > 5.0.0 /3.2.2: OK
<br>> > 5.0.1 /3.2.2: does not boot, seems _very_ unstable
<br>> > 5.0.10/3.2.2: OK
<br>> > 5.0.11/3.2.2: Makes endless SCSI-resets after issuing commands
like
<br>> >&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
echo "scsi remove-single-device 0 0 1 0 " >/proc/scsi/scsi
<br>> > 5.0.12/3.2.2: locks up the system as 5.0.14 does!
<br>> > 5.0.13/3.2.2: locks up the system as 5.0.14 does!
<br>> >
<br>> > My system:
<br>> > Pentium-200 (single-CPU)
<br>> > SCSI-HA: Adaptec 3490U, Bios 1.24
<br>> > Channel A:
<br>> > 0 : CD Sony CDU-76S
<br>> > 1 : HD Seagate ST32430N
<br>> > 3 : CDRW Yamaha CRW4416S 1.0f
<br>> > 4 : Streamer Tandberg NS20 Pro
<br>> > 5 : HD IBM DCAS-34330
<br>> > 6 : HD IBM DCAS-34330W
<br>> > (End of SCSI-bus with active termination, and AHA with auto-termination.)
<br>> > Channel B:
<br>> > 0 : Olympus Deltis-MOS320 (MO)
<br>> > 3 : HP ScanJet
<br>> > (End of SCSI-bus with passive termination, and AHA with auto-termination.)
<br>> >
<br>> > What was changed in the aic7xxx-code after 5.0.10/3.2.2?
<br>> >
<br>> > What can I do to help you finding the bug?
<br>> >
<br>> > Stephan.
<p>OK, I must have missed this report somehow.&nbsp; Anyway, the big item
of change
<br>between the 5.0.10 and later versions is that all later versions default
to
<br>using MMAPIO instead of PIO.&nbsp; So, if you want to test things out,
go into the
<br>aic7xxx.c file, find the line that reads:
<p>#define MMAPIO
<p>and comment that line out then recompile.&nbsp; That should disable
MMAPed I/O on
<br>your system and that will let us know if your problem is related to
<br>simultaneous I/O to different MMAP regions on the card.&nbsp; Note,
there may be
<br>more than one line of the #define MMAPIO in your source code, but assuming
you
<br>are using an Intel based machine, you need only find the one in the
#ifdef
<br>__i386__ block of code.&nbsp; You can ignore the other architectures.&nbsp;
I would give
<br>and exact line number, but depending on which patch you use this could
be
<br>greatly different as that section of code was in a state of flux during
the
<br>5.0.10->14 days.
<p>--
<br>&nbsp; Doug Ledford&nbsp;&nbsp; &lt;dledford@redhat.com>
<br>&nbsp;&nbsp; Opinions expressed are my own, but
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; they should be everybody's.
<p>To Unsubscribe: send mail to majordomo@FreeBSD.org
<br>with "unsubscribe aic7xxx" in the body of the message</blockquote>

<pre>--&nbsp;
Nick Taylor&nbsp;&nbsp; <A HREF="mailto://nt@dataskill.co.uk">mailto://nt@dataskill.co.uk</A>&nbsp;&nbsp; Dataskill, London, England
<A HREF="mailto://webmaster@reflexology.org">mailto://webmaster@reflexology.org</A>
HOME OF REFLEXOLOGY&nbsp;&nbsp; <A HREF="http://www.reflexology.org">http://www.reflexology.org</A></pre>
&nbsp;</html>

--------------03544987E7EA8D9080B6A618--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe aic7xxx" in the body of the message