Date: Fri, 09 Jul 1999 18:50:44 +0100 From: Nick Taylor <nt@dataskill.co.uk> To: Doug Ledford <dledford@redhat.com> Cc: Stephan Loescher <loescher@leo.org>, AIC7xxx@FreeBSD.ORG Subject: 3940W / Kernel 2.0.36+ fix Message-ID: <37863674.C14847BC@dataskill.co.uk> References: <m3vhbtj1sl.fsf@sl.sl.de> <37862400.921473D7@dataskill.co.uk> <378619DC.7E22CD80@redhat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--------------03544987E7EA8D9080B6A618 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Hi again Have tried disabling the MMAPIO as suggested and that seems to have done the trick :-) Many thanks. I don't know whether you feel it's worth building in some checks so that these 3940s don't break - The BIOS on mine are 1.24, maybe this has been superceded with one that does work. If there's any info that would help.... Nick --- Doug Ledford wrote: > Nick Taylor wrote: > > > > Hi > > > > I am still seeing 3940 problems, I get a similar lock up. For me it is as > > soon as I try to access 2 hds at the same time. > > > > However it appears that not all 3940Ws fail as some people have indicated > > that they are using them OK. > > > > My problem also appeared with kernel 2.0.36, 2.0.33 being OK. I am also > > convinced that something has been broken, but sadly am not a C hacker so > > don't know how this problem can be resolved. > > > > Nick > > --- > > > > Stephan Loescher wrote: > > > > > Hi! > > > > > > I have found a bug, that appeares first in the aic7xxx-Code in Linux > > > 2.0.34 (5.0.14/3.2.4) and is there up to recent 2.3.xx-kernels! The > > > aic7xxx-Code in Linux 2.0.33 (4.1.1/3.2.1) runs stable for me. > > > > > > The sympoms: > > > When I copy a lot of large files from my harddisk (IBM DCAS-34330W) to > > > my magneto-optical (MO) drive, then after some time (5 seconds to > > > several minutes) the Linux kernel stops. The system is freezed (locked > > > up) and the SCSI-bus led, the MO-led and the harddisk-led is lighting. I > > > canīt log into my system. Mouse and keyboard are "dead". > > > The source-and target-filesystems are ext2. > > > I can reproduce this behaviour. > > > I can copy files between all my harddisks without any error. > > > With kernel 2.0.33 there are no problems! > > > > > > I nailed it down with linux/Documentation/BUG-HUNTING to the > > > aic7xxx-Code, because when I replace the aic7xxx-files in 2.0.34 with > > > the files from 2.0.33, then the system runs stable. > > > > > > I have tried the following kernels: > > > 2.0.34 > > > 2.0.35 > > > 2.0.36 > > > 2.1.128 > > > 2.2.2 > > > 2.2.5 > > > 2.2.6 > > > 2.2.7 > > > 2.2.10 > > > 2.3.4 > > > (with and without all AC-patches) > > > > > > Also disabling all aic7xxx-features does not help. > > > I tried these options: > > > aic7xxx=verbose, aic7xxx=pci_parity, aic7xxx=verbose:0x1ffff > > > and disabled TAGGED_QUEUEING at all. > > > > > > To help you finding the bug, I tried all aic7xxx-patches for Linux > > > 2.0.33 from the last 4.x.x up to 5.0.13. The results are: > > > > > > 5.0.0 /3.2.2: OK > > > 5.0.1 /3.2.2: does not boot, seems _very_ unstable > > > 5.0.10/3.2.2: OK > > > 5.0.11/3.2.2: Makes endless SCSI-resets after issuing commands like > > > echo "scsi remove-single-device 0 0 1 0 " >/proc/scsi/scsi > > > 5.0.12/3.2.2: locks up the system as 5.0.14 does! > > > 5.0.13/3.2.2: locks up the system as 5.0.14 does! > > > > > > My system: > > > Pentium-200 (single-CPU) > > > SCSI-HA: Adaptec 3490U, Bios 1.24 > > > Channel A: > > > 0 : CD Sony CDU-76S > > > 1 : HD Seagate ST32430N > > > 3 : CDRW Yamaha CRW4416S 1.0f > > > 4 : Streamer Tandberg NS20 Pro > > > 5 : HD IBM DCAS-34330 > > > 6 : HD IBM DCAS-34330W > > > (End of SCSI-bus with active termination, and AHA with auto-termination.) > > > Channel B: > > > 0 : Olympus Deltis-MOS320 (MO) > > > 3 : HP ScanJet > > > (End of SCSI-bus with passive termination, and AHA with auto-termination.) > > > > > > What was changed in the aic7xxx-code after 5.0.10/3.2.2? > > > > > > What can I do to help you finding the bug? > > > > > > Stephan. > > OK, I must have missed this report somehow. Anyway, the big item of change > between the 5.0.10 and later versions is that all later versions default to > using MMAPIO instead of PIO. So, if you want to test things out, go into the > aic7xxx.c file, find the line that reads: > > #define MMAPIO > > and comment that line out then recompile. That should disable MMAPed I/O on > your system and that will let us know if your problem is related to > simultaneous I/O to different MMAP regions on the card. Note, there may be > more than one line of the #define MMAPIO in your source code, but assuming you > are using an Intel based machine, you need only find the one in the #ifdef > __i386__ block of code. You can ignore the other architectures. I would give > and exact line number, but depending on which patch you use this could be > greatly different as that section of code was in a state of flux during the > 5.0.10->14 days. > > -- > Doug Ledford <dledford@redhat.com> > Opinions expressed are my own, but > they should be everybody's. > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe aic7xxx" in the body of the message -- Nick Taylor mailto://nt@dataskill.co.uk Dataskill, London, England mailto://webmaster@reflexology.org HOME OF REFLEXOLOGY http://www.reflexology.org --------------03544987E7EA8D9080B6A618 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> <html> Hi again <p>Have tried disabling the MMAPIO as suggested and that seems to have done the trick :-) Many thanks. <p>I don't know whether you feel it's worth building in some checks so that these 3940s don't break - The BIOS on mine are 1.24, maybe this has been superceded with one that does work. <p>If there's any info that would help.... <p>Nick <br>--- <p>Doug Ledford wrote: <blockquote TYPE=CITE>Nick Taylor wrote: <br>> <br>> Hi <br>> <br>> I am still seeing 3940 problems, I get a similar lock up. For me it is as <br>> soon as I try to access 2 hds at the same time. <br>> <br>> However it appears that not all 3940Ws fail as some people have indicated <br>> that they are using them OK. <br>> <br>> My problem also appeared with kernel 2.0.36, 2.0.33 being OK. I am also <br>> convinced that something has been broken, but sadly am not a C hacker so <br>> don't know how this problem can be resolved. <br>> <br>> Nick <br>> --- <br>> <br>> Stephan Loescher wrote: <br>> <br>> > Hi! <br>> > <br>> > I have found a bug, that appeares first in the aic7xxx-Code in Linux <br>> > 2.0.34 (5.0.14/3.2.4) and is there up to recent 2.3.xx-kernels! The <br>> > aic7xxx-Code in Linux 2.0.33 (4.1.1/3.2.1) runs stable for me. <br>> > <br>> > The sympoms: <br>> > When I copy a lot of large files from my harddisk (IBM DCAS-34330W) to <br>> > my magneto-optical (MO) drive, then after some time (5 seconds to <br>> > several minutes) the Linux kernel stops. The system is freezed (locked <br>> > up) and the SCSI-bus led, the MO-led and the harddisk-led is lighting. I <br>> > can´t log into my system. Mouse and keyboard are "dead". <br>> > The source-and target-filesystems are ext2. <br>> > I can reproduce this behaviour. <br>> > I can copy files between all my harddisks without any error. <br>> > With kernel 2.0.33 there are no problems! <br>> > <br>> > I nailed it down with linux/Documentation/BUG-HUNTING to the <br>> > aic7xxx-Code, because when I replace the aic7xxx-files in 2.0.34 with <br>> > the files from 2.0.33, then the system runs stable. <br>> > <br>> > I have tried the following kernels: <br>> > 2.0.34 <br>> > 2.0.35 <br>> > 2.0.36 <br>> > 2.1.128 <br>> > 2.2.2 <br>> > 2.2.5 <br>> > 2.2.6 <br>> > 2.2.7 <br>> > 2.2.10 <br>> > 2.3.4 <br>> > (with and without all AC-patches) <br>> > <br>> > Also disabling all aic7xxx-features does not help. <br>> > I tried these options: <br>> > aic7xxx=verbose, aic7xxx=pci_parity, aic7xxx=verbose:0x1ffff <br>> > and disabled TAGGED_QUEUEING at all. <br>> > <br>> > To help you finding the bug, I tried all aic7xxx-patches for Linux <br>> > 2.0.33 from the last 4.x.x up to 5.0.13. The results are: <br>> > <br>> > 5.0.0 /3.2.2: OK <br>> > 5.0.1 /3.2.2: does not boot, seems _very_ unstable <br>> > 5.0.10/3.2.2: OK <br>> > 5.0.11/3.2.2: Makes endless SCSI-resets after issuing commands like <br>> > echo "scsi remove-single-device 0 0 1 0 " >/proc/scsi/scsi <br>> > 5.0.12/3.2.2: locks up the system as 5.0.14 does! <br>> > 5.0.13/3.2.2: locks up the system as 5.0.14 does! <br>> > <br>> > My system: <br>> > Pentium-200 (single-CPU) <br>> > SCSI-HA: Adaptec 3490U, Bios 1.24 <br>> > Channel A: <br>> > 0 : CD Sony CDU-76S <br>> > 1 : HD Seagate ST32430N <br>> > 3 : CDRW Yamaha CRW4416S 1.0f <br>> > 4 : Streamer Tandberg NS20 Pro <br>> > 5 : HD IBM DCAS-34330 <br>> > 6 : HD IBM DCAS-34330W <br>> > (End of SCSI-bus with active termination, and AHA with auto-termination.) <br>> > Channel B: <br>> > 0 : Olympus Deltis-MOS320 (MO) <br>> > 3 : HP ScanJet <br>> > (End of SCSI-bus with passive termination, and AHA with auto-termination.) <br>> > <br>> > What was changed in the aic7xxx-code after 5.0.10/3.2.2? <br>> > <br>> > What can I do to help you finding the bug? <br>> > <br>> > Stephan. <p>OK, I must have missed this report somehow. Anyway, the big item of change <br>between the 5.0.10 and later versions is that all later versions default to <br>using MMAPIO instead of PIO. So, if you want to test things out, go into the <br>aic7xxx.c file, find the line that reads: <p>#define MMAPIO <p>and comment that line out then recompile. That should disable MMAPed I/O on <br>your system and that will let us know if your problem is related to <br>simultaneous I/O to different MMAP regions on the card. Note, there may be <br>more than one line of the #define MMAPIO in your source code, but assuming you <br>are using an Intel based machine, you need only find the one in the #ifdef <br>__i386__ block of code. You can ignore the other architectures. I would give <br>and exact line number, but depending on which patch you use this could be <br>greatly different as that section of code was in a state of flux during the <br>5.0.10->14 days. <p>-- <br> Doug Ledford <dledford@redhat.com> <br> Opinions expressed are my own, but <br> they should be everybody's. <p>To Unsubscribe: send mail to majordomo@FreeBSD.org <br>with "unsubscribe aic7xxx" in the body of the message</blockquote> <pre>-- Nick Taylor <A HREF="mailto://nt@dataskill.co.uk">mailto://nt@dataskill.co.uk</A> Dataskill, London, England <A HREF="mailto://webmaster@reflexology.org">mailto://webmaster@reflexology.org</A> HOME OF REFLEXOLOGY <A HREF="http://www.reflexology.org">http://www.reflexology.org</A></pre> </html> --------------03544987E7EA8D9080B6A618-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe aic7xxx" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?37863674.C14847BC>