FreeBSD Mail Archives

Date:      Tue, 12 Nov 1996 15:05:48 -0800
From:      "Justin T. Gibbs" <gibbs@freefall.freebsd.org>
To:        "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" (Andrey A. Chernov) <ache@nagual.ru>
Cc:        current@freebsd.org, scsi@freebsd.org
Subject:   Re: SCB paging is most dangerous option now! 
Message-ID:  <199611122305.PAA02805@freefall.freebsd.org>
In-Reply-To: Your message of "Tue, 12 Nov 1996 18:41:16 %2B0300." <199611121541.SAA00746@nagual.ru>

>> What were the error messages?
>> 
>
>They not stored anywhere now because it seems ANY disk write cause
>immediately destruction of inode table including syslog writes.
>As I remember there was something like:
>
>data overrun of XXXX bytes detected
>
>followed by various retraining/resetting failure attempts.
>As I remember no one successfull write's happens.

This sounds like a cache coherency bug with your motherboard.  What kind
is it?

The reason I belive this to be the case is that:

1) SCB paging causes the same piece of memory to be DMA'ed in and out
in rapid succession - much more often then in the non paging case.  The
amount of DMA will see a dramatic increase when you switch from 1 to two
active targets.

2) After I saw your bug report last night, I again attempted to reproduce
the error.  I made my 2940 look as much like a 2842 as I could by making
the driver believe that it only has 4 SCBs.  After about 30 minutes of
poinding my two disks with as many as 30 outstanding transactions at a
time, I gave up.  I will try again tonight with my aic7850 card (3 SCBs)
as soon as I can rip the machine apart and rearange my disks.

Now I don't have access to a Rev E board anywhere, and the driver does take
advantage of undocumented features of that revision of the aic7770.  I can
send you a little snippet of code that can verify that the 1 important
feature, being able to store full 8 bit values in the QIN and QOUTFIFO does
work on your card without you having to turn on SCB paging.  I don't
believe this to be the case since 1 drive would not work at all either.

If someone has either a 2742A(T) or 2842A that they'd like to send me, I
may be able to debug this further.

If it is DMA related, it should be easy to see that by playing with your
cache settings and trying to reproduce the problem.  If you are going to do
this, attempt to repro it *only in single user mode*, with your filesystems
mounted read only, by starting multiple processes acessing the disks.  I
have yet to lose any disk data with this kind of testing, and this will
usually fail easily if the problem you are reporting still exists.  If the
system starts to go south, note what the error messages are and hit the
reset button.  Multiple dds (at least 8 to each drive) from the raw
partitions of your disks to /dev/null will work nicely.

--
Justin T. Gibbs
===========================================
  FreeBSD: Turning PCs into workstations
===========================================

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611122305.PAA02805>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation