From owner-freebsd-scsi  Wed May 21 14:50:29 1997
Return-Path: <owner-freebsd-scsi>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id OAA08700
          for freebsd-scsi-outgoing; Wed, 21 May 1997 14:50:29 -0700 (PDT)
Received: from wall.jhs.no_domain (vector.muc.ditec.de [194.120.126.35])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id OAA08594;
          Wed, 21 May 1997 14:49:41 -0700 (PDT)
Received: (from jhs@localhost)
	by desk.jhs.no_domain (8.8.5/8.8.5) id XAA03201;
	Wed, 21 May 1997 23:37:17 +0200 (MET DST)
Date: Wed, 21 May 1997 23:37:17 +0200 (MET DST)
Message-Id: <199705212137.XAA03201@desk.jhs.no_domain>
To: tomppa@fidata.fi
cc: scsi@freebsd.org
Cc: fabio@cesar.unicamp.br, fty@mcnc.org, gcrutchr@nightflight.com,
        j@uriah.heep.sax.de, jc@irbs.com, julian@freebsd.org,
        kuku@gilberto.physik.rwth-aachen.de, lehey.pad@sni.de, mrm@Sceard.com,
        nikm@ixa.net, tomppa@fidata.fi, wilko@yedi.iaf.nl,
        Scott Kelly <scott@relay.forest.com>
Subject: 8 * 0xFF bytes at intermittent multiples of 0x1000
From: "Julian H. Stacey" <jhs@freebsd.org>
Reply-To: "Julian H. Stacey" <jhs@freebsd.org>
Web: http://www.freebsd.org/~jhs/
Sender: owner-freebsd-scsi@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

To scsi@freebsd.org
Cc Adaptec 1542A SCSI Adapter People.
	I & some other 1542A people are probably not on scsi@ list,
	so please be careful if trimming CC line.

Ref. my earlier mail a while ago on [at least my] Adaptec 1542A mis-behaving,
	writing 8 * 0xFF bytes at intermittent multiples of 0x1000.

I had thought it was my (dmesg:) "HP 97548S 8928" type 0 fixed SCSI 1,
it's certainly not though, the drive was dismantled just last week
	 (500M 5" individual platters make shiney paperweights ;-)
the problem remains on 2 replacement drives, (physically labelled Seagate, but
	dmesg: "CDC 94191-15 5376" type 0 fixed SCSI 1

I'm sure it is [at least my] 1542A card at fault.
  I've tried moving 1542C & 1542A cards & discs between systems, & deduced
  it's not motherboard, ram, cache or over clocking (I don't, never have),
  nor is it termination.

Either just my card is faulty, or the on board eprom code, or freebsd driver
does not cope with some 1542A werdity. As tomppa@fidata.fi
has seen similar problem, I don't think its just my card.

I don't know,  I need some 1542A users to run a simple test to deduce that
Please :-)

NO need to dismantle boxes, no need to be root, guest even will do !
just compile & run my
	http://www.freebsd.org/~jhs//src/bsd/jhs/bin/public/testblock/
	(char sccsID[] ="@(#) testblock.c V2.3 is my latest version localy)
_simultaneously_ on 2 seperate discs with
	cd /usr/tmp;testblock -v -l 20000000 foobar
	cd /usr1/tmp;testblock -v -l 20000000 foobar
If you get an error such as:
	Error at byte 32769 (0x8001), after 1,536,000 (0x177000)previously read;
	Read 0xff, expected 0x19.
Or even worse ....
	Error at byte 32769 (0x8001), after 0 previously read;
	Read 0xff, expected 0x0.
you can see bad data with hexdump -C foobar | more             # /^00008000
  00008000  ff ff ff ff ff ff ff ff  08 09 0a 0b 0c 0d 0e 0f  |ےےےےےےےے........|
  00008010  10 11 12 13 14 15 16 17  18 19 1a 1b 1c 1d 1e 1f  |................|
  00008020  20 21 22 23 24 25 26 27  28 29 2a 2b 2c 2d 2e 2f  | !"#$%&'()*+,-./|

The fault does not materialise on read, only on write (ie you can write good
data on a 1542C system & check still good on a 1542A, but not other way around.

The fault does not materialise if you only test one disc at a time !

The fault usually seems more noticeable on sd1 & sd2, less on sd0
(yes, I know: deduce bus length & termis ? , but I really don't think so !)

The test program: testblock/ .c & .1 
merely reads & writes a large file in user mode, you don't need to be root
& it does nothing nasty (except it will fill your file system with a single
very large file, if you dont use `-l number_of_bytes' )

This could be a good reason for you to run my testblock.c even if you think
you have no problem - think of it as a free disk check, that doesnt disrupt,
no need to run dos, drop to debugger, be root, repartition, backup the file
system or any other hastle :-).

This problem exists on FreeBSD-2.1 & 2.2.1 & 2.2-stable

------------

For reference, I'll append parts of my old <jhs> mail:
> Tomi Vainio <tomppa@fidata.fi>
> Has confirmed he sees the same Adaptec 1542A SCSI adapter bug that I do.
> 
> > I connected sd1 to my 1542A and here are results:
> > 
> > 1. No problems if testblock is only one that generates disk activity.
> > 2. I launched couple find processes to sd0 and at same time I
> >    run testblock. Testblock failed only 1/10 of test runs.
> > 3. I copied files with cp to sd1 when running testblock on
> >    sd1. Testblock failed on every time.
> > 
> >   Tomppa

Remember if you have a swap partitions on sd*, & you swapped,
the swap may be damaged so you might crash, so although my testblock.c
is entirely well behaved (read the source if you doubt me :-) ,
just in case, & certaily if testblock reports data corruption,
a reboot might be wise.

Comments &/or test results please, Thanks :-)

Julian
--
Julian H. Stacey	jhs@freebsd.org  	http://www.freebsd.org/~jhs/