Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Apr 1996 04:18:15 +0200
From:      "Julian H. Stacey" <jhs@freebsd.org>
To:        bugs@freebsd.org
Cc:        fabio@cesar.unicamp.br, fty@mcnc.org, gcrutchr@nightflight.com, j@uriah.heep.sax.de, jc@irbs.com, julian@freebsd.org, kuku@gilberto.physik.rwth-aachen.de, lehey.pad@sni.de, mrm@Sceard.com, nikm@ixa.net, tomppa@fidata.fi, wilko@yedi.iaf.nl
Subject:   Re: Adaptec 1542A Users 
Message-ID:  <199604140218.EAA10729@vector.jhs.no_domain>
In-Reply-To: Your message of "Sat, 13 Apr 1996 02:39:33 %2B0300." <199604122339.CAA00591@zeta.fidata.fi> 

next in thread | previous in thread | raw e-mail | index | archive | help
Hi bugs@freebsd.org
Cc Adaptec 1542A SCSI Adapter People, & Julian Elischer (SCSI Guru :-)

Tomi Vainio <tomppa@fidata.fi>
Has confirmed he sees the same Adaptec 1542A SCSI adapter bug that I do.

To quote his 2 mails to me:
======
> Date: Fri, 12 Apr 1996 23:37:12 +0300 (EET DST)
> From: Tomi Vainio <tomppa@fidata.fi>
> To: "Julian H. Stacey" <jhs@freebsd.org>
> Subject: Re. Adaptec 1542A Users
> 
> Julian H. Stacey writes:
>  > 
>  > Since my posting,
>  > I have written & read a 190 meg image on sd3 using a system with a 1542B,
>  > then I took it to the 1542A system, as sd1 ... it blew within a few K.
>  > 
>  > The test on a 1542A system is simple:
>  > 	make
>  > 	./testblock -v -l 10000000 /usr2/tmp/rubbish
>  > (where /usr2/tmp must be on /dev/sd1 or 2 or 3, but not /dev/sd0 or /dev/wd*
>  > & wait to see if you get something like:
>  > 	data mismatch at byte 40961 (0xa001), after 0 (0x0) previously checked ok.
> 
> I connected sd1 to my 1542A and here are results:
> 
> 1. No problems if testblock is only one that generates disk activity.
> 2. I launched couple find processes to sd0 and at same time I
>    run testblock. Testblock failed only 1/10 of test runs.
> 3. I copied files with cp to sd1 when running testblock on
>    sd1. Testblock failed on every time.
> 
>   Tomppa
> 
> ../testblock -v -l 10000000 /v/fish
> ../testblock: Neither -w or -r specified, so will both write then read.
> Using a block size of 61440, to a limit of 10000000.
> ../testblock writing then reading /v/fish.
> ../testblock: Started rewinding /v/fish.
> ../testblock: Finished rewinding /v/fish.
> ../testblock: In /v/fish, data mismatch at byte 49153 (0xc001), after 0 (0x0) previously checked ok.
> Byte read 255, byte expected 0
> ../testblock: With /v/fish, only checked 0 bytes, 10,014,720 failed.
> ../testblock: Finished.
> }{
======
> Date: Sat, 13 Apr 1996 02:39:33 +0300 (EET DST)
> From: Tomi Vainio <tomppa@fidata.fi>
> To: "Julian H. Stacey" <jhs@freebsd.org>
> 
> Julian H. Stacey writes:
>  > 
>  > Please tell me in what way the data is corrupted.
>  > Do you get a block of 8 0xFF bytes aligned at the beginning of random
>  > (not all) 0x1000 boundary aligned blocks ?
>  > 
>  > Please run the enclosed 8f.c on some corrupt files, 
>  > maybe with 
>  > 	script
>  > 	find /usr.flaky.drive -exec 8f {} /dev/null \; 
>  > I am very interested to know if you see the same bad data I do !
>  >
> 
> fish is file that testblock made and zip files are copied with cp
> 
>   Tomppa
> 
> worm:/v(13)# find . -exec 8f/8f {} /dev/null \;
> ../fish:	byte (dec) 147457	(hex) 24001,	line 577	character 247
> ../fish:	byte (dec) 2203649	(hex) 21a001,	line 8608	character 25
> ../fish:	byte (dec) 2383873	(hex) 246001,	line 9312	character 28
> ../herit201.zip:	byte (dec) 147457	(hex) 24001,	line 570	character 21
> ../herit209.zip:	byte (dec) 212993	(hex) 34001,	line 833	character 53
> ../herit209.zip:	byte (dec) 1245185	(hex) 130001,	line 5132	character 122
> ../herit208.zip:	byte (dec) 53249	(hex) d001,	line 233	character 482
> ../herit208.zip:	byte (dec) 147457	(hex) 24001,	line 619	character 355
> ../herit207.zip:	byte (dec) 49153	(hex) c001,	line 178	character 39
> ../herit204.zip:	byte (dec) 999425	(hex) f4001,	line 3874	character 275
> ../herit204.zip:	byte (dec) 1458177	(hex) 164001,	line 5863	character 11
> ../herit203.zip:	byte (dec) 147457	(hex) 24001,	line 406	character 56
> ../herit203.zip:	byte (dec) 868353	(hex) d4001,	line 3138	character 72
> ../herit202.zip:	byte (dec) 49153	(hex) c001,	line 194	character 239
> ../herit210.zip:	byte (dec) 49153	(hex) c001,	line 192	character 497
> ../herit210.zip:	byte (dec) 147457	(hex) 24001,	line 611	character 491
> ../8f/8f.c:	byte (dec) 173	(hex) ad,	line 10	character 32
> 

So it looks like a generic bug in FreeBSD code:
	With a 1542A (& not a 1542B, which seems OK),
	In simultaneous multiple task write mode to sd1 (or 2 or 3 or 4),
	At random multiples of 0x1000 bytes,
	The first 8 bytes of a block get forced to 0xFF.
(Of course it may well be that FreeBSD code is not `in error' but merely
doesnt allow for some wart in the 1542A, that's fixed in the 1542B,
but whatever, we need a fix).

Those who have not yet proven this on their system might like to try something
like this:
	sync ; echo maybe even dump sd1 to tape # See below
        cd <<<sd1_mount_point>>>/tmp
        testblock -l 10000000 rubbish1 &
        testblock -l 10000000 rubbish2 &
        testblock -l 10000000 rubbish3 &
        & do some other sd0 to sd1 copying in parallel.
        Then run my 8f on all the data files youve run.
        
Remember if you have a swap partition on sd1, & you swapped,
the swap may be damaged so you might crash.
If you'r really unlucky, while the system is creating new inodes for the 
rubbish files, & is manipulating the file system, 8 bytes (out of several 0x1000)
bytes of file system structure data may get mangled.

Here's hoping our SCSI Guru, Julian Elischer (or anyone else come to that)
can suggest some code changes (ideally diffs) for me to compile here,
& test on my machines, & that I can report back on.
(Others too, naturally, if they want).
I guess Im in an ideal test set up here:
	I have a 1542B current system with 3 discs, & a 1542A 2.1 system with
	3 or 4 discs, (each also with scsi tape, & I think Ive seen this bug
	on tape too, on my 1542A)

I have supplied CC readers with testblock.c & 8f.c,
for others interested, I'll toss them in http://www.freebsd.org/~jhs/src/

Julian
--
Julian H. Stacey	jhs@freebsd.org  	http://www.freebsd.org/~jhs/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604140218.EAA10729>