FreeBSD Mail Archives

Date:      Sun, 25 May 2008 13:34:16 +0200
From:      "Martin Laabs" <martin.laabs@mailbox.tu-dresden.de>
To:        "Bruce Evans" <brde@optusnet.com.au>
Cc:        "freebsd-gnats-submit@freebsd.org" <freebsd-bugs@freebsd.org>
Subject:   Re: misc/123939: msdosfs corruptes new files
Message-ID:  <op.ubpjreaw724k7f@martin>
In-Reply-To: <20080525100023.D17089@besplex.bde.org>
References:  <200805231916.m4NJGVXP001708@www.freebsd.org> <20080524134012.L69478@delplex.bde.org> <op.ubnw4xiy724k7f@martin> <20080525100023.D17089@besplex.bde.org>


Hi,

> This thread switched to private mail.  Did you mean that?  I don't mind,
> but sometimes useful PR info gets lost because it is not public.

Oh no - this was not my intention. Now I added the two CC's.
For that I fullquoted the last mail.

Using "cmp -lx" directly with the device does not work - I think be-
cause of the wrong block size.
Under the assumtion that writing and reading with bs=2k does work
propperly I tried to discover whether read, write or both are affec-
ted of the bug.

su:~$ dd if=/boot/kernel/kernel of=/dev/da4 bs=2k count=200
200+0 records in
200+0 records out
409600 bytes transferred in 1.399798 secs (292614 bytes/sec)
su:~$ dd if=/dev/da4 bs=2k|cmp -lx /boot/kernel/kernel - |head -n 15
00064000 f6 55
00064001 46 89
00064002 30 e5
[...]

This is OK since I wrote exactly 0x64000 bytes.

Now I tried 4k and 8k which worked also fine.

With a blocksize of 10k I get a missmatch at adress 0:

su:~$ dd if=/dev/da4 bs=10k|cmp -lx /boot/kernel/kernel - |head -n 15
00000000 7f 1d
00000001 45 04
00000002 4c 00
00000003 46 00
00000004 01 c4
00000005 01 16
00000006 01 00
00000007 09 00
[...]

I tried to discover the offset of the data that is read with
bs>8k and bs<=16k. It is exactly 0x2000 (8k).
With bs>16k, bs<=24k the offset is 0x4000, with bs>24k, bs<=32
it is 0x6000.
Until now I only checked the data around address 0.

Now the writing experiment:

As seen above bs=2k is working OK. Now I try 4k:

su:~$ dd if=/boot/kernel/kernel of=/dev/da4 bs=4k count=100
100+0 records in
100+0 records out
409600 bytes transferred in 1.002872 secs (408427 bytes/sec)
su:~$ dd if=/dev/da4 bs=2k|cmp -lx /boot/kernel/kernel - |head -n 15
00064000 f6 00
00064001 46 00
00064002 30 00
[...]

And now 8k:

su:~$ dd if=/boot/kernel/kernel of=/dev/da4 bs=8k count=50
50+0 records in
50+0 records out
409600 bytes transferred in 0.748899 secs (546936 bytes/sec)
su:~$ dd if=/dev/da4 bs=2k|cmp -lx /boot/kernel/kernel - |head -n 15
00064000 f6 00
00064001 46 00
00064002 30 00
[...]

Both are OK.

With bs of 10k I get the first byte mismatch at 0x2000 (8k)

su:~$ dd if=/boot/kernel/kernel of=/dev/da4 bs=10k count=40
40+0 records in
40+0 records out
409600 bytes transferred in 0.699931 secs (585201 bytes/sec)
su:~$ dd if=/dev/da4 bs=2k|cmp -lx /boot/kernel/kernel - |head -n 15
00002000 1d 7f
00002001 04 45
00002002 00 4c
00002003 00 46
00002004 c4 01
00002005 16 01
[...]

The offset of the readback data is -0x2000. This means the data at 0x2000
on the stick should be orginally at 0x0. Since it *is* already there (cmp
did not report any difference between the file and the first 0x2000 bytes)
it is the second time there. This means that the data that would be  
origina-
lly at 0x2000 is lost.

The length of this "discontinuity" is 0x800 with not really regular
spacings. (writing bs was 10k)

su:~$ dd if=/dev/da4 bs=2k|cmp -lx /boot/kernel/kernel - |less

00002000 1d 7f
00002001 04 45
00002002 00 4c
[...]
000027f9 13 00
000027fc 00 49
000027fd 00 1e
00004800 de 00
00004801 0f 00
00004804 00 4f
[...]
00004ff9 00 02
00004ffc 00 3d
00004ffd 00 1e
00007000 00 bc
00007001 00 15
00007004 a0 be
[...]
000077f8 8a 83
000077f9 0f 1a
000077fc 12 dd
00009800 00 90
00009801 00 1d
00009804 00 83


So far,
  Martin


--------------------:<----------------------------

>>> This is probably a bug in the umass or da driver. da claims to support  
>>> i/o's
>>> of DFLTPHYS = 64K, so lower level drivers must support this even if the
>>> hardware doesn't, but apparently some usb drives have a lower limit.
>>
>> Hey - you are right. First I tried direct copy with bs=2k (which
>> is the sector size of that device.) This was OK:
>>
>> u:~$ dd if=/boot/kernel/kernel of=/dev/da7 bs=2k
>> dd: /dev/da7: Invalid argument
>> 4501+1 records in
>> 4501+0 records out
>> 9218048 bytes transferred in 31.502305 secs (292615 bytes/sec)

> It's another bug that gives the EINVAL error for writing at EOF.
> This complicates debugging a little.  I think the disk size is not
> a multiple of the block size (2K here), so the last block would
> strictly cross the boundary at the end of the disk, and none of
> it is written, but the error handling would be different/better
> if the block were at the boundary, and maybe different/worse if
> the block were strictly beyond the boundary.  For larger blocks,
> the last one would be more likely to strictly cross the boundary.
> So just note the error above so as to ignore similar errors for
> larger blocks.
>
>> su:~$ dd if=/boot/kernel/kernel bs=2k count=4501 of=test.fs
>> 4501+0 records in
>> 4501+0 records out
>> 9218048 bytes transferred in 0.134802 secs (68382200 bytes/sec)
>
> No errors since it didn't go near EOF for either the input or output.
>
>> su:~$ diff test.umass test.fs
>>
>> Now I tried this with a block size of 128k and it did not work
>> anymore:
>>
>> su:~$ dd if=/boot/kernel/kernel of=/dev/da7 bs=128k
>> dd: /dev/da7: Invalid argument
>> 70+1 records in
>> 70+0 records out
>> 9175040 bytes transferred in 12.484369 secs (734922 bytes/sec)
>
> Better write only 70 blocks to avoid secondary errors.  I think you
> eliminated the secondary error above by checking only 70 blocks later.
>
>> su:~$ dd if=/dev/da7 of=test.umass bs=128k count=70
>> 70+0 records in
>> 70+0 records out
>> 9175040 bytes transferred in 9.297371 secs (986842 bytes/sec)
>>
>> su:~$ dd if=/boot/kernel/kernel of=test.fs bs=128k count=70
>> 70+0 records in
>> 70+0 records out
>> 9175040 bytes transferred in 0.127474 secs (71975736 bytes/sec)
>>
>> su:~$ diff test.umass test.fs
>> Files test.umass and test.fs differ
>
> Use cmp -lx to locate the error(s), especially the first one (expect
> a lot).  Copying from the disk using dd is good for eliminating
> secondary errors, but cmp -lx directly on the disk should work for
> a quick check, with a better chance of working than for diff.  (It
> depends on whether cmp's block size working (the block size only needs
> to be a multiple of the sector size, with no partial block at EOF) and
> not being large enough to cause the suspected error on input.  The
> suspected bug may affect input, output, or both.  I suspect both.)
>
> Bruce
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.ubpjreaw724k7f>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation