Date: Sun, 25 May 2008 13:34:16 +0200 From: "Martin Laabs" <martin.laabs@mailbox.tu-dresden.de> To: "Bruce Evans" <brde@optusnet.com.au> Cc: "freebsd-gnats-submit@freebsd.org" <freebsd-bugs@freebsd.org> Subject: Re: misc/123939: msdosfs corruptes new files Message-ID: <op.ubpjreaw724k7f@martin> In-Reply-To: <20080525100023.D17089@besplex.bde.org> References: <200805231916.m4NJGVXP001708@www.freebsd.org> <20080524134012.L69478@delplex.bde.org> <op.ubnw4xiy724k7f@martin> <20080525100023.D17089@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, > This thread switched to private mail. Did you mean that? I don't min= d, > but sometimes useful PR info gets lost because it is not public. Oh no - this was not my intention. Now I added the two CC's. For that I fullquoted the last mail. Using "cmp -lx" directly with the device does not work - I think be- cause of the wrong block size. Under the assumtion that writing and reading with bs=3D2k does work propperly I tried to discover whether read, write or both are affec- ted of the bug. su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D2k count=3D200 200+0 records in 200+0 records out 409600 bytes transferred in 1.399798 secs (292614 bytes/sec) su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15= 00064000 f6 55 00064001 46 89 00064002 30 e5 [...] This is OK since I wrote exactly 0x64000 bytes. Now I tried 4k and 8k which worked also fine. With a blocksize of 10k I get a missmatch at adress 0: su:~$ dd if=3D/dev/da4 bs=3D10k|cmp -lx /boot/kernel/kernel - |head -n 1= 5 00000000 7f 1d 00000001 45 04 00000002 4c 00 00000003 46 00 00000004 01 c4 00000005 01 16 00000006 01 00 00000007 09 00 [...] I tried to discover the offset of the data that is read with bs>8k and bs<=3D16k. It is exactly 0x2000 (8k). With bs>16k, bs<=3D24k the offset is 0x4000, with bs>24k, bs<=3D32 it is 0x6000. Until now I only checked the data around address 0. Now the writing experiment: As seen above bs=3D2k is working OK. Now I try 4k: su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D4k count=3D100 100+0 records in 100+0 records out 409600 bytes transferred in 1.002872 secs (408427 bytes/sec) su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15= 00064000 f6 00 00064001 46 00 00064002 30 00 [...] And now 8k: su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D8k count=3D50 50+0 records in 50+0 records out 409600 bytes transferred in 0.748899 secs (546936 bytes/sec) su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15= 00064000 f6 00 00064001 46 00 00064002 30 00 [...] Both are OK. With bs of 10k I get the first byte mismatch at 0x2000 (8k) su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D10k count=3D40 40+0 records in 40+0 records out 409600 bytes transferred in 0.699931 secs (585201 bytes/sec) su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15= 00002000 1d 7f 00002001 04 45 00002002 00 4c 00002003 00 46 00002004 c4 01 00002005 16 01 [...] The offset of the readback data is -0x2000. This means the data at 0x200= 0 on the stick should be orginally at 0x0. Since it *is* already there (cm= p did not report any difference between the file and the first 0x2000 byte= s) it is the second time there. This means that the data that would be = origina- lly at 0x2000 is lost. The length of this "discontinuity" is 0x800 with not really regular spacings. (writing bs was 10k) su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |less 00002000 1d 7f 00002001 04 45 00002002 00 4c [...] 000027f9 13 00 000027fc 00 49 000027fd 00 1e 00004800 de 00 00004801 0f 00 00004804 00 4f [...] 00004ff9 00 02 00004ffc 00 3d 00004ffd 00 1e 00007000 00 bc 00007001 00 15 00007004 a0 be [...] 000077f8 8a 83 000077f9 0f 1a 000077fc 12 dd 00009800 00 90 00009801 00 1d 00009804 00 83 So far, Martin --------------------:<---------------------------- >>> This is probably a bug in the umass or da driver. da claims to suppo= rt = >>> i/o's >>> of DFLTPHYS =3D 64K, so lower level drivers must support this even i= f the >>> hardware doesn't, but apparently some usb drives have a lower limit.= >> >> Hey - you are right. First I tried direct copy with bs=3D2k (which >> is the sector size of that device.) This was OK: >> >> u:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da7 bs=3D2k >> dd: /dev/da7: Invalid argument >> 4501+1 records in >> 4501+0 records out >> 9218048 bytes transferred in 31.502305 secs (292615 bytes/sec) > It's another bug that gives the EINVAL error for writing at EOF. > This complicates debugging a little. I think the disk size is not > a multiple of the block size (2K here), so the last block would > strictly cross the boundary at the end of the disk, and none of > it is written, but the error handling would be different/better > if the block were at the boundary, and maybe different/worse if > the block were strictly beyond the boundary. For larger blocks, > the last one would be more likely to strictly cross the boundary. > So just note the error above so as to ignore similar errors for > larger blocks. > >> su:~$ dd if=3D/boot/kernel/kernel bs=3D2k count=3D4501 of=3Dtest.fs >> 4501+0 records in >> 4501+0 records out >> 9218048 bytes transferred in 0.134802 secs (68382200 bytes/sec) > > No errors since it didn't go near EOF for either the input or output. > >> su:~$ diff test.umass test.fs >> >> Now I tried this with a block size of 128k and it did not work >> anymore: >> >> su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da7 bs=3D128k >> dd: /dev/da7: Invalid argument >> 70+1 records in >> 70+0 records out >> 9175040 bytes transferred in 12.484369 secs (734922 bytes/sec) > > Better write only 70 blocks to avoid secondary errors. I think you > eliminated the secondary error above by checking only 70 blocks later.= > >> su:~$ dd if=3D/dev/da7 of=3Dtest.umass bs=3D128k count=3D70 >> 70+0 records in >> 70+0 records out >> 9175040 bytes transferred in 9.297371 secs (986842 bytes/sec) >> >> su:~$ dd if=3D/boot/kernel/kernel of=3Dtest.fs bs=3D128k count=3D70 >> 70+0 records in >> 70+0 records out >> 9175040 bytes transferred in 0.127474 secs (71975736 bytes/sec) >> >> su:~$ diff test.umass test.fs >> Files test.umass and test.fs differ > > Use cmp -lx to locate the error(s), especially the first one (expect > a lot). Copying from the disk using dd is good for eliminating > secondary errors, but cmp -lx directly on the disk should work for > a quick check, with a better chance of working than for diff. (It > depends on whether cmp's block size working (the block size only needs= > to be a multiple of the sector size, with no partial block at EOF) and= > not being large enough to cause the suspected error on input. The > suspected bug may affect input, output, or both. I suspect both.) > > Bruce >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.ubpjreaw724k7f>