Date: Wed, 19 Oct 2016 15:10:22 +0200 From: Arrigo Marchiori <ardovm@yahoo.it> To: Poul-Henning Kamp <phk@phk.freebsd.dk> Cc: Arrigo Marchiori via freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: Random truncated files on USB hard disk with timeouts; how to debug? Message-ID: <20161019131022.GE93031@nuvolo> In-Reply-To: <23735.1476876382@critter.freebsd.dk> References: <20161018152715.GC89691@nuvolo> <51997.1476812624@critter.freebsd.dk> <20161019062812.GA93031@nuvolo> <7759.1476858801@critter.freebsd.dk> <20161019064315.GB93031@nuvolo> <7924.1476861738@critter.freebsd.dk> <20161019080005.GD93031@nuvolo> <23735.1476876382@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello Poul-Henning, On Wed, Oct 19, 2016 at 11:26:22AM +0000, Poul-Henning Kamp wrote: > -------- > In message <20161019080005.GD93031@nuvolo>, Arrigo Marchiori writes: > > >> If the drive has bad power supply, that may not happen. > > > >Yes, I understand. But, forgive me for insisting: there is an > >inconsistency that is _at filesystem level_ and _temporary_, and this > >really puzzles me. > > Because the drive returns wrong data every so often and when > power is better returns correct data ? > > End-to-End arguments in system design applies here: > > Either you trust your drive, or you check everything it tells you > (ie: RAID with parity, ZFS or similar). Ok, but I cannot understand why read() returns plain zero bytes. If ``bad'' data was received from a USB read operation, it should just not make sense to the kernel. Not just show up as an empty file?... Wile fiddling with a funny file, I found that read(2) and mmap(2) behave differently. While cat(1) shows an empty file, cp(1) was able to read its contents. The file was in fact /usr/src/usr.bin/clang/clang/clang.1, the source of the clang(1) manual page. On the other hand, mv(1) does not alter the ``readability'' of the file. # mv clang.1 a # truss cat a [snip] openat(AT_FDCWD,"a",O_RDONLY,00) = 3 (0x3) fstat(1,{ mode=crw--w---- ,inode=146,size=0,blksize=4096 }) = 0 (0x0) __sysctl(0x7fffffffe5e0,0x2,0x7fffffffe5c4,0x7fffffffe5c8,0x0,0x0) = 0 (0x0) __sysctl(0x7fffffffe5e0,0x2,0x7fffffffe5c4,0x7fffffffe5c8,0x0,0x0) = 0 (0x0) mmap(0x0,2097152,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34374418432 (0x800e00000) read(3,0x800e16000,4096) = 0 (0x0) close(3) = 0 (0x0) [snip] # Truss cp a b [snip] stat("b",0x7fffffffe9d8) ERR#2 'No such file or directory' lstat("a",{ mode=-rw-r--r-- ,inode=6510202,size=16993,blksize=32768 }) = 0 (0x0) umask(0x1ff) = 18 (0x12) umask(0x12) = 511 (0x1ff) mmap(0x0,2097152,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34374418432 (0x800e00000) fstatat(AT_FDCWD,"a",{ mode=-rw-r--r-- ,inode=6510202,size=16993,blksize=32768 },0x0) = 0 (0x0) stat("b",0x7fffffffea50) ERR#2 'No such file or directory' openat(AT_FDCWD,"a",O_RDONLY,00) = 3 (0x3) openat(AT_FDCWD,"b",O_WRONLY|O_CREAT|O_TRUNC,0100644) = 4 (0x4) mmap(0x0,16993,PROT_READ,MAP_SHARED,3,0x0) = 34366304256 (0x800643000) write(4,".\\" $FreeBSD: stable/11/usr.bin"...,16993) = 16993 (0x4261) munmap(0x800643000,16993) = 0 (0x0) close(4) = 0 (0x0) close(3) = 0 (0x0) [snip] Please also consider that these commands are repeatable (on the same file): cat always sees the file empty, cp always succeedes. # cp a c # cat a # cat c [data] I think this also tracks down the problem to read operations: the file was successfully installed with yesterday's buildworld. Only today, at this time, it started to behave ``funny''. Best regards, -- rigo http://rigo.altervista.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161019131022.GE93031>