From owner-freebsd-bugs@FreeBSD.ORG Mon Dec 5 15:07:05 2011 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97628106566B; Mon, 5 Dec 2011 15:07:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 36DDE8FC1C; Mon, 5 Dec 2011 15:07:04 +0000 (UTC) Received: from c211-28-227-231.carlnfd1.nsw.optusnet.com.au (c211-28-227-231.carlnfd1.nsw.optusnet.com.au [211.28.227.231]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id pB5F72uF021020 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 6 Dec 2011 02:07:03 +1100 Date: Tue, 6 Dec 2011 02:07:02 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Petr Salinger In-Reply-To: <201112050800.pB580uKZ014901@red.freebsd.org> Message-ID: <20111206004901.Q1446@besplex.bde.org> References: <201112050800.pB580uKZ014901@red.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/163076: It is not possible to read in chunks from linprocfs and procfs. X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Dec 2011 15:07:05 -0000 On Mon, 5 Dec 2011, Petr Salinger wrote: >> Description: > It is not possible to read in chunks from linprocfs and procfs. > It is a regression against stable-8. > I suspect it is due to changes of sbuf implementation between 8 and 9. > > Some files are rather big (over 4KB) and it is really standard to read them in blocks. >> How-To-Repeat: > "dd if=$FILE bs=1", with FILE any file in procfs or linprocfs > The result is empty output. I don't remember this ever working. The correct way to fix it is unclear (start by not claiming that the highly irregular files in procfs are regular), but empty output is unnecessarily bad - I would expect to get at least 1 byte. Under FreeBSD-~5.2, I get the following file sizes: file dd (1 byte) dd (10k) dd (1m) wc | cut... wc -c stat -------- ----------- -------- ------- -------- -------- -------- cmdline 0 6 EIO 6 0 0 ctl EBADF EBADF EBADF EBADF ctl 0 dbregs hangs hangs hangs hangs 0 0 etype 0 14 EIO 14 0 0 file@ 575712 575712 575712 575712 575712 575712 fpregs hangs hangs hangs hangs 0 0 map 0 1150 EIO 1150 0 0 mem EBADF EBADF EBADF EBADF 0 0 note EBADF EBADF EBADF EBADF 0 0 notepg EBADF EBADF EBADF EBADF 0 0 regs hangs hangs hangs hangs 0 0 rlimit 0 65 EIO 65 0 0 status 0 94 EIO 94 0 0 The irregularity is so large that it confuses wc -c into not working, while plain wc works. This is apparently because wc -c believes the claim that the file is regular, so it stats the file to get its size and finds 0, while plain wc reads the whole file using block size 64K. (md5 is another utility that is broken on such files, but it is broken even for files that don't claim to be regular. E.g., md5 on /dev/zero (or any device file that you can open) gives the same result as md5 on /dev/null, because it just stats the file, although this is completely wrong for device files. md5 is unbroken on pipes, so you can apply it to device files using the apparent beginner's pessimization "cat /dev/foo | md5". This method works for the irregular regular files in procfs too. You would have to use dd instead of cat to control the block size, and choose a size that is large enough to work and small enough to avoid EIO.) The *regs files don't block doing the read(), but just loop endlessly trying to read an infinite amount. This is because the uio offset is reset to 0 after each read. ISTR this being done for some other file types. This is a different feeble attempt to fix the problem in this PR. The basic problem is that seeking is not implemented for many files, so there is no way to continue reading from the previous uio offset, so the new offset must be either infinity (for most files) or 0 (for regs files). I can now explain more of the above irregularities: - for tiny files, seeking is easy to implement by sprintf()ing the whole file and using an offset in the string. The string constant should be either invariant or the previously generated string must be saved across reads (saving the string is only reasonable if it is tiny). This (except possibly for sufficient invariance/saving) is done. But some bug breaks reads of size 1. Perhaps this is fixed in -current, or was fixed and has been broken again. dd seems to work with block sizes betwen 2 and 128k inclusive in cases where it works with a block size of 10k in the above. The 128k limit would be explained by the misimplementation of attempting to malloc() the user-specified read size instead of the tiny size actually needed. The user must not be allowed to malloc() large sizes and there is an arbitrary limit of 128k. - the regs files are small although not tiny. But they are highly variable so they should be read atomically using read() syscalls. Thus seeking in them is not useful. This should probably by enforced by only allowing the uio offset to be 0 or EOF. Instead, it is only partially enforced by resetting the offset to 0 after each read (I hink applications can mess this up by lseek()ing between reads), So callers don't need to do an lseek() for this. This API was invented before pread() existed. pread() should be used now. This API results in casual observers reading the same data endlessly. I sometimes look at these files using hd and would prefer that EOF worked normally for them. Bruce