Date: Wed, 1 Aug 2012 08:12:06 -0600 From: Warner Losh <imp@bsdimp.com> To: davidxu@freebsd.org Cc: Konstantin Belousov <kostikbel@gmail.com>, arch@freebsd.org Subject: Re: short read/write and error code Message-ID: <D7DC1F82-6CAA-4359-847C-EE89357D8538@bsdimp.com> In-Reply-To: <5018E1FC.4080609@gmail.com> References: <5018992C.8000207@freebsd.org> <20120801071934.GJ2676@deviant.kiev.zoral.com.ua> <5018E1FC.4080609@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 1, 2012, at 1:59 AM, David Xu wrote: > On 2012/8/1 15:19, Konstantin Belousov wrote: >> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote: >>> POSIX requires write() to return actually bytes written, same rule = is >>> applied to read(). >>>=20 >>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html >>>> ETURN VALUE >>>>=20 >>>> Upon successful completion, write() [XSI] and pwrite() shall >>>> return the number of bytes actually written to the file associated >>>> with fildes. This number shall never be greater than nbyte. >>>> Otherwise, -1 shall be returned and errno set to indicate the = error. >>>=20 >>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html >>>> RETURN VALUE >>>>=20 >>>> Upon successful completion, read() [XSI] and pread() shall = return >>>> a non-negative integer indicating the number of bytes actually = read. >>>> Otherwise, the functions shall return -1 and set errno to indicate >>>> the error. >> Note that the wording is only about successful return, not for the = case >> when error occured. I do think that if fo_read() returned an error, = and >> error is not of the kind 'interruption', then the error shall be = returned >> as is. > I do think data is more important than error code. Do you think if a = 512 bytes block is bad, > all bytes in the block should be thrown away while you could really = get some bytes from it, > this might be very important to someone, such as a password or a bank = account, this > is just an example, whether filesystem works in this way is = irrelevant. You do know that with disk drives it is an all or nothing sort of thing = at the sector level. Either you get the whole thing, or you get none of = it. There's no partial sector reads, and there's no way to get the data = generally. Some drives sometimes allow you to access raw tracks, but = those interfaces are never connected to read, but usually an ioctl that = issues the special command and returns the results. And even then, it = returns everything (perhaps including the ECC bytes) > While program continues to execute, next read()/write() should return = -1 and errno will be > set, I think both socket and pipe already work in this way, it is = dofileread/dofilewrite have > made it not happen. Usually it is up to the driver to make this decision. Most drivers = already return 0 when they've put any data into the buffer. The case = where there's an error returned from the driver and also data indicated = by resid would be vanishingly small. >>> I have following patch to fix our code to be compatible with POSIX: >> ... >>=20 >>> -current only resets error code to zero for short write when code is >>> ERESTART, EINTR or EWOULDBLOCK. >>> But this is incorrect, at least for pipe, when EPIPE is returned, >>> some bytes may have already been written. For a named pipe, I may = don't >>> care a reader is disappeared or not, because for named pipe, a new >>> reader can come in and talk with writer again, so I need to know >>> how many bytes have been written, same is applied to reader, I don't >>> care writer is gone, it can come in again and talk with reader. So I >>> suggest to remove surplus code in -current's dofilewrite() and >>> dofileread(). >> Then fix the pipe code, and not introduce the behaviour change for = all >> file types ? > see above, I think data is more important than error code, and next = read/write will > get the error. >=20 >>> For EPIPE, We still deliver SIGPIPE to current thread, but returns >>> actually bytes written. >> And this sounds wrong. I think that fixing the code for pipes would = also >> semi-magically makes this correct. Yes. Pipes are too magical and don't match devices very well. Warner=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D7DC1F82-6CAA-4359-847C-EE89357D8538>