From owner-freebsd-arch@FreeBSD.ORG Wed Aug 1 16:28:44 2012 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 827011065670; Wed, 1 Aug 2012 16:28:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id DDB1C8FC0A; Wed, 1 Aug 2012 16:28:43 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q71GSmrQ029370; Wed, 1 Aug 2012 19:28:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q71GSa5d073985; Wed, 1 Aug 2012 19:28:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q71GSarS073984; Wed, 1 Aug 2012 19:28:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 1 Aug 2012 19:28:36 +0300 From: Konstantin Belousov To: Bruce Evans Message-ID: <20120801162836.GO2676@deviant.kiev.zoral.com.ua> References: <5018992C.8000207@freebsd.org> <20120801071934.GJ2676@deviant.kiev.zoral.com.ua> <20120801183240.K1291@besplex.bde.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="uD6Il+FtNNOLHt4d" Content-Disposition: inline In-Reply-To: <20120801183240.K1291@besplex.bde.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: arch@freebsd.org, David Xu Subject: Re: short read/write and error code X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Aug 2012 16:28:44 -0000 --uD6Il+FtNNOLHt4d Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Aug 01, 2012 at 07:23:09PM +1000, Bruce Evans wrote: > On Wed, 1 Aug 2012, Konstantin Belousov wrote: >=20 > >On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote: > >>POSIX requires write() to return actually bytes written, same rule is > >>applied to read(). > >> > >>http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html > >>>ETURN VALUE > >>> > >>>Upon successful completion, write() [XSI] and pwrite() shall > >>>return the number of bytes actually written to the file associated > >>>with fildes. This number shall never be greater than nbyte. > >>>Otherwise, -1 shall be returned and errno set to indicate the error. > >> > >>http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html > >>>RETURN VALUE > >>> > >>>Upon successful completion, read() [XSI] and pread() shall return > >>>a non-negative integer indicating the number of bytes actually read. > >>>Otherwise, the functions shall return -1 and set errno to indicate > >>>the error. > >Note that the wording is only about successful return, not for the case > >when error occured. I do think that if fo_read() returned an error, and > >error is not of the kind 'interruption', then the error shall be returned > >as is. >=20 > That is clearly not what is intended. write() is unusable if it won't > tell you how many bytes it wrote. According to your interpretation, > recalcitrantix would conform to POSIX if all it writes wrote whatever > they could and then returned -1 after detecting the error EPOSIXFUZZY. I think this is obvious pull, because no useful implementation would insert _artificial_ error. >=20 > The usability is specified for signals. From an old POSIX draft: >=20 > % 51235 If write( ) is interrupted by a signal before it=20 > writes any data, it shall return -1 with errno set to > % 51236 [EINTR]. > % 51237 If write( ) is interrupted by a signal after it=20 > successfully writes some data, it shall return the > % 51238 number of bytes written. This is exactly what existing code does. >=20 > POSIX formally defines "Successfully Transferred", mainly for aio. I > couldn't find any formal definition of "successfully writes", but clearly > it is nonsense for a write to be unsuccessful if a reader on the local > system or on an external system has successfully read some of the data > written by the write. >=20 > FreeBSD does try to convert EINTR to 0 after some data has been written, > to conform to the above. SIGPIPE should return EINTR to be returned to > dofilewrite(), so there should be no problem for SIGPIPE. But we were > reminded of this old FreeBSD bug by probelms with SIGPIPE. Sorry, I do not understand this, esp. second sentence. As I said, patch behaviour in regard of SIGPIPE is just wrong. >=20 > POSIX contradicts itself by disallowing successful completion if _any_ > error is detected: >=20 > % 435 RETURN VALUE > % 436 This section indicates the possible return= =20 > values, if any. > % 437 If the implementation can detect errors,=20 > ``successful completion'' means that no error > % 438 has been detected during execution of the=20 > function. If the implementation does detect >=20 > Relcalcitrantix has 2 versions according to which of these contradictions > has precedence. In one version, writes do as much as possible before > returning -1/EPOSIXFUZZY, as above. In the other version, this still > happens for most writes. But ones that are interrupted by a signal after > having written some data return the number of bytes written, accoding to > the "shall" for the interrupted case. Perhaps there are some other weird > cases where writes are required to work :-). >=20 > >>I have following patch to fix our code to be compatible with POSIX: > >... > > > >>-current only resets error code to zero for short write when code is > >>ERESTART, EINTR or EWOULDBLOCK. > >>But this is incorrect, at least for pipe, when EPIPE is returned, > >>some bytes may have already been written. For a named pipe, I may don't > >>care a reader is disappeared or not, because for named pipe, a new > >>reader can come in and talk with writer again, so I need to know > >>how many bytes have been written, same is applied to reader, I don't > >>care writer is gone, it can come in again and talk with reader. So I > >>suggest to remove surplus code in -current's dofilewrite() and > >>dofileread(). > >Then fix the pipe code, and not introduce the behaviour change for all > >file types ? >=20 > Because returning the error to userland breaks all file types that > want to return a short i/o (mainly special files whose i/o cannot be > backed out of). They are just detecting and returning an error as a > courtesy to upper layers, and to simplify the implementation. The > syscall API doesn't permit returning both the error code (the reason > for the short i/o) and the short count, so the error code must be > cleared to allow the short count to be returned. No, there is the only sane behaviour for the fo_read and fo_write, to return either no error (or interruption error) and adjust resid, or return error. Returning both error and adjusting resid is just wrong. Proposed patch makes generic i/o layer much less flexible, and probably preventing implementation of things like transactional writes. We should fix sys_pipe.c and not require filesystems to roll back uio into inconsistent state to report errors (since rolling back into consistent state is typically impossible but is required after the patch). >=20 > Bruce --uD6Il+FtNNOLHt4d Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlAZWTQACgkQC3+MBN1Mb4ieaACg5Jt2PwJqw/VtVZ7ovRPGbUZw ec0AniWjpRP6WRWOaXO9GZxEGZAiJy2M =D1/n -----END PGP SIGNATURE----- --uD6Il+FtNNOLHt4d--