From owner-freebsd-arch@FreeBSD.ORG Wed Aug 1 14:12:10 2012 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 86A41106566B for ; Wed, 1 Aug 2012 14:12:10 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 0F9F88FC14 for ; Wed, 1 Aug 2012 14:12:09 +0000 (UTC) Received: by wgbds11 with SMTP id ds11so6693655wgb.31 for ; Wed, 01 Aug 2012 07:12:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=LJzilumxs80a/tGBTjaRMEg84t9DJ06vpu6GqnlGC7k=; b=ovJGCkY9S5QyBKeipZQ28hU2EVpW9KZkpp+dozT8EIqvSBdM3g6wdua+dWg5GHqQXR ci257DcqJ2EDt3rZleUOfwdIx1g/kBd/ZST3mQNihjv71m6ub88RJWRec3WnebRrn19t rYVbwQHnE7U/lZeDuIbPx5d0jzTT90Dxbv+nydoTLqav7erafbrihr4gq8SmRh2BQ3Ba w+Z3AH9WSZbFkvVcePTZAXShQh3Ds11bFriV4KsjB1ZBSMHR/0S006+ek/Ncsmn+9WYd tJCNe4zCczKe8gurRFCjv6seUyCaZFyU33PPif9Agf0pCv0YWdLK/SUqVKT5DUt028M1 x6ZA== Received: by 10.50.46.232 with SMTP id y8mr5344033igm.57.1343830328145; Wed, 01 Aug 2012 07:12:08 -0700 (PDT) Received: from 63.imp.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPS id ua2sm6047455igb.7.2012.08.01.07.12.06 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 01 Aug 2012 07:12:07 -0700 (PDT) Sender: Warner Losh Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <5018E1FC.4080609@gmail.com> Date: Wed, 1 Aug 2012 08:12:06 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5018992C.8000207@freebsd.org> <20120801071934.GJ2676@deviant.kiev.zoral.com.ua> <5018E1FC.4080609@gmail.com> To: davidxu@freebsd.org X-Mailer: Apple Mail (2.1084) X-Gm-Message-State: ALoCoQnDJLTtNw4rt2zskvHhYFGbYNReYGz9CZOpcOzanNkhXwV0v6FZBzFS4mgdesmf1aZwxEza Cc: Konstantin Belousov , arch@freebsd.org Subject: Re: short read/write and error code X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Aug 2012 14:12:10 -0000 On Aug 1, 2012, at 1:59 AM, David Xu wrote: > On 2012/8/1 15:19, Konstantin Belousov wrote: >> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote: >>> POSIX requires write() to return actually bytes written, same rule = is >>> applied to read(). >>>=20 >>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html >>>> ETURN VALUE >>>>=20 >>>> Upon successful completion, write() [XSI] and pwrite() shall >>>> return the number of bytes actually written to the file associated >>>> with fildes. This number shall never be greater than nbyte. >>>> Otherwise, -1 shall be returned and errno set to indicate the = error. >>>=20 >>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html >>>> RETURN VALUE >>>>=20 >>>> Upon successful completion, read() [XSI] and pread() shall = return >>>> a non-negative integer indicating the number of bytes actually = read. >>>> Otherwise, the functions shall return -1 and set errno to indicate >>>> the error. >> Note that the wording is only about successful return, not for the = case >> when error occured. I do think that if fo_read() returned an error, = and >> error is not of the kind 'interruption', then the error shall be = returned >> as is. > I do think data is more important than error code. Do you think if a = 512 bytes block is bad, > all bytes in the block should be thrown away while you could really = get some bytes from it, > this might be very important to someone, such as a password or a bank = account, this > is just an example, whether filesystem works in this way is = irrelevant. You do know that with disk drives it is an all or nothing sort of thing = at the sector level. Either you get the whole thing, or you get none of = it. There's no partial sector reads, and there's no way to get the data = generally. Some drives sometimes allow you to access raw tracks, but = those interfaces are never connected to read, but usually an ioctl that = issues the special command and returns the results. And even then, it = returns everything (perhaps including the ECC bytes) > While program continues to execute, next read()/write() should return = -1 and errno will be > set, I think both socket and pipe already work in this way, it is = dofileread/dofilewrite have > made it not happen. Usually it is up to the driver to make this decision. Most drivers = already return 0 when they've put any data into the buffer. The case = where there's an error returned from the driver and also data indicated = by resid would be vanishingly small. >>> I have following patch to fix our code to be compatible with POSIX: >> ... >>=20 >>> -current only resets error code to zero for short write when code is >>> ERESTART, EINTR or EWOULDBLOCK. >>> But this is incorrect, at least for pipe, when EPIPE is returned, >>> some bytes may have already been written. For a named pipe, I may = don't >>> care a reader is disappeared or not, because for named pipe, a new >>> reader can come in and talk with writer again, so I need to know >>> how many bytes have been written, same is applied to reader, I don't >>> care writer is gone, it can come in again and talk with reader. So I >>> suggest to remove surplus code in -current's dofilewrite() and >>> dofileread(). >> Then fix the pipe code, and not introduce the behaviour change for = all >> file types ? > see above, I think data is more important than error code, and next = read/write will > get the error. >=20 >>> For EPIPE, We still deliver SIGPIPE to current thread, but returns >>> actually bytes written. >> And this sounds wrong. I think that fixing the code for pipes would = also >> semi-magically makes this correct. Yes. Pipes are too magical and don't match devices very well. Warner=