From owner-freebsd-arch Mon Mar 3 20: 9:12 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E27D637B401 for ; Mon, 3 Mar 2003 20:09:08 -0800 (PST) Received: from perrin.int.nxad.com (internal.ext.nxad.com [69.1.70.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id 35F0E43FAF for ; Mon, 3 Mar 2003 20:09:08 -0800 (PST) (envelope-from sean@perrin.int.nxad.com) Received: by perrin.int.nxad.com (Postfix, from userid 1001) id 85C6D21068; Mon, 3 Mar 2003 20:08:59 -0800 (PST) Date: Mon, 3 Mar 2003 20:08:59 -0800 From: Sean Chittenden To: Terry Lambert Cc: Hiten Pandya , arch@FreeBSD.ORG Subject: Re: Should sendfile() to return ENOBUFS? Message-ID: <20030304040859.GB79234@perrin.int.nxad.com> References: <20030303224418.GU79234@perrin.int.nxad.com> <20030304001230.GC36475@unixdaemons.com> <20030304002218.GY79234@perrin.int.nxad.com> <3E641131.431A0BA8@mindspring.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8Bx+wEju+vH9ym24" Content-Disposition: inline In-Reply-To: <3E641131.431A0BA8@mindspring.com> User-Agent: Mutt/1.4i X-PGP-Key: finger seanc@FreeBSD.org X-PGP-Fingerprint: 3849 3760 1AFE 7B17 11A0 83A6 DD99 E31F BC84 B341 X-Web-Homepage: http://sean.chittenden.org/ Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --8Bx+wEju+vH9ym24 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > sendfile: >=20 > When using a socket marked for non-blocking I/O, sendfile() may > send fewer bytes than requested. In this case, the number of > bytes success- fully written is returned in *sbytes (if > specified), and the error EAGAIN is returned. >=20 > This seems to indicate several things: >=20 > 1) The correct error is EAGAIN, *not* ENOBUFS EAGAIN/EWOULDBLOCK, I'm inclined to agree... > 2) You need to be damn sure you can guarantee a correct update > of *sbytes; I believe this is very difficult in the case in > question, which is why it blocks I'm not convinced of this. Have you poked through src/sys/kern/uipc_syscalls.c? It's not that ugly/hard, nothing's impossible with a bit of refactoring. > 3) If sbytes is NULL, you should probably block, even on a > non-blocking call. The reason for this is that there is > no way for the application to restart without *sbytes This degrades terribly though and if you get a spike in traffic, degradation of performance is critical. Going from a non-blocking application to a blocking call simply because of high use is murderous and is justification in itself enough for me to move away from the really nice zero-copy sockets that sendfile() affords me, back to the sluggish writev() syscall. If a system is busy, it's stuck in an sfbufa state and blocks the server from servicing thousands of connections. The symptoms are common and synonymous with mbuf exhaustion or any other kind of buffer exhaustion... my point is that having this block is the worst way that sendfile() can degrade under high performance. > 4) If you get rid of the blocking with (sbytes =3D=3D NULL), you > better add a BUGS section to the manual page. There's nothing that says that sbytes can't be set to 0 if errno is EAGAIN, in fact, that's what it does right now. > Frankly I'm really surprised that you are blocking in this place; it > indicates an inability to get a page in the kernel map in the sf > zone, which, in turn, indicates that your NSFBUFS is improperly > tuned; if you are using sendfile, and tune up your other kernel > parameters for your system, don't forget NSFBUFS. Well, it's set to 65535 at the moment. How much higher you think I should set it? :-] At some point I have to say, "it's high enough and I just need to get the application to degrade gracefully." :-] > While you could *technically* make sf_buf_alloc() non-blocking, in > general this would be a bad idea, given that the one place it's > called is in in interior loop that can be the subject of a "goto" > (so it's an embedded interior loop) in sendfile() itself. I think > it would be very hard to satisfy #2, to allow it to be restartable > by the application, in the face of failure, and since *sbytes is not > a mandatory parameter, likely your application will end up barfing > (e.g. sending partial FTP files or HTML documents down, with no way > to recover from a failure, other than closing the client socket, and > hoping the client can recover). Frankly, if a developer is stupid enough to pass in NULL for sbytes, they get what they deserve. Returning -1 and setting errno to EAGAIN in the event that there aren't any sf_buf's available isn't what I'd call the programming exercise of the decade. :-P > In a "flash crowd" case on an HTTP server, this basically means that > you will continuously get retries, and the situation will worsen, > exponentially, as people retry getting the same page. In the FTP > case, or some other protocol without automatic retry on session > abandonment, of course, it will be fatal. Hrm, let me redefine "fatal" as "changing the behavior of a system call to go from returning in less than 0.001ms, to returning in 2-15s for every connection when trying to make over ~500K sendfile(2) calls a second." I'd call that a catastrophic failure to degrade successfully. -sc --=20 Sean Chittenden --8Bx+wEju+vH9ym24 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Comment: Sean Chittenden iD8DBQE+ZCbb3ZnjH7yEs0ERAk3mAKCTIVw1wlkEppN9MlKOvgcjGROfbQCgyjlj ihQpNHXryGSGT/JMcV81SQI= =frrn -----END PGP SIGNATURE----- --8Bx+wEju+vH9ym24-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message