Date: Mon, 9 Oct 2017 17:01:03 -0700 From: Jason Eggleston <eggnet@gmail.com> To: Gleb Smirnoff <glebius@freebsd.org> Cc: Sean Bruno <sbruno@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r324405 - head/sys/kern Message-ID: <CAC27FbXMinj_7sLmMP9XpixiwYkiUd9RYQO2KiuvgXXP5P6EXQ@mail.gmail.com> In-Reply-To: <20171009232052.GU1055@FreeBSD.org> References: <201710072330.v97NUv7E040636@repo.freebsd.org> <20171009232052.GU1055@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
In tcp_input.c, this is where a RST is handled. so_error gets ECONNRESET, then tcp_close() is called. case TCPS_ESTABLISHED: case TCPS_FIN_WAIT_1: case TCPS_FIN_WAIT_2: case TCPS_CLOSE_WAIT: case TCPS_CLOSING: case TCPS_LAST_ACK: so->so_error = ECONNRESET; close: /* FALLTHROUGH */ default: tp = tcp_close(tp); tcp_close() calls soisdisconnected() which sets this flag: so->so_state |= SS_ISDISCONNECTED; Followed by: ... socantrcvmore_locked(so); ... socantsendmore_locked(so); ... Those two functions set these flags: so->so_rcv.sb_state |= SBS_CANTRCVMORE; so->so_snd.sb_state |= SBS_CANTSENDMORE; A TCP session cannot survive a RST, and i/o calls will not work afterword, no matter how many times you try. Having said that, there's a race condition between when so_error is set and when the socket and socket buffer flags are set, and I agree with your strategy of: - have sendfile() match send() - The current send() behavior is probably fine as is I agree sendfile() should match write() and send(), by checking in this order: SBS_CANTSENDMORE so_error SS_ISCONNECTED and will prep a new review with your patch. On Mon, Oct 9, 2017 at 4:20 PM, Gleb Smirnoff <glebius@freebsd.org> wrote: > Sean & Jason, > > On Sat, Oct 07, 2017 at 11:30:57PM +0000, Sean Bruno wrote: > S> Author: sbruno > S> Date: Sat Oct 7 23:30:57 2017 > S> New Revision: 324405 > S> URL: https://svnweb.freebsd.org/changeset/base/324405 > S> > S> Log: > S> Check so_error early in sendfile() call. Prior to this patch, if a > S> connection was reset by the remote end, sendfile() would just report > S> ENOTCONN instead of ECONNRESET. > S> > S> Submitted by: Jason Eggleston <jason@eggnet.com> > S> Reviewed by: glebius > S> Sponsored by: Limelight Networks > S> Differential Revision: https://reviews.freebsd.org/D12575 > S> > S> Modified: > S> head/sys/kern/kern_sendfile.c > S> > S> Modified: head/sys/kern/kern_sendfile.c > S> ============================================================ > ================== > S> --- head/sys/kern/kern_sendfile.c Sat Oct 7 23:10:16 2017 > (r324404) > S> +++ head/sys/kern/kern_sendfile.c Sat Oct 7 23:30:57 2017 > (r324405) > S> @@ -514,6 +514,11 @@ sendfile_getsock(struct thread *td, int s, struct > file > S> *so = (*sock_fp)->f_data; > S> if ((*so)->so_type != SOCK_STREAM) > S> return (EINVAL); > S> + if ((*so)->so_error) { > S> + error = (*so)->so_error; > S> + (*so)->so_error = 0; > S> + return (error); > S> + } > S> if (((*so)->so_state & SS_ISCONNECTED) == 0) > S> return (ENOTCONN); > S> return (0); > > Despite my initial positive review, now I am quite unsure on that. > > Problem is that sendfile(2) isn't defined by SUS, so there is no > distinctive final answer on that. Should we match other OSes? > Should we match our historic behaviour? Or should we match > write(2)/send(2) to socket, which are closest analogy. I probably > believe in the latter: sendfile(2) belongs to write(2)/send(2) > family. > > SUS specifies that write may return ECONNRESET. It also documents > EPIPE. However, our manual page documents only EPIPE for both > send(2) and write(2). For write we have: > > SOCKBUF_LOCK(&so->so_snd); > if (so->so_snd.sb_state & SBS_CANTSENDMORE) { > SOCKBUF_UNLOCK(&so->so_snd); > error = EPIPE; > goto out; > } > if (so->so_error) { > error = so->so_error; > so->so_error = 0; > SOCKBUF_UNLOCK(&so->so_snd); > goto out; > } > > Indeed, EPIPE will be returned prior to return/clear of so_error, > which supposedly is ECONNRESET. > > In the sendfile(2) implementation we see exactly same code: > > if (so->so_snd.sb_state & SBS_CANTSENDMORE) { > error = EPIPE; > SOCKBUF_UNLOCK(&so->so_snd); > goto done; > } else if (so->so_error) { > error = so->so_error; > so->so_error = 0; > SOCKBUF_UNLOCK(&so->so_snd); > goto done; > } > > But it isn't reached. Before due to SS_ISCONNECTED check, now > due to your change. Now we got two spots where so_error is > returned/cleared. > > For the receive family (read(2) and recv(2)), SUS specifies > ECONNRESET and our code does that. It also clears so_error. > > So, after your change at least one standard case is broken: an > application that polls/selects for both readability and > writeability of a socket. It discovers that socket is available > for writing, does sendfile(2), receives ECONNRESET. Then it does > read on the socket, and blocks(?) or returns a bogus error(?), > not sure. But definitely not ECONNRESET, since it is now cleared. > > Now, where does this ENOTCONN come from? These two lines: > > if (((*so)->so_state & SS_ISCONNECTED) == 0) > return (ENOTCONN); > > can be traced back all the way to original 1998 commit of sendfile. > > Here comes difference between sendfile(2) and send(2). In send(2), > this check is done _after_ so_error check, so your change is correct > in placing of so_error check wrt so_state check, but is incorrect > in placing wrt socket buffer state check. > > After all this considerations, I would suggest to revert your change, > and apply this patch. It would make behavior of sendfile(2) to match > send(2). I would appreciate more reviews around it. > > -- > Gleb Smirnoff >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAC27FbXMinj_7sLmMP9XpixiwYkiUd9RYQO2KiuvgXXP5P6EXQ>