Date: Thu, 18 Feb 2010 11:59:40 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Mikolaj Golub <to.my.trociny@gmail.com> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: unix socket: race on close? Message-ID: <alpine.BSF.2.00.1002181157450.59664@fledge.watson.org> In-Reply-To: <86ocjn3rue.fsf@zhuzha.ua1> References: <86ocjn3rue.fsf@zhuzha.ua1>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 18 Feb 2010, Mikolaj Golub wrote: > Below is a simple test code with unix sockets: the client does > connect()/close() in loop and the server -- accept()/close(). > > Sometimes close() fails with 'Socket is not connected' error: Hi Mikolaj: Thanks for this report, and sorry about not spotting your earlier post to freebsd-net. I've been fairly preoccupied the last month and not keeping up with the mailing lists. Could I ask you to file a PR on this, and forward me the PR number so I can claim ownership? This should prevent it from getting lost while I catch up. In short, your evaluation seems reasonable to me -- have you tried tweaking soclose() to ignore ENOTCONN from sodisconnect() to confirm this diagnosis fixes all the instances you've been seeing? Robert N M Watson Computer Laboratory University of Cambridge > > a.out: parent: close error: 57 > > or > > a.out: child: close error: 57 > > It looks for me like some race in close(). Looking at uipc_socket.c:soclose(): > > int > soclose(struct socket *so) > { > int error = 0; > > KASSERT(!(so->so_state & SS_NOFDREF), ("soclose: SS_NOFDREF on enter")); > > CURVNET_SET(so->so_vnet); > funsetown(&so->so_sigio); > if (so->so_state & SS_ISCONNECTED) { > if ((so->so_state & SS_ISDISCONNECTING) == 0) { > error = sodisconnect(so); > if (error) > goto drop; > } > > Isn't the problem here? so_state is checked for SS_ISCONNECTED and > SS_ISDISCONNECTING without locking and then sodisconnect() is called, which > closes both sockets of the connection. So it looks for me that if the close() > is called for both ends simultaneously it is possible that sodisconnect() will > be called for both ends and for one ENOTCONN will be returned. Or may I have > missed something? > > We have been observing periodically ENOTCONN errors on unix socket close in > our applications, so it is not just curiosity :-) (I posted about our problem > to freebsd-net@ some time ago but then did not attract any attention > http://lists.freebsd.org/pipermail/freebsd-net/2009-December/024047.html). > > #include <sys/types.h> > #include <sys/socket.h> > #include <sys/un.h> > #include <netinet/in.h> > #include <arpa/inet.h> > #include <errno.h> > #include <fcntl.h> > #include <stdio.h> > #include <strings.h> > #include <string.h> > #include <unistd.h> > #include <sys/select.h> > #include <err.h> > > #define UNIXSTR_PATH "/tmp/mytest.socket" > #define USLEEP 100 > > int main(int argc, char **argv) > { > int listenfd, connfd, pid; > struct sockaddr_un servaddr; > > pid = fork(); > if (-1 == pid) > errx(1, "fork(): %d", errno); > > if (0 != pid) { /* parent */ > > if ((listenfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0) > errx(1, "parent: socket error: %d", errno); > > unlink(UNIXSTR_PATH); > bzero(&servaddr, sizeof(servaddr)); > servaddr.sun_family = AF_LOCAL; > strcpy(servaddr.sun_path, UNIXSTR_PATH); > > if (bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) > errx(1, "parent: bind error: %d", errno); > > if (listen(listenfd, 1024) < 0) > errx(1, "parent: listen error: %d", errno); > > for ( ; ; ) { > if ((connfd = accept(listenfd, (struct sockaddr *) NULL, NULL)) < 0) > errx(1, "parent: accept error: %d", errno); > > //usleep(USLEEP / 2); // (I) uncomment this or (II) below to avoid the race > > if (close(connfd) < 0) > errx(1, "parent: close error: %d", errno); > } > > } else { /* child */ > > sleep(1); /* give the parent some time to create the socket */ > > for ( ; ; ) { > > if ((connfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0) > errx(1, "child: socket error: %d", errno); > > bzero(&servaddr, sizeof(servaddr)); > servaddr.sun_family = AF_LOCAL; > strcpy(servaddr.sun_path, UNIXSTR_PATH); > > if (connect(connfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) > errx(1, "child: connect error %d", errno); > > // usleep(USLEEP); // (II) uncomment this or (I) above to avoid the race > > if (close(connfd) != 0) > errx(1, "child: close error: %d", errno); > > usleep(USLEEP); > } > } > > return 0; > } > > -- > Mikolaj Golub > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1002181157450.59664>