From owner-freebsd-hackers@FreeBSD.ORG  Thu Feb 18 11:59:40 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C0926106566C
	for <freebsd-hackers@FreeBSD.ORG>; Thu, 18 Feb 2010 11:59:40 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 7C18C8FC15
	for <freebsd-hackers@FreeBSD.ORG>; Thu, 18 Feb 2010 11:59:40 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 19C4346B0C;
	Thu, 18 Feb 2010 06:59:40 -0500 (EST)
Date: Thu, 18 Feb 2010 11:59:40 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Mikolaj Golub <to.my.trociny@gmail.com>
In-Reply-To: <86ocjn3rue.fsf@zhuzha.ua1>
Message-ID: <alpine.BSF.2.00.1002181157450.59664@fledge.watson.org>
References: <86ocjn3rue.fsf@zhuzha.ua1>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-hackers@FreeBSD.ORG
Subject: Re: unix socket: race on close?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Feb 2010 11:59:40 -0000

On Thu, 18 Feb 2010, Mikolaj Golub wrote:

> Below is a simple test code with unix sockets: the client does 
> connect()/close() in loop and the server -- accept()/close().
>
> Sometimes close() fails with 'Socket is not connected' error:

Hi Mikolaj:

Thanks for this report, and sorry about not spotting your earlier post to 
freebsd-net.  I've been fairly preoccupied the last month and not keeping up 
with the mailing lists.  Could I ask you to file a PR on this, and forward me 
the PR number so I can claim ownership?  This should prevent it from getting 
lost while I catch up.

In short, your evaluation seems reasonable to me -- have you tried tweaking 
soclose() to ignore ENOTCONN from sodisconnect() to confirm this diagnosis 
fixes all the instances you've been seeing?

Robert N M Watson
Computer Laboratory
University of Cambridge


>
> a.out: parent: close error: 57
>
> or
>
> a.out: child: close error: 57
>
> It looks for me like some race in close(). Looking at uipc_socket.c:soclose():
>
> int
> soclose(struct socket *so)
> {
>        int error = 0;
>
>        KASSERT(!(so->so_state & SS_NOFDREF), ("soclose: SS_NOFDREF on enter"));
>
>        CURVNET_SET(so->so_vnet);
>        funsetown(&so->so_sigio);
>        if (so->so_state & SS_ISCONNECTED) {
>                if ((so->so_state & SS_ISDISCONNECTING) == 0) {
>                        error = sodisconnect(so);
>                        if (error)
>                                goto drop;
>                }
>
> Isn't the problem here? so_state is checked for SS_ISCONNECTED and
> SS_ISDISCONNECTING without locking and then sodisconnect() is called, which
> closes both sockets of the connection. So it looks for me that if the close()
> is called for both ends simultaneously it is possible that sodisconnect() will
> be called for both ends and for one ENOTCONN will be returned. Or may I have
> missed something?
>
> We have been observing periodically ENOTCONN errors on unix socket close in
> our applications, so it is not just curiosity :-) (I posted about our problem
> to freebsd-net@ some time ago but then did not attract any attention
> http://lists.freebsd.org/pipermail/freebsd-net/2009-December/024047.html).
>
> #include <sys/types.h>
> #include <sys/socket.h>
> #include <sys/un.h>
> #include <netinet/in.h>
> #include <arpa/inet.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <strings.h>
> #include <string.h>
> #include <unistd.h>
> #include <sys/select.h>
> #include <err.h>
>
> #define UNIXSTR_PATH "/tmp/mytest.socket"
> #define USLEEP  100
>
> int main(int argc, char **argv)
> {
> 	int			listenfd, connfd, pid;
> 	struct sockaddr_un	servaddr;
>
> 	pid = fork();
> 	if (-1 == pid)
> 		errx(1, "fork(): %d", errno);
>
> 	if (0 != pid) { /* parent */
>
> 		if ((listenfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0)
> 			errx(1, "parent: socket error: %d", errno);
>
> 		unlink(UNIXSTR_PATH);
> 		bzero(&servaddr, sizeof(servaddr));
> 		servaddr.sun_family = AF_LOCAL;
> 		strcpy(servaddr.sun_path, UNIXSTR_PATH);
>
> 		if (bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0)
> 			errx(1, "parent: bind error: %d", errno);
>
> 		if (listen(listenfd, 1024) < 0)
> 			errx(1, "parent: listen error: %d", errno);
>
> 		for ( ; ; ) {
> 			if ((connfd = accept(listenfd, (struct sockaddr *) NULL, NULL)) < 0)
> 				errx(1, "parent: accept error: %d", errno);
>
> 			//usleep(USLEEP / 2); // (I) uncomment this or (II) below to avoid the race
>
> 	        	if (close(connfd) < 0)
> 				errx(1, "parent: close error: %d", errno);
> 		}
>
> 	} else { /* child */
>
> 		sleep(1); /* give the parent some time to create the socket */
>
> 		for ( ; ; ) {
>
> 			if ((connfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0)
> 				errx(1, "child: socket error: %d", errno);
>
> 			bzero(&servaddr, sizeof(servaddr));
> 			servaddr.sun_family = AF_LOCAL;
> 			strcpy(servaddr.sun_path, UNIXSTR_PATH);
>
> 			if (connect(connfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0)
> 				errx(1, "child: connect error %d", errno);
>
> 			// usleep(USLEEP); // (II) uncomment this or (I) above to avoid the race
>
> 			if (close(connfd) != 0)
> 				errx(1, "child: close error: %d", errno);
>
> 			usleep(USLEEP);
> 		}
> 	}
>
> 	return 0;
> }
>
> -- 
> Mikolaj Golub
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>