From owner-freebsd-hackers@FreeBSD.ORG Thu Feb 18 07:50:38 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DAB951065676 for ; Thu, 18 Feb 2010 07:50:38 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f219.google.com (mail-fx0-f219.google.com [209.85.220.219]) by mx1.freebsd.org (Postfix) with ESMTP id 702C38FC08 for ; Thu, 18 Feb 2010 07:50:38 +0000 (UTC) Received: by fxm19 with SMTP id 19so1121103fxm.3 for ; Wed, 17 Feb 2010 23:50:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:subject:organization:from :date:message-id:user-agent:mime-version:content-type; bh=oAnRk3RiGbVmyO5aWB6Q8XC4TLOQyX0P4jbffyI2N4U=; b=OdF65dh7KEmvEPiFYs5Hb8FPN6ovC9DMuW9yTg8uYZKtD2a+8CVHKkwPQIf0w1a1N/ ZVRnP3lc3LaOyJCJd0+zn02/tTLsAWZp5Xbg1PiMsN81y5HoKu79vQYN5rN9y8JUP3Tr kId+TZ4WQOJIFjMLfaeBVtZ8NjGwebk1RMYMc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:subject:organization:from:date:message-id:user-agent :mime-version:content-type; b=uBY3cFHlqxOQVIXof7HHmETM88de4L2CznSlS7IMGGJyIU8yZgMyF1GJ+tPPlL81Vc p2oYN2i3Y+sEQEHotEjKPMmKkMmrYbUw2Ce5AVnKWKQna2KRNpqKGAeUAhZuI4JVcjfz Aps9R+HbNqToee0GDOu1EiXlVCMOga2mMspVA= Received: by 10.87.69.17 with SMTP id w17mr16482441fgk.41.1266479437052; Wed, 17 Feb 2010 23:50:37 -0800 (PST) Received: from localhost (ms.singlescrowd.net [80.85.90.67]) by mx.google.com with ESMTPS id e3sm13594344fga.6.2010.02.17.23.50.35 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 17 Feb 2010 23:50:35 -0800 (PST) To: freebsd-hackers@FreeBSD.ORG Organization: TOA Ukraine From: Mikolaj Golub Date: Thu, 18 Feb 2010 09:50:33 +0200 Message-ID: <86ocjn3rue.fsf@zhuzha.ua1> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Subject: unix socket: race on close? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Feb 2010 07:50:39 -0000 Hi, Below is a simple test code with unix sockets: the client does connect()/close() in loop and the server -- accept()/close(). Sometimes close() fails with 'Socket is not connected' error: a.out: parent: close error: 57 or a.out: child: close error: 57 It looks for me like some race in close(). Looking at uipc_socket.c:soclose(): int soclose(struct socket *so) { int error = 0; KASSERT(!(so->so_state & SS_NOFDREF), ("soclose: SS_NOFDREF on enter")); CURVNET_SET(so->so_vnet); funsetown(&so->so_sigio); if (so->so_state & SS_ISCONNECTED) { if ((so->so_state & SS_ISDISCONNECTING) == 0) { error = sodisconnect(so); if (error) goto drop; } Isn't the problem here? so_state is checked for SS_ISCONNECTED and SS_ISDISCONNECTING without locking and then sodisconnect() is called, which closes both sockets of the connection. So it looks for me that if the close() is called for both ends simultaneously it is possible that sodisconnect() will be called for both ends and for one ENOTCONN will be returned. Or may I have missed something? We have been observing periodically ENOTCONN errors on unix socket close in our applications, so it is not just curiosity :-) (I posted about our problem to freebsd-net@ some time ago but then did not attract any attention http://lists.freebsd.org/pipermail/freebsd-net/2009-December/024047.html). #include #include #include #include #include #include #include #include #include #include #include #include #include #define UNIXSTR_PATH "/tmp/mytest.socket" #define USLEEP 100 int main(int argc, char **argv) { int listenfd, connfd, pid; struct sockaddr_un servaddr; pid = fork(); if (-1 == pid) errx(1, "fork(): %d", errno); if (0 != pid) { /* parent */ if ((listenfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0) errx(1, "parent: socket error: %d", errno); unlink(UNIXSTR_PATH); bzero(&servaddr, sizeof(servaddr)); servaddr.sun_family = AF_LOCAL; strcpy(servaddr.sun_path, UNIXSTR_PATH); if (bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) errx(1, "parent: bind error: %d", errno); if (listen(listenfd, 1024) < 0) errx(1, "parent: listen error: %d", errno); for ( ; ; ) { if ((connfd = accept(listenfd, (struct sockaddr *) NULL, NULL)) < 0) errx(1, "parent: accept error: %d", errno); //usleep(USLEEP / 2); // (I) uncomment this or (II) below to avoid the race if (close(connfd) < 0) errx(1, "parent: close error: %d", errno); } } else { /* child */ sleep(1); /* give the parent some time to create the socket */ for ( ; ; ) { if ((connfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0) errx(1, "child: socket error: %d", errno); bzero(&servaddr, sizeof(servaddr)); servaddr.sun_family = AF_LOCAL; strcpy(servaddr.sun_path, UNIXSTR_PATH); if (connect(connfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) errx(1, "child: connect error %d", errno); // usleep(USLEEP); // (II) uncomment this or (I) above to avoid the race if (close(connfd) != 0) errx(1, "child: close error: %d", errno); usleep(USLEEP); } } return 0; } -- Mikolaj Golub