From owner-freebsd-net Tue Jul 4 6: 7:51 2000 Delivered-To: freebsd-net@freebsd.org Received: from mail.pace.co.uk (mh.pace.co.uk [136.170.50.8]) by hub.freebsd.org (Postfix) with ESMTP id 719E437B869 for ; Tue, 4 Jul 2000 06:07:38 -0700 (PDT) (envelope-from kbracey@pace.co.uk) Received: from admin-1.pace.co.uk (admin-1.cam.pace.co.uk [136.170.131.64]) by mail.pace.co.uk (8.9.1b+Sun/8.9.1) with ESMTP id OAA02937 for ; Tue, 4 Jul 2000 14:07:33 +0100 (BST) Received: from art-work.cam.pace.co.uk (art-work.cam.pace.co.uk [136.170.131.5]) by admin-1.pace.co.uk (8.9.1b+Sun/8.9.1) with ESMTP id OAA13694 for ; Tue, 4 Jul 2000 14:07:33 +0100 (BST) Received: from kbracey.cam.pace.co.uk (kbracey.cam.pace.co.uk [136.170.129.213]) by art-work.cam.pace.co.uk (8.9.3+Sun/8.9.1) with SMTP id OAA27810 for ; Tue, 4 Jul 2000 14:07:32 +0100 (BST) Date: Tue, 04 Jul 2000 13:52:47 +0100 From: Kevin Bracey To: freebsd-net@freebsd.org Subject: Race condition in TCP connection drops? Message-ID: <282ed4d849%kbracey@kbracey.cam.pace.co.uk> X-Organization: Pace Micro Technology plc, Cambridge, United Kingdom X-Mailer: Messenger v1.40f for RISC OS MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Posting-Agent: RISC OS Newsbase 0.61b Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I've just come across a nasty glitch in our FreeBSD derived IP stack, and I'm curious to know whether the problem is inherent in the BSD network code, or is due to our implementation of its environment. I'm describing this from a version of the source from about 2 years ago, so some of the functions (eg xxx_usrreq) referred to will have changed, but as far as I can tell the recent changes haven't affected this particular problem. The problem occurs when a connection is dropped - tcp_drop() calls tcp_close(), which then does: free(tp, M_PCB); inp->inp_ppcb = 0; soisdisconnected(so); in_pcbdetach(inp); tcpstat.tcps_closed++; return ((struct tcpcb *)0); soisdisconnected() calls sowakeup(), which, because SS_ASYNC is set, calls psignal(). Now, on our system, psignal() sends round an immediate message, on receipt of which an application detects the failure and calls close() on the socket. Then, soclose calls tcp_usrreq(PRU_DETACH), which aborts because the inp_ppcb pointer is 0. This is totally reliable on our system, because the psignal mechanism is synchronous. Are there interlocks to prevent this happening on FreeBSD, or is it a race condition? I'm not as familiar as I perhaps should be with the Unix kernel environment. Is there a reason for soisdisconnected() to be called before in_pcbdetach()? -- Kevin Bracey, Principal Software Engineer Pace Micro Technology plc Tel: +44 (0) 1223 518566 645 Newmarket Road Fax: +44 (0) 1223 518526 Cambridge, CB5 8PB, United Kingdom WWW: http://www.acorn.co.uk/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message