Date: Mon, 14 Mar 2005 17:31:12 +0100 (CET) From: Marc Olzheim <zlo@zlo.nu>, Sven Berkvens <sven@berkvens.net> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/78824: race condition close()ing and read()ing the same socketpair on SMP. Message-ID: <200503141631.j2EGVCH2035756@rave.ilse.net> Resent-Message-ID: <200503141640.j2EGe3GQ036011@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 78824 >Category: kern >Synopsis: race condition close()ing and read()ing the same socketpair on SMP. >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Mar 14 16:40:02 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Marc Olzheim, Sven Berkvens >Release: FreeBSD 5.4-PRERELEASE i386 >Organization: ilse media >Environment: System: FreeBSD rave.ilse.net 5.4-PRERELEASE FreeBSD 5.4-PRERELEASE #0: Thu Mar 10 15:43:26 CET 2005 root@rave.ilse.net:/usr/obj/usr/src/sys/SE3DEBUG i386 GENERIC + INVARIANTS + INVARIANT_SUPPORT + WITNESS + WITNESS_SKIPSPIN >Description: When read()ing from a socket while the other end is being close()d at the same time, read() fails with errno == ENOTCONN, instead of doing normal End-of-file handling. References: soisdisconnected() from __FBSDID("$FreeBSD: src/sys/kern/uipc_socket2.c,v 1.137.2.5 2005/02/23 00:39:17 rwatson Exp $"); soreceive() from __FBSDID("$FreeBSD: src/sys/kern/uipc_socket.c,v 1.208.2.17 2005/03/07 13:08:03 rwatson Exp $"); close() from __FBSDID("$FreeBSD: src/sys/kern/kern_descrip.c,v 1.243.2.6 2005/03/03 22:27:32 jhb Exp $"); It seems as though soreceive() doesn't check for a lock on the filedescriptor, just the socket buffer, allowing close() to modify its flags at the same time. >How-To-Repeat: Since this is heavily timing dependant (it is a race condition), it might not be easily reproduced. We can run our code on the following hardware, with no other CPU-time consuming processes running to reproduce it: hw.machine: i386 hw.model: Intel(R) Xeon(TM) CPU 3.06GHz hw.ncpu: 4 hw.byteorder: 1234 hw.clockrate: 3065 kern.ostype: FreeBSD kern.osrelease: 5.4-PRERELEASE kern.osrevision: 199506 kern.version: FreeBSD 5.4-PRERELEASE #0: Thu Mar 10 15:43:26 CET 2005 root@rave.ilse.net:/usr/obj/usr/src/sys/SE3DEBUG kern.clockrate: { hz = 100, tick = 10000, profhz = 1024, stathz = 128 } kern.osreldate: 503105 kern.stackprot: 7 kern.ktrace.genio_size: 4096 kern.ktrace.request_pool: 100 kern.sched.name: 4BSD kern.smp.maxcpus: 16 kern.smp.active: 1 kern.smp.disabled: 0 kern.smp.cpus: 4 kern.smp.forward_signal_enabled: 1 kern.smp.forward_roundrobin_enabled: 1 Here's the code. I run under ktrace on our machine, the problem is reproduced: rave:/tmp>echo 'ktrace -i ./socketpair2 < /dev/null' | sh <Socket is not connected> (3,4) (i:33) <Socket is not connected> (3,4) (i:48) <Socket is not connected> (3,4) (i:67) <Socket is not connected> (3,4) (i:99) 100 <Socket is not connected> (3,4) (i:131) <Socket is not connected> (3,4) (i:141) <Socket is not connected> (3,4) (i:144) <Socket is not connected> (3,4) (i:159) <Socket is not connected> (3,4) (i:169) <Socket is not connected> (3,4) (i:176) <Socket is not connected> (3,4) (i:183) 200 <Socket is not connected> (3,4) (i:213) <Socket is not connected> (3,4) (i:226) <Socket is not connected> (3,4) (i:234) <Socket is not connected> (3,4) (i:254) <Socket is not connected> (3,4) (i:282) ... socketpair2.c: /* socketpair2.c: - Marc Olzheim <zlo at zlo.nu>, * Sven Berkvens <sven at berkvens.net> */ #include <errno.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <signal.h> #include <sys/socket.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> int main(int argc, char *argv[]) { int sock[2], i, j, wstat; char buf[1024]; ssize_t bytes; pid_t newpid; if (1 != argc) { fprintf(stderr, "Usage: %s\n", argv[0]); return 1; } for (i = 0;;++i) { if (socketpair(PF_UNIX, SOCK_STREAM, 0, sock)) perror("socketpair()"); newpid = fork(); if (-1 == newpid) perror("fork()"); if (0 != newpid) { /* parent */ close(sock[1]); if (write(sock[0], "A", 1) != 1) perror("write()"); /* Suspend until the child has read the byte. */ kill(getpid(), SIGSTOP); /* We hopefully get a time slice as soon as as a * SIGCONT it delivered. */ close(sock[0]); } else { /* child */ close(sock[0]); bytes = read(sock[1], buf, 1); if (bytes != 1) perror("first read()"); /* Tell the parent to continue and close his side of * the socket. */ kill(getppid(), SIGCONT); /* Since only 1 byte is send, this should * produce EOF. */ bytes = read(sock[1], buf, 1); if (bytes == -1) { printf("<%s> (%d,%d) (i:%d)\n", strerror(errno), sock[0], sock[1], i); exit(1); } exit(0); } wait(&wstat); if (!(i % 100) && i) printf("%d\n", i); } return 0; } >Fix: It's possible to catch the ENOTCONN and restart the read() to to read the EOF... >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200503141631.j2EGVCH2035756>