From owner-freebsd-stable@FreeBSD.ORG Thu Nov 10 23:39:51 2005 Return-Path: X-Original-To: freebsd-stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1F75316A41F; Thu, 10 Nov 2005 23:39:51 +0000 (GMT) (envelope-from dmitry@atlantis.dp.ua) Received: from postman.atlantis.dp.ua (postman.atlantis.dp.ua [193.108.47.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4C08843D45; Thu, 10 Nov 2005 23:39:49 +0000 (GMT) (envelope-from dmitry@atlantis.dp.ua) Received: from smtp.atlantis.dp.ua (smtp.atlantis.dp.ua [193.108.46.231]) by postman.atlantis.dp.ua (8.13.1/8.13.1) with ESMTP id jAANdiG7031206; Fri, 11 Nov 2005 01:39:44 +0200 (EET) (envelope-from dmitry@atlantis.dp.ua) Date: Fri, 11 Nov 2005 01:39:44 +0200 (EET) From: Dmitry Pryanishnikov To: bug-followup@FreeBSD.org Message-ID: <20051111011100.X17529@atlantis.atlantis.dp.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-stable@FreeBSD.org Subject: Re: i386/87208 : /dev/cuad[0/1] bad file descriptor error during X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Nov 2005 23:39:51 -0000 Hello! I'm CCing this follow-up to freebsd-stable because this problem can prevent use of RELENG_6 machines in production (mgetty is quite usual example of such a use). This bug is a regression vs. RELENG_5/4. My analysis shows that it isn't only dup() problem. File descriptor 0 get somehow "reserved" in RELENG_6, but only IF process has been started by the init via /etc/ttys! Look at this simple program: #include #include #include #include #include #include main() { int res; while((res=open("/dev/null",O_RDONLY)) < 3) if (res == -1) syslog(LOG_ERR,"open(): %m"); syslog(LOG_ERR,"Started"); sleep(10); if (close(0) == -1) syslog(LOG_ERR,"close(0): %m"); if (close(2) == -1) syslog(LOG_ERR,"close(2): %m"); if ((res=dup(1)) == -1) syslog(LOG_ERR,"dup(1): %m"); syslog(LOG_ERR,"dup() gave %d\n",res); sleep(10); return 0; } One can watch the file descriptor usage in two points where program is sleeping: first after program has opened enough files to use descriptor #3, and second after closing descriptors #0 and #2 and copying descriptor #1. So, when I start this program under 6.0-RELEASE in usual way (./a.out), in first point lsof shows me the following (I'll show only plain descriptors and omit cwd/rtd/txt information): At first sleep: a.out 837 root 0u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 1u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 2u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 3r VCHR 0,13 0t0 13 /dev/null a.out 837 root 4u unix 0xc1c7b9bc 0t0 ->0xc1bf7de8 (descriptor #4 has been created by syslog()). Program logged the following: a.out: dup() gave 0 At the second sleep: a.out 837 root 0u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 1u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 3r VCHR 0,13 0t0 13 /dev/null a.out 837 root 4u unix 0xc1c7b9bc 0t0 ->0xc1bf7de8 So all OK in this mode: there were 3 standard files open at the beginning (descr. 0-2), program has opened descr. 3 (and 4), closed 0 and 2 successfully, and copied 1 to 0. Now let's start this program from the /etc/ttys: cuad0 "/root/tmp/a.out" unknown on insecure Now we have the following at the first sleep(): a.out 817 root 1r VCHR 0,13 0t0 13 /dev/null a.out 817 root 2r VCHR 0,13 0t0 13 /dev/null a.out 817 root 3r VCHR 0,13 0t0 13 /dev/null a.out 817 root 4u unix 0xc1c7bde8 0t0 ->0xc1bf7de8 Note that open() has also skipped descr. 0! Then program tries to close it, gives an error: close(0): Bad file descriptor dup() gave 2 Note that descriptor 0 isn't open: close() refuses to close it. But dup() doesn't "see" it and returns descr. 2 instead. At the second sleep, we have exactly the same open file table: descr. 0 is not in use, 1-3 point at /dev/null. So it seems to me that open() suffers from the same problem here as a dup(): descriptor 0 becomes "reserved" somehow. Sincerely, Dmitry -- Atlantis ISP, System Administrator e-mail: dmitry@atlantis.dp.ua nic-hdl: LYNX-RIPE