Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Nov 2005 01:39:44 +0200 (EET)
From:      Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>
To:        bug-followup@FreeBSD.org
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: i386/87208 : /dev/cuad[0/1] bad file descriptor error during
Message-ID:  <20051111011100.X17529@atlantis.atlantis.dp.ua>

next in thread | raw e-mail | index | archive | help

Hello!

  I'm CCing this follow-up to freebsd-stable because this problem can
prevent use of RELENG_6 machines in production (mgetty is quite usual
example of such a use). This bug is a regression vs. RELENG_5/4.

  My analysis shows that it isn't only dup() problem. File descriptor 0
get somehow "reserved" in RELENG_6, but only IF process has been started
by the init via /etc/ttys! Look at this simple program:

#include <unistd.h>
#include <syslog.h>
#include <fcntl.h>

#include <stdio.h>
#include <string.h>
#include <stdarg.h>

main()
{
     int res;

     while((res=open("/dev/null",O_RDONLY)) < 3)
         if (res == -1) syslog(LOG_ERR,"open(): %m");
     syslog(LOG_ERR,"Started"); sleep(10);
     if (close(0) == -1) syslog(LOG_ERR,"close(0): %m");
     if (close(2) == -1) syslog(LOG_ERR,"close(2): %m");
     if ((res=dup(1)) == -1) syslog(LOG_ERR,"dup(1): %m");
     syslog(LOG_ERR,"dup() gave %d\n",res);
     sleep(10);
     return 0;
}

One can watch the file descriptor usage in two points where program is 
sleeping: first after program has opened enough files to use descriptor
#3, and second after closing descriptors #0 and #2 and copying descriptor
#1. So, when I start this program under 6.0-RELEASE in usual way (./a.out),
in first point lsof shows me the following (I'll show only plain descriptors
and omit cwd/rtd/txt information):

At first sleep:

a.out   837 root    0u  VCHR       0,70  0t77713     70 /dev/ttyv1
a.out   837 root    1u  VCHR       0,70  0t77713     70 /dev/ttyv1
a.out   837 root    2u  VCHR       0,70  0t77713     70 /dev/ttyv1
a.out   837 root    3r  VCHR       0,13      0t0     13 /dev/null
a.out   837 root    4u  unix 0xc1c7b9bc      0t0        ->0xc1bf7de8

(descriptor #4 has been created by syslog()). Program logged the following:

a.out: dup() gave 0

At the second sleep:

a.out   837 root    0u  VCHR       0,70  0t77713     70 /dev/ttyv1
a.out   837 root    1u  VCHR       0,70  0t77713     70 /dev/ttyv1
a.out   837 root    3r  VCHR       0,13      0t0     13 /dev/null
a.out   837 root    4u  unix 0xc1c7b9bc      0t0        ->0xc1bf7de8

So all OK in this mode: there were 3 standard files open at the beginning
(descr. 0-2), program has opened descr. 3 (and 4), closed 0 and 2 
successfully, and copied 1 to 0. Now let's start this program from the
/etc/ttys:

cuad0  "/root/tmp/a.out"       unknown on insecure

Now we have the following at the first sleep():

a.out   817 root    1r  VCHR       0,13      0t0     13 /dev/null
a.out   817 root    2r  VCHR       0,13      0t0     13 /dev/null
a.out   817 root    3r  VCHR       0,13      0t0     13 /dev/null
a.out   817 root    4u  unix 0xc1c7bde8      0t0        ->0xc1bf7de8

Note that open() has also skipped descr. 0! Then program tries to close it,
gives an error:

close(0): Bad file descriptor
dup() gave 2

Note that descriptor 0 isn't open: close() refuses to close it. But dup()
doesn't "see" it and returns descr. 2 instead. At the second sleep, we
have exactly the same open file table: descr. 0 is not in use, 1-3 point
at /dev/null. So it seems to me that open() suffers from the same problem 
here as a dup(): descriptor 0 becomes "reserved" somehow.


Sincerely, Dmitry
-- 
Atlantis ISP, System Administrator
e-mail:  dmitry@atlantis.dp.ua
nic-hdl: LYNX-RIPE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051111011100.X17529>