Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Aug 2009 13:53:37 +0300 (EEST)
From:      Valentin Nechayev <netch@segfault.kiev.ua>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/137482: script(1) doesn't read tail of program output, pty is suspected
Message-ID:  <200908061053.n76Arbfh097956@segfault.kiev.ua>
Resent-Message-ID: <200908061100.n76B0Eek081991@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         137482
>Category:       kern
>Synopsis:       script(1) doesn't read tail of program output, pty is suspected
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Aug 06 11:00:14 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Valentin Nechayev
>Release:        FreeBSD 7.2-RELEASE-p1 i386
>Organization:
private
>Environment:
System: FreeBSD segfault.kiev.ua 7.2-RELEASE-p1 FreeBSD 7.2-RELEASE-p1 #1: Sun Jun 14 10:32:33 EEST 2009 root@segfault.kiev.ua:/usr/BSD/obj/usr/BSD/src/sys/sf72 i386

>Description:

script(1) doesn't read tail of program output when program in child process
finishes. On very quick & short output, this leads to empty output. In
other cases, this can give partial data loss.

I treat this as kernel issue, not userland, due to reasons explained below,
that's why category is "kern".

>How-To-Repeat:

Enter a directory with very few files (less than 10 and short named).
Run a few times:

script s01 ls

In most cases, script output is empty. Ktrace shows that ls(1) prints
its output, but script(1) ignores it.

Adding explicit timeout:

script s01 sh -c 'ls; sleep 1'

causes ls(1) output to be recorded in 100% runs.

With more long and complex cases, e.g.:

script s01 ls -lR /usr/lib

output contents are unstable and change from one run to another.

All this means race condition is reached.

But, detailed ktrace analysis shows this seems not to be happen in script(1),
but in pty(4). Essential parts of kdump:

 97747 script   CALL  open(0xbfbfdea1,O_RDWR|O_NOCTTY,<unused>0x280897b8)
 97747 script   NAMI  "/dev/ptyqa"
 97747 script   RET   open 4
 97747 script   CALL  fstat(0x4,0xbfbfddec)
 97747 script   STRU  struct stat {dev=83951360, ino=186, mode=crw-rw-rw- , nlin
k=1, uid=0, gid=0, rdev=186, atime=1249554087, stime=1249511994, ctime=124951199
4, birthtime=-1, size=0, blksize=4096, blocks=0, flags=0x0 }
 97747 script   RET   fstat 0
[...]
 97749 ls       CALL  write(0x1,0x28216000,0x1d)
 97749 ls       GIO   fd 1 wrote 29 bytes
       "file1           file2           ktrace.out      s01
       "
 97749 ls       RET   write 29/0x1d
 97749 ls       CALL  sigprocmask(SIG_BLOCK,0x28074600,0xbfbfe12c)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  sigprocmask(SIG_SETMASK,0x28074610,0)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  sigprocmask(SIG_BLOCK,0x28074600,0xbfbfe0fc)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  sigprocmask(SIG_SETMASK,0x28074610,0)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  exit(0)
 97747 script   RET   select 1
 97747 script   CALL  read(0x4,0xbfbfe39c,0x400)
 97747 script   GIO   fd 4 read 0 bytes
       ""
 97747 script   RET   read 0
 97747 script   CALL  wait4(0xffffffff,0xbfbfded8,WNOHANG,0)
 97747 script   RET   wait4 97749/0x17dd5
 97747 script   CALL  wait4(0xffffffff,0xbfbfded8,WNOHANG,0)
 97747 script   RET   wait4 -1 errno 10 No child processes

So, read() from /dev/ptyqa gives 0 when data are written to slave side but
seems to be dropped because process terminated and pty is closed.
Earlier, they were kept in pty state until master side is closed.
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200908061053.n76Arbfh097956>