Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Dec 2007 23:37:50 +0000 (GMT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        current@FreeBSD.org
Cc:        stable@FreeBSD.org
Subject:   Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1)
Message-ID:  <20071203225800.S30376@fledge.watson.org>

next in thread | raw e-mail | index | archive | help

(If you aren't interested in the details of our ptmx/pty/pts driver, skip to
  the paragraph that reads "So, why the long-winded story?)

Dear all:

The current ptmx/pts implementation makes use of devfs(4) cloning: a user 
process wanting to allocate a pty/pts pair opens /dev/ptmx, which returns a 
reference to a new pty master.  An ioctl is then performed to query which pts 
number was returned, and the pts device is then opened.  Internally, the 
lookup of /dev/ptmx causes the driver to instantiate the pty, and then when 
the pty is opened, the pts is created.  The pty and pts nodes are both 
destroyed when last close occurs, cleaning up the bits automatically when the 
last process attached to thee pair exits.  Sounds good. :-)

Unfortunately, the current implementation is subject to a potential resource 
leak: the pty is created when the lookup occurs, but if the open never takes 
place, then the pty is leaked.  In principle, we have facilities to GC unused 
device nodes "eventually", although not a race-free way to determine that no 
race occurs, assuming that we implemented that.  This leakage turns out to 
interact particularly poorly with our resource limits on pty/pts pairs -- both 
the administrative limit imposed by sysctl and also the functional limit on 
the number of entries in /etc/ttys.  It's possible to imagine various 
sometimes messy techniques of performing this garbage collection.

Instead, what I'd like to do is modify the ptmx code to have a race-free 
protocol, in which eventual termination of processes referencing the node 
results in freeing of the nodes.  On some systems, ptmx performs a 
"bait-and-switch", in which the file descriptor of the pty node is silently 
substituted for the file descriptor of the ptmx code--similar to our model, 
only no window between lookup and open, but also not easily supported in our 
current VFS.  Another possibility is to introduce a new system call and bypass 
ptmx entirely -- similar to pipe(), socketpair(), etc.

The change that seemed to be the least disruptive, and which I have 
implemented, introduces ptmx as a true device node (not a devfs clone), and an 
ioctl that causes the allocation of the pty and pts pair -- however, the pair 
is also added to a garbage collection list.  If the ptmx node is closed 
*before* the pty is opened, then the nodes are garbage collected.  It turns 
out this also isn't easily implementable in our VFS, as we don't offer a 
per-file descriptor opaque to be used by device driver, nor offer the file 
descriptor pointer to the device driver (as in, say, Linux).  At some point, 
this functionality will turn up, as there has been consistent interest in it 
over time.  What I've done is implement an approximation of that model -- an 
"open counter" for ptmx, which when it hits zero across all references, causes 
a garbage collection sweep.  If/when we can use per-file descriptor state, it 
is easily modified to sweep on close of a specific descriptor.

--> start reading here if you were bored by the above

Why the long-winded story?  Well, this turns out to change the convention by 
which libc communications with the kernel -- instead of a simple open of ptmx 
and then ioctl to find the pts, we now open ptmx, perform an ioctl to allocate 
the pair, and then open both the pty and pts nodes explicitly.  Thus, libc 
requires modification, and libcs that know how to speak to the old ptmx don't 
know how to speak to the new one, and, in effect, vice versa.  This doesn't 
meet our ABI requirements for a stable branch, so what I plan to do is 
withdraw the ptmx/pts implementation from 7.0 before the release by disabling 
it in the kernel and libc.  This will prevent us from nailing down the ABI, 
and we'll instead merge the revised protocol for 7.1.  This change will, 
however, affect users of the 8-CURRENT branch, as during an upgrade cycle, 
it's likely that libc and kernel will be out of sync, and therefore if pts 
support is enabled (via the kern.pts.enable sysctl), pty devices will not be 
available, which might crimp the style of anyone performing a remote upgrade 
via, say, ssh.

So, this is notice of two upcoming changes:

(1) kern.pts.enable will be removed in 7.x, for reintroduction in 7.1.  If
     kern.pts.enable was set, then your system will silently revert to using
     old-style ptys, and the setting of the sysctl will lead to an error.

(2) I will merge the revised ptmx implementation to 7.x, potentially
     disrupting use of pty/pts devices for users who have kern.pts.enable
     explicitly set to a non-zero value.

Hopefully this will resolve the known resource leaks in the ptmx code, and get 
us on track to start enabling it by default in the near future ... in 8.x, and 
at least offering it as a production feature in 7.x.

Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071203225800.S30376>