From owner-freebsd-questions  Tue Apr 25 17:56:22 1995
Return-Path: questions-owner
Received: (from majordom@localhost)
          by freefall.cdrom.com (8.6.10/8.6.6) id RAA26693
          for questions-outgoing; Tue, 25 Apr 1995 17:56:22 -0700
Received: from cs.weber.edu (cs.weber.edu [137.190.16.16])
          by freefall.cdrom.com (8.6.10/8.6.6) with SMTP id RAA26682
          for <freebsd-questions@FreeBSD.org>; Tue, 25 Apr 1995 17:56:13 -0700
Received: by cs.weber.edu (4.1/SMI-4.1.1)
	id AA03107; Tue, 25 Apr 95 18:48:43 MDT
From: terry@cs.weber.edu (Terry Lambert)
Message-Id: <9504260048.AA03107@cs.weber.edu>
Subject: Re: Discussion about mgetty...
To: gert@greenie.muc.de (Gert Doering)
Date: Tue, 25 Apr 95 18:48:43 MDT
Cc: timb@thud.cdrom.com, freebsd-questions@FreeBSD.org, neko@greenie.muc.de,
        knarf@nasim.cube.net, mgetty@muc.de
In-Reply-To: <m0s3sQM-00003KC@greenie.muc.de> from "Gert Doering" at Apr 25, 95 11:45:37 pm
X-Mailer: ELM [version 2.4dev PL52]
Sender: questions-owner@FreeBSD.org
Precedence: bulk

> some kind soul forwarded this to me, and I feel I simply have to react,
> because some statements in this mail concerning mgetty where blatantly
> wrong, and I feel somewhat attacked by statements like "mgetty is bad", if
> in the same mail the author shows that he doesn't know much about mgetty's
> inner workings.

It wasn't intended as an attack, it's just that there seems to be a
trade between dialout capability and mgetty capabilities.

I'm more than willing to discuss any of this; the problems I see in
mgetty and which I reported in the posting are primarily the result
of mgetty trying to solve a problem for which OS support should exist
but doesn't.  The problems aren't directly attributable to mgetty itself,
although it suffers from them as a result of the OS deficiencies.


> > The modem should "reset as if powered of and turned on on an on-to-off
> > transition of DTR from the computer".
> > 
> > On modern Hayes modems, this is AT&D2.
> 
> Ummm... not really. Reset-as-if-power-cycle is usually &D3. &D2 is usually
> only "hang up and return to command mode".

I screwed this one up.  It was retracted in a followup posting; I was
looking at a multimodem 224E manual when quoting this, and it wasn't
quite hayes compatible.

> > The practical effect of the /dev/cua0/cua1 device will be the setting
> > of the terminal modes HUPCL and -CLOCAL; you should log in through the
> > modem and type "stty -a" to make sure these settings are present.
> 
> Well, many (that is, *all I know of*) getty programs set the termio(s)
> values to something sane anyway. It's getty's purpose to set them, not to
> rely on some driver defaults.

Most drivers do templating, and the templating in BSD modem control
drivers in a traditional (ie: Sun) implementation sets these flags.  Most
of the gettytab definitions don't permit these settings to be changed.

That the current BSD drivers hybrid the templating they do is either
attributable to design issues that are still not addressed in BSD, or
design decisions on the part of the driver writer.  I don't agree with
all of them.

Traditionally, the CLOCAL and HUPCL flags are not settable nor are set
by the getty program because of adverse effects; specifically, the
setting of CLOCAL to -CLOCAL without carrier present will (is suppose
to) result in the same effect on the device as if there had been A DCD
loss on a line where DCD was present.

The setting of -HUPCL to HUPCL can result in the deivce acting as if it
had been closed and dropping DTR to the modem, potentiall severing the
active connection.  In general, it is not permissable to change these
"meta" flags while the modem is on line.

This is acceptable for dial-in for security reasons having to do with
ensureing proper modem baud rate training, defeating iterative login
attempts by forcing the caller to reestabilish the connection after a
count of failures, and preventing leving logged in shells active for
subsequent modem callers.

This is acceptable for dialout because dialout programs are not the
process for the dialout line is the controlling tty.  This is because
correct operation requires that a call-in caller calling out on a
seperate modem from the one dialed in on *must* correctly lose his
session when the connection between his modem and the machine is lost,
and resetting the controlling tty would preclude SIGHUP delivery, since
each process mah have only one controlling tty.

Similarly, it is typical in a hard-writed terminal connection to wire
terminal DTR to serial port DCD so that if the terminal is powered off
(as a typical end user might do at the end of the day), security is
maintained by destroying the user's connection to the machine by
allowing the session for a direct connected terminal to honor -CLOCAL
seperately from HUPCL.

> > In incorrectly configured program (ie: mgetty) could result in the
> > flags being set incorrectly after login.
> 
> What do you mean by that? mgetty does definitely set HUPCL and -CLOCAL
> (I'm not *that* dumb). It did from the very first release.
> 
> Did you ever try it?

Yes, I did.  The problem I have is in the opening of the port prior to
a connection being present, thereby precluding use of the port for
bidirectional (both incoming and outbound) use.


> > The use of "mgetty" is a problem.  The "mgetty" program differs from
> > the "getty" program in that it opens the port and hangs on a read
> > waiting for the modem to announce a baud rate so that it can set
> > line speed.  
> 
> Partial true. Mgetty *can* do it (since some modems insist on switching
> baud rates), but the default is to read the CONNECT message only for
> informational and logging purposes. One of mgetty's big advantages is that
> the user can always exactly see in the log files which state the modem
> was/is in, and what did exactly happen when.

This is an advantage for non-bidirectional use.  However, in bidirectional
prot implementation, when an incomming port open has succeeded, then an
outgoing open will be blocked until the connection goes away (or if it
is opened with O_NDELAY, the open will return EWOULDBLOCK).


> > The ability to open without DCD present is a real
> > problem, in that it defeats the purpose of a calling unit device
> 
> Well... we had a longish discussion about the "calling unit devices" in
> the Linux kernel mailing lists two months or so ago, and finally, nearly
> everyone (including the serial driver's author) agreed that the tty/cua
> distinction is a hack to get ill-behaving applications to work.
> 
> The "classic" approach is to use one device for one physical device... 

And then make the "uugetty" (or in this case the "mgetty') wait for
a carriage return following DCD before emitting a login, and allowing
the uugetty open to complete once DCD has benn asserted, but then
checking for a lock file (which must be asserted by the dialing program
whose successful connection allowed the DCD to come true).  The uugetty
then backs off and rechecks the lock file presence at 5 minute intervals,
and if present, sends a kill 0 to the process whose PID is in the lock
file.  This is where the abomination before god of the kill 0 and the
abomination of caring about lockfile format and PID contents came from.
The kill returns 0 (if the process is there and killable), but doesn't
actually signal the process, or it returns EPERM, indicating that the
process is there, but not killable by you, or it returns ESRCH, which
indicates that the program is dead and the lock file may be ignored.

This is clearly much more of a kludge than having device synonyms with
implied (and therefore more reliable) locking in the kernel.

> > and thereby thwarts the normal modem login sequence.
> 
> It doesn't. Ever heard the term "uugetty"?

Yes.  See above why it is a kludge.

> > 1)	The modem comes on; the computer is not in multiuser
> > 	mode, so DTR is not asserted.  The mode will not
> > 	answer the phone until DTR is asserted.
> 
> What do you do if your machine crashes, leaving DTR asserted? It *does*
> happen occasionally. If you have a modem that auto-answers, your modem
> will pick up the phone, and the callers will have to pay to the Telco...

You fix the hardware.  This is an unlikely case, and is no better handled
than any other catastrophic failure using any other software.

> > 11)	The modem answers the phone.  Its first act is to report
> > 	its "CONNECT <baud rate>" message to the computer.
> > 
> > 12)	Like other input prior to DCD being present, this data
> > 	is ignored.
> 
> And now, we come to a far more important problem: to have the maximum
> flexibility, you have to be able to cope with modems that raise DCD
> *before* sending the CONNECT string. There are *many* of them around,
> and with a "classic" approach, you can only handle them if you switch
> off modem responses (which is nasty for parallel dial-outs).

I disagree.  The Tandy DT400 does this, but it's ^N escape character
makes it useless for binary file transmission anyway.  The AT&T 4024
series of modems also do this by default, but it is programmable and
thus you can get rid of the behaviour.

Reporting connect AFTER connect but before DCD has been raised to the
computer is mandated by the Hayes standard.  Other than the examples
above, I think you'd be hard pressed to find modems which do this that
can not be reconfigured to *NOT* do it.

> So, what do you do? You use a getty program (mgetty, that is, or getty_ps,
> or others) that does know about CONNECT strings. 

I reconfigure the modem.

> BTW: in this step, you can as well do the distinction whether the incoming
> call was a FAX or DATA call, and handle both accordingly. Major plus for
> mgetty and FlexFax.

I understand that.  Again, that's the unidirectional soloution.  The
correct way to handle it would be to not allow the open to complete in
the kernel until a RI is seen, and set the modem for 2 rings to ensure
that one is sent to the computer before the modem answers.  This is a
kernel driver problem in that there is not (currently) a flag to open
to cause this behaviour.

On the other hand, if the open has competed prior to an attempt to
connect to the machine, the only alternative for an outbound connection
program is to open the port itself and *manually* kill off the mgetty
(requiring the acquisition of apropriate priveledges to make this a
realistic alternative).  This is worse than just a kludge, it's a blatant
security hole.

> > 16)	If the user does not see this, they send "break" signals;
> > 	the "getty" changes the baud rate and reissues the prompt
> > 	for each break signal received.  
> 
> Now this is something I call severely dumb. We're talking about smart
> modems, remember? All of them can be set to a fixed DTE baud rate, so
> there isn't any need for baud rate switching anymore.

I was describing the traditional behaviour.  As long as the modem has
sufficient RAM for buffering of data and an out of band flow control
mechanism to allow binary data to be successfully transmitted (It's a
bitch when Xmodem stops at packet 17 or 19 because the sequence number
is a ^Q or ^S, or your SLIP or PPP just hang on you), AND you are willing
to live with the delay between the time you type ^C and the output actually
aborts caused by the baud differential filling that buffer, then locking
the computer-modem baud rate is an acceptable soloution.  Similarly, if
it's an internal modem with no real computer-modem baud rate, it's
also an acceptable arrangement because there is no such thing as needing
to train the getty in that case.

Anyway, it's not my design, but it beats the DEC method of building serial
boards that recognixe returns because you *know* the serial board you are
going to use and you have DEC drivers.


> *If* it has to be done, then, by all means, read the CONNECT string and
> handle it the way it is meant to be.

Sure, as long as that read is not pending except during the connection
attempt itself so I don't have to buy a second modem and phone line to
be able to dial out.

> Having the callers send <break> signals is something that will work for
> Unix experts, but the average Joe User doesn't want (or should) to know
> about it. He wants to see a CONNECT and then a "welcome" banner, no break
> signal fiddling.

Kermit, TERM, Procomm, Crosstalk, and most other comm software packages
support scripting capability, and come with defaults for connecting
to UNIX boxes; on the other hand, you are right that it's dumb.  I'd
much prefer that they use the RS232C standard external clock pin for
port baud rate, but unfortuantely, Intel doesn't know how to build
UARTS, or at least IBM doesn't know how to pick them.

The only other alternative, as you pointed out, is locking the DTE
baud rate and accepting the problems that come with doing that instead.


> > 	particular line, as well as the default line settings for
> > 	each are stored in the /etc/gettytab file.
> 
> Well, mgetty can't read gettytab, but it can read gettydefs (SYSV-ism),
> which is - IMHO - far more powerful, though I detest it as well (personal
> dislike for cryptic formats).

We agree here.  8-).

> > [Modem connection failing immediately after CONNECT]
> > This is exactly the situation if the computer has an input present
> > before DCD is raised and echos the "RING" or "CONNECT" messages
> > back to the modem.
> 
> This is exactly the reason why it is a GoodThing to have a getty process
> that knows about RING and CONNECT and *doesn't* echo it back to the modem.
> 
> Don't we all agree here?

I agree that it should not be echoed; how you arrive at not echoing it
is another thing altogether.  Whether it's because the computer is not
in a state that it can echo things back because it is watching for wire
signals or it's because there's a read posted without the line being in
a cannonical processing mode to not echo it is irrelevant.  What *is*
relevant is whether the approach you chose screws with your ability to
use the hardware the way you want to use it (ie: for dialout).

> > If the mgetty incorrectly sets CLOCAL or -HUPCL in its effort to
> > open the port without DCD being present, ...
> 
> It doesn't. Well, it does, but as soon as carrier is present, the CLOCAL
> flag is reset, and the -HUPCL flag is set. There *is* a small time window
> (about 50-100 ms or so) where the DCD line is high, but CLOCAL is still
> set, but if the connection is lost in that place, mgetty will notice as
> well (doesn't have to rely on SIGHUP here).

Agreed -- on the other hand, it relies on the driver not acting quirky
on port settings changes (many older drivers do).  If it does rely on
not having older driver, it might as well rely on optimistic driver
behaviours as well, so there's really no difference in the approach.

> > ... this will prevent normal operation.
> 
> Normal operation isn't touched by this in any way.

Normal dialout operation when a connection is not present, but while
mgetty has a non-echoing read posted that will prevent a dialout program
from communicating with the modem.

> > If the mgetty/getty is started on a tty device instead of a cua device,
> > the default port settings will not include -CLOCAL and HUPCL, and
> > the connection will not behave as expected.
> 
> Who cares for default settings? It's getty's *JOB* to set the termios
> settings to something specific, not to rely on the defaults.
> 
> Mgetty very well knows about possibly-wrong defaults and sets the flags
> properly.

I'd argue this for the CLOCAL and hUPCL flags (but *only* for those -- for
all other flags, you are in fact correct).  I'd also put forth Sun and
SVR4 serial drivers as arguments on my behalf.

> > The "mgetty" program is bad.  
> 
> I tend to have quite a different opinion here :)

8-).

> > It should not succeed in the open
> > without DCD present; this prevents the port for being used for
> > dialout without killing the "mgetty".
> 
> This is *WRONG* (dammit). If the programs create proper UUCP lock files,
> they can dial out while mgetty is running without any difficulties.
> 
> Admitted, they must not use a different device in the tty/cua device pair,
> but as long as the same device is used, shared dial-in/dial-out is very
> easy.

How do you prevent the mgetty read from reading the responses from the
modem to the dialout program when you must post the read after the open
and before the appearance of the lockfile?  Failing out is a bad method
of handling this.

Again, I'll admit that it's the only option you have with the brain dead
serial drivers you are working with to get the functionality you want to
provide.

> > The correct way to open the port without DCD present is to use the
> > O_NDELAY flag; this has the side effect of setting no delay on reads,
> > when you probably do not want this,...
> 
> Huh?? Now I am really puzzled. Did you *EVER* *LOOK* at mgetty? mgetty
> *does* open the port with O_NDELAY. Did from the very first release.

I was describing the partial open hack, which was first introduced in
SVR3.2 with HDB UUCP (the same time kill 0 was introduced).

> Naturally, CLOCAL has to be set as well, because many serial drivers won't
> read from the device, even if the open() has succeeded, if CLOCAL is unset
> and DCD hasn't been raised yet.

Not true.  All SVRx drivers and All Xenix drivers (except for an early
release of Xenix 2.3.0, and Intel 320 serial drivers on a rev B serial
board when accessed over Intel OpenNet) are succeptable to the partial
open hack.

You open the port without O_NDELAY.  This doesn't open the port because
DCD is not present; however, if the O_EXCL bt was used on the previous
open, this will unset it.

You alarm out of the open (or close it if it succeeds).

Then you open the port O_NDELAY.  This allows you to open the port without
DCD being present.  You save this fd.

You open the port without O_NDELAY.  The open succeeds because the port
already has an open in it.  For those interested, this was first done
in sys2.c in SVR3.2 at about line 318.  If the 12 lines dealing with
O_EXCL in this area had been moved down another 22 lines, the first
open to unset O_EXCL would not have been necessary.

You close the original descriptor and use the second descriptor to access
the port.  If you have Xenix 2.3.0, then you will have to close and reopen
the port on DCD loss, otherwise, you are fine from here on out.

This is called "the partial open hack", where the blocking open is allowed
to succeed because you already have the port "partiall open".  This is all
well documented in the HDB UUCP sources (and badly documented in the UNIX
programming manuals).


> > ...with no way to unset them.
> 
> Nonsense. Ever read the manpage of fcntl() in recent Unixoid systems?
> You *can* reset O_NDELAY (or O_NONBLOCK, definitions differ) with
> fcntl(), and it works very well on Linux, Free/NetBSD, BSDI, SVR3/4,
> SunOS, Solaris, you-name-it-all. Even on the AT&T 3B1 UnixPC!

Early Xenix (whose I/O was based on V7), Microport, and Cubix (a modified
Microport) was not succeptable to this.  Neither was SVR3.1.  The Altos
2000 "multidrop" serial boards, which used a single event queue for reads,
writes, and ioctl()'s (requiring that you use multiple processes as reader
and writer -- startlingly similar to pre 5.2 VMS serial I/O) would block
the fcntl() indefinitely, although it worked fine on internal serial ports.
The fcntl(0 was totally unsupported on Intel OpenNet both on the Intel
310 and 320, and was also not supported via Touch communications OpenNet
implementation for PrimeOS.

> > The correct procedure is to: open with O_NDELAY, open a second time
> > without O_NDELAY (the second open will not block because there is
> > already an open on the port), and close the first open.  This is
> > called "the partial open hack", but in reality it is not a hack
> > (unless you consider the overloading of O_NDELAY in the first place
> > a hack).
> 
> Well.... one could do this on systems where fcntl() isn't able to reset
> O_NDELAY, but I've never seen one. 

I'll introduce you some day, if you are anywhere nealy as interested in
hardware antiquities as I am.  8-).

> Very old SCO Unix variants are rumored to have a bug in open() that will
> make this necessary as well, but I have compiled and run mgetty on a SCO
> Unix 3.2v2.0 machine (about five years old), and never seen any problem.

Yeah, that was the Xenix 2.3.0 bug.

> > The "mgetty" program should be rewritten to use an open flag that
> > causes the open to hang until a "ring indicate" signal is seen
> > instead od waiting for DCD.  The driver should be rewritten to
> > accomodate this.  
> 
> That is an interesting (and very reasonable!) approach. It has been
> suggested by other very competent persons before.

Yeah, I never claimed it wa my original idea.  8-).

> Unfortunately, de-facto standards make this impossible. Many multiport
> cards don't even wire the RI line anywhere. No commercial unix vendor will
> support this. So, are you really willing to sacrifice portability here?

The portability sacrafice buys directed dialout capability -- I think it's
worth the effort.  The alternative is to forego the mgetty advantages in
fax sharing for a fax modem and a lot of other things I'd actually rather
not give up, as long as I don't have to get a second modem and phone line
to dial out reliably.

The only real difference in the BSD approach is the calling unit device,
and that's just a cosmetic niceity to avoid all of the lockfile compatability
grap that even Pete Honeyman got wron on one occasion...

> > Then the "mgetty" should do it's reads normally;
> > an alarm should be set so that if DCD (a "CONNECT" message) does
> > not occur within a set period of time, the mgetty closes the port
> > and reissues the open.  
> 
> This approach won't work too good. It's better to monitor the port for
> "good" (CONNECT) and "bad" (NO CARRIER) strings. Reason: if you want to
> do a timeout, it has to be quite long, to accomodate the time until the
> desired number of RINGs (S0) and the maximum no-connect time (S7) has
> passed. This gives a large time frame to collide with outgoing processes.

I considered that, but the problem here is that you won't necessarily
get a "NO CARRIER" as a result of a hangup after a RI but before the
modem has actually answered the phone.  The only way to reliably make
the thing let go of the line so it can be used for an outgoing call after
it grabs it is to either have it voluntarily let go after a timeout or
you can murder the thing from user space.

A lot of serial drivers, especially SCO, even the most recent ones, have
bad code handling of the RTS/CTS flow control, such that if it is enabled
but isn't present (for instance, an older Microcomm modem, to name one of
several), it will hang the driver.  SunOS pre 4.1.3 has this problem in
a big way, and it was quite easy to get it into a wierd state where you
need to reboot to fix it.

> Final exams... second half... limited time... bad mood... stop... :)

8-).


					Terry Lambert
					terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.