Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Jul 2005 11:16:19 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
Cc:        stable@FreeBSD.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Serious issue with serial console in 5.4
Message-ID:  <20050721110222.U97888@fledge.watson.org>
In-Reply-To: <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net>
References:  <20050721050048.GU22430@xor.obsecurity.org> <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1073072043-1121940979=:97888
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE


On Thu, 21 Jul 2005, Eirik =D8verby wrote:

>>> The above panic will show up occasionally when logging out from a
>>> serial console (i.e. ctrl-D, logout, exit, whatever). This is
>>> EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at
>>> random - and renders the serial console useless.
>>>=20
>>> Robert Watson confirmed this to be an issue on the 10th of April.
>>=20
>> You might have to wait until 6.0-R since fixing it seems to require=20
>> infrastructure changes that cannot easily be backported to 5.x.
>
> With all due respect - if this is (and I'm assuming it is, because it=20
> happens on all the servers I'm serial-controlling) an omnipresent=20
> problem on 5.x, I daresay it should warrant some more attention. Having=
=20
> unsafe serial terminal support that can bring down your system like that=
=20
> defies much of the point of having serial terminal support in the first=
=20
> place.
>
> However, since I seem to be the only one who has noticed this, perhaps=20
> I'm the last person on earth to routinely use serial terminal switches=20
> instead of KVM switches to do my admin work?

The concern about the 5.x backport is that it will break parts of the=20
device driver ABI, and is a significant change that involves a lot of=20
risk.

Regarding the general prevalence of the problem -- I've seen a small=20
number of people reporting it's a big problem.  Since I know of a great=20
many people running with serial consoles (other than a workstation, I=20
never run FreeBSD boxes any other way), this leads me to believe it's=20
something that shows up in fairly specific conditions -- perhaps relating=
=20
to precise timing of a race condition.  This means that if we introduce a=
=20
generally destabilizing change, it may impact more people than the problem=
=20
as it exists (a nasty trade-off).

I've only seen the issue when logging out of a serial console session, and=
=20
had previously hypothesized that it had to do with the simultaneous timing=
=20
of a console message from syslog and the opening/closing of the console's=
=20
tty due to logging out and getty restarting, resulting in a reference=20
count improperly hitting zero.

I thought Doug White had come up with a work-around patch that prevented=20
the reference count from being allowed to hit 0 for the console by=20
artificially elevating it, which would prevent the panic, so either (a)=20
the work around wasn't committed, or (b) it didn't work.

I can attempt to take another look at this problem in a week or so, but=20
have a number of things I need to finish up for FreeBSD 6.0 before then=20
that will be occupying my time.

Robert N M Watson
--0-1073072043-1121940979=:97888--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050721110222.U97888>