Date: Thu, 21 Jul 2005 11:16:19 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net> Cc: stable@FreeBSD.org, Kris Kennaway <kris@obsecurity.org> Subject: Re: Serious issue with serial console in 5.4 Message-ID: <20050721110222.U97888@fledge.watson.org> In-Reply-To: <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net> References: <20050721050048.GU22430@xor.obsecurity.org> <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 21 Jul 2005, Eirik Øverby wrote: >>> The above panic will show up occasionally when logging out from a >>> serial console (i.e. ctrl-D, logout, exit, whatever). This is >>> EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at >>> random - and renders the serial console useless. >>> >>> Robert Watson confirmed this to be an issue on the 10th of April. >> >> You might have to wait until 6.0-R since fixing it seems to require >> infrastructure changes that cannot easily be backported to 5.x. > > With all due respect - if this is (and I'm assuming it is, because it > happens on all the servers I'm serial-controlling) an omnipresent > problem on 5.x, I daresay it should warrant some more attention. Having > unsafe serial terminal support that can bring down your system like that > defies much of the point of having serial terminal support in the first > place. > > However, since I seem to be the only one who has noticed this, perhaps > I'm the last person on earth to routinely use serial terminal switches > instead of KVM switches to do my admin work? The concern about the 5.x backport is that it will break parts of the device driver ABI, and is a significant change that involves a lot of risk. Regarding the general prevalence of the problem -- I've seen a small number of people reporting it's a big problem. Since I know of a great many people running with serial consoles (other than a workstation, I never run FreeBSD boxes any other way), this leads me to believe it's something that shows up in fairly specific conditions -- perhaps relating to precise timing of a race condition. This means that if we introduce a generally destabilizing change, it may impact more people than the problem as it exists (a nasty trade-off). I've only seen the issue when logging out of a serial console session, and had previously hypothesized that it had to do with the simultaneous timing of a console message from syslog and the opening/closing of the console's tty due to logging out and getty restarting, resulting in a reference count improperly hitting zero. I thought Doug White had come up with a work-around patch that prevented the reference count from being allowed to hit 0 for the console by artificially elevating it, which would prevent the panic, so either (a) the work around wasn't committed, or (b) it didn't work. I can attempt to take another look at this problem in a week or so, but have a number of things I need to finish up for FreeBSD 6.0 before then that will be occupying my time. Robert N M Watson
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050721110222.U97888>
