From owner-freebsd-stable@FreeBSD.ORG Thu Jul 21 10:15:56 2005 Return-Path: X-Original-To: stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 73C7816A41F for ; Thu, 21 Jul 2005 10:15:56 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id C235743D4C for ; Thu, 21 Jul 2005 10:15:55 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id DEFBE46B0D; Thu, 21 Jul 2005 06:15:52 -0400 (EDT) Date: Thu, 21 Jul 2005 11:16:19 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: =?ISO-8859-1?Q?Eirik_=D8verby?= In-Reply-To: <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net> Message-ID: <20050721110222.U97888@fledge.watson.org> References: <20050721050048.GU22430@xor.obsecurity.org> <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1073072043-1121940979=:97888" Cc: stable@FreeBSD.org, Kris Kennaway Subject: Re: Serious issue with serial console in 5.4 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2005 10:15:56 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1073072043-1121940979=:97888 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Thu, 21 Jul 2005, Eirik =D8verby wrote: >>> The above panic will show up occasionally when logging out from a >>> serial console (i.e. ctrl-D, logout, exit, whatever). This is >>> EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at >>> random - and renders the serial console useless. >>>=20 >>> Robert Watson confirmed this to be an issue on the 10th of April. >>=20 >> You might have to wait until 6.0-R since fixing it seems to require=20 >> infrastructure changes that cannot easily be backported to 5.x. > > With all due respect - if this is (and I'm assuming it is, because it=20 > happens on all the servers I'm serial-controlling) an omnipresent=20 > problem on 5.x, I daresay it should warrant some more attention. Having= =20 > unsafe serial terminal support that can bring down your system like that= =20 > defies much of the point of having serial terminal support in the first= =20 > place. > > However, since I seem to be the only one who has noticed this, perhaps=20 > I'm the last person on earth to routinely use serial terminal switches=20 > instead of KVM switches to do my admin work? The concern about the 5.x backport is that it will break parts of the=20 device driver ABI, and is a significant change that involves a lot of=20 risk. Regarding the general prevalence of the problem -- I've seen a small=20 number of people reporting it's a big problem. Since I know of a great=20 many people running with serial consoles (other than a workstation, I=20 never run FreeBSD boxes any other way), this leads me to believe it's=20 something that shows up in fairly specific conditions -- perhaps relating= =20 to precise timing of a race condition. This means that if we introduce a= =20 generally destabilizing change, it may impact more people than the problem= =20 as it exists (a nasty trade-off). I've only seen the issue when logging out of a serial console session, and= =20 had previously hypothesized that it had to do with the simultaneous timing= =20 of a console message from syslog and the opening/closing of the console's= =20 tty due to logging out and getty restarting, resulting in a reference=20 count improperly hitting zero. I thought Doug White had come up with a work-around patch that prevented=20 the reference count from being allowed to hit 0 for the console by=20 artificially elevating it, which would prevent the panic, so either (a)=20 the work around wasn't committed, or (b) it didn't work. I can attempt to take another look at this problem in a week or so, but=20 have a number of things I need to finish up for FreeBSD 6.0 before then=20 that will be occupying my time. Robert N M Watson --0-1073072043-1121940979=:97888--