From owner-freebsd-stable@FreeBSD.ORG Thu Jul 21 11:02:33 2005 Return-Path: X-Original-To: stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 068A216A41F; Thu, 21 Jul 2005 11:02:33 +0000 (GMT) (envelope-from ltning@anduin.net) Received: from anduin.net (anduin.net [212.12.46.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3AD7443D7E; Thu, 21 Jul 2005 11:02:13 +0000 (GMT) (envelope-from ltning@anduin.net) Received: from eirik.unicore.no ([213.225.74.166] helo=[10.0.16.10]) by anduin.net with esmtpa (Exim 4.50 (FreeBSD)) id 1DvYoY-0006Wu-SL; Thu, 21 Jul 2005 13:02:06 +0200 In-Reply-To: <20050721110222.U97888@fledge.watson.org> References: <20050721050048.GU22430@xor.obsecurity.org> <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net> <20050721110222.U97888@fledge.watson.org> Mime-Version: 1.0 (Apple Message framework v733) Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: <86DDD9F6-A086-48E2-A5C5-1F5EA1C49354@anduin.net> Content-Transfer-Encoding: quoted-printable From: =?ISO-8859-1?Q?Eirik_=D8verby?= Date: Thu, 21 Jul 2005 13:02:01 +0200 To: Robert Watson X-Mailer: Apple Mail (2.733) Cc: stable@FreeBSD.org, Kris Kennaway Subject: Re: Serious issue with serial console in 5.4 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2005 11:02:33 -0000 On Jul 21, 2005, at 12:16 PM, Robert Watson wrote: > > On Thu, 21 Jul 2005, Eirik =D8verby wrote: > > >>>> The above panic will show up occasionally when logging out from a >>>> serial console (i.e. ctrl-D, logout, exit, whatever). This is >>>> EXTREMELY BAD, as it will crash an otherwise perfectly healthy =20 >>>> box at >>>> random - and renders the serial console useless. >>>> Robert Watson confirmed this to be an issue on the 10th of April. >>>> >>> You might have to wait until 6.0-R since fixing it seems to =20 >>> require infrastructure changes that cannot easily be backported =20 >>> to 5.x. >>> >> >> With all due respect - if this is (and I'm assuming it is, because =20= >> it happens on all the servers I'm serial-controlling) an =20 >> omnipresent problem on 5.x, I daresay it should warrant some more =20 >> attention. Having unsafe serial terminal support that can bring =20 >> down your system like that defies much of the point of having =20 >> serial terminal support in the first place. >> >> However, since I seem to be the only one who has noticed this, =20 >> perhaps I'm the last person on earth to routinely use serial =20 >> terminal switches instead of KVM switches to do my admin work? >> > > The concern about the 5.x backport is that it will break parts of =20 > the device driver ABI, and is a significant change that involves a =20 > lot of risk. > > Regarding the general prevalence of the problem -- I've seen a =20 > small number of people reporting it's a big problem. Since I know =20 > of a great many people running with serial consoles (other than a =20 > workstation, I never run FreeBSD boxes any other way), this leads =20 > me to believe it's something that shows up in fairly specific =20 > conditions -- perhaps relating to precise timing of a race =20 > condition. This means that if we introduce a generally =20 > destabilizing change, it may impact more people than the problem as =20= > it exists (a nasty trade-off). > > I've only seen the issue when logging out of a serial console =20 > session, and had previously hypothesized that it had to do with the =20= > simultaneous timing of a console message from syslog and the =20 > opening/closing of the console's tty due to logging out and getty =20 > restarting, resulting in a reference count improperly hitting zero. I did indeed make some changes to my syslog configuration after =20 getting the serials online. Your theory might not be entirely off. Let me know if I should post my syslog.conf file or anything else =20 here or elsewhere... Thanks, /Eirik > I thought Doug White had come up with a work-around patch that =20 > prevented the reference count from being allowed to hit 0 for the =20 > console by artificially elevating it, which would prevent the =20 > panic, so either (a) the work around wasn't committed, or (b) it =20 > didn't work. > > I can attempt to take another look at this problem in a week or so, =20= > but have a number of things I need to finish up for FreeBSD 6.0 =20 > before then that will be occupying my time. > > Robert N M Watson