From owner-freebsd-questions@FreeBSD.ORG Tue Jul 20 23:17:48 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0A0416A4CE for ; Tue, 20 Jul 2004 23:17:48 +0000 (GMT) Received: from spider.netmails.net (mx2.netmails.net [69.93.35.4]) by mx1.FreeBSD.org (Postfix) with SMTP id 254C043D1F for ; Tue, 20 Jul 2004 23:17:48 +0000 (GMT) (envelope-from subscr@spider.netmails.net) Received: (qmail 78747 invoked by uid 1011); 19 Jul 2004 21:57:42 -0000 Date: Mon, 19 Jul 2004 16:57:42 -0500 From: Hari Bhaskaran To: freebsd-questions@FreeBSD.org Message-ID: <20040719215742.GA78611@spider.netmails.net> References: <20040709214752.GA67399@spider.netmails.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040709214752.GA67399@spider.netmails.net> User-Agent: Mutt/1.4.1i Subject: panic on 5.2.1-RELEASE-p9 (all new procs core dump, panic during reboot attempt) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jul 2004 23:17:48 -0000 Previous post didn't get thru. So here is it again. On Fri, Jul 09, 2004 at 04:47:52PM -0500, Hari Bhaskaran wrote: > Hi, > > Came back from lunch, m/c didn't respond. Tried to reboot and got a paniced m/c. > > Panic message was "bad pte" with > details of "TPTE at 0xbfc20550 is ZERO @ VA 08154000" > > A detailed log of events is given below. I do not see > any file in /var/crash > > Here is the /var/log/messages for that time frame. > > Jul 9 12:13:10 anaimudi kernel: pid 31157 (python), uid 1001: exited on signal 3 (core dumped) > > <-- Around this time I was on a python debugging sesssion. It is possible I quit the > program with Cntrl+\, but dont remember doing that. I was using python > debugger module (pdb) and then did xlock (ports->xlockmore) went for lunch... > Anyways, this was a normal user (not root). > > Jul 9 12:45:56 anaimudi kernel: Warning: pid 31924 used static ldt allocation. > > <-- this process might be XFree86 . I get these messages occasionally. > XFree86 version info given below. > > Jul 9 12:45:56 anaimudi kernel: See the i386_set_ldt man page for more info > > <-- From 12:50 a series of core dumps happen ( I am not at my desk) . I have analyse > these dumps also (details below) > > Jul 9 12:50:00 anaimudi kernel: pid 31927 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 12:50:00 anaimudi kernel: pid 31928 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 12:55:00 anaimudi kernel: pid 31934 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 12:55:00 anaimudi kernel: pid 31936 (jot), uid 2: exited on signal 10 > Jul 9 12:55:00 anaimudi kernel: pid 31937 (dd), uid 2: exited on signal 10 > Jul 9 12:55:00 anaimudi kernel: pid 31938 (mailwrapper), uid 2: exited on signal 10 > Jul 9 12:55:00 anaimudi kernel: pid 31935 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:00:00 anaimudi kernel: pid 31950 (jot), uid 2: exited on signal 10 > Jul 9 13:00:00 anaimudi kernel: pid 31952 (dd), uid 2: exited on signal 10 > Jul 9 13:00:00 anaimudi kernel: pid 31951 (mailwrapper), uid 2: exited on signal 10 > Jul 9 13:00:00 anaimudi kernel: pid 31947 (newsyslog), uid 0: exited on signal 10 (core dumped) > Jul 9 13:00:00 anaimudi kernel: pid 31948 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:00:00 anaimudi kernel: pid 31954 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:00:00 anaimudi kernel: pid 31953 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:01:15 anaimudi kernel: pid 30542 (firefox-bin), uid 1001: exited on signal 11 (core dumped) > Jul 9 13:05:00 anaimudi kernel: pid 31957 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:05:00 anaimudi kernel: pid 31958 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:10:00 anaimudi kernel: pid 31961 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:10:00 anaimudi kernel: pid 31962 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:11:00 anaimudi kernel: pid 31966 (jot), uid 2: exited on signal 10 > Jul 9 13:11:00 anaimudi kernel: pid 31968 (dd), uid 2: exited on signal 10 > Jul 9 13:11:00 anaimudi kernel: pid 31967 (mailwrapper), uid 2: exited on signal 10 > Jul 9 13:15:01 anaimudi kernel: pid 31971 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:15:01 anaimudi kernel: pid 31972 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:20:01 anaimudi kernel: pid 31975 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:20:01 anaimudi kernel: pid 31976 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:22:01 anaimudi kernel: pid 31980 (jot), uid 2: exited on signal 10 > Jul 9 13:22:01 anaimudi kernel: pid 31982 (dd), uid 2: exited on signal 10 > Jul 9 13:22:01 anaimudi kernel: pid 31981 (mailwrapper), uid 2: exited on signal 10 > Jul 9 13:25:01 anaimudi kernel: pid 31985 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:25:01 anaimudi kernel: pid 31986 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:30:00 anaimudi kernel: pid 31991 (atrun), uid 0: exited on signal 10 (core dumped) > Jul 9 13:30:00 anaimudi kernel: pid 31992 (mailwrapper), uid 0: exited on signal 10 (core dumped) > Jul 9 13:33:00 anaimudi kernel: pid 31996 (jot), uid 2: exited on signal 10 > Jul 9 13:33:00 anaimudi kernel: pid 31998 (dd), uid 2: exited on signal 10 > Jul 9 13:33:00 anaimudi kernel: pid 31997 (mailwrapper), uid 2: exited on signal 10 > Jul 9 13:35:00 anaimudi kernel: pid 32001 (atrun), uid 0: exited on signal 10 (core dumped) > > [snip] > > Jul 9 13:40:48 anaimudi kernel: pid 504 (login), uid 0: exited on signal 10 (core dumped) > Jul 9 13:40:48 anaimudi kernel: pid 32009 (getty), uid 0: exited on signal 10 (core dumped) > Jul 9 13:40:48 anaimudi kernel: pid 32010 (getty), uid 0: exited on signal 10 (core dumped) > Jul 9 13:40:48 anaimudi kernel: pid 32011 (getty), uid 0: exited on signal 10 (core dumped) > Jul 9 13:40:48 anaimudi kernel: pid 32012 (getty), uid 0: exited on signal 10 (core dumped) > Jul 9 13:40:48 anaimudi init: getty repeating too quickly on port /dev/ttyv1, sleeping 30 secs > > Keeping repeating. Since I couldn't get X back, I killed it from first terminal and > tried to login on the console. Now login hangs (getty core dumps, of course). > Then I tried to reboot with Cntrl-Alt-Delete. It tries to reboot and then goes to panic. > > Here are the non-standard stuff installed in the machine. > > 1. linux binary compatibility is enabled. > 2. nvidia card uses latest freebsd driver (as of yesterday) > from the their page > 3. I frequently get this message in /var/log/messages > > Jul 9 13:47:05 anaimudi kernel: Warning: pid 540 used static ldt allocation. > Jul 9 13:47:05 anaimudi kernel: See the i386_set_ldt man page for more info > > I verified this is XFree86 process. > > XFree86 Version 4.3.0 > Release Date: 27 February 2003 > X Protocol Version 11, Revision 0, Release 6.6 > Build Operating System: FreeBSD 5.2.1 i386 [ELF] > Build Date: 13 February 2004 > Before reporting problems, check http://www.XFree86.Org/ > to make sure that you have the latest version. > Module Loader present > > 4. I am running two jails with couple of filesystems (including devfs mounted to jail). > I am using jails for filesystem separation (and not really for security) and I couldn't > see any other way other than mounting (read-write!) devfs to jails too. I hope that > is not the reason. > > $ mount > /dev/ad0s1a on / (ufs, local) > devfs on /dev (devfs, local) > /dev/ad0s1g on /fs1 (ufs, local, soft-updates) > /dev/ad0s1e on /tmp (ufs, local, soft-updates) > /dev/ad0s1f on /usr (ufs, local, soft-updates) > /dev/ad0s1d on /var (ufs, local, soft-updates) > /usr on /fs1/jails/172.16.32.1/data/usr (nullfs, local, read-only) > /bin on /fs1/jails/172.16.32.1/data/bin (nullfs, local, read-only) > /sbin on /fs1/jails/172.16.32.1/data/sbin (nullfs, local, read-only) > devfs on /fs1/jails/172.16.32.1/data/dev (devfs, local) > /usr/compat on /fs1/jails/172.16.32.1/data/compat (nullfs, local, read-only) > /lib on /fs1/jails/172.16.32.1/data/lib (nullfs, local, read-only) > /libexec on /fs1/jails/172.16.32.1/data/libexec (nullfs, local, read-only) > /rescue on /fs1/jails/172.16.32.1/data/rescue (nullfs, local, read-only) > /fs1/vslick on /fs1/jails/172.16.32.1/data/fs1/vslick (nullfs, local) > /usr on /fs1/jails/172.16.32.2/data/usr (nullfs, local, read-only) > /bin on /fs1/jails/172.16.32.2/data/bin (nullfs, local, read-only) > /sbin on /fs1/jails/172.16.32.2/data/sbin (nullfs, local, read-only) > devfs on /fs1/jails/172.16.32.2/data/dev (devfs, local) > /usr/compat on /fs1/jails/172.16.32.2/data/compat (nullfs, local, read-only) > /lib on /fs1/jails/172.16.32.2/data/lib (nullfs, local, read-only) > /libexec on /fs1/jails/172.16.32.2/data/libexec (nullfs, local, read-only) > /rescue on /fs1/jails/172.16.32.2/data/rescue (nullfs, local, read-only) > /fs1/vslick on /fs1/jails/172.16.32.2/data/fs1/vslick (nullfs, local) > > jails are 172.16.32.1 and 172.16.32.2 > > > gdb core stack traces > ===================== > NOTE: I dont have any cron jobs configured. > > > -su-2.05b# gdb /usr/libexec/atrun atrun.core > GNU gdb 5.2.1 (FreeBSD) > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-unknown-freebsd"...(no debugging symbols found)... > Core was generated by `atrun'. > Program terminated with signal 10, Bus error. > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done. > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x2805f580 in __sflags () from /libexec/ld-elf.so.1 > (gdb) where > #0 0x2805f580 in __sflags () from /libexec/ld-elf.so.1 > #1 0x280541a2 in lm_init () from /libexec/ld-elf.so.1 > #2 0x2804f6f8 in _rtld () from /libexec/ld-elf.so.1 > (gdb) > > > -su-2.05b# gdb /usr/sbin/mailwrapper mailwrapper.core > GNU gdb 5.2.1 (FreeBSD) > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-unknown-freebsd"...(no debugging symbols found)... > Core was generated by `mailwrapper'. > Program terminated with signal 10, Bus error. > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done. > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x2805d580 in __sflags () from /libexec/ld-elf.so.1 > (gdb) where > #0 0x2805d580 in __sflags () from /libexec/ld-elf.so.1 > #1 0x280521a2 in lm_init () from /libexec/ld-elf.so.1 > #2 0x2804d6f8 in _rtld () from /libexec/ld-elf.so.1 > (gdb) > > -su-2.05b# gdb /usr/sbin/newsyslog newsyslog.core > GNU gdb 5.2.1 (FreeBSD) > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-unknown-freebsd"...(no debugging symbols found)... > Core was generated by `newsyslog'. > Program terminated with signal 10, Bus error. > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done. > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x28063580 in __sflags () from /libexec/ld-elf.so.1 > (gdb) where > #0 0x28063580 in __sflags () from /libexec/ld-elf.so.1 > #1 0x280581a2 in lm_init () from /libexec/ld-elf.so.1 > #2 0x280536f8 in _rtld () from /libexec/ld-elf.so.1 > (gdb) > > For firefox-bin (my browser), it was a little different > > #0 0x289f6527 in kill () from /lib/libc.so.5 > #1 0x289eb944 in raise () from /lib/libc.so.5 > #2 0x08056180 in nsProfileLock::FatalSignalHandler(int) () > #3 0x288e093c in _thread_sig_handler () from /usr/lib/libc_r.so.5 > #4 0x288e07bb in _thread_sig_handler () from /usr/lib/libc_r.so.5 > > > -- > Hari Bhaskaran