From owner-freebsd-questions@FreeBSD.ORG Sun Sep 7 12:21:26 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DFB2116A4BF for ; Sun, 7 Sep 2003 12:21:25 -0700 (PDT) Received: from franklin-belle.com (adsl-65-68-247-73.dsl.crchtx.swbell.net [65.68.247.73]) by mx1.FreeBSD.org (Postfix) with ESMTP id BBC2143FDF for ; Sun, 7 Sep 2003 12:21:24 -0700 (PDT) (envelope-from jackstone@sage-one.net) Received: from sagea (sagea.sage-american [10.0.0.3]) by franklin-belle.com (8.12.8p1/8.12.8) with SMTP id h87JLNlP034391; Sun, 7 Sep 2003 14:21:23 -0500 (CDT) (envelope-from jackstone@sage-one.net) Message-Id: <3.0.5.32.20030907142124.013a9ef0@sage-one.net> X-Sender: jackstone@sage-one.net X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32) Date: Sun, 07 Sep 2003 14:21:24 -0500 To: Chuck Swiger From: "Jack L. Stone" In-Reply-To: <3F5B724C.8040606@mac.com> References: <3.0.5.32.20030907102900.01393408@sage-one.net> <3.0.5.32.20030907102900.01393408@sage-one.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, hits=-2.0 required=4.5 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.55-fbelle.rules_v1 X-Spam-Checker-Version: SpamAssassin 2.55-fbelle.rules_v1 (1.174.2.19-2003-05-19-exp) cc: freebsd-questions@freebsd.org Subject: Re: Random crash and/or reboots X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Sep 2003 19:21:26 -0000 At 02:00 PM 9.7.2003 -0400, Chuck Swiger wrote: >Jack L. Stone wrote: >> A while back, on a couple of occasions, I posted a query about some bad >> behavior on my mail server. For the past several months, it has been either >> crashing/reboot or just rebooting. It's ALWAYS triggered by a SSH login, >> but at random and ONLY at the "su" to root -- usually the most time before >> reboot is about 2+ weeks and then contrasted by 2 in a row right after the >> reboot -- actually no pattern. It has never happened directly at the console. >[ ... ] >> There are no indications of anything in the logs, and no core dumps. It >> just stops and reboots, and any random time it pick. Only a couple of times >> it has crashed without the remote login. > >These two paragraphs contradict each other, at least in part. :-) > Except, I doubt if those 2 nighttime reboots had the same problem....that's why I said always triggered by login to root.... forget the 2 unrelated ones. >You're seeing frequent crashes, which seem to be strongly correlated with >logging in as root, but you've also noticed crashes "without the remote login", >too? You should build a debug kernel, and enable dumping the system to swap >upon a panic ("man crash"), so that you have more information about the crash. > >> One tip was that I might have stale NFS mountabs -- cleared them out, but >> problem persisted. >> >> The above tip was suggested when I mentioned that on a couple or more of >> the occurrences, I managed to get to the console quickly enough to see (in >> bright bold) "lockmgr locking against myself" -- or close to that. My >> google of that error does mention stale mounts, but mostly about esoteric >> code stuff. No fix found anywhere. > >Hmm. Are you performing local mail delivery to NFS volumes? > No, just running backups to backup server over NFS... and share the: /usr/ports ... /usr/obj ... and /usr/src from the "build" machines. >Normally (or historically, anyway), NFS locking problems cause rpc.lockd to >crash or wedge, thus resulting in NFS locking not working and possibly grim >results to file consistency for anything being changed by two or more processes >at the same time. > >However, NFS locking problems generally do not result in a system panic. > >[ ... ] >> http://sageweb/tmp/1-lsof.txt >> http://sageweb/tmp/2-lsof.txt > >These URLs aren't fully-qualified hostnames. Please try again. :-) > Yeah, drats! Already sent these: http://www.sageweb.net/tmp/1-lsof.txt http://www.sageweb.net/tmp/2-lsof.txt http://www.sageweb.net/tmp/3-lsof.txt http://www.sageweb.net/tmp/4-lsof.txt http://www.sageweb.net/tmp/5-lsof.txt http://www.sageweb.net/tmp/6-lsof.txt >-Chuck > > Best regards, Jack L. Stone, Administrator SageOne Net http://www.sage-one.net jackstone@sage-one.net