From owner-freebsd-alpha@FreeBSD.ORG Wed Apr 9 11:07:55 2003 Return-Path: Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D57E437B401 for ; Wed, 9 Apr 2003 11:07:55 -0700 (PDT) Received: from freebie.xs4all.nl (freebie.xs4all.nl [213.84.32.253]) by mx1.FreeBSD.org (Postfix) with ESMTP id 954F743F85 for ; Wed, 9 Apr 2003 11:07:54 -0700 (PDT) (envelope-from wkb@freebie.xs4all.nl) Received: from freebie.xs4all.nl (localhost [127.0.0.1]) by freebie.xs4all.nl (8.12.9/8.12.9) with ESMTP id h39I7rpH015055; Wed, 9 Apr 2003 20:07:53 +0200 (CEST) (envelope-from wkb@freebie.xs4all.nl) Received: (from wkb@localhost) by freebie.xs4all.nl (8.12.9/8.12.9/Submit) id h39I7rkE015054; Wed, 9 Apr 2003 20:07:53 +0200 (CEST) Date: Wed, 9 Apr 2003 20:07:53 +0200 From: Wilko Bulte To: Jens =?iso-8859-1?Q?R=F6der?= Message-ID: <20030409180753.GB14966@freebie.xs4all.nl> References: <20030408181856.GA10163@freebie.xs4all.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4i X-OS: FreeBSD 4.8-STABLE X-PGP: finger wilko@freebsd.org cc: freebsd-alpha@freebsd.org Subject: Re: alpha/50659: reboot causes SRM console to loop endless error and needs to be restetted hard X-BeenThere: freebsd-alpha@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the Alpha List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 18:07:56 -0000 On Wed, Apr 09, 2003 at 10:56:35AM +0200, Jens Röder wrote: > > > Hello Wilko, > > thanks a lot for the kind reply. I will go into more details below: > > > On Tue, 8 Apr 2003, Wilko Bulte wrote: > > > > > It run perfectly under FreeBSD 4.7 but unfortunately the kernel was not > > > stable with having probably problems in memory so that I tried the 5.0. > > > > Do you mean it reports Processor Correctable memory errors? How much memory > > does it have? > > The machine has about 1 GB RAM. Honestly I am not sure what "processor 1GB... that is overkill for a gateway, but hey, it should not hurt ;) > correctable memory errors" are, maybe it helps to show the output. That > was from a selfcompiled kernel under 4.7 but I had the same problems when > trying a generic. That is a kernel panic, not a memory problem ;) Most Alphas, and your AS500 too, have ECC (error correction) memory. That allows single bit memory errors to be corrected. The kernel will tell you if a correction was applied, these are the processor correctable errors I mentioned. > > Mar 20 10:54:55 ptchgate /kernel: > Mar 20 10:54:55 ptchgate /kernel: fatal kernel trap: > Mar 20 10:54:55 ptchgate /kernel: > Mar 20 10:54:55 ptchgate /kernel: trap entry = 0x4 (unaligned access fault) > Mar 20 10:54:55 ptchgate /kernel: a0 = 0xfffffca900010021 > Mar 20 10:54:55 ptchgate /kernel: a1 = 0x2c > Mar 20 10:54:55 ptchgate /kernel: a2 = 0x11 > Mar 20 10:54:55 ptchgate /kernel: pc = 0xfffffc00004f8564 > Mar 20 10:54:55 ptchgate /kernel: ra = 0xfffffc00004942b4 > Mar 20 10:54:55 ptchgate /kernel: curproc = 0 > Mar 20 10:54:55 ptchgate /kernel: Unaligned accesses in kernel mode are Bad(TM). Check the handbook on creating more debug info on the crash please. > At the moment I consider also defect memory and will check that as soon as > I have a temporarily replacement for that Institute gateway and a night Very unlikely, this looks like a problem in the kernel to me. > Meanwhile I have compiled a kernel with suffiencet debug mode with the > hope to offer proper error messages. Can you catch a crash dump maybe? > I think the "unalighed access error" when listing the firewall rules > showed only up in the 5.0 version. I will probably downgrade to 4.7 or 4.8 > (what is better to use?) again and recompile with ipfw2 then, and let you > know then. Before I will try to produce proper errror messages with the > debug kernel of 5.0. I'd go for 4.8. Do you need any ipfw2 functionality? > Maybe you can try out the SRM console problem without upgrading to 5.0 as > I remember I first noticed it, when I booted from floppy or CD and called > the machine to abort. I thought first of the errors reason to be my fault > because of the abortion. Again 4.7 did not have that problem. I have a fresh 4.8 on my AS500 and that does not show me the problem. What kind of PCI cards are in the machine? Can you post a SHOW CONF from the SRM ? Wilko -- | / o / /_ _ wilko@FreeBSD.org |/|/ / / /( (_) Bulte