From owner-freebsd-sparc64@FreeBSD.ORG Sun Jun 26 22:32:29 2005 Return-Path: X-Original-To: sparc64@freebsd.org Delivered-To: freebsd-sparc64@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 683B016A41C for ; Sun, 26 Jun 2005 22:32:29 +0000 (GMT) (envelope-from dwhite@gumbysoft.com) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 539E443D4C for ; Sun, 26 Jun 2005 22:32:29 +0000 (GMT) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id 28CBA72DD4; Sun, 26 Jun 2005 15:32:29 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id 244EB72DCB for ; Sun, 26 Jun 2005 15:32:29 -0700 (PDT) Date: Sun, 26 Jun 2005 15:32:29 -0700 (PDT) From: Doug White To: sparc64@freebsd.org Message-ID: <20050626152732.X66393@carver.gumbysoft.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Subject: bug in trap handler? X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Jun 2005 22:32:29 -0000 Hey folks, I fished a 440MHz USIIi on a Panther board out of a boneyard the other day and it doesn't survive a kernel build, dying with this: RED State Exception TL=0000.0000.0000.0005 TT=0000.0000.0000.0010 TPC=0000.0000.c003.c200 TnPC=0000.0000.c003.c204 TSTATE=0000.0044.5800.1503 TL=0000.0000.0000.0004 TT=0000.0000.0000.0010 TPC=0000.0000.c003.c200 TnPC=0000.0000.c003.c204 TSTATE=0000.0044.5800.1503 TL=0000.0000.0000.0003 TT=0000.0000.0000.0010 TPC=0000.0000.c003.c200 TnPC=0000.0000.c003.c204 TSTATE=0000.0044.5800.1503 TL=0000.0000.0000.0002 TT=0000.0000.0000.0010 TPC=0000.0000.c004.0f80 TnPC=0000.0000.c004.0f84 TSTATE=0000.0044.5800.1403 TL=0000.0000.0000.0001 TT=0000.0000.0000.0063 TPC=0000.0000.0012.6b80 TnPC=0000.0000.0012.6b84 TSTATE=0000.0044.0000.1202 I found the .traps OBP command and trap type 0x63 is an ECC error. PC 0xc0040f80 in the kernel is in trap(). Checking the code, we should be panicking since this is an undefined (implementation-specific) trap number as far as FreeBSD is concerned. But we shouldn't be spiralling into endless invalid-instruction traps, which is 0x10. I booted Solaris 9 on the machine and it properly reports memory ECC errors on the system so this shoudn't be fatal. So it appears there are two bugs: 1. We should define trap type 0x63 like the existing ecc-error trap number (0x41?) and just notify the user. 2. We should fix the trap handler to not blow up on undefined traps. I'll try to dig into this but someone more familiar with SPARC assembly would find it much faster :) Thanks for any input! -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org