From owner-freebsd-sparc64@FreeBSD.ORG Sun Aug 31 00:58:50 2008 Return-Path: Delivered-To: sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60DED1065691 for ; Sun, 31 Aug 2008 00:58:50 +0000 (UTC) (envelope-from jb@what-creek.com) Received: from what-creek.com (what-creek.com [66.111.37.70]) by mx1.freebsd.org (Postfix) with ESMTP id 3B6AB8FC08 for ; Sun, 31 Aug 2008 00:58:50 +0000 (UTC) (envelope-from jb@what-creek.com) Received: by what-creek.com (Postfix, from userid 102) id 8B1D7742D0; Sun, 31 Aug 2008 00:42:40 +0000 (GMT) Date: Sun, 31 Aug 2008 00:42:40 +0000 From: John Birrell To: fbsd-sparc64@bzerk.org Message-ID: <20080831004240.GA48524@what-creek.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: sparc64@freebsd.org Subject: Re: DTrace on sparc64 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Aug 2008 00:58:50 -0000 Hi Ruben, There are currently no patches to support running DTrace on FreeBSD/sparc64 because I only have access to a T2000 (that Sun sent to me) and a T1000 that the FreeBSD Foundation contributed to the FreeBSD cluster. Unfortunately, the sun4v is pretty much dead in the water. I've tried to figure out where the bug is, but it requires a smart person who can read the code and check if it complies with the T1 spec. It's hard to debug because I can't trap the bug even using the hypervisor traps. For sparc processors prior to the T1 which I assume you are asking about, I'd need access to a machine to make any headway. Obviously the code exists in OpenSolaris to support DTrace on sparc, so there is a reference to work to. The implementation gets a bit complicated because we can't add any CDDL code to the GENERIC kernel, so we need to implement hooks that are BSD licensed that allow the CDDL code to register itself. I'd really like to see other people working to develop this code. One person working on such a major feature is a huge bottleneck. I know that pjd@ has the same problem with ZFS. He'd love to have other people contributing too. To be honest, although I respect Sun Microsystems' contribution to the free software community, Sparc support isn't high on my agenda because I think that a lot of people who want to use Sun hardware will use OpenSolaris. After all, it's free too. MIPS has recently sent me their reference board, the MALTA, and MIPS support in FreeBSD is actively being developed, so my highest priority on the DTrace front is to do the MIPS port. -- John Birrell From owner-freebsd-sparc64@FreeBSD.ORG Sun Aug 31 05:42:56 2008 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34E721065670 for ; Sun, 31 Aug 2008 05:42:56 +0000 (UTC) (envelope-from carton@Ivy.NET) Received: from sakima.Ivy.NET (sakima.Ivy.NET [IPv6:2610:1f8:dc:41:220:edff:fe27:e764]) by mx1.freebsd.org (Postfix) with ESMTP id 9D65E8FC0C for ; Sun, 31 Aug 2008 05:42:55 +0000 (UTC) (envelope-from carton@Ivy.NET) Received: from castrovalva.Ivy.NET (castrovalva.Ivy.NET [IPv6:2610:1f8:dc:c0::3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sakima.Ivy.NET (Postfix) with ESMTP id C3040A8093 for ; Sun, 31 Aug 2008 01:38:03 -0400 (EDT) Received: by castrovalva.Ivy.NET (Postfix, from userid 405) id 2D1AC12FD0D; Sun, 31 Aug 2008 01:42:52 -0400 (EDT) To: freebsd-sparc64@freebsd.org References: <20080831004240.GA48524@what-creek.com> From: Miles Nordin MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Sun_Aug_31_01:42:51_2008-1"; micalg=pgp-sha1; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit Date: Sun, 31 Aug 2008 01:42:52 -0400 In-Reply-To: <20080831004240.GA48524@what-creek.com> (John Birrell's message of "Sun, 31 Aug 2008 00:42:40 +0000") Message-ID: User-Agent: T-gnus/6.17.2 (based on No Gnus v0.2) SEMI/1.14.6 (Maruoka) FLIM/1.14.7 (=?ISO-8859-4?Q?Sanj=F2?=) APEL/10.6 Emacs/21.4 (alpha--netbsd) MULE/5.0 (SAKAKI) Subject: Re: DTrace on sparc64 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Aug 2008 05:42:56 -0000 --pgp-sign-Multipart_Sun_Aug_31_01:42:51_2008-1 Content-Type: text/plain; charset=US-ASCII >>>>> "jb" == John Birrell writes: jb> After all, it's free too. huge chunks of it, maybe even more than half, are not free as in freedom. but...yeah...I'm running it. --pgp-sign-Multipart_Sun_Aug_31_01:42:51_2008-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (NetBSD) iQCVAwUASLovW4nCBbTaW/4dAQJwnAP+ICrvEH1/xtiCDa1mGPWw+BdgrNL06hVx KNjayR7+fp+QwlAQLhugv8YYgGqPOOYz6ubjvw3sA8I4KbpQAbON/aC5xUyZazNO 7lopD3I8x47uJLxPsM7MoY+nsJ3ymXzWqLO7immDgtDbGmkZB1PDbfCzySv43KdK lnogpcufoLw= =u1re -----END PGP SIGNATURE----- --pgp-sign-Multipart_Sun_Aug_31_01:42:51_2008-1-- From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 11:07:03 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6BE0F1065672 for ; Mon, 1 Sep 2008 11:07:02 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DFD458FC1D for ; Mon, 1 Sep 2008 11:07:02 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m81B72g9068576 for ; Mon, 1 Sep 2008 11:07:02 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m81B72Qa068572 for freebsd-sparc64@FreeBSD.org; Mon, 1 Sep 2008 11:07:02 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 1 Sep 2008 11:07:02 GMT Message-Id: <200809011107.m81B72Qa068572@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-sparc64@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-sparc64@FreeBSD.org X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 11:07:03 -0000 Current FreeBSD problem reports Critical problems Serious problems S Tracker Resp. Description -------------------------------------------------------------------------------- o sparc/71729 sparc64 printf in kernel thread causes panic on SPARC o sparc/80410 sparc64 [netgraph] netgraph is causing crash with mpd on sparc o sparc/80890 sparc64 [panic] kmem_malloc(73728): kmem_map too small running o sparc/95297 sparc64 vt100 term does not work in install o sparc/104428 sparc64 [nullfs] nullfs panics on E4500 (but not E420) o sparc/105048 sparc64 [trm] trm(4) panics on sparc64 f sparc/105607 sparc64 [modules] modules on sparc64 don't work with >= 4GB f sparc/106251 sparc64 [libmalloc] malloc fails > for large allocations s sparc/107087 sparc64 system is hinged during boot from CD o sparc/109908 sparc64 apache22 mod_perl issue on sparc64 o sparc/113556 sparc64 panic: trap: memory address not aligned; Rebooting... o sparc/118932 sparc64 7.0-BETA4/sparc-64 kernel panic in rip_output o sparc/119017 sparc64 7.0 Beta won't install on U60 s sparc/119239 sparc64 gdb coredumps on sparc64 o sparc/119244 sparc64 X11Forwarding to X11 server on sparc crashes Xorg 15 problems total. Non-critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- f sparc/105157 sparc64 No reply to ping on Sparc64 f sparc/108732 sparc64 ping(8) reports 14 digit time on sparc64 o sparc/119240 sparc64 top has WCPU over 100% on UP system 3 problems total. From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 14:39:46 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6C5BE1065692 for ; Mon, 1 Sep 2008 14:39:46 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from buffy.york.ac.uk (buffy.york.ac.uk [144.32.226.160]) by mx1.freebsd.org (Postfix) with ESMTP id 1B9BC8FC16 for ; Mon, 1 Sep 2008 14:39:45 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from buffy.york.ac.uk (localhost [127.0.0.1]) by buffy.york.ac.uk (8.14.2/8.14.2) with ESMTP id m81EKSHB072396; Mon, 1 Sep 2008 15:20:28 +0100 (BST) (envelope-from gavin@FreeBSD.org) Received: (from ga9@localhost) by buffy.york.ac.uk (8.14.2/8.14.2/Submit) id m81EKSWN072395; Mon, 1 Sep 2008 15:20:28 +0100 (BST) (envelope-from gavin@FreeBSD.org) X-Authentication-Warning: buffy.york.ac.uk: ga9 set sender to gavin@FreeBSD.org using -f From: Gavin Atkinson To: freebsd-sparc64@FreeBSD.org Content-Type: multipart/mixed; boundary="=-h9I5aZ0SRzwwwXEO2JZI" Date: Mon, 01 Sep 2008 15:20:27 +0100 Message-Id: <1220278827.70590.35.camel@buffy.york.ac.uk> Mime-Version: 1.0 X-Mailer: Evolution 2.22.2 FreeBSD GNOME Team Port Cc: marius@FreeBSD.org Subject: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 14:39:46 -0000 --=-h9I5aZ0SRzwwwXEO2JZI Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi all, My Blade 100 now panics on boot with HEAD, and I've tracked it down to sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. Specifically, this version now configures bridges differently, and not setting "Master Abort Mode" prevents the panic: Index: src/sys/sparc64/pci/ofw_pcibus.c =================================================================== RCS file: /home/ncvs/src/sys/sparc64/pci/ofw_pcibus.c,v retrieving revision 1.21 diff -u -r1.21 ofw_pcibus.c --- src/sys/sparc64/pci/ofw_pcibus.c 24 Aug 2008 15:05:46 -0000 1.21 +++ src/sys/sparc64/pci/ofw_pcibus.c 1 Sep 2008 14:09:27 -0000 @@ -140,7 +140,7 @@ PCIM_HDRTYPE) == PCIM_HDRTYPE_BRIDGE) { reg = PCIB_READ_CONFIG(bridge, busno, slot, func, PCIR_BRIDGECTL_1, 1); - reg |= PCIB_BCR_MASTER_ABORT_MODE | PCIB_BCR_SERR_ENABLE | + reg |= /* PCIB_BCR_MASTER_ABORT_MODE | */ PCIB_BCR_SERR_ENABLE | PCIB_BCR_PERR_ENABLE; #ifdef OFW_PCI_DEBUG device_printf(bridge, My Blade 100 (dmesg and panic backtrace attached) has three extra ATI graphics cards installed (Official Sun ones, PN 370-4362), it doesn't panic with these removed. Removing them and throwing a generic fxp(4) card into one of the slots also gives the panic, so I suspect having anything in at least one of the slots will cause a panic for me. I'm pretty sure the panic is not hardware related, as the machine will happily run Solaris 10. Any suggestions? Are we missing some code necessary to support master mode aborts? I'm happy to test anything necessary. This code was also MFC'd, so I'm concerned about seeing 7.1 also have this issue. Thanks, Gavin --=-h9I5aZ0SRzwwwXEO2JZI Content-Disposition: attachment; filename=violet-panic.txt Content-Type: text/plain; name=violet-panic.txt; charset=ASCII Content-Transfer-Encoding: base64 anVtcGluZyB0byBrZXJuZWwgZW50cnkgYXQgMHhjMDA3ODAwMC4NCkdEQjogbm8gZGVidWcgcG9y dHMgcHJlc2VudA0KS0RCOiBkZWJ1Z2dlciBiYWNrZW5kczogZGRiDQpLREI6IGN1cnJlbnQgYmFj a2VuZDogZGRiDQpDb3B5cmlnaHQgKGMpIDE5OTItMjAwOCBUaGUgRnJlZUJTRCBQcm9qZWN0Lg0K Q29weXJpZ2h0IChjKSAxOTc5LCAxOTgwLCAxOTgzLCAxOTg2LCAxOTg4LCAxOTg5LCAxOTkxLCAx OTkyLCAxOTkzLCAxOTk0DQogICAgICAgIFRoZSBSZWdlbnRzIG9mIHRoZSBVbml2ZXJzaXR5IG9m IENhbGlmb3JuaWEuIEFsbCByaWdodHMgcmVzZXJ2ZWQuDQpGcmVlQlNEIGlzIGEgcmVnaXN0ZXJl ZCB0cmFkZW1hcmsgb2YgVGhlIEZyZWVCU0QgRm91bmRhdGlvbi4NCkZyZWVCU0QgOC4wLUNVUlJF TlQgIzEwOiBNb24gU2VwICAxIDEyOjI1OjI3IEJTVCAyMDA4DQogICAgcm9vdEB2aW9sZXQueW9y ay5hYy51azovdXNyL29iai91c3Ivc3JjL3N5cy9HRU5FUklDDQpXQVJOSU5HOiBXSVRORVNTIG9w dGlvbiBlbmFibGVkLCBleHBlY3QgcmVkdWNlZCBwZXJmb3JtYW5jZS4NClRpbWVjb3VudGVyICJ0 aWNrIiBmcmVxdWVuY3kgNTAyMDAwMDAwIEh6IHF1YWxpdHkgMTAwMA0KcmVhbCBtZW1vcnkgID0g NTM2ODcwOTEyICg1MTIgTUIpDQphdmFpbCBtZW1vcnkgPSA1MDY0NTQwMTYgKDQ4MiBNQikNCmNw dTA6IFN1biBNaWNyb3N5c3RlbXMgVWx0cmFTcGFyYy1JSWUgUHJvY2Vzc29yICg1MDIuMDAgTUh6 IENQVSkNCnJlZ2lzdGVyZWQgZmlybXdhcmUgc2V0IDxpc3BfMTAwMD4NCnJlZ2lzdGVyZWQgZmly bXdhcmUgc2V0IDxpc3BfMTA0MD4NCnJlZ2lzdGVyZWQgZmlybXdhcmUgc2V0IDxpc3BfMTA0MF9p dD4NCnJlZ2lzdGVyZWQgZmlybXdhcmUgc2V0IDxpc3BfMTA4MD4NCnJlZ2lzdGVyZWQgZmlybXdh cmUgc2V0IDxpc3BfMTA4MF9pdD4NCnJlZ2lzdGVyZWQgZmlybXdhcmUgc2V0IDxpc3BfMTIxNjA+ DQpyZWdpc3RlcmVkIGZpcm13YXJlIHNldCA8aXNwXzEyMTYwX2l0Pg0KcmVnaXN0ZXJlZCBmaXJt d2FyZSBzZXQgPGlzcF8yMTAwPg0KcmVnaXN0ZXJlZCBmaXJtd2FyZSBzZXQgPGlzcF8yMjAwPg0K cmVnaXN0ZXJlZCBmaXJtd2FyZSBzZXQgPGlzcF8yMzAwPg0KcmVnaXN0ZXJlZCBmaXJtd2FyZSBz ZXQgPGlzcF8yMzIyPg0KcmVnaXN0ZXJlZCBmaXJtd2FyZSBzZXQgPGlzcF8yNDAwPg0Ka2JkMCBh dCBrYmRtdXgwDQphdGhfaGFsOiAwLjkuMjAuMyAoQVI1MjEwLCBBUjUyMTEsIEFSNTIxMiwgUkY1 MTExLCBSRjUxMTIsIFJGMjQxMywgUkY1NDEzLCBSRUdPUFNfRlVOQykNCm5leHVzMDogPE9wZW4g RmlybXdhcmUgTmV4dXMgZGV2aWNlPg0KcGNpYjA6IDxVMlAgVVBBLVBDSSBicmlkZ2U+IG1lbSAw eDFmZTAwMDAwMDAwLTB4MWZlMDAwMGZmZmYsMHgxZmUwMTAwMDAwMC0weDFmZTAxMDAwMGZmIGly cSAyMDMyLDIwMzAsMjAzMSwyMDIxIG9uIG5leHVzMA0KcGNpYjA6IEh1bW1pbmdiaXJkIGNvbXBh dGlibGUsIGltcGwgMCwgdmVyc2lvbiAwLCBJR04gMHgxZiwgYnVzIEENCnBjaWIwOiBbRklMVEVS XQ0KcGNpYjA6IFtGSUxURVJdDQpwY2liMDogW0dJQU5ULUxPQ0tFRF0NCnBjaWIwOiBbSVRIUkVB RF0NCnBjaWIwOiBEVk1BIG1hcDogMHhjMDAwMDAwMCB0byAweGMzZmZmZmZmDQpwY2liMDogW0ZJ TFRFUl0NCnBjaTA6IDxPRlcgUENJIGJ1cz4gb24gcGNpYjANCnBjaWIwOiBkZXZpY2UgMC8xMi8w OiBsYXRlbmN5IHRpbWVyIDY0IC0+IDgwDQpwY2liMDogZGV2aWNlIDAvNy8wOiBsYXRlbmN5IHRp bWVyIDAgLT4gNjQNCnBjaWIwOiBkZXZpY2UgMC8xMi8xOiBsYXRlbmN5IHRpbWVyIDY0IC0+IDgw DQpwY2liMDogZGV2aWNlIDAvMTIvMjogbGF0ZW5jeSB0aW1lciA2NCAtPiA4MA0KcGNpYjA6IGRl dmljZSAwLzEyLzM6IGxhdGVuY3kgdGltZXIgNjQgLT4gODANCnBjaWIwOiBkZXZpY2UgMC8zLzA6 IGxhdGVuY3kgdGltZXIgMCAtPiA2NA0KcGNpYjA6IGRldmljZSAwLzgvMDogbGF0ZW5jeSB0aW1l ciA2NCAtPiAxNg0KcGNpYjA6IGRldmljZSAwLzEzLzA6IGxhdGVuY3kgdGltZXIgNjQgLT4gMTYN CnBjaWIwOiBkZXZpY2UgMC8xOS8wOiBsYXRlbmN5IHRpbWVyIDY0IC0+IDY0DQpwY2liMDogYnJp ZGdlIDAvNS8wOiBjb250cm9sIDB4MCAtPiAweDIzDQpwY2liMDogYnJpZGdlIDAvNS8wOiBsYXRl bmN5IHRpbWVyIDAgLT4gNjQNCnBjaWIwOiBkZXZpY2UgMC81LzA6IGxhdGVuY3kgdGltZXIgNjQg LT4gNjQNCmVidXMwOiA8UENJLUVCdXMzIGJyaWRnZT4gbWVtIDB4ZjAwMDAwMDAtMHhmMGZmZmZm ZiwweGYxMDAwMDAwLTB4ZjE3ZmZmZmYgYXQgZGV2aWNlIDEyLjAgb24gcGNpMA0KZWJ1czA6IDxp ZHByb20+OiBpbmNvbXBsZXRlDQplYnVzMDogPGZsYXNocHJvbT4gYWRkciAwLTB4ZmZmZmYgKG5v IGRyaXZlciBhdHRhY2hlZCkNCmVlcHJvbTA6IDxFRVBST00vY2xvY2s+IGFkZHIgMHgxMDAwMDAw MDAtMHgxMDAwMDFmZmYgb24gZWJ1czANCmVlcHJvbTA6IG1vZGVsIG1rNDh0NTkNCmlzYWIwOiA8 UENJLUlTQSBicmlkZ2U+IGF0IGRldmljZSA3LjAgb24gcGNpMA0KaXNhMDogPElTQSBidXM+IG9u IGlzYWIwDQpnZW0wOiA8U3VuIEVSSSAxMC8xMDAgRXRoZXJuZXQ+IG1lbSAweDQwMDAwMC0weDQx ZmZmZiBhdCBkZXZpY2UgMTIuMSBvbiBwY2kwDQptaWlidXMwOiA8TUlJIGJ1cz4gb24gZ2VtMA0K dWtwaHkwOiA8R2VuZXJpYyBJRUVFIDgwMi4zdSBtZWRpYSBpbnRlcmZhY2U+IFBIWSAxIG9uIG1p aWJ1czANCnVrcGh5MDogIDEwYmFzZVQsIDEwYmFzZVQtRkRYLCAxMDBiYXNlVFgsIDEwMGJhc2VU WC1GRFgsIGF1dG8NCmdlbTA6IDJrQiBSWCBGSUZPLCAya0IgVFggRklGTw0KZ2VtMDogRXRoZXJu ZXQgYWRkcmVzczogMDA6MDM6YmE6MWQ6OGQ6N2YNCmdlbTA6IFtJVEhSRUFEXQ0KZndvaGNpMDog PFN1biBQQ0lPLTI+IG1lbSAweDQyMDAwMC0weDQyMDdmZiwweDQyMjAwMC0weDQyMjdmZiBhdCBk ZXZpY2UgMTIuMiBvbiBwY2kwDQpmd29oY2kwOiBbRklMVEVSXQ0KZndvaGNpMDogT0hDSSB2ZXJz aW9uIDEuMCAoUk9NPTApDQpmd29oY2kwOiBOby4gb2YgSXNvY2hyb25vdXMgY2hhbm5lbHMgaXMg NC4NCmZ3b2hjaTA6IEVVSTY0IDAwOjAzOmJhOmZmOmZlOjFkOjhkOjdmDQpmd29oY2kwOiBQaHkg MTM5NGEgYXZhaWxhYmxlIFM0MDAsIDIgcG9ydHMuDQpmd29oY2kwOiBMaW5rIFM0MDAsIG1heF9y ZWMgMjA0OCBieXRlcy4NCmZpcmV3aXJlMDogPElFRUUxMzk0KEZpcmVXaXJlKSBidXM+IG9uIGZ3 b2hjaTANCmRjb25zX2Nyb20wOiA8ZGNvbnMgY29uZmlndXJhdGlvbiBST00+IG9uIGZpcmV3aXJl MA0KZGNvbnNfY3JvbTA6IGJ1c19hZGRyIDB4YzExMjgwMDANCmZ3aXAwOiA8SVAgb3ZlciBGaXJl V2lyZT4gb24gZmlyZXdpcmUwDQpmd2lwMDogRmlyZXdpcmUgYWRkcmVzczogMDA6MDM6YmE6ZmY6 ZmU6MWQ6OGQ6N2YgQCAweGZmZmUwMDAwMDAwMCwgUzQwMCwgbWF4cmVjIDIwNDgNCnNicDA6IDxT QlAtMi9TQ1NJIG92ZXIgRmlyZVdpcmU+IG9uIGZpcmV3aXJlMA0KZndlMDogPEV0aGVybmV0IG92 ZXIgRmlyZVdpcmU+IG9uIGZpcmV3aXJlMA0KaWZfZndlMDogRmFrZSBFdGhlcm5ldCBhZGRyZXNz OiAwMjowMzpiYToxZDo4ZDo3Zg0KZndlMDogRXRoZXJuZXQgYWRkcmVzczogMDI6MDM6YmE6MWQ6 OGQ6N2YNCmZ3b2hjaTA6IEluaXRpYXRlIGJ1cyByZXNldA0KZndvaGNpMDogQlVTIHJlc2V0DQpm d29oY2kwOiBub2RlX2lkPTB4YzgwMGZmYzAsIGdlbj0xLCBDWUNMRU1BU1RFUiBtb2RlDQpvaGNp MDogPFN1biBQQ0lPLTIgVVNCIGNvbnRyb2xsZXI+IG1lbSAweDIwMDAwMDAtMHgyMDA3ZmZmIGF0 IGRldmljZSAxMi4zIG9uIHBjaTANCm9oY2kwOiBbR0lBTlQtTE9DS0VEXQ0Kb2hjaTA6IFtJVEhS RUFEXQ0KdXNiMDogT0hDSSB2ZXJzaW9uIDEuMCwgbGVnYWN5IHN1cHBvcnQNCnVzYjA6IDxTdW4g UENJTy0yIFVTQiBjb250cm9sbGVyPiBvbiBvaGNpMA0KdXNiMDogVVNCIHJldmlzaW9uIDEuMA0K dWh1YjA6IDwoMHgxMDhlKSBPSENJIHJvb3QgaHViLCBjbGFzcyA5LzAsIHJldiAxLjAwLzEuMDAs IGFkZHIgMT4gb24gdXNiMA0KdWh1YjA6IDQgcG9ydHMgd2l0aCA0IHJlbW92YWJsZSwgc2VsZiBw b3dlcmVkDQpwY2kwOiA8b2xkLCBub24tVkdBIGRpc3BsYXkgZGV2aWNlPiBhdCBkZXZpY2UgMy4w IChubyBkcml2ZXIgYXR0YWNoZWQpDQpwY2kwOiA8bXVsdGltZWRpYSwgYXVkaW8+IGF0IGRldmlj ZSA4LjAgKG5vIGRyaXZlciBhdHRhY2hlZCkNCmF0YXBjaTA6IDxBY2VyTGFicyBNNTIyOSBVRE1B NjYgY29udHJvbGxlcj4gcG9ydCAweGEwMC0weGEwNywweGExOC0weGExYiwweGExMC0weGExNyww eGEwOC0weGEwYiwweGEyMC0weGEyZiBhdCBkZXZpY2UgMTMuMCBvbiBwY2kwDQphdGFwY2kwOiBb SVRIUkVBRF0NCmF0YXBjaTA6IHVzaW5nIFBJTyB0cmFuc2ZlcnMgYWJvdmUgMTM3R0IgYXMgd29y a2Fyb3VuZCBmb3IgNDhiaXQgRE1BIGFjY2VzcyBidWcsIGV4cGVjdCByZWR1Y2VkIHBlcmZvcm1h bmNlDQphdGEyOiA8QVRBIGNoYW5uZWwgMD4gb24gYXRhcGNpMA0KYXRhMjogW0lUSFJFQURdDQph dGEzOiA8QVRBIGNoYW5uZWwgMT4gb24gYXRhcGNpMA0KYXRhMzogW0lUSFJFQURdDQptYWNoZmIw OiA8QVRJIFJhZ2UgWEw+IHBvcnQgMHhiMDAtMHhiZmYgbWVtIDB4MzAwMDAwMC0weDNmZmZmZmYs MHg0MjYwMDAtMHg0MjZmZmYgYXQgZGV2aWNlIDE5LjAgb24gcGNpMA0KbWFjaGZiMDogMTYgTUIg YXBlcnR1cmUgYXQgMHhkNTkwNjAwMCwgMSBLQiByZWdpc3RlcnMgYXQgMHgwMzdmZmMwMA0KbWFj aGZiMDogODE4OCBLQiBTRFJBTSAxMTQuOTkyIE1IeiwgbWF4aW11bSBSQU1EQUMgY2xvY2sgMjMw IE1IeiwgRFNQDQptYWNoZmIwOiByZXNvbHV0aW9uIDExNTJ4OTAwIGF0IDggYnBwDQpwY2liMTog PE9GVyBQQ0ktUENJIGJyaWRnZT4gYXQgZGV2aWNlIDUuMCBvbiBwY2kwDQpwY2kxOiA8T0ZXIFBD SSBidXM+IG9uIHBjaWIxDQpwY2liMTogZGV2aWNlIDEvMC8wOiBsYXRlbmN5IHRpbWVyIDY0IC0+ IDY0DQpwY2liMTogZGV2aWNlIDEvMS8wOiBsYXRlbmN5IHRpbWVyIDY0IC0+IDY0DQpwY2liMTog ZGV2aWNlIDEvMi8wOiBsYXRlbmN5IHRpbWVyIDY0IC0+IDY0DQptYWNoZmIxOiA8QVRJIFJhZ2Ug WEw+IHBvcnQgMHgxMDAwLTB4MTBmZiBtZW0gMHg0MDAwMDAwLTB4NGZmZmZmZiwweDUwMDAwMDAt MHg1MDAwZmZmIGF0IGRldmljZSAwLjAgb24gcGNpMQ0KbWFjaGZiMTogMTYgTUIgYXBlcnR1cmUg YXQgMHhkNjkwODAwMCwgMSBLQiByZWdpc3RlcnMgYXQgMHgwNDdmZmMwMA0KbWFjaGZiMTogODE4 OCBLQiBTR1JBTSAxMTQuOTkyIE1IeiwgbWF4aW11bSBSQU1EQUMgY2xvY2sgMjMwIE1IeiwgRFNQ DQptYWNoZmIxOiByZXNvbHV0aW9uIDExNTJ4OTAwIGF0IDggYnBwDQptYWNoZmIyOiA8QVRJIFJh Z2UgWEw+IHBvcnQgMHgxMTAwLTB4MTFmZiBtZW0gMHg2MDAwMDAwLTB4NmZmZmZmZiwweDUwMDIw MDAtMHg1MDAyZmZmIGF0IGRldmljZSAxLjAgb24gcGNpMQ0KbWFjaGZiMjogMTYgTUIgYXBlcnR1 cmUgYXQgMHhkNzkwYTAwMCwgMSBLQiByZWdpc3RlcnMgYXQgMHgwNjdmZmMwMA0KbWFjaGZiMjog ODE4OCBLQiBTR1JBTSAxMTQuOTkyIE1IeiwgbWF4aW11bSBSQU1EQUMgY2xvY2sgMjMwIE1Ieiwg RFNQDQptYWNoZmIyOiByZXNvbHV0aW9uIDExNTJ4OTAwIGF0IDggYnBwDQptYWNoZmIzOiA8QVRJ IFJhZ2UgWEw+IHBvcnQgMHgxMjAwLTB4MTJmZiBtZW0gMHg3MDAwMDAwLTB4N2ZmZmZmZiwweDUw MDQwMDAtMHg1MDA0ZmZmIGF0IGRldmljZSAyLjAgb24gcGNpMQ0KbWFjaGZiMzogMTYgTUIgYXBl cnR1cmUgYXQgMHhkODkwYzAwMCwgMSBLQiByZWdpc3RlcnMgYXQgMHgwNzdmZmMwMA0KbWFjaGZi MzogODE4OCBLQiBTR1JBTSAxMTQuOTkyIE1IeiwgbWF4aW11bSBSQU1EQUMgY2xvY2sgMjMwIE1I eiwgRFNQDQptYWNoZmIzOiByZXNvbHV0aW9uIDExNTJ4OTAwIGF0IDggYnBwDQpzeXNjb25zMDog PFN5c3RlbSBjb25zb2xlPiBvbiBuZXh1czANCnN5c2NvbnMwOiBVbmtub3duIDwxNiB2aXJ0dWFs IGNvbnNvbGVzLCBmbGFncz0weDEwMD4NCnBhbmljOiBwY2liOiBQQ0kgYnVzIEEgZXJyb3IgQUZB UiAweDFmZTAyMDAxYzgwIEFGU1IgMHg0MDAwMDAwMTAwMDAwMDAwDQpjcHVpZCA9IDANCktEQjog ZW50ZXI6IHBhbmljDQpbdGhyZWFkIHBpZCAwIHRpZCAxMDAwMDAgXQ0KU3RvcHBlZCBhdCAgICAg IGtkYl9lbnRlcisweDgwOiB0YSAgICAgICAgICAgICAgJXhjYywgMQ0KZGI+IHRyDQpUcmFjaW5n IHBpZCAwIHRpZCAxMDAwMDAgdGQgMHhjMDdkMmU3MA0KcGFuaWMoKSBhdCBwYW5pYysweDIwOA0K cHN5Y2hvX3BjaV9idXMoKSBhdCBwc3ljaG9fcGNpX2J1cysweDg4DQppbnRyX2V2ZW50X2hhbmRs ZSgpIGF0IGludHJfZXZlbnRfaGFuZGxlKzB4NWMNCmludHJfZXhlY3V0ZV9oYW5kbGVycygpIGF0 IGludHJfZXhlY3V0ZV9oYW5kbGVycysweDE0DQppbnRyX2Zhc3QoKSBhdCBpbnRyX2Zhc3QrMHg2 OA0KLS0gaW50ZXJydXB0IGxldmVsPTB4ZCBwaWw9MCAlbzc9MHhjMDJlYTU1YyAtLQ0KLS0gZGF0 YSBhY2Nlc3MgZXJyb3IgJW83PTB4YzBjMTc1N2MgLS0NCmFoY19pc2FfZmluZF9kZXZpY2UoKSBh dCBhaGNfaXNhX2ZpbmRfZGV2aWNlKzB4NTANCmFoY19pc2FfaWRlbnRpZnkoKSBhdCBhaGNfaXNh X2lkZW50aWZ5KzB4ZDgNCmJ1c19nZW5lcmljX3Byb2JlKCkgYXQgYnVzX2dlbmVyaWNfcHJvYmUr MHg2NA0KaXNhX3Byb2JlX2NoaWxkcmVuKCkgYXQgaXNhX3Byb2JlX2NoaWxkcmVuKzB4NA0KY29u ZmlndXJlKCkgYXQgY29uZmlndXJlKzB4MmMNCm1pX3N0YXJ0dXAoKSBhdCBtaV9zdGFydHVwKzB4 MThjDQpidGV4dCgpIGF0IGJ0ZXh0KzB4MzQNCmRiPg0KDQoNCg0KDQpOZXh0IGxpbmVzIHRvIGJl IHByaW50ZWQgaWYgdGhlIHBhbmljIGRpZG4ndCBvY2N1cjoNCg0KdWFydDA6IDwxNjU1MCBvciBj b21wYXRpYmxlPiBhdCBwb3J0IDB4M2Y4LTB4M2ZmIGlycSA0MyBvbiBpc2EwDQp1YXJ0MDogW0ZJ TFRFUl0NCnVhcnQwOiBjb25zb2xlICg5NjAwLG4sOCwxKQ0KdWFydDE6IDwxNjU1MCBvciBjb21w YXRpYmxlPiBhdCBwb3J0IDB4MmU4LTB4MmVmIGlycSA0MyBvbiBpc2EwDQp1YXJ0MTogW0ZJTFRF Ul0NClRpbWVjb3VudGVycyB0aWNrIGV2ZXJ5IDEuMDAwIG1zZWMNCmZpcmV3aXJlMDogMSBub2Rl cywgbWF4aG9wIDw9IDAsIGNhYmxlIElSTSA9IDAgKG1lKQ0KZmlyZXdpcmUwOiBidXMgbWFuYWdl ciAwIChtZSkNCmFkMDogMTkwOTJNQiA8U2VhZ2F0ZSBTVDMyMDAxMUEgMy4xOT4gYXQgYXRhMi1t YXN0ZXIgVURNQTY2DQphY2QwOiBDRFJXIDxMVE40ODZTL1kzUzI+IGF0IGF0YTItc2xhdmUgVURN QTMzDQpXQVJOSU5HOiBXSVRORVNTIG9wdGlvbiBlbmFibGVkLCBleHBlY3QgcmVkdWNlZCBwZXJm b3JtYW5jZS4NCg== --=-h9I5aZ0SRzwwwXEO2JZI-- From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 16:20:46 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DC911065671; Mon, 1 Sep 2008 16:20:46 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from buffy.york.ac.uk (buffy.york.ac.uk [144.32.226.160]) by mx1.freebsd.org (Postfix) with ESMTP id DAE4C8FC1D; Mon, 1 Sep 2008 16:20:45 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from buffy.york.ac.uk (localhost [127.0.0.1]) by buffy.york.ac.uk (8.14.2/8.14.2) with ESMTP id m81GKioW080507; Mon, 1 Sep 2008 17:20:44 +0100 (BST) (envelope-from gavin@FreeBSD.org) Received: (from ga9@localhost) by buffy.york.ac.uk (8.14.2/8.14.2/Submit) id m81GKirf080506; Mon, 1 Sep 2008 17:20:44 +0100 (BST) (envelope-from gavin@FreeBSD.org) X-Authentication-Warning: buffy.york.ac.uk: ga9 set sender to gavin@FreeBSD.org using -f From: Gavin Atkinson To: freebsd-sparc64@FreeBSD.org In-Reply-To: <1220278827.70590.35.camel@buffy.york.ac.uk> References: <1220278827.70590.35.camel@buffy.york.ac.uk> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Mon, 01 Sep 2008 17:20:44 +0100 Message-Id: <1220286044.70590.43.camel@buffy.york.ac.uk> Mime-Version: 1.0 X-Mailer: Evolution 2.22.2 FreeBSD GNOME Team Port Cc: marius@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 16:20:46 -0000 On Mon, 2008-09-01 at 15:20 +0100, Gavin Atkinson wrote: > Hi all, > > My Blade 100 now panics on boot with HEAD, and I've tracked it down to > sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. > Specifically, this version now configures bridges differently, and not > setting "Master Abort Mode" prevents the panic: > > Index: src/sys/sparc64/pci/ofw_pcibus.c > =================================================================== > RCS file: /home/ncvs/src/sys/sparc64/pci/ofw_pcibus.c,v > retrieving revision 1.21 > diff -u -r1.21 ofw_pcibus.c > --- src/sys/sparc64/pci/ofw_pcibus.c 24 Aug 2008 15:05:46 -0000 1.21 > +++ src/sys/sparc64/pci/ofw_pcibus.c 1 Sep 2008 14:09:27 -0000 > @@ -140,7 +140,7 @@ > PCIM_HDRTYPE) == PCIM_HDRTYPE_BRIDGE) { > reg = PCIB_READ_CONFIG(bridge, busno, slot, func, > PCIR_BRIDGECTL_1, 1); > - reg |= PCIB_BCR_MASTER_ABORT_MODE | PCIB_BCR_SERR_ENABLE | > + reg |= /* PCIB_BCR_MASTER_ABORT_MODE | */ PCIB_BCR_SERR_ENABLE | > PCIB_BCR_PERR_ENABLE; > #ifdef OFW_PCI_DEBUG > device_printf(bridge, [snip] > Any suggestions? Are we missing some code necessary to support master > mode aborts? After further research (mainly involving eyeballing pci_pbm_err_handler() in OpenSolaris), it looks like we are indeed missing code to handle them. Therefore, until this code is written, I suspect the patch above is actually correct. Gavin From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 16:42:10 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB1BC106567C; Mon, 1 Sep 2008 16:42:10 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from buffy.york.ac.uk (buffy.york.ac.uk [144.32.226.160]) by mx1.freebsd.org (Postfix) with ESMTP id 594BC8FC14; Mon, 1 Sep 2008 16:42:10 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from buffy.york.ac.uk (localhost [127.0.0.1]) by buffy.york.ac.uk (8.14.2/8.14.2) with ESMTP id m81Gg8CM080623; Mon, 1 Sep 2008 17:42:08 +0100 (BST) (envelope-from gavin@FreeBSD.org) Received: (from ga9@localhost) by buffy.york.ac.uk (8.14.2/8.14.2/Submit) id m81Gg8xS080622; Mon, 1 Sep 2008 17:42:08 +0100 (BST) (envelope-from gavin@FreeBSD.org) X-Authentication-Warning: buffy.york.ac.uk: ga9 set sender to gavin@FreeBSD.org using -f From: Gavin Atkinson To: Marius Strobl In-Reply-To: <20080901161850.GE80839@alchemy.franken.de> References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> Content-Type: multipart/mixed; boundary="=-I5ae6o9bB7NBGQbcf4cR" Date: Mon, 01 Sep 2008 17:42:08 +0100 Message-Id: <1220287328.70590.46.camel@buffy.york.ac.uk> Mime-Version: 1.0 X-Mailer: Evolution 2.22.2 FreeBSD GNOME Team Port Cc: gibbs@FreeBSD.org, freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 16:42:11 -0000 --=-I5ae6o9bB7NBGQbcf4cR Content-Type: text/plain Content-Transfer-Encoding: 7bit On Mon, 2008-09-01 at 18:18 +0200, Marius Strobl wrote: > On Mon, Sep 01, 2008 at 03:20:27PM +0100, Gavin Atkinson wrote: > > Hi all, > > > > My Blade 100 now panics on boot with HEAD, and I've tracked it down to > > sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. > The most likely reason for this is a buggy driver. In this > case the culprit appears to be the ISA front-end of ahc(4), > which assumes that it can do bus space reads and writes at > addresses that may in fact be assigned to a non-ahc(4)- > compatible device or none at all. While writing something > at an address that may no belong to the expected device > probably is a bad idea in generally, reading to and writing > from unassigned addresses may also trigger exceptions on > sparc64. I'm unsure how to really fix ahc(4) regarding this, > I think it should be okay though to only do it on i386 where > the address range in question probably is reserved for such > purposes (and which also is the only architecture FreeBSD > currently runs on where a machine might have an ISA-slot > and thus can use that front-end at all). > Justin, do you approve the below patch? > > Marius > > Index: ahc_isa.c > =================================================================== > --- ahc_isa.c (revision 182474) > +++ ahc_isa.c (working copy) > @@ -82,6 +82,12 @@ ahc_isa_identify(driver_t *driver, device_t parent > int slot; > int max_slot; > > +#if !defined(__i386__) > + /* > + * Don't assume we can get away with the blind bus space > + * reads and writes which ahc_isa_find_device() does. > + */ > +#endif > max_slot = 14; > for (slot = 0; slot <= max_slot; slot++) { > struct aic7770_identity *entry; This patch (with the addition of a "return;" inside the #ifdef which I'm assuming was forgotten!) gets me booting again with stock ofw_pcibus.c. Thanks! Gavin --=-I5ae6o9bB7NBGQbcf4cR Content-Disposition: attachment; filename=aic-noisa.diff Content-Transfer-Encoding: base64 Content-Type: text/x-patch; name=aic-noisa.diff; charset=ASCII SW5kZXg6IHNyYy9zeXMvZGV2L2FpYzd4eHgvYWhjX2lzYS5jDQo9PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09DQpSQ1MgZmls ZTogL2hvbWUvbmN2cy9zcmMvc3lzL2Rldi9haWM3eHh4L2FoY19pc2EuYyx2DQpyZXRyaWV2aW5n IHJldmlzaW9uIDEuNw0KZGlmZiAtdSAtcjEuNyBhaGNfaXNhLmMNCi0tLSBzcmMvc3lzL2Rldi9h aWM3eHh4L2FoY19pc2EuYwkzIFNlcCAyMDA2IDAwOjI3OjQwIC0wMDAwCTEuNw0KKysrIHNyYy9z eXMvZGV2L2FpYzd4eHgvYWhjX2lzYS5jCTEgU2VwIDIwMDggMTY6Mjc6MTcgLTAwMDANCkBAIC04 Miw2ICs4MiwxMyBAQA0KIAlpbnQgc2xvdDsNCiAJaW50IG1heF9zbG90Ow0KIA0KKyNpZiAhZGVm aW5lZChfX2kzODZfXykNCisJLyoNCisJICogRG9uJ3QgYXNzdW1lIHdlIGNhbiBnZXQgYXdheSB3 aXRoIHRoZSBibGluZCBidXMgc3BhY2UNCisJICogcmVhZHMgYW5kIHdyaXRlcyB3aGljaCBhaGNf aXNhX2ZpbmRfZGV2aWNlKCkgZG9lcy4NCisJICovDQorCXJldHVybjsNCisjZW5kaWYNCiAJbWF4 X3Nsb3QgPSAxNDsNCiAJZm9yIChzbG90ID0gMDsgc2xvdCA8PSBtYXhfc2xvdDsgc2xvdCsrKSB7 DQogCQlzdHJ1Y3QgYWljNzc3MF9pZGVudGl0eSAqZW50cnk7DQo= --=-I5ae6o9bB7NBGQbcf4cR-- From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 16:56:24 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C8C61065670 for ; Mon, 1 Sep 2008 16:56:24 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id C37CC8FC15 for ; Mon, 1 Sep 2008 16:56:23 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id m81GIowg085787; Mon, 1 Sep 2008 18:18:50 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id m81GIoEA085786; Mon, 1 Sep 2008 18:18:50 +0200 (CEST) (envelope-from marius) Date: Mon, 1 Sep 2008 18:18:50 +0200 From: Marius Strobl To: Gavin Atkinson , gibbs@FreeBSD.org Message-ID: <20080901161850.GE80839@alchemy.franken.de> References: <1220278827.70590.35.camel@buffy.york.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1220278827.70590.35.camel@buffy.york.ac.uk> User-Agent: Mutt/1.4.2.3i Cc: freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 16:56:24 -0000 On Mon, Sep 01, 2008 at 03:20:27PM +0100, Gavin Atkinson wrote: > Hi all, > > My Blade 100 now panics on boot with HEAD, and I've tracked it down to > sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. > Specifically, this version now configures bridges differently, and not > setting "Master Abort Mode" prevents the panic: > > Index: src/sys/sparc64/pci/ofw_pcibus.c > =================================================================== > RCS file: /home/ncvs/src/sys/sparc64/pci/ofw_pcibus.c,v > retrieving revision 1.21 > diff -u -r1.21 ofw_pcibus.c > --- src/sys/sparc64/pci/ofw_pcibus.c 24 Aug 2008 15:05:46 -0000 1.21 > +++ src/sys/sparc64/pci/ofw_pcibus.c 1 Sep 2008 14:09:27 -0000 > @@ -140,7 +140,7 @@ > PCIM_HDRTYPE) == PCIM_HDRTYPE_BRIDGE) { > reg = PCIB_READ_CONFIG(bridge, busno, slot, func, > PCIR_BRIDGECTL_1, 1); > - reg |= PCIB_BCR_MASTER_ABORT_MODE | PCIB_BCR_SERR_ENABLE | > + reg |= /* PCIB_BCR_MASTER_ABORT_MODE | */ PCIB_BCR_SERR_ENABLE | > PCIB_BCR_PERR_ENABLE; > #ifdef OFW_PCI_DEBUG > device_printf(bridge, > > > > My Blade 100 (dmesg and panic backtrace attached) has three extra ATI > graphics cards installed (Official Sun ones, PN 370-4362), it doesn't > panic with these removed. Removing them and throwing a generic fxp(4) > card into one of the slots also gives the panic, so I suspect having > anything in at least one of the slots will cause a panic for me. > > I'm pretty sure the panic is not hardware related, as the machine will > happily run Solaris 10. > > Any suggestions? Are we missing some code necessary to support master > mode aborts? I'm happy to test anything necessary. This code was also > MFC'd, so I'm concerned about seeing 7.1 also have this issue. > <...> > machfb0: port 0xb00-0xbff mem 0x3000000-0x3ffffff,0x426000-0x426fff at device 19.0 on pci0 > machfb0: 16 MB aperture at 0xd5906000, 1 KB registers at 0x037ffc00 > machfb0: 8188 KB SDRAM 114.992 MHz, maximum RAMDAC clock 230 MHz, DSP > machfb0: resolution 1152x900 at 8 bpp > pcib1: at device 5.0 on pci0 > pci1: on pcib1 > pcib1: device 1/0/0: latency timer 64 -> 64 > pcib1: device 1/1/0: latency timer 64 -> 64 > pcib1: device 1/2/0: latency timer 64 -> 64 > machfb1: port 0x1000-0x10ff mem 0x4000000-0x4ffffff,0x5000000-0x5000fff at device 0.0 on pci1 > machfb1: 16 MB aperture at 0xd6908000, 1 KB registers at 0x047ffc00 > machfb1: 8188 KB SGRAM 114.992 MHz, maximum RAMDAC clock 230 MHz, DSP > machfb1: resolution 1152x900 at 8 bpp > machfb2: port 0x1100-0x11ff mem 0x6000000-0x6ffffff,0x5002000-0x5002fff at device 1.0 on pci1 > machfb2: 16 MB aperture at 0xd790a000, 1 KB registers at 0x067ffc00 > machfb2: 8188 KB SGRAM 114.992 MHz, maximum RAMDAC clock 230 MHz, DSP > machfb2: resolution 1152x900 at 8 bpp > machfb3: port 0x1200-0x12ff mem 0x7000000-0x7ffffff,0x5004000-0x5004fff at device 2.0 on pci1 > machfb3: 16 MB aperture at 0xd890c000, 1 KB registers at 0x077ffc00 > machfb3: 8188 KB SGRAM 114.992 MHz, maximum RAMDAC clock 230 MHz, DSP > machfb3: resolution 1152x900 at 8 bpp > syscons0: on nexus0 > syscons0: Unknown <16 virtual consoles, flags=0x100> > panic: pcib: PCI bus A error AFAR 0x1fe02001c80 AFSR 0x4000000100000000 > cpuid = 0 > KDB: enter: panic > [thread pid 0 tid 100000 ] > Stopped at kdb_enter+0x80: ta %xcc, 1 > db> tr > Tracing pid 0 tid 100000 td 0xc07d2e70 > panic() at panic+0x208 > psycho_pci_bus() at psycho_pci_bus+0x88 > intr_event_handle() at intr_event_handle+0x5c > intr_execute_handlers() at intr_execute_handlers+0x14 > intr_fast() at intr_fast+0x68 > -- interrupt level=0xd pil=0 %o7=0xc02ea55c -- > -- data access error %o7=0xc0c1757c -- > ahc_isa_find_device() at ahc_isa_find_device+0x50 > ahc_isa_identify() at ahc_isa_identify+0xd8 > bus_generic_probe() at bus_generic_probe+0x64 > isa_probe_children() at isa_probe_children+0x4 > configure() at configure+0x2c > mi_startup() at mi_startup+0x18c > btext() at btext+0x34 > db> > The most likely reason for this is a buggy driver. In this case the culprit appears to be the ISA front-end of ahc(4), which assumes that it can do bus space reads and writes at addresses that may in fact be assigned to a non-ahc(4)- compatible device or none at all. While writing something at an address that may no belong to the expected device probably is a bad idea in generally, reading to and writing from unassigned addresses may also trigger exceptions on sparc64. I'm unsure how to really fix ahc(4) regarding this, I think it should be okay though to only do it on i386 where the address range in question probably is reserved for such purposes (and which also is the only architecture FreeBSD currently runs on where a machine might have an ISA-slot and thus can use that front-end at all). Justin, do you approve the below patch? Marius Index: ahc_isa.c =================================================================== --- ahc_isa.c (revision 182474) +++ ahc_isa.c (working copy) @@ -82,6 +82,12 @@ ahc_isa_identify(driver_t *driver, device_t parent int slot; int max_slot; +#if !defined(__i386__) + /* + * Don't assume we can get away with the blind bus space + * reads and writes which ahc_isa_find_device() does. + */ +#endif max_slot = 14; for (slot = 0; slot <= max_slot; slot++) { struct aic7770_identity *entry; From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 17:51:04 2008 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87FE51065748 for ; Mon, 1 Sep 2008 17:51:04 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from adsum.doit.wisc.edu (adsum.doit.wisc.edu [144.92.197.210]) by mx1.freebsd.org (Postfix) with ESMTP id 587A78FC2A for ; Mon, 1 Sep 2008 17:51:04 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=ISO-8859-1; format=flowed Received: from avs-daemon.smtpauth1.wiscmail.wisc.edu by smtpauth1.wiscmail.wisc.edu (Sun Java(tm) System Messaging Server 6.3-6.03 (built Mar 14 2008; 32bit)) id <0K6J00900054YR00@smtpauth1.wiscmail.wisc.edu> for freebsd-sparc64@freebsd.org; Mon, 01 Sep 2008 11:51:04 -0500 (CDT) Received: from trantor.tachypleus.net (ppp-70-226-169-118.dsl.mdsnwi.ameritech.net [70.226.169.118]) by smtpauth1.wiscmail.wisc.edu (Sun Java(tm) System Messaging Server 6.3-6.03 (built Mar 14 2008; 32bit)) with ESMTPSA id <0K6J006JL052O610@smtpauth1.wiscmail.wisc.edu> for freebsd-sparc64@freebsd.org; Mon, 01 Sep 2008 11:51:03 -0500 (CDT) Date: Mon, 01 Sep 2008 11:54:42 -0500 From: Nathan Whitehorn In-reply-to: <1220287328.70590.46.camel@buffy.york.ac.uk> To: freebsd-sparc64@freebsd.org Message-id: <48BC1E52.7060200@freebsd.org> X-Spam-Report: AuthenticatedSender=yes, SenderIP=70.226.169.118 X-Spam-PmxInfo: Server=avs-9, Version=5.4.1.325704, Antispam-Engine: 2.6.0.325393, Antispam-Data: 2008.9.1.163128, SenderIP=70.226.169.118 References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> <1220287328.70590.46.camel@buffy.york.ac.uk> User-Agent: Thunderbird 2.0.0.16 (X11/20080814) Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 17:51:04 -0000 Gavin Atkinson wrote: > On Mon, 2008-09-01 at 18:18 +0200, Marius Strobl wrote: >> On Mon, Sep 01, 2008 at 03:20:27PM +0100, Gavin Atkinson wrote: >>> Hi all, >>> >>> My Blade 100 now panics on boot with HEAD, and I've tracked it down to >>> sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. > >> The most likely reason for this is a buggy driver. In this >> case the culprit appears to be the ISA front-end of ahc(4), >> which assumes that it can do bus space reads and writes at >> addresses that may in fact be assigned to a non-ahc(4)- >> compatible device or none at all. While writing something >> at an address that may no belong to the expected device >> probably is a bad idea in generally, reading to and writing >> from unassigned addresses may also trigger exceptions on >> sparc64. I'm unsure how to really fix ahc(4) regarding this, >> I think it should be okay though to only do it on i386 where >> the address range in question probably is reserved for such >> purposes (and which also is the only architecture FreeBSD >> currently runs on where a machine might have an ISA-slot >> and thus can use that front-end at all). >> Justin, do you approve the below patch? Speaking of ahc(4), I have one in my Ultra 5 which will not work unless I have options AHC_ALLOW_MEMIO in my kernel config. I think this option should always be valid for sparc64 systems. Can it be in the default kernel? -Nathan From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 19:46:34 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 59C6D1065672; Mon, 1 Sep 2008 19:46:34 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id D56BA8FC25; Mon, 1 Sep 2008 19:46:33 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id m81JkWCJ008988; Mon, 1 Sep 2008 21:46:32 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id m81JkW8Z008987; Mon, 1 Sep 2008 21:46:32 +0200 (CEST) (envelope-from marius) Date: Mon, 1 Sep 2008 21:46:32 +0200 From: Marius Strobl To: Gavin Atkinson Message-ID: <20080901194632.GF80839@alchemy.franken.de> References: <1220278827.70590.35.camel@buffy.york.ac.uk> <1220286044.70590.43.camel@buffy.york.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1220286044.70590.43.camel@buffy.york.ac.uk> User-Agent: Mutt/1.4.2.3i Cc: freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 19:46:34 -0000 On Mon, Sep 01, 2008 at 05:20:44PM +0100, Gavin Atkinson wrote: > On Mon, 2008-09-01 at 15:20 +0100, Gavin Atkinson wrote: > > Hi all, > > > > My Blade 100 now panics on boot with HEAD, and I've tracked it down to > > sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. > > Specifically, this version now configures bridges differently, and not > > setting "Master Abort Mode" prevents the panic: > > > > Index: src/sys/sparc64/pci/ofw_pcibus.c > > =================================================================== > > RCS file: /home/ncvs/src/sys/sparc64/pci/ofw_pcibus.c,v > > retrieving revision 1.21 > > diff -u -r1.21 ofw_pcibus.c > > --- src/sys/sparc64/pci/ofw_pcibus.c 24 Aug 2008 15:05:46 -0000 1.21 > > +++ src/sys/sparc64/pci/ofw_pcibus.c 1 Sep 2008 14:09:27 -0000 > > @@ -140,7 +140,7 @@ > > PCIM_HDRTYPE) == PCIM_HDRTYPE_BRIDGE) { > > reg = PCIB_READ_CONFIG(bridge, busno, slot, func, > > PCIR_BRIDGECTL_1, 1); > > - reg |= PCIB_BCR_MASTER_ABORT_MODE | PCIB_BCR_SERR_ENABLE | > > + reg |= /* PCIB_BCR_MASTER_ABORT_MODE | */ PCIB_BCR_SERR_ENABLE | > > PCIB_BCR_PERR_ENABLE; > > #ifdef OFW_PCI_DEBUG > > device_printf(bridge, > > [snip] > > > Any suggestions? Are we missing some code necessary to support master > > mode aborts? > > After further research (mainly involving eyeballing > pci_pbm_err_handler() in OpenSolaris), it looks like we are indeed > missing code to handle them. Therefore, until this code is written, I > suspect the patch above is actually correct. > While not setting master abort mode on PCI-PCI-bridges might hide your problem, the right place for ignoring master and (in this case) target aborts, both of which are fatal in general though, would be the host-PCI-bridge. Similarly, support for peeking and poking of I/O and memory space like OpenSolaris apparently has (the associated recovery handlers probably are the code you're refering to) should be implemented PCI-bus wide and not just grounded at PCI-PCI-bridges. I don't think there's a real need to go through the hoops to support these in FreeBSD though. The blind bus access ahc(4) ISA front-end does is also what hangs B100 during boot even with master abort mode in the PCI-PCI-bridges I think. We're looking at several problems here though and IMO the first one is that ahc(4) shouldn't try to identify cards on LPC(-like) busses and the respective code also only should be compiled in on architectures where machine actually can have ISA slots (which currently is only i386 AFAICT). Marius From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 19:47:28 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D06D106564A; Mon, 1 Sep 2008 19:47:28 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id B4F5D8FC1E; Mon, 1 Sep 2008 19:47:27 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id m81JlQZP009012; Mon, 1 Sep 2008 21:47:26 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id m81JlQ4l009011; Mon, 1 Sep 2008 21:47:26 +0200 (CEST) (envelope-from marius) Date: Mon, 1 Sep 2008 21:47:26 +0200 From: Marius Strobl To: Gavin Atkinson Message-ID: <20080901194726.GG80839@alchemy.franken.de> References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> <1220287328.70590.46.camel@buffy.york.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1220287328.70590.46.camel@buffy.york.ac.uk> User-Agent: Mutt/1.4.2.3i Cc: gibbs@FreeBSD.org, freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 19:47:28 -0000 On Mon, Sep 01, 2008 at 05:42:08PM +0100, Gavin Atkinson wrote: > On Mon, 2008-09-01 at 18:18 +0200, Marius Strobl wrote: > > On Mon, Sep 01, 2008 at 03:20:27PM +0100, Gavin Atkinson wrote: > > > Hi all, > > > > > > My Blade 100 now panics on boot with HEAD, and I've tracked it down to > > > sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. > > > The most likely reason for this is a buggy driver. In this > > case the culprit appears to be the ISA front-end of ahc(4), > > which assumes that it can do bus space reads and writes at > > addresses that may in fact be assigned to a non-ahc(4)- > > compatible device or none at all. While writing something > > at an address that may no belong to the expected device > > probably is a bad idea in generally, reading to and writing > > from unassigned addresses may also trigger exceptions on > > sparc64. I'm unsure how to really fix ahc(4) regarding this, > > I think it should be okay though to only do it on i386 where > > the address range in question probably is reserved for such > > purposes (and which also is the only architecture FreeBSD > > currently runs on where a machine might have an ISA-slot > > and thus can use that front-end at all). > > Justin, do you approve the below patch? > > > > Marius > > > > Index: ahc_isa.c > > =================================================================== > > --- ahc_isa.c (revision 182474) > > +++ ahc_isa.c (working copy) > > @@ -82,6 +82,12 @@ ahc_isa_identify(driver_t *driver, device_t parent > > int slot; > > int max_slot; > > > > +#if !defined(__i386__) > > + /* > > + * Don't assume we can get away with the blind bus space > > + * reads and writes which ahc_isa_find_device() does. > > + */ > > +#endif > > max_slot = 14; > > for (slot = 0; slot <= max_slot; slot++) { > > struct aic7770_identity *entry; > > This patch (with the addition of a "return;" inside the #ifdef which I'm > assuming was forgotten!) gets me booting again with stock ofw_pcibus.c. Oops, the "return;" was missing of course. Marius From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 21:38:06 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90B491065671; Mon, 1 Sep 2008 21:38:06 +0000 (UTC) (envelope-from gibbs@scsiguy.com) Received: from aslan.scsiguy.com (www.scsiguy.com [70.89.174.89]) by mx1.freebsd.org (Postfix) with ESMTP id 268CD8FC16; Mon, 1 Sep 2008 21:38:06 +0000 (UTC) (envelope-from gibbs@scsiguy.com) Received: from [192.168.0.6] (tumnus.scsiguy.org [192.168.0.6]) (authenticated bits=0) by aslan.scsiguy.com (8.14.2/8.14.2) with ESMTP id m81LDZBe003506 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 1 Sep 2008 15:13:38 -0600 (MDT) (envelope-from gibbs@scsiguy.com) Message-ID: <48BC5AF8.50600@scsiguy.com> Date: Mon, 01 Sep 2008 15:13:28 -0600 From: "Justin T. Gibbs" User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: Marius Strobl References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> <1220287328.70590.46.camel@buffy.york.ac.uk> <20080901194726.GG80839@alchemy.franken.de> In-Reply-To: <20080901194726.GG80839@alchemy.franken.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: gibbs@FreeBSD.org, Gavin Atkinson , freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 21:38:06 -0000 The driver isn't buggy. This particular hardware can only be identified via an invasive probe. Just returning early is a hack. How does Sparc64 exclude other, non-PNP devices from its probe sequence? They all have #ifdefs in them for Sparc64 and every other platform that gives a bus fault for touching a location that is unmapped? There's no generic method for trapping bus faults during invasive probes so that panics are avoided? There's no generic method for flagging probes as invasive so that they simply are never called (or compiled in) on platforms that cannot tolerate them? If you absolutely have to remove the probe just for sparc, it would be better to figure out how to just avoid compiling in that probe (config spec change "optional isa_nonpnp", or similar?). -- Justin Marius Strobl wrote: > On Mon, Sep 01, 2008 at 05:42:08PM +0100, Gavin Atkinson wrote: >> On Mon, 2008-09-01 at 18:18 +0200, Marius Strobl wrote: >>> On Mon, Sep 01, 2008 at 03:20:27PM +0100, Gavin Atkinson wrote: >>>> Hi all, >>>> >>>> My Blade 100 now panics on boot with HEAD, and I've tracked it down to >>>> sys/sparc64/pci/ofw_pcibus.c 1.21 (SVN r182108) by marius@. >>> The most likely reason for this is a buggy driver. In this >>> case the culprit appears to be the ISA front-end of ahc(4), >>> which assumes that it can do bus space reads and writes at >>> addresses that may in fact be assigned to a non-ahc(4)- >>> compatible device or none at all. While writing something >>> at an address that may no belong to the expected device >>> probably is a bad idea in generally, reading to and writing >>> from unassigned addresses may also trigger exceptions on >>> sparc64. I'm unsure how to really fix ahc(4) regarding this, >>> I think it should be okay though to only do it on i386 where >>> the address range in question probably is reserved for such >>> purposes (and which also is the only architecture FreeBSD >>> currently runs on where a machine might have an ISA-slot >>> and thus can use that front-end at all). >>> Justin, do you approve the below patch? >>> >>> Marius >>> >>> Index: ahc_isa.c >>> =================================================================== >>> --- ahc_isa.c (revision 182474) >>> +++ ahc_isa.c (working copy) >>> @@ -82,6 +82,12 @@ ahc_isa_identify(driver_t *driver, device_t parent >>> int slot; >>> int max_slot; >>> >>> +#if !defined(__i386__) >>> + /* >>> + * Don't assume we can get away with the blind bus space >>> + * reads and writes which ahc_isa_find_device() does. >>> + */ >>> +#endif >>> max_slot = 14; >>> for (slot = 0; slot <= max_slot; slot++) { >>> struct aic7770_identity *entry; >> This patch (with the addition of a "return;" inside the #ifdef which I'm >> assuming was forgotten!) gets me booting again with stock ofw_pcibus.c. > > Oops, the "return;" was missing of course. > > Marius > > From owner-freebsd-sparc64@FreeBSD.ORG Mon Sep 1 23:16:07 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 14EE21065680; Mon, 1 Sep 2008 23:16:07 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id 78BA68FC0C; Mon, 1 Sep 2008 23:16:06 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id m81NG5X6019875; Tue, 2 Sep 2008 01:16:05 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id m81NG58n019874; Tue, 2 Sep 2008 01:16:05 +0200 (CEST) (envelope-from marius) Date: Tue, 2 Sep 2008 01:16:04 +0200 From: Marius Strobl To: "Justin T. Gibbs" Message-ID: <20080901231604.GH80839@alchemy.franken.de> References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> <1220287328.70590.46.camel@buffy.york.ac.uk> <20080901194726.GG80839@alchemy.franken.de> <48BC5AF8.50600@scsiguy.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48BC5AF8.50600@scsiguy.com> User-Agent: Mutt/1.4.2.3i Cc: gibbs@FreeBSD.org, Gavin Atkinson , freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 23:16:07 -0000 On Mon, Sep 01, 2008 at 03:13:28PM -0600, Justin T. Gibbs wrote: > The driver isn't buggy. This particular hardware can only be identified > via an invasive probe. I meant misbehaving from the sparc64 point of view, not buggy in general. > > Just returning early is a hack. How does Sparc64 exclude other, non-PNP > devices from its probe sequence? It doesn't and invasive probes involving I/O or memory space accesses aren't supported. There are no ISA-slots in sparc64 machines and in general one can only regard the devices in the device tree provided by the firmware (which one can consider as PNP-mechanism) as existent and functional so supporting invasive probes or non-PNP devices doesn't make much sense from a hardware point of view. > They all have #ifdefs in them for > Sparc64 and every other platform that gives a bus fault for touching > a location that is unmapped? The other ISA drivers doing invasive probes aren't relevant for sparc64 as they either simply can't show up in a sparc64 machine. Some drivers with multiple bus front-ends also aren't in GENERIC as their core f.e. doesn't use bus_dma(9) or isn't endian-clean and therefore doesn't work on sparc64 so far anyway. It's just ahc(4) which is in GENERIC as the PCI variant works but brings in an invasive probe. > There's no generic method for trapping > bus faults during invasive probes so that panics are avoided? There's a procedure for configuration space accesses but for I/O and memory space one can really just ignore bus faults if there's a way to tell the host-to-foo driver that they are expected f.e. due to invasive probing. > There's > no generic method for flagging probes as invasive so that they > simply are never called (or compiled in) on platforms that cannot > tolerate them? Not as far as I can tell. > > If you absolutely have to remove the probe just for sparc, it would > be better to figure out how to just avoid compiling in that probe > (config spec change "optional isa_nonpnp", or similar?). What I think would be the right thing to do in this regard is splitting the ISA drivers and bus front-ends into bus front-ends for LPC or LPC-like busses (i.e. on-board PNP- only/firmware enumerated) and real ISA busses (non-PNP, cards in real slots). Though as far as I know there's more to LPC in terms of ACPI-probing which I currently don't understand and I admit that I'm reluctant to doing that much work just to keep a single bus front-end from probing... Marius From owner-freebsd-sparc64@FreeBSD.ORG Tue Sep 2 21:00:07 2008 Return-Path: Delivered-To: freebsd-sparc64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA1A1106568B for ; Tue, 2 Sep 2008 21:00:07 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C7EF08FC27 for ; Tue, 2 Sep 2008 21:00:06 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m82L06Kj089111 for ; Tue, 2 Sep 2008 21:00:06 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m82L06HT089110; Tue, 2 Sep 2008 21:00:06 GMT (envelope-from gnats) Resent-Date: Tue, 2 Sep 2008 21:00:06 GMT Resent-Message-Id: <200809022100.m82L06HT089110@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-sparc64@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Paulo Afonso Graner Fessel Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A6581065671 for ; Tue, 2 Sep 2008 20:58:57 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 67C6A8FC26 for ; Tue, 2 Sep 2008 20:58:57 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m82KwuL4012460 for ; Tue, 2 Sep 2008 20:58:56 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.2/8.14.1/Submit) id m82KwuWm012459; Tue, 2 Sep 2008 20:58:56 GMT (envelope-from nobody) Message-Id: <200809022058.m82KwuWm012459@www.freebsd.org> Date: Tue, 2 Sep 2008 20:58:56 GMT From: Paulo Afonso Graner Fessel To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: sparc64/127051: hme interfaces "pause" with the message "device timeout" on FreeBSD 7.0/sparc64 on an Enterprise 220R X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Sep 2008 21:00:07 -0000 >Number: 127051 >Category: sparc64 >Synopsis: hme interfaces "pause" with the message "device timeout" on FreeBSD 7.0/sparc64 on an Enterprise 220R >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-sparc64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Sep 02 21:00:05 UTC 2008 >Closed-Date: >Last-Modified: >Originator: Paulo Afonso Graner Fessel >Release: 7.0 >Organization: Virtus TI >Environment: FreeBSD vtsfprfw01.virtus-ti.com.br 7.0-RELEASE FreeBSD 7.0-RELEASE #7: Fri Jun 20 19:29:52 BRT 2008 root@vtsuprfw01.dedalusprime.com.br:/usr/obj/usr/src/sys/VIRTUSFW sparc64 >Description: We have a pair of UltraSparc II servers configured as HA-firewall with carp and pfsync. I noticed that even with an advskew of zero on the primary firewall (vtsfprfw01) the carp interfaces end up migrating to the backup firewall, which has an advskew of 200. Here's ifconfig for the primary firewall (master): hme0: flags=8843 metric 0 mtu 1500 options=b ether 08:00:20:d0:c3:dd inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255 media: Ethernet 100baseTX status: active hme1: flags=8b43 metric 0 mtu 1500 options=b ether 08:00:20:bc:a6:b4 inet 200.215.183.101 netmask 0xfffffff0 broadcast 200.215.183.111 media: Ethernet 100baseTX status: active hme2: flags=8b43 metric 0 mtu 1500 options=b ether 08:00:20:bc:a6:b5 inet 200.143.2.2 netmask 0xffffff00 broadcast 200.143.2.255 media: Ethernet 100baseTX status: active hme3: flags=8802 metric 0 mtu 1500 options=b ether 08:00:20:bc:a6:b6 media: Ethernet autoselect hme4: flags=8802 metric 0 mtu 1500 options=b ether 08:00:20:bc:a6:b7 media: Ethernet autoselect pflog0: flags=141 metric 0 mtu 33160 lo0: flags=8049 metric 0 mtu 16384 inet 127.0.0.1 netmask 0xff000000 pfsync0: flags=41 metric 0 mtu 1460 pfsync: syncdev: hme0 syncpeer: 192.168.0.2 maxupd: 128 carp0: flags=49 metric 0 mtu 1500 inet 200.215.183.100 netmask 0xfffffff0 carp: BACKUP vhid 1 advbase 1 advskew 0 carp1: flags=49 metric 0 mtu 1500 inet 200.143.2.1 netmask 0xffffff00 carp: BACKUP vhid 2 advbase 1 advskew 0 And the same, for the second firewall (backup): vtsfprfw02# ifconfig -a hme0: flags=8843 metric 0 mtu 1500 options=b ether 08:00:20:e7:39:31 inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255 media: Ethernet 100baseTX status: active hme1: flags=8b43 metric 0 mtu 1500 options=b ether 08:00:20:bc:a3:a0 inet 200.215.183.102 netmask 0xfffffff0 broadcast 200.215.183.111 media: Ethernet 100baseTX status: active hme2: flags=8b43 metric 0 mtu 1500 options=b ether 08:00:20:bc:a3:a1 inet 200.143.2.3 netmask 0xffffff00 broadcast 200.143.2.255 media: Ethernet 100baseTX status: active hme3: flags=8802 metric 0 mtu 1500 options=b ether 08:00:20:bc:a3:a2 media: Ethernet autoselect hme4: flags=8802 metric 0 mtu 1500 options=b ether 08:00:20:bc:a3:a3 media: Ethernet autoselect pflog0: flags=141 metric 0 mtu 33160 lo0: flags=8049 metric 0 mtu 16384 inet 127.0.0.1 netmask 0xff000000 pfsync0: flags=41 metric 0 mtu 1460 pfsync: syncdev: hme0 syncpeer: 192.168.0.1 maxupd: 128 carp0: flags=49 metric 0 mtu 1500 inet 200.215.183.100 netmask 0xfffffff0 carp: MASTER vhid 1 advbase 1 advskew 200 carp1: flags=49 metric 0 mtu 1500 inet 200.143.2.1 netmask 0xffffff00 carp: MASTER vhid 2 advbase 1 advskew 200 After noticing this, I also saw that "local-mac-address?" on the first firewall was set to "false", what caused all the interface ports to show the same MAC address. I've fixed this and rebooted the server, to investigate if this had something to do with the issue. Everything was alright during approximate 30 minutes, when the firewall has changed to the secondary machine. Here's an excerpt from /var/log/messages from the primary firewall: Sep 2 10:18:51 vtsfprfw01 ftp-proxy[929]: listening on 127.0.0.1 port 8021 Sep 2 10:18:51 vtsfprfw01 getty[943]: open /dev/ttyv2: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[945]: open /dev/ttyv4: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[941]: open /dev/ttyv0: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[948]: open /dev/ttyv7: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[946]: open /dev/ttyv5: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[944]: open /dev/ttyv3: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[942]: open /dev/ttyv1: No such file or directory Sep 2 10:18:51 vtsfprfw01 getty[947]: open /dev/ttyv6: No such file or directory Sep 2 10:18:58 vtsfprfw01 login: ROOT LOGIN (root) ON ttyu0 Sep 2 10:44:43 vtsfprfw01 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 10:44:43 vtsfprfw01 kernel: hme2: device timeout Sep 2 10:44:48 vtsfprfw01 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 2 10:44:48 vtsfprfw01 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 2 11:11:48 vtsfprfw01 kernel: hme2: device timeout Sep 2 11:35:19 vtsfprfw01 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 11:35:19 vtsfprfw01 kernel: hme2: device timeout Sep 2 11:35:24 vtsfprfw01 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 2 11:35:24 vtsfprfw01 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 2 13:37:16 vtsfprfw01 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 2 13:37:16 vtsfprfw01 kernel: hme1: device timeout Sep 2 13:37:17 vtsfprfw01 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 16:11:06 vtsfprfw01 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 2 16:11:14 vtsfprfw01 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) As it can be seen from the logs, there's a number of messages like "hmeX: device timeout". When this happens, carp0 and carp1 see this as a disconnection and are forced to the secondary firewall. The most interesting is part is that I don't lose communication with either machines: I'm able to ping the first firewall normally after the event, and I don't get messages neither in the OS or on the switch pointing to link loss. The secondary server also shows the same problem (hmeX device timeout): Sep 1 15:58:00 vtsfprfw02 kernel: hme0: device timeout Sep 1 16:35:00 vtsfprfw02 kernel: hme0: device timeout Sep 1 16:35:40 vtsfprfw02 last message repeated 2 times Sep 1 16:45:33 vtsfprfw02 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 1 16:45:34 vtsfprfw02 kernel: hme1: device timeout Sep 1 16:45:35 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 1 16:45:47 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 1 16:45:47 vtsfprfw02 kernel: carp1: BACKUP -> MASTER (preempting a slower master) Sep 1 16:45:47 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 1 16:45:48 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 1 16:45:48 vtsfprfw02 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 1 16:45:48 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 1 17:12:16 vtsfprfw02 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 1 17:12:16 vtsfprfw02 kernel: hme1: device timeout Sep 1 17:12:17 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 1 17:44:33 vtsfprfw02 kernel: arp: 200.215.183.110 is on hme1 but got reply from 00:14:d1:38:92:ba on hme2 Sep 1 17:45:05 vtsfprfw02 last message repeated 21 times Sep 1 17:45:27 vtsfprfw02 last message repeated 15 times Sep 1 18:55:48 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 1 18:55:48 vtsfprfw02 kernel: carp1: BACKUP -> MASTER (preempting a slower master) Sep 1 18:55:48 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 1 18:55:48 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 1 18:55:49 vtsfprfw02 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 1 18:55:49 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 1 19:24:43 vtsfprfw02 sshd[34638]: error: PAM: authentication error for pfessel from 201.20.234.104 Sep 1 19:24:46 vtsfprfw02 sshd[34641]: error: ssh_msg_send: write Sep 2 10:19:31 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 10:19:50 vtsfprfw02 kernel: arp: 200.143.2.2 moved from 08:00:20:d0:c3:dd to 08:00:20:bc:a6:b5 on hme2 Sep 2 10:19:51 vtsfprfw02 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 2 10:45:15 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 2 10:45:15 vtsfprfw02 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 2 10:45:15 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 2 10:45:20 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 10:45:26 vtsfprfw02 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 2 11:12:19 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 2 11:12:32 vtsfprfw02 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 2 11:35:51 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 2 11:35:51 vtsfprfw02 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 2 11:35:51 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 2 11:35:56 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 11:36:06 vtsfprfw02 kernel: carp1: MASTER -> BACKUP (more frequent advertisement received) Sep 2 13:37:47 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 2 13:37:48 vtsfprfw02 kernel: carp1: BACKUP -> MASTER (preempting a slower master) Sep 2 13:37:48 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.143.2.1 (!AF_LINK) Sep 2 13:37:48 vtsfprfw02 kernel: carp0: MASTER -> BACKUP (more frequent advertisement received) Sep 2 13:37:49 vtsfprfw02 kernel: carp0: BACKUP -> MASTER (preempting a slower master) Sep 2 13:37:49 vtsfprfw02 kernel: arp_rtrequest: bad gateway 200.215.183.100 (!AF_LINK) Sep 2 14:01:06 vtsfprfw02 sshd[37221]: error: PAM: authentication error for root from 200.143.2.125 Sep 2 14:01:08 vtsfprfw02 sshd[37221]: error: PAM: authentication error for root from 200.143.2.125 Sep 2 16:07:37 vtsfprfw02 sshd[37559]: error: PAM: authentication error for root from 200.143.2.125 Sep 2 16:08:17 vtsfprfw02 last message repeated 3 times Sep 2 16:08:35 vtsfprfw02 sshd[37559]: error: PAM: authentication error for root from 200.143.2.125 Sep 2 16:08:38 vtsfprfw02 sshd[37566]: error: ssh_msg_send: write Sep 2 16:08:50 vtsfprfw02 sshd[37567]: error: PAM: authentication error for root from 200.143.2.125 Sep 2 16:09:00 vtsfprfw02 last message repeated 5 times Sep 2 16:11:40 vtsfprfw02 kernel: hme2: device timeout This log is even more interesting, as it's shown here there's the first transition of the carp interfaces from MASTER to BACKUP from when I first rebooted the master firewall to fix the eeprom issue; but there are also other transitions from BACKUP to MASTER to BACKUP to MASTER again, which haven't been logged in the master firewall! Now, some background information about network topology. * carp0 is connected to hme1 in both servers (master and backup), and points to our default internet gateway; * carp1 is connected to hme2 in both servers (master and backup), and points to our data center's front-end network (200.143.2.0/24) * pfsync0 is associated to hme0 in both servers. We don't use a back-to-back connection via crossover cable, but instead we use a dedicated VLAN on one of our switches. Because of this I've chosen to specify syncpeers on both servers, with the master firewall pointing to the backup firewall and vice-versa, always using hme0 as the physical device for pfsync0. I've checked that VHIDs from other firewalls which use this VLAN don't coincide with the ones we use in this particular configuration. Finally, hardware configuration follows: * Master firewall: E220R with 2 USII 450 MHz processors, 2 SUN18G SCSI hard disks, 2 GB RAM, one Sun QFE PCI interface (hme1-hme4 ports) plus the onboard HME interface (hme0). * Backup firewall: E420R with 2 USII 450 MHz processors, 2 SUN18G SCSI hard disks, 2 GB RAM, one Sun QFE PCI interface (hme1-hme4 ports) plus the onboard HME interface (hme0). As you can see, both machines are nearly identical. My feeling is that this behaviour has something to do with something specific to sparc64 architecture, as I do have other 2 Sun QFE PCI boards running on Intel architecture (Dell Poweredges 1650 and 1750 respectively) and there are no such issues whatsoever. These i386 servers have already an uptime of 11 days and there is no "hmeX: device timeout" in either server. >How-To-Repeat: Just configure the networking in the servers as shown and then is just a question of time for the problem to appear. It can take days, hours, or minutes, but it will show up after some time. >Fix: >Release-Note: >Audit-Trail: >Unformatted: From owner-freebsd-sparc64@FreeBSD.ORG Tue Sep 2 21:10:46 2008 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2445D1065679; Tue, 2 Sep 2008 21:10:46 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id A2B838FC13; Tue, 2 Sep 2008 21:10:45 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id m82LAivV021945; Tue, 2 Sep 2008 23:10:44 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id m82LAhkq021944; Tue, 2 Sep 2008 23:10:43 +0200 (CEST) (envelope-from marius) Date: Tue, 2 Sep 2008 23:10:43 +0200 From: Marius Strobl To: "Justin T. Gibbs" Message-ID: <20080902211043.GA21904@alchemy.franken.de> References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> <1220287328.70590.46.camel@buffy.york.ac.uk> <20080901194726.GG80839@alchemy.franken.de> <48BC5AF8.50600@scsiguy.com> <20080901231604.GH80839@alchemy.franken.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080901231604.GH80839@alchemy.franken.de> User-Agent: Mutt/1.4.2.3i Cc: gibbs@FreeBSD.org, Gavin Atkinson , freebsd-sparc64@FreeBSD.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Sep 2008 21:10:46 -0000 On Tue, Sep 02, 2008 at 01:16:04AM +0200, Marius Strobl wrote: > On Mon, Sep 01, 2008 at 03:13:28PM -0600, Justin T. Gibbs wrote: > > > > If you absolutely have to remove the probe just for sparc, it would > > be better to figure out how to just avoid compiling in that probe > > (config spec change "optional isa_nonpnp", or similar?). > > What I think would be the right thing to do in this regard > is splitting the ISA drivers and bus front-ends into bus > front-ends for LPC or LPC-like busses (i.e. on-board PNP- > only/firmware enumerated) and real ISA busses (non-PNP, > cards in real slots). Though as far as I know there's more > to LPC in terms of ACPI-probing which I currently don't > understand and I admit that I'm reluctant to doing that > much work just to keep a single bus front-end from probing... > Thinking some more about it I decided to work around the lack of distinction between LPC and real ISA at a different level. Marius From owner-freebsd-sparc64@FreeBSD.ORG Tue Sep 2 22:09:54 2008 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7BDFF106564A; Tue, 2 Sep 2008 22:09:54 +0000 (UTC) (envelope-from obrien@NUXI.org) Received: from dragon.nuxi.org (trang.nuxi.org [74.95.12.85]) by mx1.freebsd.org (Postfix) with ESMTP id 4776D8FC0A; Tue, 2 Sep 2008 22:09:54 +0000 (UTC) (envelope-from obrien@NUXI.org) Received: from dragon.nuxi.org (obrien@localhost [127.0.0.1]) by dragon.nuxi.org (8.14.2/8.14.2) with ESMTP id m82LkYdt086505; Tue, 2 Sep 2008 14:46:35 -0700 (PDT) (envelope-from obrien@dragon.nuxi.org) Received: (from obrien@localhost) by dragon.nuxi.org (8.14.2/8.14.2/Submit) id m82LkYa5086504; Tue, 2 Sep 2008 14:46:34 -0700 (PDT) (envelope-from obrien) Date: Tue, 2 Sep 2008 14:46:34 -0700 From: "David O'Brien" To: Nathan Whitehorn Message-ID: <20080902214634.GA86270@dragon.NUXI.org> References: <1220278827.70590.35.camel@buffy.york.ac.uk> <20080901161850.GE80839@alchemy.franken.de> <1220287328.70590.46.camel@buffy.york.ac.uk> <48BC1E52.7060200@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48BC1E52.7060200@freebsd.org> X-Operating-System: FreeBSD 8.0-CURRENT User-Agent: Mutt/1.5.16 (2007-06-09) Cc: freebsd-sparc64@freebsd.org Subject: Re: HEAD panic with ofw_pcibus.c 1.21 on Blade 100 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: obrien@freebsd.org List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Sep 2008 22:09:54 -0000 On Mon, Sep 01, 2008 at 11:54:42AM -0500, Nathan Whitehorn wrote: > Speaking of ahc(4), I have one in my Ultra 5 which will not work unless I > have options AHC_ALLOW_MEMIO in my kernel config. I think this option > should always be valid for sparc64 systems. Can it be in the default Simple enough request - done for 8-CURRENT. From owner-freebsd-sparc64@FreeBSD.ORG Thu Sep 4 22:00:04 2008 Return-Path: Delivered-To: freebsd-sparc64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C3CC2106567B for ; Thu, 4 Sep 2008 22:00:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B15CD8FC12 for ; Thu, 4 Sep 2008 22:00:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m84M041s058383 for ; Thu, 4 Sep 2008 22:00:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m84M04RA058382; Thu, 4 Sep 2008 22:00:04 GMT (envelope-from gnats) Date: Thu, 4 Sep 2008 22:00:04 GMT Message-Id: <200809042200.m84M04RA058382@freefall.freebsd.org> To: freebsd-sparc64@FreeBSD.org From: Marius Strobl Cc: Subject: Re: sparc64/127051: hme interfaces "pause" with the message "device timeout" on FreeBSD 7.0/sparc64 on an Enterprise 220R X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Marius Strobl List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Sep 2008 22:00:04 -0000 The following reply was made to PR sparc64/127051; it has been noted by GNATS. From: Marius Strobl To: Paulo Afonso Graner Fessel , bug-followup@FreeBSD.org Cc: Subject: Re: sparc64/127051: hme interfaces "pause" with the message "device timeout" on FreeBSD 7.0/sparc64 on an Enterprise 220R Date: Thu, 4 Sep 2008 23:25:15 +0200 On Tue, Sep 02, 2008 at 08:58:56PM +0000, Paulo Afonso Graner Fessel wrote: > > >Environment: > FreeBSD vtsfprfw01.virtus-ti.com.br 7.0-RELEASE FreeBSD 7.0-RELEASE #7: Fri Jun 20 19:29:52 BRT 2008 root@vtsuprfw01.dedalusprime.com.br:/usr/obj/usr/src/sys/VIRTUSFW sparc64 Please give 7.0-STABLE/7.1-PRERELEASE a try, there where several bugs in hme(4) fixed since 7.0-RELEASE. Marius