From owner-freebsd-stable@FreeBSD.ORG Wed Jan 18 19:26:00 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D71051065670 for ; Wed, 18 Jan 2012 19:26:00 +0000 (UTC) (envelope-from peter@hk.ipsec.se) Received: from n.manet.nu (n.manet.nu [212.91.140.35]) by mx1.freebsd.org (Postfix) with ESMTP id 62AFA8FC19 for ; Wed, 18 Jan 2012 19:25:59 +0000 (UTC) Received: from bore.hk.ipsec.se (h87-241-127-130.dynamic.se.alltele.net [87.241.127.130]) by n.manet.nu (8.14.3/8.14.3) with ESMTP id q0IJPhUu083334 for ; Wed, 18 Jan 2012 20:25:58 +0100 (CET) (envelope-from peter@hk.ipsec.se) Received: from [192.168.99.6] (zap.hk.ipsec.se [192.168.99.6]) by bore.hk.ipsec.se (8.14.4/8.14.4) with ESMTP id q0IJPSpK096787 for ; Wed, 18 Jan 2012 20:25:43 +0100 (CET) (envelope-from peter@hk.ipsec.se) From: peter h To: freebsd-stable@freebsd.org Date: Wed, 18 Jan 2012 20:25:25 +0100 User-Agent: KMail/1.8 References: <201201171859.10812.peter@hk.ipsec.se> <20120117220912.GA32330@icarus.home.lan> <4F16FE42.3090300@egr.msu.edu> In-Reply-To: <4F16FE42.3090300@egr.msu.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <201201182025.26736.peter@hk.ipsec.se> Subject: Re: about thumper aka sun fire x4500 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jan 2012 19:26:01 -0000 On Wednesday 18 January 2012 18.15, Adam McDougall wrote: > On 01/17/12 17:09, Jeremy Chadwick wrote: > > On Tue, Jan 17, 2012 at 06:59:08PM +0100, peter h wrote: > >> I have been beating on of these a few days, i have udes freebsd 9.0 an= d 8.2 > >> Both fails when i engage> 10 disks, the system craches and messages : > >> "Hyper transport sync flood" will get into the BIOS errorlog ( but not= hing will > >> come to syslog since reboot is immediate) > >> > >> Using a zfs radz of 25 disks and typing "zpool scrub" will bring the s= ystem down in seconds. > >> > >> Anyone using a x4500 that can comfirm that it works ? Or is this box b= roken ? > > >=20 > I've seen what is probably the same base issue but on multiple x4100m2=20 > systems running FreeBSD 7 or 8 a few years ago. For me the instant=20 > reboot and HT sync flood error happened when I fetched a ~200mb file via= =20 > HTTP using an onboard intel nic and wrote it out to a simple zfs mirror=20 > on 2 disks. I may have tried the nvidia ethernet ports as an=20 > alternative but that driver had its own issues at the time. This was=20 > never a problem with FFS instead of ZFS. I could repeat it fairly=20 > easily by running fetch in a loop (can't remember if writing the output=20 > to disk was necessary to trigger it). The workaround I found that=20 > worked for me was to buy a cheap intel PCIE nic and use that instead of=20 > the onboard ports. If a zpool scrub triggers it for you, I doubt my=20 > workaround will help but I wanted to relate my experience. The problem i had was most likley the disc-io itself. It was always there=20 whenever a larger number of discs was in motion.It was never there as=20 violent networking ( i even used myri2000 to increase traffic, never a prob= lem) A scrub on the 20-or-so zpool was all that was needed, andn when rebooting= =20 the scrub continued and whoops - a new reboot. Sometimes the bios reported not even 16G mem but 10.5 ( which also freebsd = noticed) Right now i am torturing the box with same load ( minus myri2000) and sunk-= os, i'll report if it does show simular problems. >=20 > > Given this above diagram, I'm sure you can figure out how "flooding" > > might occur. :-) I'm not sure what "sync flood" means (vs. I/O > > flooding). >=20 > As I understand it, a sync flood is a purposeful reaction to an error=20 > condition as somewhat of a last ditch effort to regain control over the=20 > system (which ends up rebooting). I'm pulling this out of my memory=20 > from a few years ago. > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >=20 =2D-=20 Peter H=E5kanson =20 There's never money to do it right, but always money to do it again ... and again ... and again ... and again. ( Det =E4r billigare att g=F6ra r=E4tt. Det =E4r dyrt att laga fel.= )