From owner-freebsd-stable@FreeBSD.ORG Wed Jan 18 17:31:09 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16579106564A for ; Wed, 18 Jan 2012 17:31:09 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mail.egr.msu.edu (boomhauer.egr.msu.edu [35.9.37.167]) by mx1.freebsd.org (Postfix) with ESMTP id DDC9B8FC0C for ; Wed, 18 Jan 2012 17:31:08 +0000 (UTC) Received: from boomhauer (localhost [127.0.0.1]) by mail.egr.msu.edu (Postfix) with ESMTP id 2176661BCE for ; Wed, 18 Jan 2012 12:15:51 -0500 (EST) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mail.egr.msu.edu ([127.0.0.1]) by boomhauer (boomhauer.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xuurHV5Q4Yos for ; Wed, 18 Jan 2012 12:15:51 -0500 (EST) Received: from [35.9.44.65] (daemon.egr.msu.edu [35.9.44.65]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: mcdouga9) by mail.egr.msu.edu (Postfix) with ESMTPSA id EFF7561BC8 for ; Wed, 18 Jan 2012 12:15:50 -0500 (EST) Message-ID: <4F16FE42.3090300@egr.msu.edu> Date: Wed, 18 Jan 2012 12:15:46 -0500 From: Adam McDougall User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111222 Thunderbird/9.0 MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <201201171859.10812.peter@hk.ipsec.se> <20120117220912.GA32330@icarus.home.lan> In-Reply-To: <20120117220912.GA32330@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: about thumper aka sun fire x4500 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jan 2012 17:31:09 -0000 On 01/17/12 17:09, Jeremy Chadwick wrote: > On Tue, Jan 17, 2012 at 06:59:08PM +0100, peter h wrote: >> I have been beating on of these a few days, i have udes freebsd 9.0 and 8.2 >> Both fails when i engage> 10 disks, the system craches and messages : >> "Hyper transport sync flood" will get into the BIOS errorlog ( but nothing will >> come to syslog since reboot is immediate) >> >> Using a zfs radz of 25 disks and typing "zpool scrub" will bring the system down in seconds. >> >> Anyone using a x4500 that can comfirm that it works ? Or is this box broken ? > I've seen what is probably the same base issue but on multiple x4100m2 systems running FreeBSD 7 or 8 a few years ago. For me the instant reboot and HT sync flood error happened when I fetched a ~200mb file via HTTP using an onboard intel nic and wrote it out to a simple zfs mirror on 2 disks. I may have tried the nvidia ethernet ports as an alternative but that driver had its own issues at the time. This was never a problem with FFS instead of ZFS. I could repeat it fairly easily by running fetch in a loop (can't remember if writing the output to disk was necessary to trigger it). The workaround I found that worked for me was to buy a cheap intel PCIE nic and use that instead of the onboard ports. If a zpool scrub triggers it for you, I doubt my workaround will help but I wanted to relate my experience. > Given this above diagram, I'm sure you can figure out how "flooding" > might occur. :-) I'm not sure what "sync flood" means (vs. I/O > flooding). As I understand it, a sync flood is a purposeful reaction to an error condition as somewhat of a last ditch effort to regain control over the system (which ends up rebooting). I'm pulling this out of my memory from a few years ago.