From owner-freebsd-stable@FreeBSD.ORG Thu Jan 19 00:23:25 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C566E106566C for ; Thu, 19 Jan 2012 00:23:25 +0000 (UTC) (envelope-from markus.gebert@hostpoint.ch) Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch [IPv6:2a00:d70:0:a::e0]) by mx1.freebsd.org (Postfix) with ESMTP id 1D72F8FC19 for ; Thu, 19 Jan 2012 00:23:25 +0000 (UTC) Received: from 46-127-111-189.dynamic.hispeed.ch ([46.127.111.189]:50373 helo=[172.16.1.20]) by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1Rnfml-000Azs-Mz; Thu, 19 Jan 2012 01:23:23 +0100 Mime-Version: 1.0 (Apple Message framework v1251.1) From: Markus Gebert In-Reply-To: <201201182025.26736.peter@hk.ipsec.se> Date: Thu, 19 Jan 2012 01:23:21 +0100 Message-Id: References: <201201171859.10812.peter@hk.ipsec.se> <20120117220912.GA32330@icarus.home.lan> <4F16FE42.3090300@egr.msu.edu> <201201182025.26736.peter@hk.ipsec.se> To: peter h X-Mailer: Apple Mail (2.1251.1) Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: about thumper aka sun fire x4500 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2012 00:23:25 -0000 Hi Peter On 18.01.2012, at 20:25, peter h wrote: > On Wednesday 18 January 2012 18.15, Adam McDougall wrote: >> On 01/17/12 17:09, Jeremy Chadwick wrote: >>> On Tue, Jan 17, 2012 at 06:59:08PM +0100, peter h wrote: >>>> I have been beating on of these a few days, i have udes freebsd 9.0 = and 8.2 >>>> Both fails when i engage> 10 disks, the system craches and = messages : >>>> "Hyper transport sync flood" will get into the BIOS errorlog ( but = nothing will >>>> come to syslog since reboot is immediate) >>>>=20 >>>> Using a zfs radz of 25 disks and typing "zpool scrub" will bring = the system down in seconds. >>>>=20 >>>> Anyone using a x4500 that can comfirm that it works ? Or is this = box broken ? >>>=20 >>=20 >> I've seen what is probably the same base issue but on multiple = x4100m2=20 >> systems running FreeBSD 7 or 8 a few years ago. For me the instant=20= >> reboot and HT sync flood error happened when I fetched a ~200mb file = via=20 >> HTTP using an onboard intel nic and wrote it out to a simple zfs = mirror=20 >> on 2 disks. I may have tried the nvidia ethernet ports as an=20 >> alternative but that driver had its own issues at the time. This was=20= >> never a problem with FFS instead of ZFS. I could repeat it fairly=20 >> easily by running fetch in a loop (can't remember if writing the = output=20 >> to disk was necessary to trigger it). The workaround I found that=20 >> worked for me was to buy a cheap intel PCIE nic and use that instead = of=20 >> the onboard ports. If a zpool scrub triggers it for you, I doubt my=20= >> workaround will help but I wanted to relate my experience. >=20 > The problem i had was most likley the disc-io itself. It was always = there=20 > whenever a larger number of discs was in motion.It was never there as=20= > violent networking ( i even used myri2000 to increase traffic, never a = problem) >=20 > A scrub on the 20-or-so zpool was all that was needed, andn when = rebooting=20 > the scrub continued and whoops - a new reboot. >=20 > Sometimes the bios reported not even 16G mem but 10.5 ( which also = freebsd noticed) >=20 > Right now i am torturing the box with same load ( minus myri2000) and = sunk-os, > i'll report if it does show simular problems. >=20 >=20 >>=20 >>> Given this above diagram, I'm sure you can figure out how "flooding" >>> might occur. :-) I'm not sure what "sync flood" means (vs. I/O >>> flooding). >>=20 >> As I understand it, a sync flood is a purposeful reaction to an error=20= >> condition as somewhat of a last ditch effort to regain control over = the=20 >> system (which ends up rebooting). I'm pulling this out of my memory=20= >> from a few years ago. As Adam has pointed out, a sync flood is a way to signal an error = condition on the hyper transport. As I understand it, it's used as a = last resort when less fatal means of error communication are no longer = possible because of a problem on the transport or a device attached to = it. The transport will not recover from this state until it's reset. On = Sun AMD systems a reboot is triggered immediately when a sync flood is = detected. The fact that it happened is mentioned during POST, but it = should also appear in the machine's error logs (IPMI/iLOM), so if you = haven't done this already, it might be worth checking them. Maybe you'll = find additional information there. You should be able to disable the automatic reset on sync flood in your = BIOS settings. We did this on our Sun X4200M2 machines when we = experienced sync flood errors. It allowed the kernel to catch an MCE, = panic and print out information about the MCE. This might help you get = more information about the cause. Our problems with the X4200M2 have some similarties with your case, = though in our case high IO (i.e. zpool scrub) did not reliably (read: = within minutes or hours) trigger the MCE/sync flood. If we put load on = the zpool _and_ the network (em) we could trigger it easily. An other = similarity: an other OS (in our case Linux), did not show the symptoms. = Even other FreeBSD branches did not trigger the sync flood. You'll find = the thread here: http://lists.freebsd.org/pipermail/freebsd-stable/2010-July/057670.html It's a rather long thread. Short version: If raid controller (mpt) = interrupts were routed to the first cpu (cpu0) everything worked, if = not, sync flood (or MCE) happened on heavy IO. It happens that Linux and = even older and newer FreeBSD versions (7.x, 9.x) assigned different = interrupt routes for mpt0 compared to the FreeBSD 8.1 we were testing = on. So what seemed like a bug of a specific FreeBSD version, because it = didn't happen using other FreeBSD versions and Linux, turned out to be a = hardware problem after all. IIRC a change in some hardware clock code = caused an additional IRQ to be registered on boot (or one less), which = reshuffled interrupt assignments compared to older FreeBSD versions we = had used successfully on those machines. So we fixed it by setting a = tunable which restored old clock behavior and thus old interrupt = assignments. It impossible to tell wether you have the same problem. But if you don't = see any problems with other operating systems, maybe it's worth to play = around with interrupt assignments. Luckily, the routings are tunable at = runtime through cpuset(1). For example: # cpuset -c -l 0 -x 58 IRQ58 was used by mpt0. Rerouting it to cpu0 made all problems go away. = Hope this helps you in some way. Good luck, --=20 Markus