From owner-freebsd-stable@FreeBSD.ORG  Thu Jan 19 00:23:25 2012
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C566E106566C
	for <freebsd-stable@freebsd.org>; Thu, 19 Jan 2012 00:23:25 +0000 (UTC)
	(envelope-from markus.gebert@hostpoint.ch)
Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch
	[IPv6:2a00:d70:0:a::e0])
	by mx1.freebsd.org (Postfix) with ESMTP id 1D72F8FC19
	for <freebsd-stable@freebsd.org>; Thu, 19 Jan 2012 00:23:25 +0000 (UTC)
Received: from 46-127-111-189.dynamic.hispeed.ch ([46.127.111.189]:50373
	helo=[172.16.1.20])
	by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128)
	(Exim 4.69 (FreeBSD)) (envelope-from <markus.gebert@hostpoint.ch>)
	id 1Rnfml-000Azs-Mz; Thu, 19 Jan 2012 01:23:23 +0100
Mime-Version: 1.0 (Apple Message framework v1251.1)
From: Markus Gebert <markus.gebert@hostpoint.ch>
In-Reply-To: <201201182025.26736.peter@hk.ipsec.se>
Date: Thu, 19 Jan 2012 01:23:21 +0100
Message-Id: <F32BCF4A-9CD8-4B7F-A1DD-1B53120C577A@hostpoint.ch>
References: <201201171859.10812.peter@hk.ipsec.se>
	<20120117220912.GA32330@icarus.home.lan>
	<4F16FE42.3090300@egr.msu.edu>
	<201201182025.26736.peter@hk.ipsec.se>
To: peter h <peter@hk.ipsec.se>
X-Mailer: Apple Mail (2.1251.1)
Content-Type: text/plain;
	charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-stable@freebsd.org
Subject: Re: about thumper aka sun fire x4500
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Jan 2012 00:23:25 -0000

Hi Peter

On 18.01.2012, at 20:25, peter h wrote:

> On Wednesday 18 January 2012 18.15, Adam McDougall wrote:
>> On 01/17/12 17:09, Jeremy Chadwick wrote:
>>> On Tue, Jan 17, 2012 at 06:59:08PM +0100, peter h wrote:
>>>> I have been beating on of these a few days, i have udes freebsd 9.0 =
and 8.2
>>>> Both fails when i engage>  10 disks, the system craches and =
messages :
>>>> "Hyper transport sync flood" will get into the BIOS errorlog ( but =
nothing will
>>>> come to syslog since reboot is immediate)
>>>>=20
>>>> Using a zfs radz of 25 disks and typing "zpool scrub" will bring =
the system down in seconds.
>>>>=20
>>>> Anyone using a x4500 that can comfirm that it works ? Or is this =
box broken ?
>>>=20
>>=20
>> I've seen what is probably the same base issue but on multiple =
x4100m2=20
>> systems running FreeBSD 7 or 8 a few years ago.  For me the instant=20=

>> reboot and HT sync flood error happened when I fetched a ~200mb file =
via=20
>> HTTP using an onboard intel nic and wrote it out to a simple zfs =
mirror=20
>> on 2 disks.  I may have tried the nvidia ethernet ports as an=20
>> alternative but that driver had its own issues at the time.  This was=20=

>> never a problem with FFS instead of ZFS.  I could repeat it fairly=20
>> easily by running fetch in a loop (can't remember if writing the =
output=20
>> to disk was necessary to trigger it).  The workaround I found that=20
>> worked for me was to buy a cheap intel PCIE nic and use that instead =
of=20
>> the onboard ports.  If a zpool scrub triggers it for you, I doubt my=20=

>> workaround will help but I wanted to relate my experience.
>=20
> The problem i had was most likley the disc-io itself. It was always =
there=20
> whenever a larger number of discs was in motion.It was never there as=20=

> violent networking ( i even used myri2000 to increase traffic, never a =
problem)
>=20
> A scrub on the 20-or-so zpool was all that was needed, andn when =
rebooting=20
> the scrub continued and whoops - a new reboot.
>=20
> Sometimes the bios reported not even 16G mem but 10.5 ( which also =
freebsd noticed)
>=20
> Right now i am torturing the box with same load ( minus myri2000) and =
sunk-os,
> i'll report if it does show simular problems.
>=20
>=20
>>=20
>>> Given this above diagram, I'm sure you can figure out how "flooding"
>>> might occur.  :-)  I'm not sure what "sync flood" means (vs. I/O
>>> flooding).
>>=20
>> As I understand it, a sync flood is a purposeful reaction to an error=20=

>> condition as somewhat of a last ditch effort to regain control over =
the=20
>> system (which ends up rebooting).  I'm pulling this out of my memory=20=

>> from a few years ago.

As Adam has pointed out, a sync flood is a way to signal an error =
condition on the hyper transport. As I understand it, it's used as a =
last resort when less fatal means of error communication are no longer =
possible because of a problem on the transport or a device attached to =
it. The transport will not recover from this state until it's reset. On =
Sun AMD systems a reboot is triggered immediately when a sync flood is =
detected. The fact that it happened is mentioned during POST, but it =
should also appear in the machine's error logs (IPMI/iLOM), so if you =
haven't done this already, it might be worth checking them. Maybe you'll =
find additional information there.

You should be able to disable the automatic reset on sync flood in your =
BIOS settings. We did this on our Sun X4200M2 machines when we =
experienced sync flood errors. It allowed the kernel to catch an MCE, =
panic and print out information about the MCE. This might help you get =
more information about the cause.

Our problems with the X4200M2 have some similarties with your case, =
though in our case high IO (i.e. zpool scrub) did not reliably (read: =
within minutes or hours) trigger the MCE/sync flood. If we put load on =
the zpool _and_ the network (em) we could trigger it easily. An other =
similarity: an other OS (in our case Linux), did not show the symptoms. =
Even other FreeBSD branches did not trigger the sync flood. You'll find =
the thread here:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-July/057670.html

It's a rather long thread. Short version: If raid controller (mpt) =
interrupts were routed to the first cpu (cpu0) everything worked, if =
not, sync flood (or MCE) happened on heavy IO. It happens that Linux and =
even older and newer FreeBSD versions (7.x, 9.x) assigned different =
interrupt routes for mpt0 compared to the FreeBSD 8.1 we were testing =
on. So what seemed like a bug of a specific FreeBSD version, because it =
didn't happen using other FreeBSD versions and Linux, turned out to be a =
hardware problem after all. IIRC a change in some hardware clock code =
caused an additional IRQ to be registered on boot (or one less), which =
reshuffled interrupt assignments compared to older FreeBSD versions we =
had used successfully on those machines. So we fixed it by setting a =
tunable which restored old clock behavior and thus old interrupt =
assignments.

It impossible to tell wether you have the same problem. But if you don't =
see any problems with other operating systems, maybe it's worth to play =
around with interrupt assignments. Luckily, the routings are tunable at =
runtime through cpuset(1). For example:

# cpuset -c -l 0 -x 58

IRQ58 was used by mpt0. Rerouting it to cpu0 made all problems go away. =
Hope this helps you in some way.


Good luck,

--=20
Markus