From owner-freebsd-hackers@FreeBSD.ORG  Fri Jan 30 15:59:05 2015
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 48426184
 for <freebsd-hackers@freebsd.org>; Fri, 30 Jan 2015 15:59:05 +0000 (UTC)
Received: from resqmta-ch2-03v.sys.comcast.net
 (resqmta-ch2-03v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:35])
 (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits))
 (Client CN "Bizanga Labs SMTP Client Certificate",
 Issuer "Bizanga Labs CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 07CFFDBC
 for <freebsd-hackers@freebsd.org>; Fri, 30 Jan 2015 15:59:04 +0000 (UTC)
Received: from resomta-ch2-15v.sys.comcast.net ([69.252.207.111])
 by resqmta-ch2-03v.sys.comcast.net with comcast
 id mFyy1p0012Qkjl901Fz3EB; Fri, 30 Jan 2015 15:59:03 +0000
Received: from resmail-ch2-217v.sys.comcast.net ([162.150.48.251])
 by resomta-ch2-15v.sys.comcast.net with comcast
 id mFz31p00H5RAVJS01Fz3XX; Fri, 30 Jan 2015 15:59:03 +0000
Date: Fri, 30 Jan 2015 15:59:03 +0000 (UTC)
From: rondzierwa@comcast.net
To: freebsd-hackers@freebsd.org
Message-ID: <772071445.14131656.1422633543143.JavaMail.zimbra@comcast.net>
In-Reply-To: <1337518696.14094651.1422631127707.JavaMail.zimbra@comcast.net>
Subject: sync flood
MIME-Version: 1.0
X-Originating-IP: [::ffff:50.241.136.197]
X-Mailer: Zimbra 8.0.7_GA_6031 (ZimbraWebClient - FF28 (Win)/8.0.7_GA_6031)
Thread-Topic: sync flood
Thread-Index: NXVafJxHZRv/WMBNhhIT8IomaT2cmg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
 s=q20140121; t=1422633543;
 bh=UmdkN6M7UYSiE5evcox7htmXAc3sPKrl/+szm1xC2zs=;
 h=Received:Received:Date:From:To:Message-ID:Subject:MIME-Version:
 Content-Type;
 b=XR4QzqmihDLDTpr+s+qNIX5yQmfksGrKhsE3hQQ1szwiglLixm1LngQ/j+1mvV2k7
 zkJ7Eu7tiv28fKs2VLKmNL1bw6xZSMECZ8GVq5lHIvn4BW2L95oE4f1jWT4l5XtHf8
 +Sqm+H+o8mTRaomN4XcuVEOVLyDpT0lLFXhm17Ns31VVuGSIJK9BVd5RnJ17HHtkzP
 IrWxNESQsTgNpRZl3XldzO3bYyNPFwwZLgFRZuOAFaBHpPAm5sNL6x9MLYOsxo9k6Z
 AUBgwR7JnPDRwgz+uuhM2TP0bEKYZgak1rreaN9xWfoR7olUFnlscv5SL375aO4yzt
 mvOiYHsevGtJw==
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jan 2015 15:59:05 -0000

I am using freebsd 10.1-Release on a sunfire x4500 (thumper) and have run into a couple of odd things I was hoping someone could shed some light on. 

I am using the system to share up space on nfs and samba. the system has slots for up to 48 drives. I have 15 currently populated (a boot disk and 14 array disks). I created a raidz and zfs pool and began creating shares. 

When I began actually using the system to serve up nfs shares over one of the on-board em ethernet devices, the system crashed within a couple of minutes. when it rebooted, the bios screen stopped with a message indicating that there had been a sync flood condition that caused the reboot. This was easily and quickly repeatable, and it made the server useless. 

I have found several threads on the mailing lists where people have encountered this before and i tried a few things, but what made it stop was disabling all but the boot processor using loader.conf. Once I was running on only one processor the sync flood stopped happening and its been running under load for a day. The server has run reliably under solaris, but since they were end-of-life, i was able to economically re-purpose them. 

What led me to try running on only one processpr was part of a thread that talked about changing the way interrupts route to the various processors. 
http://lists.freebsd.org/pipermail/freebsd-stable/2010-July/057670.html 
The thread was using a Sun X4100, so what they were doing did not seem to directly apply, but by eliminating all but the boot processor would certainly solve any interrupt routing issues, and it was easy enough to try, and it seemed to have masked the problem. 

For the long term however, this is not a workable solution. This server will be given more and more things to do and the other processors will become more of a necessity. 

It seems like there is something in the default assignment of hardware resources that is having a problem dealing with a system like this that has so much on the bus (6 marvel sata controllers, 4 intel pro/1000 controllers, 4 usb controllers). It also has an issue where freebsd can only allocate bus resources for 2 of the 4 ethernet devices: 
em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.6> mem 0xfdbe0000-0xfdbfffff irq 61 at device 1.0 on pci8 
em2: 0x40 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). 
em2: Unable to allocate bus resource: ioport 
em2: Allocation of PCI resources failed 
device_attach: em2 attach returned 6 
em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.6> mem 0xfdbc0000-0xfdbdffff irq 62 at device 1.1 on pci8 
em2: 0x40 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). 
em2: Unable to allocate bus resource: ioport 
em2: Allocation of PCI resources failed 
device_attach: em2 attach returned 6 


There are plans to complicate things further by adding two InfiBand interfaces. 

can anyone offer any ideas as to how to chase this problem? 


thanks, 
ron.