From owner-freebsd-stable@FreeBSD.ORG Mon Jul 12 16:08:56 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 008BC1065674 for ; Mon, 12 Jul 2010 16:08:56 +0000 (UTC) (envelope-from markus.gebert@hostpoint.ch) Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch [217.26.48.124]) by mx1.freebsd.org (Postfix) with ESMTP id B8E5F8FC1C for ; Mon, 12 Jul 2010 16:08:55 +0000 (UTC) Received: from [77.109.131.203] (port=61079 helo=ch4buk-en0.office.hostpoint.internal) by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1OYLYs-000MKb-Ea; Mon, 12 Jul 2010 18:08:54 +0200 Mime-Version: 1.0 (Apple Message framework v1078) Content-Type: text/plain; charset=us-ascii From: Markus Gebert In-Reply-To: <20100712154316.GD15784@egr.msu.edu> Date: Mon, 12 Jul 2010 18:08:54 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <2CA1E184-D9FF-4093-A6D9-18AF6DDC7407@hostpoint.ch> References: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch> <20100712154316.GD15784@egr.msu.edu> To: Adam McDougall X-Mailer: Apple Mail (2.1078) Cc: freebsd-stable Subject: Re: 8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Jul 2010 16:08:56 -0000 On 12.07.2010, at 17:43, Adam McDougall wrote: > I also get MCE on x4100m2 when causing significant disk activity in = mpt > while also downloading through em0 or em1. Could you reproduce this on 6.x or 7.x? Because whatever we try here, we = simply couldn't so far. A short test with Ubuntu also didn't show any = sing of problems. > I was not able to trigger it > while using nfe, however nfe locked up on me during normal DNS server > traffic so that was a wash. We had issues with nfe pre-8.x, that's why we have been using the em = nics, which seem to be part of the problem now in 8.x. > What seemed to work for me was to add an > Intel PCIE nic to the server and use it instead of the onboard NICS. Thanks for the hint. > For whatever reason I never experienced this problem until using ZFS. We were able to reproduce it with UFS on 8.x. with just one disks (no = gmirror), but I guess it's easier to trigger with ZFS especially in an = mirror setup. > I triggered it by downloading a 200m tgz file via http repeatedly > over gigabit and it would reliably crash within a minute or two. Our test case is basically: 1. fetch a large file using wget over em0 (100mbit link seems enough) 2. cp a large file locally to stress mpt 3. wait for MCE > I ordered a dozen nics for probably around $20 each and was satisfied > with this workaround given the age of the servers. I'm pinched on = time > for work so I often don't get around to reporting issues where I've > found a workaround, I'm glad you can get that started.=20 "Glad" we're not the only ones :-) Markus