From owner-freebsd-hardware@FreeBSD.ORG Mon Jun 28 23:47:14 2010 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA3FA106566B; Mon, 28 Jun 2010 23:47:14 +0000 (UTC) (envelope-from lists@mawer.org) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7027E8FC13; Mon, 28 Jun 2010 23:47:14 +0000 (UTC) Received: by vws13 with SMTP id 13so8453316vws.13 for ; Mon, 28 Jun 2010 16:47:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.162.148 with SMTP id v20mr3000060vcx.176.1277767186454; Mon, 28 Jun 2010 16:19:46 -0700 (PDT) Received: by 10.220.194.5 with HTTP; Mon, 28 Jun 2010 16:19:46 -0700 (PDT) In-Reply-To: <201006281409.23546.jhb@freebsd.org> References: <4C2499B5.3030404@wp.pl> <201006281326.08896.jhb@freebsd.org> <4C28E287.5010103@wp.pl> <201006281409.23546.jhb@freebsd.org> Date: Tue, 29 Jun 2010 09:19:46 +1000 Message-ID: From: Antony Mawer To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Ireneusz Pluta , freebsd-hardware@freebsd.org Subject: Re: System hangs during heavy sequential write to mfi device X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jun 2010 23:47:14 -0000 On Tue, Jun 29, 2010 at 4:09 AM, John Baldwin wrote: > On Monday 28 June 2010 1:57:27 pm Ireneusz Pluta wrote: >> John Baldwin pisze: >> > On Monday 28 June 2010 12:00:06 pm Ireneusz Pluta wrote: >> > >> >> John Baldwin pisze: >> >> >> >>> On Friday 25 June 2010 4:59:57 pm Ireneusz Pluta wrote: >> >>> >> >>> >> >>>> John Baldwin pisze: >> >>>> >> >>>> >> >>>>> Hmmm. =A0You might have a hardware issue. =A0OTOH, you can try see= ing if > you >> >>>>> >> > have >> > >> >>>>> a BIOS option to disable PCIE error logging. >> >>>>> >> >>>>> >> >>>> is it one of them?: >> >>>> >> >>>> Assert NMI on SERR >> >>>> Assert NMI on PERR >> >>>> >> >>>> (pdf page 109 of: -> >> >>>> >> >>>> >> > > http://download.intel.com/support/motherboards/server/s5520hc/sb/e3952901= 3_s5520hc_s5500hcv_s5520hct_tps_r1_9.pdf) >> > >> >>>> >> >>>> >> >>> Well, that will turn off the NMIs. =A0Not sure if it will affect the= event >> >>> logging, but it is worth a shot. >> >>> >> >>> >> >> Per BIOS setup documentation: >> >> >> >> On SERR, generate an NMI and log an error. >> >> Note: [Enabled] must be selected for the Assert NMI >> >> on PERR setup option to be visible. >> >> >> >> and: >> >> >> >> On PERR, generate an NMI and log an error. >> >> Note: This option is only active if the Assert NMI on >> >> SERR option is [Enabled] selected. >> >> >> >> However, disabling them did not change anything. >> >> >> > >> > Is it still logging errors and sending NMIs with them disabled? >> > >> with the options I mentioned disabled. They do not have to be the only >> sources of NMIs, do they? > > Well, they should be the sources of the log messages you found in your sy= stem > event log. =A0There is a good chance that you have some broken hardware > somewhere, I'm not sure how easy it is for you to debug that via swapping= out > components, but the RAID controller is the first thing I would try. You might want to try a BIOS update to see if it resolves these problems ... I seem to remember some mention of these sorts of errors in the change log of one of the recent Intel server board BIOSes. -- Antony