From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 24 00:17:55 2012 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3C8801065676 for ; Fri, 24 Feb 2012 00:17:55 +0000 (UTC) (envelope-from cstamas@digitus.itk.ppke.hu) Received: from jugisgw.ppke.hu (mailgw.ppke.hu [193.225.109.43]) by mx1.freebsd.org (Postfix) with ESMTP id DF8B18FC19 for ; Fri, 24 Feb 2012 00:17:54 +0000 (UTC) Received: from jugisgw.ppke.hu (localhost.localdomain [127.0.0.1]) by jugisgw.ppke.hu (Postfix) with ESMTP id 5481240116A2 for ; Fri, 24 Feb 2012 00:59:46 +0100 (CET) X-PPKE-MailScanner-Watermark: 1330646385.56406@b0DhXstYAx45yIgIQqsvrg X-PPKE-MailScanner-From: cstamas@digitus.itk.ppke.hu X-PPKE-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, pont=0, szukseges 5, autolearn=disabled) X-PPKE-MailScanner: Found to be clean X-PPKE-MailScanner-ID: 1S0iZZ-000141-93 Received: from rivendell.itk.ppke.hu (rivendell.itk.ppke.hu [193.225.109.193]) by jugisgw.ppke.hu (Postfix) with ESMTP id 410C240113CD for ; Fri, 24 Feb 2012 00:59:41 +0100 (CET) Received: by rivendell.itk.ppke.hu (Postfix, from userid 1000) id 0D27C35377; Fri, 24 Feb 2012 00:59:33 +0100 (CET) Date: Fri, 24 Feb 2012 00:59:33 +0100 From: Csillag Tamas To: freebsd-scsi@freebsd.org Message-ID: <20120223235932.GB19927@rivendell> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline X-Operating-System: Gnu/Linux X-PPKE-NOSPAM: I promise, I will never let anything happen to you. Nemo. X-PGP-Key: http://digitus.itk.ppke.hu/~cstamas/cstamas.pgp User-Agent: Mutt/1.5.20 (2009-06-14) Subject: mfi timeout issues and patch that seems to work (PERC H800) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Csillag Tamas List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2012 00:17:55 -0000 Hi, I had the same issues with the Perc H800 controller as it is described here: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 (Just for google I include the error here: mfi0: COMMAND 0xffffff80009c4b90 TIMEOUT AFTER 41 SECONDS) mfsbsd# mfiutil show adapter mfi0 Adapter: Product Name: PERC H800 Adapter Serial Number: 1A8006L Firmware: 12.10.2-0004 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 1024M Minimum Stripe: 8k Maximum Stripe: 1M mfsbsd# mfiutil show firmware mfi0 Firmware Package Version: 12.10.2-0004 mfi0 Firmware Images: Name Version Date Time Status BIOS 3.18.00_4.09.05.00_0x0416A000 00_0x0416A000 active APP 2.100.03-1405 Sep 19 2011 17:58:36 active PCLI 04.04-010:#%00008 May 31 2010 20:21:52 active CTLR 2.02-0025.1 Aug 22 2011 11:37:38 active NVDT 2.07.03-0003 Jul 14 2010 15:53:29 active BTBL 2.02.00.00-0000 Sep 16 2009 21:37:06 active BOOT 01.250.04.219 4/28/2009 12:51:38 active however getting and compiling the newest kernel did NOT fix it for me. Issuing commands with mfiutils still fixed the hang and everything returned to normal. It seems that intensive read triggers the issue, but if you have write concurrently you are fine (mostly). Restarting rsync is ideal for triggering this buggy condition. I tried to poke around in the source code and in the end tweaked the patch (seen here before http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html): replacing (void)sc->mfi_read_fw_status(sc); with mfi_get_controller_info(sc); around line 933 and after testing it for a day it seems to be solid. (The original patch did not help me.) Can someone expert in the topic can confirm if any of this is correct? Thanks in advance! Regards, cstamas -- CSILLAG Tamas (cstamas) - http://digitus.itk.ppke.hu/~cstamas Arguing with an engineer is like wrestling with a pig in mud. After a while, you realise the pig is enjoying it. -- Jamie Lawrence.