Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Feb 2012 00:59:33 +0100
From:      Csillag Tamas <cstamas@digitus.itk.ppke.hu>
To:        freebsd-scsi@freebsd.org
Subject:   mfi timeout issues and patch that seems to work (PERC H800)
Message-ID:  <20120223235932.GB19927@rivendell>

next in thread | raw e-mail | index | archive | help
Hi,

I had the same issues with the Perc H800 controller as it is described
here: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416
(Just for google I include the error here:
mfi0: COMMAND 0xffffff80009c4b90 TIMEOUT AFTER 41 SECONDS)

mfsbsd# mfiutil show adapter
mfi0 Adapter:
    Product Name: PERC H800 Adapter
   Serial Number: 1A8006L
        Firmware: 12.10.2-0004
     RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50
  Battery Backup: present
           NVRAM: 32K
  Onboard Memory: 1024M
  Minimum Stripe: 8k
  Maximum Stripe: 1M

mfsbsd# mfiutil show firmware
mfi0 Firmware Package Version: 12.10.2-0004
mfi0 Firmware Images:
Name  Version                        Date           Time      Status
BIOS  3.18.00_4.09.05.00_0x0416A000  00_0x0416A000            active
APP   2.100.03-1405                  Sep 19 2011    17:58:36  active
PCLI  04.04-010:#%00008              May 31 2010    20:21:52  active
CTLR  2.02-0025.1                    Aug 22 2011    11:37:38  active
NVDT  2.07.03-0003                   Jul 14 2010    15:53:29  active
BTBL  2.02.00.00-0000                Sep 16 2009    21:37:06  active
BOOT  01.250.04.219                  4/28/2009      12:51:38  active

however getting and compiling the newest kernel did NOT fix it for me.
Issuing commands with mfiutils still fixed the hang and everything
returned to normal.

It seems that intensive read triggers the issue, but if you have write
concurrently you are fine (mostly). Restarting rsync is ideal for
triggering this buggy condition.

I tried to poke around in the source code and in the end tweaked the
patch (seen here before
http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html):

replacing
(void)sc->mfi_read_fw_status(sc);
with
mfi_get_controller_info(sc);
around line 933

and after testing it for a day it seems to be solid.
(The original patch did not help me.)

Can someone expert in the topic can confirm if any of this is correct?

Thanks in advance!

Regards,
  cstamas
-- 
CSILLAG Tamas (cstamas) - http://digitus.itk.ppke.hu/~cstamas

Arguing with an engineer is like wrestling with a pig in mud. After a while,
you realise the pig is enjoying it.                  -- Jamie Lawrence. 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120223235932.GB19927>