Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Sep 2015 21:35:22 +0000 (UTC)
From:      Pallav Bose <pallav_bose@yahoo.com>
To:        "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org>
Subject:   Interrupt storm and poor disk performance | mfi(4) driver | FreeBSD 8 | Dell PERC H730
Message-ID:  <472489221.917644.1443216922050.JavaMail.yahoo@mail.yahoo.com>

next in thread | raw e-mail | index | archive | help
Hello,
I have a Dell PowerEdge R430 server with a PERC H730 RAID controller. I'm t=
rying to get FreeBSD 8 to install and run on this server. At this time, I h=
ave a patched version of the mfi(4) driver which attaches to the controller=
. I'm aware of mrsas(4), but since I have scripts that use mfiutil(8), I'd =
like to continue using the mfi(4) driver.

A simple dd test shows SSD performance to be very poor:
# dd if=3D/dev/mfid0 of=3D/dev/null bs=3D1m count=3D10241024+0 records in10=
24+0 records out1073741824 bytes transferred in 27.978784 secs (38377001 by=
tes/sec)
top -PHS shows a lot of CPU time being used by the swi6 s/w interrupt handl=
er:
last pid: 81270; =C2=A0load averages: =C2=A00.01, =C2=A00.05, =C2=A00.05 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 up 0+05:34:20 =C2=A015:45:51302 processes: 7 runni=
ng, 278 sleeping, 17 waitingCPU 0: =C2=A00.0% user, =C2=A00.0% nice, =C2=A0=
0.0% system, 52.6% interrupt, 47.4% idleCPU 1: =C2=A00.0% user, =C2=A00.0% =
nice, =C2=A00.0% system, =C2=A00.0% interrupt, =C2=A0100% idleCPU 2: =C2=A0=
0.0% user, =C2=A00.0% nice, =C2=A00.0% system, =C2=A00.0% interrupt, =C2=A0=
100% idleCPU 3: =C2=A00.0% user, =C2=A00.0% nice, =C2=A00.0% system, =C2=A0=
0.0% interrupt, =C2=A0100% idleCPU 4: =C2=A00.0% user, =C2=A00.0% nice, =C2=
=A00.0% system, =C2=A00.0% interrupt, =C2=A0100% idleCPU 5: =C2=A00.0% user=
, =C2=A00.0% nice, =C2=A00.0% system, =C2=A00.7% interrupt, 99.3% idleMem: =
48M Active, 4044K Inact, 997M Wired, 7144K Cache, 1248K Buf, 30G FreeSwap:
=C2=A0 PID USERNAME =C2=A0 =C2=A0PRI NICE =C2=A0 SIZE =C2=A0 =C2=A0RES STAT=
E =C2=A0 C =C2=A0 TIME =C2=A0 WCPU COMMAND=C2=A0 =C2=A010 root =C2=A0 =C2=
=A0 =C2=A0 =C2=A0171 ki31 =C2=A0 =C2=A0 0K =C2=A0 192K CPU5 =C2=A0 =C2=A05 =
319:51 100.00% {idle: cpu5}=C2=A0 =C2=A010 root =C2=A0 =C2=A0 =C2=A0 =C2=A0=
171 ki31 =C2=A0 =C2=A0 0K =C2=A0 192K CPU2 =C2=A0 =C2=A02 293:32 94.58% {id=
le: cpu2}=C2=A0 =C2=A010 root =C2=A0 =C2=A0 =C2=A0 =C2=A0171 ki31 =C2=A0 =
=C2=A0 0K =C2=A0 192K CPU3 =C2=A0 =C2=A03 298:46 93.65% {idle: cpu3}=C2=A0 =
=C2=A010 root =C2=A0 =C2=A0 =C2=A0 =C2=A0171 ki31 =C2=A0 =C2=A0 0K =C2=A0 1=
92K CPU4 =C2=A0 =C2=A04 278:55 92.58% {idle: cpu4}=C2=A0 =C2=A010 root =C2=
=A0 =C2=A0 =C2=A0 =C2=A0171 ki31 =C2=A0 =C2=A0 0K =C2=A0 192K CPU1 =C2=A0 =
=C2=A01 289:36 92.19% {idle: cpu1}=C2=A0 =C2=A010 root =C2=A0 =C2=A0 =C2=A0=
 =C2=A0171 ki31 =C2=A0 =C2=A0 0K =C2=A0 192K RUN =C2=A0 =C2=A0 0 293:17 85.=
99% {idle: cpu0}=C2=A0 =C2=A011 root =C2=A0 =C2=A0 =C2=A0 =C2=A0-24 =C2=A0 =
=C2=A0- =C2=A0 =C2=A0 0K =C2=A0 544K WAIT =C2=A0 =C2=A02 173:40 47.27% {swi=
6: task queue}=C2=A0 =C2=A011 root =C2=A0 =C2=A0 =C2=A0 =C2=A0-64 =C2=A0 =
=C2=A0- =C2=A0 =C2=A0 0K =C2=A0 544K WAIT =C2=A0 =C2=A05 =C2=A011:50 =C2=A0=
0.00% {irq256: mfi0}=C2=A0 =C2=A011 root =C2=A0 =C2=A0 =C2=A0 =C2=A0-32 =C2=
=A0 =C2=A0- =C2=A0 =C2=A0 0K =C2=A0 544K WAIT =C2=A0 =C2=A01 =C2=A0 6:26 =
=C2=A00.00% {swi4: clock}
The interrupt rate in case of irq256:mfi0 is very high, in spite of there b=
eing no disk activity.=20
# vmstat -iinterrupt =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0total =C2=A0 =C2=A0 =C2=A0 rateirq4: =
uart0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0257 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00irq9: acpi0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00irq18: ehci0 ehci1=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 71739 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A03cpu0: timer =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 40226355 =C2=A0 =C2=A0 =C2=A0 1998irq256: m=
fi0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 3=
642472 =C2=A0 =C2=A0 =C2=A0 =C2=A0180irq257: bge0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 34922 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A01cpu3: timer =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 40229128 =C2=A0 =C2=A0 =C2=A0 1998cpu5: timer =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 40228=
959 =C2=A0 =C2=A0 =C2=A0 1998cpu4: timer =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 40229014 =C2=A0 =C2=A0 =C2=A0 1998cpu1:=
 timer =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 40228629 =C2=A0 =C2=A0 =C2=A0 1998cpu2: timer =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 40223967 =C2=A0 =C2=A0 =C2=A0=
 1998Total =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0245115443 =C2=A0 =C2=A0 =C2=A012175
Procstat output:
# procstat -kk 11 =C2=A0 =C2=A0 =C2=A0 # PID 11 taken from output of top=C2=
=A0 PID =C2=A0 =C2=A0TID COMM =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 TDN=
AME =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 KSTACK=C2=A0 =C2=A011 100008 intr =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi3: vm=C2=A0 =C2=A011 100009 in=
tr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi1: netisr 0 =C2=A0 mi_switc=
h+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A0=
11 100010 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi4: clock =C2=A0=
 =C2=A0 =C2=A0mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_tramp=
oline+0xe=C2=A0 =C2=A011 100011 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 swi4: clock =C2=A0 =C2=A0 =C2=A0mi_switch+0x205 ithread_loop+0x1bf fork=
_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 100012 intr =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 swi4: clock =C2=A0 =C2=A0 =C2=A0mi_switch+0x205=
 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 1000=
13 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi4: clock =C2=A0 =C2=A0=
 =C2=A0mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0=
xe=C2=A0 =C2=A011 100014 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi=
4: clock =C2=A0 =C2=A0 =C2=A0mi_switch+0x205 ithread_loop+0x1bf fork_exit+0=
x112 fork_trampoline+0xe=C2=A0 =C2=A011 100015 intr =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 swi4: clock =C2=A0 =C2=A0 =C2=A0mi_switch+0x205 ithrea=
d_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 100021 intr=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi5: +=C2=A0 =C2=A011 100023 in=
tr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi6: Giant task mi_switch+0x2=
05 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 10=
0024 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi6: task queue mi_swi=
tch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=
=A011 100027 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi2: cambio =
=C2=A0 =C2=A0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_tramp=
oline+0xe=C2=A0 =C2=A011 100032 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 irq9: acpi0 =C2=A0 =C2=A0 =C2=A0mi_switch+0x205 ithread_loop+0x1bf fork=
_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 100033 intr =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 irq256: mfi0 =C2=A0 =C2=A0 mi_switch+0x205 ithr=
ead_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 100034 in=
tr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 irq18: ehci0 ehc mi_switch+0x2=
05 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0 =C2=A011 10=
0039 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 swi0: uart uart =C2=A0m=
i_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe=C2=A0=
 =C2=A011 100040 intr =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 irq1: atkbd=
0
# kldload dtraceall# dtrace -n 'profile:::profile-276hz { @pc[stack()]=3Dco=
unt(); }'dtrace: description 'profile:::profile-276hz ' matched 1 probe
The above dtrace script is supposed to=C2=A0print all the stack traces seen=
 during the sampling period.
The following stack trace occurs a large number of times:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`DELAY+0x64
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`bus_dmamap_load+0x3=
a9=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`mfi_mapcmd+0x4f=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`mfi_startio+0x65=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`mfi_wait_command+0x9c=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`mfi_tbolt_sync_map_=
info+0xb4=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`mfi_handle=
_map_sync+0x39=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`taskq=
ueue_run+0x91=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel`intr_e=
vent_execute_handlers+0x66=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
kernel`ithread_loop+0x8e=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ke=
rnel`fork_exit+0x112=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 kernel=
`0xffffffff8050624e=C2=A0Can someone help me debug this problem? It's likel=
y that the mfi(4) driver I currently have access to doesn't have all the ne=
cessary patches.
Thank you.

Regards,
Pallav
From owner-freebsd-questions@freebsd.org  Fri Sep 25 22:32:28 2015
Return-Path: <owner-freebsd-questions@freebsd.org>
Delivered-To: freebsd-questions@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 96A0CA0898D
 for <freebsd-questions@mailman.ysv.freebsd.org>;
 Fri, 25 Sep 2015 22:32:28 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 7173018E8
 for <freebsd-questions@freebsd.org>; Fri, 25 Sep 2015 22:32:28 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by douhisi.pair.com (Postfix) with ESMTPSA id 3D6AF3F727
 for <freebsd-questions@freebsd.org>; Fri, 25 Sep 2015 18:32:21 -0400 (EDT)
Message-ID: <5605CB74.2020908@sneakertech.com>
Date: Fri, 25 Sep 2015 18:32:20 -0400
From: Quartz <quartz@sneakertech.com>
MIME-Version: 1.0
To: freebsd-questions@freebsd.org
Subject: Re: ZFS ready drives WAS: zfs performance degradation
References: <56019211.2050307@dim.lv>
 <37A37E9D-9D65-4553-BBA2-C5B032163499@kraus-haus.org>
 <56038054.5060906@dim.lv>
 <782C9CEF-BE07-4E05-83ED-133B7DA96780@kraus-haus.org> <56040150.90403@dim.lv>
 <60BF2FC3-0342-46C9-A718-52492303522F@kraus-haus.org>
 <560412B2.9070905@dim.lv>
 <8D1FF55C-7068-4AB6-8C0E-B4E64C1BB5FA@kraus-haus.org>
 <56042209.8040903@dim.lv>
 <2008181C-F0B5-4581-9D15-11911A1DE41B@kraus-haus.org>
 <CAFYkXjkdUrcUUdVQW4qgSuEmtifD=mvbvf4k0vq5t9R6dtR1pQ@mail.gmail.com>
 <6498A090-A2A2-4580-A148-2BCBF68BF2BF@kraus-haus.org>
 <5605481D.10902@physics.umn.edu>
 <106217D9-F3DB-4DB5-822E-098041B5BC6F@kraus-haus.org>
In-Reply-To: <106217D9-F3DB-4DB5-822E-098041B5BC6F@kraus-haus.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions/>;
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Sep 2015 22:32:28 -0000

> once you take
> the 5 year warranty in account.

This assumes that the company in question will honorably honor the 
warranty. It's been our experience that they usually don't. I can't 
count you the number of times a drive manufacturer has pulled a fast one 
on a warranty replacement. WD is especially bad about this, they send 
back a cheaper drive than the original, a bottom-bin refurbished drive 
with twice the runtime/wear as the original that dies after a month 
(which mysteriously doesn't qualify for a warranty itself), or randomly 
insists we have to do a pay-and-reimburse replacement method then loses 
the records and never reimburses us.

I understand that this is all anecdotal, but personally I don't find 
warranties worth the paper they're written on and never assume getting a 
functional replacement anymore. Long term it's cheaper to just buy a new 
drive outright than to waste employee time arguing over the phone for 
days. I buy exclusively based on ratings and reliability reports now.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?472489221.917644.1443216922050.JavaMail.yahoo>