Date: Fri, 25 Sep 2015 23:10:58 +0000 (UTC) From: Pallav Bose <pallav_bose@yahoo.com> To: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> Subject: Re: Interrupt storm and poor disk performance | mfi(4) driver | FreeBSD 8 | Dell PERC H730 Message-ID: <488290978.924887.1443222658131.JavaMail.yahoo@mail.yahoo.com> In-Reply-To: <472489221.917644.1443216922050.JavaMail.yahoo@mail.yahoo.com> References: <472489221.917644.1443216922050.JavaMail.yahoo@mail.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
To add to my earlier email, the mfi(4) driver in my source tree was last synced with FreeBSD 8 stable tree up to https://svnweb.freebsd.org/base?view=revision&revision=250497 I integrated these three patches to get the mfi(4) driver to attach to the the H730 controller:https://svnweb.freebsd.org/base?view=revision&revision=252471 https://svnweb.freebsd.org/base?view=revision&revision=256924 https://svnweb.freebsd.org/base?view=revision&revision=261535 I can undo my patch work, but then how I should I proceed from here to resolve the interrupt storm problem? Thanks,Pallav On Friday, September 25, 2015 2:35 PM, Pallav Bose <pallav_bose@yahoo.com> wrote: Hello, I have a Dell PowerEdge R430 server with a PERC H730 RAID controller. I'm trying to get FreeBSD 8 to install and run on this server. At this time, I have a patched version of the mfi(4) driver which attaches to the controller. I'm aware of mrsas(4), but since I have scripts that use mfiutil(8), I'd like to continue using the mfi(4) driver. A simple dd test shows SSD performance to be very poor: # dd if=/dev/mfid0 of=/dev/null bs=1m count=10241024+0 records in1024+0 records out1073741824 bytes transferred in 27.978784 secs (38377001 bytes/sec) top -PHS shows a lot of CPU time being used by the swi6 s/w interrupt handler: last pid: 81270; load averages: 0.01, 0.05, 0.05 up 0+05:34:20 15:45:51302 processes: 7 running, 278 sleeping, 17 waitingCPU 0: 0.0% user, 0.0% nice, 0.0% system, 52.6% interrupt, 47.4% idleCPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idleCPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.7% interrupt, 99.3% idleMem: 48M Active, 4044K Inact, 997M Wired, 7144K Cache, 1248K Buf, 30G FreeSwap: PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 10 root 171 ki31 0K 192K CPU5 5 319:51 100.00% {idle: cpu5} 10 root 171 ki31 0K 192K CPU2 2 293:32 94.58% {idle: cpu2} 10 root 171 ki31 0K 192K CPU3 3 298:46 93.65% {idle: cpu3} 10 root 171 ki31 0K 192K CPU4 4 278:55 92.58% {idle: cpu4} 10 root 171 ki31 0K 192K CPU1 1 289:36 92.19% {idle: cpu1} 10 root 171 ki31 0K 192K RUN 0 293:17 85.99% {idle: cpu0} 11 root -24 - 0K 544K WAIT 2 173:40 47.27% {swi6: task queue} 11 root -64 - 0K 544K WAIT 5 11:50 0.00% {irq256: mfi0} 11 root -32 - 0K 544K WAIT 1 6:26 0.00% {swi4: clock} The interrupt rate in case of irq256:mfi0 is very high, in spite of there being no disk activity. # vmstat -iinterrupt total rateirq4: uart0 257 0irq9: acpi0 1 0irq18: ehci0 ehci1 71739 3cpu0: timer 40226355 1998irq256: mfi0 3642472 180irq257: bge0 34922 1cpu3: timer 40229128 1998cpu5: timer 40228959 1998cpu4: timer 40229014 1998cpu1: timer 40228629 1998cpu2: timer 40223967 1998Total 245115443 12175 Procstat output: # procstat -kk 11 # PID 11 taken from output of top PID TID COMM TDNAME KSTACK 11 100008 intr swi3: vm 11 100009 intr swi1: netisr 0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100010 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100011 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100012 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100013 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100014 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100015 intr swi4: clock mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100021 intr swi5: + 11 100023 intr swi6: Giant task mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100024 intr swi6: task queue mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100027 intr swi2: cambio mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100032 intr irq9: acpi0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100033 intr irq256: mfi0 mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100034 intr irq18: ehci0 ehc mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100039 intr swi0: uart uart mi_switch+0x205 ithread_loop+0x1bf fork_exit+0x112 fork_trampoline+0xe 11 100040 intr irq1: atkbd0 # kldload dtraceall# dtrace -n 'profile:::profile-276hz { @pc[stack()]=count(); }'dtrace: description 'profile:::profile-276hz ' matched 1 probe The above dtrace script is supposed to print all the stack traces seen during the sampling period. The following stack trace occurs a large number of times: kernel`DELAY+0x64 kernel`bus_dmamap_load+0x3a9 kernel`mfi_mapcmd+0x4f kernel`mfi_startio+0x65 kernel`mfi_wait_command+0x9c kernel`mfi_tbolt_sync_map_ info+0xb4 kernel`mfi_handle_map_sync+ 0x39 kernel`taskqueue_run+0x91 kernel`intr_event_execute_ handlers+0x66 kernel`ithread_loop+0x8e kernel`fork_exit+0x112 kernel`0xffffffff8050624e Can someone help me debug this problem? It's likely that the mfi(4) driver I currently have access to doesn't have all the necessary patches. Thank you. Regards, Pallav From owner-freebsd-questions@freebsd.org Sat Sep 26 04:06:09 2015 Return-Path: <owner-freebsd-questions@freebsd.org> Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1B4B0A09C33 for <freebsd-questions@mailman.ysv.freebsd.org>; Sat, 26 Sep 2015 04:06:09 +0000 (UTC) (envelope-from info@pk1048.com) Received: from cpanel61.fastdnsservers.com (server61.fastdnsservers.com [216.51.232.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E97A814AD for <freebsd-questions@freebsd.org>; Sat, 26 Sep 2015 04:06:08 +0000 (UTC) (envelope-from info@pk1048.com) Received: from pool-100-4-179-8.albyny.fios.verizon.net ([100.4.179.8]:54580 helo=[192.168.2.133]) by cpanel61.fastdnsservers.com with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.85) (envelope-from <info@pk1048.com>) id 1ZfgVc-002GFa-Ms; Fri, 25 Sep 2015 22:50:48 -0500 References: <56019211.2050307@dim.lv> <37A37E9D-9D65-4553-BBA2-C5B032163499@kraus-haus.org> <56038054.5060906@dim.lv> <782C9CEF-BE07-4E05-83ED-133B7DA96780@kraus-haus.org> <56040150.90403@dim.lv> <60BF2FC3-0342-46C9-A718-52492303522F@kraus-haus.org> <560412B2.9070905@dim.lv> <8D1FF55C-7068-4AB6-8C0E-B4E64C1BB5FA@kraus-haus.org> <56042209.8040903@dim.lv> <2008181C-F0B5-4581-9D15-11911A1DE41B@kraus-haus.org> <CAFYkXjkdUrcUUdVQW4qgSuEmtifD=mvbvf4k0vq5t9R6dtR1pQ@mail.gmail.com> <6498A090-A2A2-4580-A148-2BCBF68BF2BF@kraus-haus.org> <5605481D.10902@physics.umn.edu> <106217D9-F3DB-4DB5-822E-098041B5BC6F@kraus-haus.org> <5605CB74.2020908@sneakertech.com> In-Reply-To: <5605CB74.2020908@sneakertech.com> Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <CAA8EA43-00E4-4948-969B-432BE51A73DE@pk1048.com> Cc: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> X-Mailer: iPad Mail (12H143) From: PK1048 <info@pk1048.com> Subject: Re: ZFS ready drives WAS: zfs performance degradation Date: Fri, 25 Sep 2015 23:50:44 -0400 To: Quartz <quartz@sneakertech.com> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - cpanel61.fastdnsservers.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - pk1048.com X-Get-Message-Sender-Via: cpanel61.fastdnsservers.com: authenticated_id: info@pk1048.com X-Source: X-Source-Args: X-Source-Dir: X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions <freebsd-questions.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-questions>, <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions/> List-Post: <mailto:freebsd-questions@freebsd.org> List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-questions>, <mailto:freebsd-questions-request@freebsd.org?subject=subscribe> X-List-Received-Date: Sat, 26 Sep 2015 04:06:09 -0000 Are you buying OEM drives ? I have had no issues with either Seagate or HGST in terms of warranty work, but I _never_ but an OEM drive, I know those include no warranty. I have yet to need to warranty a WD drive, so don't know that they're like. Sent from my portable device On Sep 25, 2015, at 18:32, Quartz <quartz@sneakertech.com> wrote: >> once you take >> the 5 year warranty in account. > > This assumes that the company in question will honorably honor the warranty. It's been our experience that they usually don't. I can't count you the number of times a drive manufacturer has pulled a fast one on a warranty replacement. WD is especially bad about this, they send back a cheaper drive than the original, a bottom-bin refurbished drive with twice the runtime/wear as the original that dies after a month (which mysteriously doesn't qualify for a warranty itself), or randomly insists we have to do a pay-and-reimburse replacement method then loses the records and never reimburses us. > > I understand that this is all anecdotal, but personally I don't find warranties worth the paper they're written on and never assume getting a functional replacement anymore. Long term it's cheaper to just buy a new drive outright than to waste employee time arguing over the phone for days. I buy exclusively based on ratings and reliability reports now. > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?488290978.924887.1443222658131.JavaMail.yahoo>
