Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 May 2015 16:05:03 -0500
From:      Thomas Johnson <tommyj27@gmail.com>
To:        =?UTF-8?Q?Karli_Sj=C3=B6berg?= <karli.sjoberg@slu.se>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: zpool on Dell MD3000 causes frequent hangs
Message-ID:  <CAMwYC7aeyXo%2BkONzuCe7n7xjADg2tkFb=M7oWYirWUxEJo=u7A@mail.gmail.com>
In-Reply-To: <be6b7186-0709-4190-bcda-66c7b4c68872@email.android.com>
References:  <be6b7186-0709-4190-bcda-66c7b4c68872@email.android.com>

next in thread | previous in thread | raw e-mail | index | archive | help
That looks like a match. I'll the "abuse" knob up to 11, and see if I can
break it.

Thanks!

On Fri, May 22, 2015 at 3:27 PM, Karli Sj=C3=B6berg <karli.sjoberg@slu.se> =
wrote:

>
> Den 22 maj 2015 9:10 em skrev Thomas Johnson <tommyj27@gmail.com>:
> >
> > Hello,
> >
> > I am trying to track down an ongoing issue that I've been having, and
> > looking for any suggestions on a possible cause, or suggestions on how =
I
> > might troubleshoot further.
> >
> > The issue seems to be related to a Dell MD3000 storage array, which
> > contains a zpool. It seems that the host attached to the array will
> > occasionally hang, usually during periods of high disk activity
> > (annoyingly, usually about 0300).
> >
> > When the system hangs, I can ping the host, and switch between virtual
> > consoles (but not interact with them). The system is otherwise
> > unresponsive; with no errors reported on the console or logs. The only
> > remedy I have found is to hard-reset the host.
> >
> > I believe this issue is tied to the MD3000. I have tried swapping out S=
AS
> > cables, HBAs, the controller on the MD3000, and the host itself. I have
> > updated all the firmware I can find. Before I upgraded the host OS to
> > FreeBSD 10.1 (from 10.0) last month, I experienced hangs about once a
> > month. Since the upgrade, I have seen several events per week.
>
> My bet is on this:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D197164
>
> /K
>
> >
> > In addition to the MD3000, I have a set of USB drives that are used in =
a
> > rotation as offsite backups for the zpool. I have seen a number of hang
> > events during zfs send/receive transfers to the USB disk.
> >
> > After the most recent hang, I removed two [consumer] SSDs from the pool
> > that were being used as cache devices. It is too early to tell if this
> > change had any impact.
> >
> > Here is some of the pertinent output from the host. I can provide any
> other
> > information that would be helpful.
> >
> > root@leopard:/home/tom-> uname -a
> > FreeBSD leopard 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0 r281232: Tue
> > Apr  7 17:38:04 CDT 2015
> > root@cheshire-b
> :/pkg/base/obj_10.1-RELEASE-p9/pkg/base/src_10.1-RELEASE-p9/sys/GENERIC
> > amd64
> > root@leopard:/home/tom-> zpool list
> > NAME          SIZE  ALLOC   FREE   FRAG  EXPANDSZ    CAP  DEDUP  HEALTH
> > ALTROOT
> > backup       5.31T  3.61T  1.70T    22%         -    68%  1.00x  ONLINE
> -
> > jumpdrive_f  2.72T  2.04T   693G    30%         -    75%  1.00x  ONLINE
> -
> > root@leopard:/home/tom-> zpool status backup
> >   pool: backup
> >  state: ONLINE
> >   scan: scrub repaired 0 in 13h15m with 0 errors on Wed May 13 16:17:29
> 2015
> > config:
> >
> >     NAME        STATE     READ WRITE CKSUM
> >     backup      ONLINE       0     0     0
> >       da0       ONLINE       0     0     0
> >
> > errors: No known data errors
> > root@leopard:/home/tom-> zpool get all backup
> > NAME    PROPERTY                       VALUE
> SOURCE
> > backup  size                           5.31T                          -
> > backup  capacity                       68%                            -
> > backup  altroot                        -
> > default
> > backup  health                         ONLINE                         -
> > backup  guid                           12638712474922952450
> > default
> > backup  version                        -
> > default
> > backup  bootfs                         -
> > default
> > backup  delegation                     on
> > default
> > backup  autoreplace                    off
> > default
> > backup  cachefile                      -
> > default
> > backup  failmode                       wait
> > default
> > backup  listsnapshots                  off
> > default
> > backup  autoexpand                     off
> > default
> > backup  dedupditto                     0
> > default
> > backup  dedupratio                     1.00x                          -
> > backup  free                           1.70T                          -
> > backup  allocated                      3.61T                          -
> > backup  readonly                       off                            -
> > backup  comment                        -
> > default
> > backup  expandsize                     0                              -
> > backup  freeing                        0
> > default
> > backup  fragmentation                  22%                            -
> > backup  leaked                         0
> > default
> > backup  feature@async_destroy          enabled
> local
> > backup  feature@empty_bpobj            active
> local
> > backup  feature@lz4_compress           active
> local
> > backup  feature@multi_vdev_crash_dump  enabled
> local
> > backup  feature@spacemap_histogram     active
> local
> > backup  feature@enabled_txg            active
> local
> > backup  feature@hole_birth             active
> local
> > backup  feature@extensible_dataset     enabled
> local
> > backup  feature@embedded_data          active
> local
> > backup  feature@bookmarks              enabled
> local
> > backup  feature@filesystem_limits      enabled
> local
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMwYC7aeyXo%2BkONzuCe7n7xjADg2tkFb=M7oWYirWUxEJo=u7A>