Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Apr 2010 14:29:13 +0200
From:      Attilio Rao <attilio@freebsd.org>
To:        David Ehrmann <ehrmann@gmail.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: Strange disk problem
Message-ID:  <t2o3bbf2fe11004200529h45cf209et731a194794aae7e1@mail.gmail.com>
In-Reply-To: <4BCD5049.8030408@gmail.com>
References:  <4BCD5049.8030408@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

2010/4/20 David Ehrmann <ehrmann@gmail.com>:
> Initially, I noticed a problem where reading a file on this machine seemed
> to stop--something like a video would just stop playing.  At first, I
> thought it was the machine, but a new motherboard, CPU, and RAM later, the
> problem persists.  The network card uses a different chipset, too.
>
> The files are on zfs, but scrubs are fine, and zpool status lists no errors
> of any kind.  Trying to reproduce the problem, I set up a script that
> reading a random 1M block every 60 seconds off the drive backing zfs.
>  That's when I noticed something: one disk seems to be causing the problems.
>  I logged the dd times, and some of them were huge--more than a minute.  The
> times on the other disk in the mirrored vdev were low.
>
> I've only seen the problem when I have a vm's disk image hosted on the
> machine.  That said, the network interface is configured at 100mbps, so
> there's no reason for that to saturate the disk's throughput.  Top reports
> that almost 20% of the CPU is going towards interrupts.  I can read a file
> off the zfs pool at over 50MB/s, so that shouldn't be a problem.  One thing
> I'm wondering is why the disk read doesn't timeout quickly?  At least that
> way zfs could try to use the other drive in the mirrored vdev.
>
> Any ideas?  One thing I should try is switching the drive, see if the
> problem follows the disk or stays with the lowest /dev/adX device.  I'm
> using geli, but the read problems happen with both /dev/adX AND
> /dev/adX.eli., so I don't think that's it.  I've seen the problem with
> Samba, NFS, and dd.

David,
do you think you are willing to re-create the problem and do a PMC
analysis on it?
(If you need any guidance let me know, I will be happy to give it).

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?t2o3bbf2fe11004200529h45cf209et731a194794aae7e1>