From owner-freebsd-stable@FreeBSD.ORG  Tue Feb 14 19:52:57 2012
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 986D21065673
	for <freebsd-stable@freebsd.org>; Tue, 14 Feb 2012 19:52:57 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta12.emeryville.ca.mail.comcast.net
	(qmta12.emeryville.ca.mail.comcast.net [76.96.27.227])
	by mx1.freebsd.org (Postfix) with ESMTP id 77C448FC08
	for <freebsd-stable@freebsd.org>; Tue, 14 Feb 2012 19:52:57 +0000 (UTC)
Received: from omta22.emeryville.ca.mail.comcast.net ([76.96.30.89])
	by qmta12.emeryville.ca.mail.comcast.net with comcast
	id ZuwF1i0071vN32cACvsx51; Tue, 14 Feb 2012 19:52:57 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta22.emeryville.ca.mail.comcast.net with comcast
	id Zvsv1i00S1t3BNj8ivsvST; Tue, 14 Feb 2012 19:52:56 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 1DDCA102C1E; Tue, 14 Feb 2012 11:52:55 -0800 (PST)
Date: Tue, 14 Feb 2012 11:52:55 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Oscar Prieto <oscarmpp@googlemail.com>
Message-ID: <20120214195255.GA5064@icarus.home.lan>
References: <20120214091909.GP2010@equilibrium.bsdes.net>
	<20120214100513.GA94501@icarus.home.lan>
	<20120214135435.GQ2010@equilibrium.bsdes.net>
	<20120214141601.GA98986@icarus.home.lan>
	<4F3A83DE.3000200@ambtec.de>
	<20120214165029.GA1852@icarus.home.lan>
	<4F3A971F.9040407@omnilan.de>
	<20120214192319.44ff7aff@zelda.sugioarto.com>
	<CAK9wqRpjRXtkBqL+gX5gY3foqz-O5mT-qg7Z=_t2m=Q3rZizJg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAK9wqRpjRXtkBqL+gX5gY3foqz-O5mT-qg7Z=_t2m=Q3rZizJg@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: Harald Schmalzbauer <h.schmalzbauer@omnilan.de>, freebsd-stable@freebsd.org,
	Martin Sugioarto <martin@sugioarto.com>,
	Claudius Herder <claudius@ambtec.de>
Subject: Re: problems with AHCI on FreeBSD 8.2
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Feb 2012 19:52:57 -0000

On Tue, Feb 14, 2012 at 08:31:23PM +0100, Oscar Prieto wrote:
> I used to had tons of ahci errors in my 4 disk raidz1 worth of
> HD154UIs when the rig was built a year ago or so (with 8.0 Release),
> but they dissapeared after tuning ZFS.
> 
> Sadly i also got a new timeout days ago followed with smartcl erros i
> still keep unchecked but i guess they cold be legit, i still have to
> test/swap cables and give it a try.

About your ada3 disk:

The below SMART errors indicate your disk does in fact have physical
media problems -- 1 confirmed bad sector, and 5 which are "suspect".
"Suspect" LBAs are unreadable until writes are issued to them.  A write
will induce the drive to re-analyse the sector at that LBA and determine
if it's truly bad or not.  A single LBA can actually take quite a long
time to analyse (it depends on what the problem is), and may result in
30+ seconds of delay.  You can either let the drive figure it out over
normal usage patterns, or you can do it manually yourself time
permitting.  Your drive that shows read failures in the SMART self-test
log gives you the LBA numbers; try reading from those LBAs first.  I can
explain this procedure in another thread/offline/whatever.  (Does anyone
read what I write, re: don't hijack the thread?  :-) )

About all of your disks:

All of your disks are undergoing regular/periodic SMART short and long
tests.  Please stop this; it really, truly does no good.  You will
experience performance hits during these tests.

About timeouts:

Timeouts seen on the controller and driver level can happen in this
situation; this is universal.  This is usually what features like
Western Digital's TLER and Hitachi + Samsung's CCTL can help alleviate,
but not fully solve.  I think the ada(4) default timeout of 30 seconds
is a decent value, to be quite honest, but I'm not sure what the AHCI
driver timeout is.  mav@ would need to clue me in, or I'd need to go
look at the source.  (Right now in my life is not a good time for me to
be reviewing source code or looking at commits, sadly.  Too much on my
mind recently.)

I can discuss the TLER/CCTL stuff more at length if needed, but to be
blatantly honest, I would rather not and here's why: people begin to
rely on these features to try and circumvent actual problems with their
drives.  Phrased differently: people on the Internet become incredibly
focused on all of these timeout durations (TLER/CCTL vs. controller vs.
driver vs. storage subsystem timeouts) and try to find some bizarre
"perfect harmony" between them all.  Instead, just leave them all alone
and watch your drives for problems.

Further details which pertain to Samsung drives:

In your case, you run smartd(8), which periodically hits the drive with
SMART requests, pulling attribute data down and parsing it.  I believe
your model is fine for this, but for similar Samsung models, I must
strongly advise against this.  There are well-documented problems with
Samsung firmwares and SMART behaviour which can result in data loss (yes
you read that right).  Please see smartmontools' Wiki page on the matter
for full details.  Just make sure you're running a fixed firmware:

http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks

Regarding throughput of the drives being slow (30-40MBytes/sec across a
gigE link):

This sounds more like a Samba tuning problem, but ZFS raidz isn't known
for "amazing speed" per se.  Please see a post of mine from a while back
on how to tune Samba, which many followed up to with appreciation
stating their throughput increased dramatically:

http://lists.freebsd.org/pipermail/freebsd-stable/2011-February/061642.html

I should follow up to that post with the following entry, because I've
since updated my own smb.conf to tune things a bit better, and include
comments as to the justifications:

#
# The below options increase throughput substantially.  Be aware
# that AIO support requires the aio.ko kernel module loaded,
# and Samba to be built with AIO enabled.  Important notes:
#
# 1) We explicitly disable sendfile(2) because it has known
# problems on ZFS, including resulting in 2x the amount of memory
# used on the machine (VM cache + ZFS cache).  For further details,
# see freebsd-fs or freebsd-stable thread, subject "8.1-STABLE:
# zfs and sendfile: problem still exists".
#
# 2) (2011/10/03) socket options SO_SNDBUF and SO_RCVBUF do not
# appear to matter on FreeBSD, or our sysctls somehow take care of
# this (or maybe AIO?).  The performance is the same with or without
# these two socket options on 8.2-STABLE.
#
# 3) (2011/10/03) My previously-mentioned "aio write behind" option
# is incorrect; see the officia smb.conf(5) man page for the syntax.
# It's not a yes/no toggleable, thus serves no purpose.
#
socket options = TCP_NODELAY
use sendfile = no
min receivefile size = 16384
aio read size = 16384
aio write size = 16384

The rest is in the thread I linked.

Hope this helps.

-- 
| Jeremy Chadwick                                 jdc@parodius.com |
| Parodius Networking                     http://www.parodius.com/ |
| UNIX Systems Administrator                 Mountain View, CA, US |
| Making life hard for others since 1977.             PGP 4BD6C0CB |