From owner-freebsd-stable@FreeBSD.ORG Thu Apr 4 08:00:24 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mandree.no-ip.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by hub.freebsd.org (Postfix) with ESMTP id B6C97CAE; Thu, 4 Apr 2013 08:00:23 +0000 (UTC) (envelope-from mandree@FreeBSD.org) Received: from [127.0.0.1] (localhost.localdomain [127.0.0.1]) by apollo.emma.line.org (Postfix) with ESMTP id E64AA23CED6; Thu, 4 Apr 2013 10:00:22 +0200 (CEST) Message-ID: <515D3312.3010109@FreeBSD.org> Date: Thu, 04 Apr 2013 10:00:18 +0200 From: Matthias Andree User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: Any objections/comments on axing out old ATA stack? References: <51536306.5030907@FreeBSD.org> <20130331130409.GO3178@equilibrium.bsdes.net> <515B25D8.7050902@FreeBSD.org> <515BF5AE.4050804@FreeBSD.org> <515CAA04.1050108@FreeBSD.org> <20130403233815.GA65719@icarus.home.lan> <515CC704.90302@FreeBSD.org> <20130404010526.GA66858@icarus.home.lan> In-Reply-To: <20130404010526.GA66858@icarus.home.lan> X-Enigmail-Version: 1.4.6 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig05900F9F959D20FB9CBDA030" Cc: Alexander Motin , freebsd-current@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Apr 2013 08:00:24 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig05900F9F959D20FB9CBDA030 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Am 04.04.2013 03:05, schrieb Jeremy Chadwick: >>> Please provide "gpart show -p ada1" output, both here and in the PR, >>> if you could. >> >> =3D> 63 1953525105 ada1 MBR (931G) >> 63 209714337 ada1s1 freebsd [active] (100G) >> 209714400 800 - free - (400k) >> 209715200 71680000 ada1s2 ntfs (34G) >> 281395200 15405 - free - (7.5M) >> 281410605 488263545 ada1s3 linux-data (232G) >> 769674150 1183851018 - free - (564G) Thanks for all the useful information provided so far (including further down). I know some of that already, but am not going to complain because it is very useful in the logs. > The problem here is that I cannot guarantee you that alignment is > the problem. The performance impact of writes to partitions which are > non-aligned is quite high, and NCQ just exacerbates this problem. I > would love to tell you "switch to GPT and follow Warren Block's > document***" but if your NTFS partition is Windows and is a Windows ver= sion > older than Windows 7 GPT is not supported. I am happy to make that realign-and-use-GPT experiment. My Windows is "7 Professional 64-bit". It will take me a few days because this is spare-time stuff. > One piece of evidence that refutes my theory is that if Windows and/or > Linux partition are something you boot into and use often, I would > imagine NCQ would be used in both of those environments and would suffe= r > from the same issue. Although Windows tends to hide all sorts of > transient errors from the user (sigh), Linux tends to be like FreeBSD > with regards to such issues (on the console anyway; you wouldn't see > such messages normally inside of X). Now, the FreeBSD slice is the only partition on that disk that would likely see concurrent write accesses (think "make -j8" on a quadcore computer) which is more prone to ferret out such alignment contention. The NTFS partition is aligned on a multi-MB boundary, so wouldn't hit the problem anyways. The Linux partition is in ext4 format for mostly sequential access to files usually in excess of 10 MB each. Linux's ext4 jumps through several hoops to end up with bulk writes, like extents, delayed allocations (to avoid fragmentation), reordering of data and metadata writes, serialized log writes and all that stuff, and it would appear I am permitting it to cache writes -- Linux uses write barriers to enforce proper ordering of journal/meta-data writes. It would be rather hard to hit ATA taskfile timeouts, the expected rate with which the drive needs to do a partial write is orders of magnitude lower. Any good "concurrent write" exercise tools for Unix that I could run on the Linux ext4 partition that you would propose? > If you have the time and want to put forth the effort, I would recommen= d > backing up all your data on ada1, zero the first and last 1MByte of the= > drive, and then try following Warren Block's guide. I'd just recommend= > doing this: >=20 > gpart create -s gpt ada1 > gpart add -t freebsd-ufs -b 2m ada1 > newfs -U -j /dev/ada1p1 (or remove -j if you don't want to use SUJ) Will do. >> - I am running with kern.cam.ada.default_timeout=3D5 which makes the >> computer recover faster >=20 > I can definitely imagine cases where a drive using NCQ but doing writes= > to a non-aligned partition could take longer than 5 seconds to respond > to an ATA CDB (this is different than a SATA or AHCI layer timeout). I= am > not telling you "change this back to 30", but it might not be helping > your situation at all given my above theory. My feeling is that the stalls are mostly from the error handler and the overall time the drive is "frozen" gets shorter. If it had not _felt_ faster, I'd not have left that in sysctl.conf in the first place. > Finally: could you please provide output from "smartctl -x /dev/ada1"? > I would like to rule out any possibility of your drive having some othe= r > kind of issue that might cause it to go catatonic. Thanks. I have fetched the data with Linux this time (should not make a difference as it's all drive internal data, not host OS stuff). Looks sane to me, . I'll be happy to refetch this data with a more current smartctl version under FreeBSD if required. >=20 >=20 > ** -- http://www.seagate.com/files/www-content/support-content/document= ation/samsung/tech-specs/eco_greenf2.pdf >=20 > *** -- http://www.wonkity.com/~wblock/docs/html/ssd.html >=20 --------------enig05900F9F959D20FB9CBDA030 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlFdMxIACgkQvmGDOQUufZXemgCgk4AJnaRlr17BDpOzvCS7sHej QNIAoMLTA4PsdYY6fCxJ5w8KxwIJQTUX =/vE0 -----END PGP SIGNATURE----- --------------enig05900F9F959D20FB9CBDA030--