From owner-freebsd-fs@FreeBSD.ORG Wed Jan 23 03:51:25 2013 Return-Path: <owner-freebsd-fs@FreeBSD.ORG> Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7FD5554F for <freebsd-fs@freebsd.org>; Wed, 23 Jan 2013 03:51:25 +0000 (UTC) (envelope-from freebsd@deman.com) Received: from plato.corp.nas.com (plato.corp.nas.com [66.114.32.138]) by mx1.freebsd.org (Postfix) with ESMTP id 3C8347E1 for <freebsd-fs@freebsd.org>; Wed, 23 Jan 2013 03:51:24 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by plato.corp.nas.com (Postfix) with ESMTP id 55CC112E2056E; Tue, 22 Jan 2013 19:51:24 -0800 (PST) X-Virus-Scanned: amavisd-new at corp.nas.com Received: from plato.corp.nas.com ([127.0.0.1]) by localhost (plato.corp.nas.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zbtFqZ0Kw7qc; Tue, 22 Jan 2013 19:51:23 -0800 (PST) Received: from [192.168.113.72] (75-151-97-138-washington.hfc.comcastbusiness.net [75.151.97.138]) by plato.corp.nas.com (Postfix) with ESMTPSA id 2C17112E2055A; Tue, 22 Jan 2013 19:51:23 -0800 (PST) Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD From: Michael DeMan <freebsd@deman.com> In-Reply-To: <50FF4D87.5080404@cse.yorku.ca> Date: Tue, 22 Jan 2013 19:51:22 -0800 Message-Id: <8DC70418-7CF9-4839-BDC6-1A1AF5354307@deman.com> References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> <alpine.BSF.2.00.1301220759420.61512@wonkity.com> <B60D815C-3F2D-4FA7-B5CC-D04EC262853B@deman.com> <alpine.BSF.2.00.1301221907120.66546@wonkity.com> <50FF4D87.5080404@cse.yorku.ca> To: Jason Keltz <jas@cse.yorku.ca> X-Mailer: Apple Mail (2.1499) Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems <freebsd-fs.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>, <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs> List-Post: <mailto:freebsd-fs@freebsd.org> List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>, <mailto:freebsd-fs-request@freebsd.org?subject=subscribe> X-List-Received-Date: Wed, 23 Jan 2013 03:51:25 -0000 Inline below... On Jan 22, 2013, at 6:40 PM, Jason Keltz <jas@cse.yorku.ca> wrote: <SNIP> >>> #1. Map the physical drive slots to how they show up in FBSD so if = a disk is removed and the machine is rebooted all the disks after that = removed one do not have an 'off by one error'. i.e. if you have = ada0-ada14 and remove ada8 then reboot - normally FBSD skips that = missing ada8 drive and the next drive (that used to be ada9) is now = called ada8 and so on... >>=20 >> How do you do that? If I'm in that situation, I think I could find = the bad drive, or at least the good ones, with diskinfo and the drive = serial number. One suggestion I saw somewhere was to use disk serial = numbers for label values. > I think that was using /boot/device.hints. Unfortunately it only = works for some systems, and not for all.. and someone shared an = experience with me where a kernel update caused the card probe order to = change, the devices to change, and then it all broke... It worked for = one card, not for the other... I gave up because I wanted consistency = across different systems.. I am not sure, but possibly I hit that same issue about pci-probing with = our ZFS test machine - basically I vaguely recall asking to have the = SATA controllers have their slots swapped without completely knowing why = it needed to be done other than it did need to be done. It could have = been from an upgrade from FBSD 7.x -> 8.x -> 9.x, or could have just = because its a test box and there were other things going on with for a = while and the cards had got put back in out of order after doing some = other stuff. This is actually kind of an interesting problem overall - logical vs. = physical and how to keep things mapped in a way that makes sense. The = linux community has run into this and substantially (from a basic end = user perspective) changed the way they deal with hardware MAC addresses = and ethernet cards between RHEL5 and RHEL6. Ultimately neither of their = techniques works very well. For the FreeBSD community we should = probably pick one or another strategy and just standardize on it with = its warts and all and have it documented? >=20 > In my own opinion, the whole process of partitioning drives, labelling = them, all kinds of tricks for dealing with 4k drives, manually = configuring /boot/device.hints, etc. is something that we have to do, = but honestly, I really believe there *has* to be a better way.... =20 I agree. At this point the only solution I can think of to be able to = use ZFS on FreeBSD for production systems is to write scripts that do = all of this - all the goofy gpart + gnop + everything else. How is = anybody supposed to replace a disk in a system in an emergency situation = by having to run a bunch of cryptic command line stuff on a disk before = they can even confidently put it in as a replacement for the original? = And by definition of having to do a bunch of manual command line stuff = you can not be reliably confident? > Years back when I was using a 3ware/AMCC RAID card (actually, I AM = still using a few), none of this was an issue... every disk just = appeared in order.. I didn't have to configure anything specially .. = ordering never changed when I removed a drive, I didn't need to = partition or do anything with the disks - just give it the raw disks, = and it knew what to do... If anything, I took my labeller and labelled = the disk bays with a numeric label so when I got an error, I knew which = disk to pull, but order never changed, and I always pulled the right = drive... Now, I look at my pricey "new" system, see disks ordered by = default in what seems like an almost "random" order... I dded each drive = to figure out the exact ordering, and labelled the disks, but it just = gets really annoying.... A lot of these things - about making sure that a little extra space is = spared on the drive when an array is first built so that when a new = drive with slightly smaller capacity is the replacement - the RAID = vendors have hidden that away from the end user. In many cases they = have only done that in the last 10 years or so? And I stumbled a few = weeks ago about a Sun ZFS user that had received Sun certified disks = that had the same issue - a few sectors too small... Overall you are describing exactly the kind of behavior I want, and I = think everybody needs from a FreeBSD+ZFS system. - Alarm sent out - drive #52 failed- wake up and deal with it. - Go to server (or call data center) - groggily look at labels on front = of disk caddies - physically pull drive #52 - insert new similarly sized drive from inventory as new #52. =20 - Verify resilver is in progress - Confidently go back to bed knowing all is okay The above scenario is just unworkable right now for most people (even = tech-savvy people) because of the lack of documentation - hence I am = glad to see some kind of 'best practices' document put together. <SNIP> - Mike