From owner-freebsd-stable@FreeBSD.ORG Mon Feb 25 18:47:46 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E6370AFC for ; Mon, 25 Feb 2013 18:47:46 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta14.emeryville.ca.mail.comcast.net (qmta14.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:212]) by mx1.freebsd.org (Postfix) with ESMTP id B600AEA8 for ; Mon, 25 Feb 2013 18:47:46 +0000 (UTC) Received: from omta05.emeryville.ca.mail.comcast.net ([76.96.30.43]) by qmta14.emeryville.ca.mail.comcast.net with comcast id 4gvo1l0040vp7WLAEinmhS; Mon, 25 Feb 2013 18:47:46 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta05.emeryville.ca.mail.comcast.net with comcast id 4inl1l00P1t3BNj8Rinlzh; Mon, 25 Feb 2013 18:47:45 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 2A7DD73A1C; Mon, 25 Feb 2013 10:47:45 -0800 (PST) Date: Mon, 25 Feb 2013 10:47:45 -0800 From: Jeremy Chadwick To: Tom Evans Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD Message-ID: <20130225184745.GA36717@icarus.home.lan> References: <20130124174039.GA35811@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1361818066; bh=HwmnL2EYV/EUvG5Frs7uuV+HHdX27gdb0XTOL4O4zGw=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=BvSF/DcqdpTqjqru7zrlm6GeTEtl3FIRHKCLD+XBFmUzYmFJYi/c0TTXoYsBpt9mm St8EnoT5lCHgAclEJcdwShE//Z67iMNW09i73H35YkpvoBA0w01eXr4s++Xkd8MWPA uxr1z5cEwF42EQu0O4wv2lQ5CRreejK/qAIhZF5daJbiwGB+GMeV5+glrilhH/GUz8 3am3YzsyaFcE4w98+Btje4h5Z3Ga4gKDEJhkELZM9gWmie8EaB95YuN24r/Lsjz75o MRXhpL3EGfBxu+psopKKftkx33xu2gW8lhu8ZNHYQrHjOKvj4juMi8K1uP0NaolBMp LBH8HjF4mdrlA== Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 18:47:47 -0000 On Mon, Feb 25, 2013 at 02:31:26PM +0000, Tom Evans wrote: > On Thu, Jan 24, 2013 at 5:40 PM, Jeremy Chadwick wrote: > >>> #1. Map the physical drive slots to how they show up in FBSD so if a > >>> disk is removed and the machine is rebooted all the disks after that > >>> removed one do not have an 'off by one error'. i.e. if you have > >>> ada0-ada14 and remove ada8 then reboot - normally FBSD skips that > >>> missing ada8 drive and the next drive (that used to be ada9) is now > >>> called ada8 and so on... > >> > >>How do you do that? If I'm in that situation, I think I could find the > >>bad drive, or at least the good ones, with diskinfo and the drive serial > >>number. One suggestion I saw somewhere was to use disk serial numbers > >>for label values. > > > > The term FreeBSD uses for this is called "wiring down" or "wired down", > > and is documented in CAM(4). It's come up repeatedly over the years but > > for whatever reason people overlook it or can't find it. Here's how you > > do it for Intel AHCI (and probably others like AMD), taken directly from > > my /boot/loader.conf -- > > > > # "Wire down" device names (ada[0-5]) to each individual port > > # on the SATA/AHCI controller. This ensures that if we reboot > > # with a disk missing, the device names stay the same, and stay > > # attached to the same SATA/AHCI controller. > > # http://lists.freebsd.org/pipermail/freebsd-fs/2011-March/011036.html > > # > > hint.scbus.0.at="ahcich0" > > hint.scbus.1.at="ahcich1" > > hint.scbus.2.at="ahcich2" > > hint.scbus.3.at="ahcich3" > > hint.scbus.4.at="ahcich4" > > hint.scbus.5.at="ahcich5" > > hint.ada.0.at="scbus0" > > hint.ada.1.at="scbus1" > > hint.ada.2.at="scbus2" > > hint.ada.3.at="scbus3" > > hint.ada.4.at="scbus4" > > hint.ada.5.at="scbus5" > > > > IMPORTANT: The device names/busses/etc. are going to vary depending on > > each person's setup, controller, etc.. Proof of this is in a post from > > Randy Bush, where I helped him off-list with this task and he figured > > out the remaining bits by himself for his hptrr(4) controller: > > > > http://lists.freebsd.org/pipermail/freebsd-fs/2012-June/014522.html > > This wiring down is not sufficient to address all the problems with > drive renumbering. For instance, add an additional ahci controller, > and potentially all your drive names change again. Or not, depending > on how the devices are enumerated. Go read CAM(4). You can solve this dilemma using the wire-down method as well. I don't know how many times I have to say this. You really can solve the problem with the adaX, daX, or scbusX numbers changing if you add a 2nd controller. Really. There was a situation with AHCI ports on some systems where an unpopulated port wasn't even advertised as an available channel per AHCI. mav@ believes (as do I) this generally violates AHCI spec, but this is the sort of thing vendors do. Entire post contains the details (including quoted parts): http://lists.freebsd.org/pipermail/freebsd-current/2011-November/029374.html Fixes for that were committed some time ago: head in November 2011 (r227635) and subsequently MFC'd in January 2012 (r229289). > This is not the only problem. Take all the disks out, put them all > back in, do they all have the same names? Unlikely, since their name > is now derived from the controller and port they are plugged in to. > Any changes, and the device name changes. By "name" I assume you mean "device name", e.g. ada0. If so, then the answer is a huge big gigantic fat YES. That is the *entire point* to the wire-down method, and what people have historically complained about (re: solving the issue of adaX or daX devices, or scbusX buses, being reordered/renumbered). If by "take all your disks out and put them all back in" you mean put them all back in to *different physical ports* (i.e. they are now attached to different physical SATA ports) than they were before, then yes, the device name will change (e.g. from ada0 to ada4, assuming you have 1:1 "wire down" mapping like the above). My advice here is the same advice as another user: label what physical SATA port on your system/chassis correlates with what. For systems with hot-swap bays/enclosures, most come with stickers that label each enclosure (starting at either 0 or 1 -- intelligent vendors give you stickers starting at 0). I have yet to encounter a system administrator who thinks a hard disk port should just be treated in an arbitrary "who cares, it's just a port" way. There's always a first. :-) > Using GPT labels is easy to do, and provides a cast iron guarantee > that your disk will not EVER be mistaken for a different drive. Using GPT labels is fine by me, as it's the only one that doesn't have atrocious side effects. But let me introduce you to a fellow who can't use GPT at all, due to his vendor making incorrect assumptions about GPT being UEFI-only, so he's stuck with MBR: http://lists.freebsd.org/pipermail/freebsd-stable/2013-February/072188.html http://lists.freebsd.org/pipermail/freebsd-stable/2013-February/072206.html I feel sorry if this guy ever has to deal with a 4096-byte sector disk ("Yeah, so, you get to create your first slice manually, with an offset of a wild/crazy value, because PC architecture is awesome" -- thankfully Warren Block wrote a wonderful page explaining how to do that too). Now back to labels, which I've lectured about in past posts: The labelling situation on FreeBSD is a nightmare given documentation that is unclear or confusing. Examples include gpart(8)'s "add -l" flag, which mentions supporting "labels attached a partition" except never tells you which partition types support labels, and glabel(8) which still to this day cannot figure out if is just an abstraction layer for things like "tunefs -L" (for UFS/UFS2), or if it quite literally has its own "GEOM label" -- the man page implies both. This drives a lot of people away from the whole mechanism. While I appreciate all the very, very hard work pjd@ has put into GEOM and the underlying bits, this situation has existed for a long time. (Yes, go right ahead and tell me "you have the source to the man pages, fix it yourself + file a PR"). And don't get me started on the GPT vs. gmirror/gstripe/graid3 and last-LBA chicken-and-egg problem. > I put a GPT label on the drive, and then write it in permanent marker > on the top of the drive and on a sticky label that is stuck on the > front of the chassis. The disk label never changes in its lifetime, so > you only get issues if you insert a drive without labelling it first. And with CAM(4) wire-down, you could do the exact same except instead of sticking it on the drive, you stick it next to the bay or port, or even on the SATA cable, to say "ada0", "ada1", etc.. Both the wire-down method and your GPT label preference still require one to manually keep track, somehow, of what something physical (whether it be a disk or a port) maps to digitally. Which circles back to what the OP said at the bottom of his thread: "So what gives? Why can't we have something like /dev/hdsn/ (hdsn == Hard Drive Serial Number) where a set of device numbers would automagically ......." ...which I explain the caveats of quite clearly in one of the URLs which you snipped from my mail, specifically the discussion with Warren Block. Those posts: http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071900.html http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071901.html http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071904.html The proposed serial/WWN idea sounds great on paper, but in reality create annoying edge-cases that can't be solved. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |