Date: Wed, 14 Dec 2011 23:24:18 -0600 From: Rob <lists@midsummerdream.org> To: freebsd-questions@freebsd.org Subject: Re: AHCI driver and static device names Message-ID: <4EE98482.1030509@midsummerdream.org> In-Reply-To: <4EE955E7.4010708@cyberleo.net> References: <4ED98E9F.9010401@midsummerdream.org> <CAN3mi_2u%2BHwFf3m%2BxvsNncfNpj_rFp94xjAv%2Bf0eFT7c4a%2B8Tg@mail.gmail.com> <4EDA489B.9060503@midsummerdream.org> <4EDA56A3.6090108@cyberleo.net> <4EE912BC.502@midsummerdream.org> <4EE955E7.4010708@cyberleo.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 12/14/11 8:05 PM, CyberLeo Kitsana wrote: >>>> The other option seems to be to use tunefs or a partitioning tool to >>>> label each partition, which is even more ugly imo. >>> >>> Ugly how? Labels appear a lot more semantically elegant than the opaque >>> 'ada4s1a' moniker. >> >> Ugly in that the driver has created a situation where we need >> workarounds to perform the tasks we need. *nix systems have always >> relied upon static device nodes, and using dynamic names without >> updating the relating tools/methods is ugly. The workarounds also could >> fail if someone forgets to perform them (specifically labels), since >> it's not necessary on just about any other *nix system. It's perfectly >> within reason to assume people will forget to apply a label when >> replacing a disk. > > Anything fails if you forget to do it. Administrative failure should not > be confused with technical failure. When you're changing a paradigm that is known to administrators for decades, it's unreasonable not to expect a decent degree of failure. Especially when the reason for the technical change isn't clear and the new method isn't at all like the old (ie no disk is guaranteed to get the same id). > Static device nodes are appropriate when the topology is fixed and can > be reasonably anticipated. With variable topologies, such as USB, iSCSI, > multipath, and PCI hotswap, the disk controllers may not even exist at > boot, or may be reordered based on probe order, or the order in which > the remote units respond; and that's before the kernel even gets around > to setting up the devices attached to those controllers. You cannot > reasonably expect the system to statically allocate device nodes for > every possible configuration that may exist for all technologies that > might be added to a machine, so why offer the expectation when the > system cannot possibly hope to fulfill it for even a fraction of the > common cases? I grant you variable topologies makes things incredibly hairy, but there's no need to take that mess and inject it into how the fixed topology (the physical hw in the box) is handled. Trying to handle all topology types in a single space can be messy. This problem wouldn't exist if a fixed topology used the old naming (adXX) and the variable topologies used the new naming (adaXX). Even this is less than ideal because your variable topologies provide no guarantee of anything being the same, thus your system could boot 1 day and fail the next because someone added a new piece of hardware to the network. That's probably more the name of the game in variable topologies (adminA changes the configuration on $ImportantBootDevice and stuff breaks), but I certainly don't want that uncertainty with the hardware in a machine. I stated that updating the device naming w/o updating the methodologies that rely upon that device naming is asking for trouble. I can't say I know a solution nor that I'm an expert, but this seems like it will cause many more problems than it will solve. >> Case in point. I have a system with 15 drives in it. I decided I >> wanted to install on the 2nd device instead of the 1st, but I >> partitioned all the other 14 drives. I completed installation and when >> to boot the system and it failed. Stupid me, the GPT boot loader found >> disk1 with a partitioning scheme but no fs. So, I popped out disk 1 and >> when to boot again. Hey, now it starts to boot only to fail to find the >> root fs because it's looking on ada1 and the fs is on ada0. That is a >> mess. > > Sounds like a bug in the BIOS or boot loader. The boot loader should be > able to ask the BIOS for the device from which it read the boot code, > and use that instead of just naively using the the first available > device in the system; the only instances where I've seen this fail have > been on machines that should've been put down years ago. Which isn't to > say it doesn't still happen. No bug in the BIOS at all. It's simply a case of device boot order, and being that I installed on disk 2 but put a bootloader on disk 1 with no OS the result was expected. >> This is not necessarily common, but also not uncommon. More likely is >> the case where you add a drive to the system and the above scenario >> plays out because the device names get re-ordered. I'm not sure the >> problem the dynamic device nodes intends to solve, but it's certainly >> caused all sorts of pain and the need for the 2 (that I know of) >> workarounds. > > How about when you add a PATA drive to a machine, but the cable is > blocking the last available bay; so you have to move an existing drive > to a different position on the cable to make room for the one you're > installing? Static device numbering won't save you now. This is not the same thing at all. If I move a physical cable, or a drive on a cable, then yes I should expect things to change. I have made a physical change to the disk's connections, and I should expect something to come out of it. In my case, I have not moved the cabling of a disk at all and thus expect the device name to stay the same. All I have done is add a new disk to the controller. I have a reasonable expectation that that action should not re-order the device nodes and screw up god knows what (ALL mounts could break, and the system could even fail to boot). This is how things used to work, and in fact still do work in Linux and other *nix. > Or how about those silly BIOSes that assume that you must really want to > boot to the new disk you just attached to the machine, so helpfully > rearrange your boot order for you so now you're booting to a strange > disk with who knows what on it? > > Honestly, there's so much that can go wrong. That's what sysadmins are for. None of those are related to my point. If a something breaks before I boot the system, that's a whole other issue. I am talking about breaking filesystem mounting by changing an age old methodology. >> I dislike the idea of having to use labels to get static functionality >> (increases the likelihood of something going wrong for a disk replace >> operation if I forget to label), but I'll give gpt labels a try. > > I find that labels solve more problems than they introduce, when applied > properly. The semantic meaning given to the devices often mean I can > discover what's on a particular disk in my pile'o'drives just by > plugging it in and looking at the kernel log; no mounting necessary. > Likewise, when juggling disks or controllers around, I don't have to > worry about remembering to update the fstab, since the labels follow the > data. If you want to use labels then by all means use them. I can seem advantages to using them. What I'm saying is that it is broken to have to use them in order to fix issues with the ahci driver using dynamic device names. The fact that you have to use them to ensure your system doesn't break horribly when you do something simple like add a disk is a clear indication of a broken design in the ahci driver imho. Rob
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EE98482.1030509>