Date: Mon, 27 Jan 2003 12:37:01 +0100 From: phk@freebsd.org To: Marcel Moolenaar <marcel@xcllnt.net> Cc: Nate Lawson <nate@root.org>, cvs-committers@freebsd.org, cvs-all@freebsd.org Subject: Re: cvs commit: src/sbin/disklabel disklabel.c Message-ID: <21754.1043667421@critter.freebsd.dk> In-Reply-To: Your message of "Mon, 27 Jan 2003 02:45:55 PST." <20030127104555.GA3050@dhcp01.pn.xcllnt.net>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <20030127104555.GA3050@dhcp01.pn.xcllnt.net>, Marcel Moolenaar write s: >Clearly this isn't getting anywhere and you also don't give any >reason whatsoever why fixing geom is not an option. So, at this >time I can only conclude that there are non-technical forces >at work... this is where I step out of the discussion and into >bed... First of all: The ratio of normal I/O requests to operations which examine or modify partitioning meta-data is so staggering high that it should be obvious that there is no sane reason to take even the smallest performance hit in the normal I/O path to cater for meta-data updates. And thats where it stops. You can call this "non-technical forces" if you like, but that is not what I would choose to call it. Second, if you add a sysctl that allows you to write to /dev/ad0 while it is being actively used, then you can update the on-disk GPT tables, but you still need to notify the GPT module that a change occured, and in doing so, you may present the GPT with a new on-disk GPT table which would blow any number of already open and mounted partitions out of the water. So: How do you intend to make that notification happen, and how do you intend to handle the case where the new on-disk GPT table would violate open partitions ? The answer is "the notification mechanism would be trival" and "you can't" respectively. That's where it stops for that "solution". We have always, and for obviously good reasons, refused to shrink or move open partitions, and I have not seen anybody suggest that restriction be lifted, and I have implemented GEOM so nobody short-cuts across it as a "quick hack". (GEOM does in fact provide for much finer control over this aspect than our previous code, but due to not carrying the relevant information in the vnode layer we cannot fully exploit this yet.) So, yes, it does look a lot like disk-partitioning tools will need to be able to operate in two distinct mode: 1) Directly to parent as a disk to write the initial meta-data. 2) Using whatever ioctl/special devices/whatever interface this particular geom-class implements for modifying active instances. Now, having been extensively through two diskpartitioning tools which both has this ability, and which both are bagged down with extensive historical luggage and one of them even with total braindamaged design decisions, I can tell that this is not an impossible task. It is not even a big task. It's barely noticeable in fact. Yes, it's marginally harder to write disk-management tools, but hey, tha'ts life. In my experience the code makes the bits end up the right place is totally dwarfed by the user-interface to allow the administrator to twiddle those bits anyway. I don't care if you implement your "in-line" path using /dev/ad0.gpt and ioctls, or reads/writes or even sysctls, but you will have to go through the active GPT class if there is one, and you obviously cannot go through it if there isn't one. Now, if you want to cooperate on trying to make a generically usable interface for such operations, I'm all open for cooperation. Poul-Henning PS: And trust me: I have only inflicted GEOM on you and the rest of the FreeBSD crew in order to annoy you and to restrict your possibilities and restrain the performance. Really: That's why. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21754.1043667421>