Date: Thu, 14 Oct 1999 12:43:40 +0200 From: Poul-Henning Kamp <phk@critter.freebsd.dk> To: freebsd-arch@freebsd.org Subject: Re: The eventual fate of BLOCK devices. Message-ID: <447.939897820@critter.freebsd.dk> In-Reply-To: Your message of "Thu, 14 Oct 1999 13:15:25 %2B1000." <Pine.BSF.4.10.9910141222290.32868-100000@alphplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[I know it was Julian who threw this ball in the air, but I take the liberty of doing the final round: I have been the primus motor on this issue from the beginning and it is part of the dev_t cleanup project.] SUMMARY: So far we have identified the following two classes of software which access disk-like devices through cdev and bdev: 1) Database software. 2) Filesystem maintenance tools 3) savecore(8) Database software prefer cdev semantics if at all possible, if running on anything but a cdev database software call fsync(2) a lot to make sure the writes have hit the media. Terry argues for retaining the bdev semantics rather than the cdev semantics, but I think we can dismiss that idea based on the above observation: it would penalize software which know better. Retaining the bdev would in essence be emulating the mistake Linux made, and which they are now unmaking. The filesystem maintenance applications mentioned so far which rely on bdev semantics, the EXT2FS tools, can be trivially converted to operate on cdev semantics. The majority of such tools already correctly operate on cdevs. Savecore(8) has already been converted to operate on cdevs. Using mmap(2) to provide a new type of buffered semantics for disk-like devices is insteresting, but its applicability will be limited by the virtual address space of a process: you can't map a 20GB database into a 32bit address space, so a lot of mmap(2) calls will be needed for serious sized data. The need for, and actual use of such a facility seemes uncertain. There is general disagreement about how much code we save, but nobody disputes that we will be able to remove some amount of complexity from the kernel. Most people seem to overlook the needlessly replicated code in a number of xxx(8) tools to DTRT with /dev/foo vs /dev/rfoo. Implementing an ioctl(2) to switch a disk-like device into bdev mode is relatively trivial, but there currently seems to be no point in doing so. There is a significant majority supporting the removal of bdev semantics. CONCLUSION: ----------- Unless we have significant new information to the contrary, I will commence the bdev removal after November 1st 1999. In order to try to trigger any such information, I will change the default value of the vfs.bdev_buffered sysctl to zero this weekend, this will make bdevs react like cdevs. An ioctl(2) based mode-switch will only be implemented if a very good reason for doing so materializes. Thanks for participating. Poul-Henning -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." FreeBSD -- It will take a long time before progress goes too far! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?447.939897820>