Date: Wed, 3 Aug 2005 18:38:36 +0200 (CEST) From: Ivan Voras <ivoras@fer.hr> To: hackers@freebsd.org Subject: gjournal public alpha release Message-ID: <20050803183010.X32344@geri.cc.fer.hr>
next in thread | raw e-mail | index | archive | help
Hi! I'm announcing the first public version of the gjournal GEOM class :) The code is here: http://ivoras.sharanet.org/gjournal.tgz, together with a README file (reproduced below). I'd like to hear as many testing and bug reports as possible :) ---- The README file: What is it ---------- It's a journaling layer in GEOM subsystem. The intention is to provide devices (on which maybe filesystems are hosted) with data journaling capabilities. This is my first geom class, and also my first significant piece of kernel programming, so there are bound to be errors from my inexperience. The code is tested though, and it shouldn't crumble too often. :) More information is available at: http://wikitest.freebsd.org/moin.cgi/gjournal What does it do --------------- gjournal connectes ("consumes") two devices - one is the "data device" that is the target for journaling, and the other is "journal device" on which data is journaled. For every write request, its data is written on the journal device, and after some time transferred to the data device. Why use it ---------- The principal benefit of this is that the writes to the journal device are done sequentially and are much faster than direct, scattered writes to the data device. Another benefit, not implemented yet, is that it can be used in a "delayed-commit" mode, aka "Copy-on-write", where the data is stored in journal but not automatically commited to data device. This allows for dangerous experimenting on the data (maybe filesystem, with fsck), and then deciding later whether to commit the changes to data device or discard them. What works in this version -------------------------- This is alpha version software. Don't use in production setup. * journaling with automatic commit * automatic recovery of the journal on crash I've tested it by hosting a filesystem on the journaled device and copying various files to and from, so it should work well at least for light loads without panicking the kernel when you look at it :) Notes ~~~~~ * Since each and every write request is recorded verbatim in the journal, together with some system data (overhead), the journal device should be big. Based on current preliminary testing, I'd recommend something like 500MB to 1GB. * Making the journal device a md(4) device backed by a file should work, but it's not tested and there could be problems with crash recovery. * Data and journal devices can be on different physical devices, for added speed. * There are some design oddities, like blocking all IO on the device while the entire journal is commited, that won't go away soon, but probably will in a later version. How to use it ------------- Here's an example, step-by-step: * Unpack the archive, chdir to resulting directory * `make` * Symlink resulting .ko file into /boot/kernel/ * `make so` * Symlink resulting .so file into /lib/geom/ * `./gjournal load` * `./gjournal label mydevice /dev/datadevice /dev/journaldevice` * Use resulting /dev/journaled/mydevice for testing * `./gjournal unload` Notes ~~~~~ * It's developed for 5.4-RELEASE but it should work on later versions. * You need full system sources present in the usual location (/usr/src) to build it. * You need kernel with INVARIANTS and INVARIANTS_SUPPORT to run it (or you can modify the Makefile not to define those) * This is alpha quality software. Do not use on production machines and/or data. * I'd like to hear as many reports of testing as possible. If it crashes, I'd appreciate receiving following data: - what you wanted to do - what you did (e.g. commands you executed) - the configuration of your devices (data and journal devices) - is it repeatable? - if it's repeatable, set kern.geom.journal.debug sysctl to 20, and send as much of the last part of the kernel log to me Benchmarks ---------- I did some quick preliminary (and thus non-scientific and non-conclusive) benchmarks, and the results are good: "tar x" = untarring of a (previously cached) tar archive containing /usr/src/sys tree "rm -rf" = doing rm -rf on the untarred tree "raw" = partition without gjournal "gj" = the same partition gjournal-ed on another parition on the same drive "SU" = softupdates "normal" and "sync" are mount methods for UFS (numbers are seconds) Type | tar x | rm -rf ---------------------------------- raw, sync | 25.0 | 8.5 raw, normal| 18.9 | 9.6 raw, SU | 17.9 | 0.6 gj, sync | 23.0 | 7.9 gj, normal | 11.9 | 8.2 These are results in the best case for gjournal, where there's no journal commit phase in the middle of benchmarking. Commit delay can be configured with kern.geom.journal.commit_delay sysctl (in seconds). Unfortunately, the best result (gj, normal) is not crash-resistent. Though the journalling appears to be sound, it seems that FFS/UFS in the "normal" mode (metadata synchronous, data asynchronous) doesn't keep the filesystem always consistent with the writes, so a crash does require fsck on the filesystem. It's maybe also true for "sync" mode only harder to provoke. I'd appreciate any help to explain or solve this, but at the current time it means that this setup (gjournal + UFS) is NOT viable as a replacement for journaling filesystem. Acknowledgments --------------- This work is sponsored by Google via Summer of Code project. Menthors are Poul-Henning Kamp <phk@FreeBSD.org> and Pawel Jakub Dawidek <pjd@FreeBSD.org>. -- Every sufficiently advanced magic is indistinguishable from technology - Arthur C Anticlarke
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050803183010.X32344>