From owner-freebsd-questions@FreeBSD.ORG Tue Nov 4 17:55:57 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 461E3106567E for ; Tue, 4 Nov 2008 17:55:57 +0000 (UTC) (envelope-from glavoie@gmail.com) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.228]) by mx1.freebsd.org (Postfix) with ESMTP id D19478FC08 for ; Tue, 4 Nov 2008 17:55:56 +0000 (UTC) (envelope-from glavoie@gmail.com) Received: by rv-out-0506.google.com with SMTP id b25so3135545rvf.43 for ; Tue, 04 Nov 2008 09:55:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=TL43w4rHnBqnTHGbxOQwKTlgsC91SOBCP9NrPOyNc5U=; b=RMt8iZfCSgI05gzughj2C09KmuG3fLmrqEQTu+r+XRRfDXY9yVC3hK4+z5KgTadRas PoEFug51J22fYl0NjEssppu95dNsij36pcpj/xFQCy6uMETUauswouKCJuGMrfntvvFy Fdp6xa5m0Ki3URYbrmHdA/1vIpra+UDKbXXeY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=Hrj1nPqsMzzKa5+AEDy9rW1MJK6hSIaG7MnMdcv/F5agWPlqQLTlZGuMIouOucIHB0 0UaGidB22GV5cqM53KBJIKRvLYzH8FpNNlOdTOAr3sh+/trFqTZbn54IzLSiR0kH+YUM ndMgkHBtt5cVKXeCY+82acboYNTTPggIUk7SQ= Received: by 10.140.178.17 with SMTP id a17mr1040528rvf.156.1225821356545; Tue, 04 Nov 2008 09:55:56 -0800 (PST) Received: by 10.141.146.8 with HTTP; Tue, 4 Nov 2008 09:55:56 -0800 (PST) Message-ID: Date: Tue, 4 Nov 2008 12:55:56 -0500 From: "Gabriel Lavoie" To: "Volodymyr Kostyrko" In-Reply-To: MIME-Version: 1.0 References: <48FD6665.5000102@telus.net> <48FD6803.7080802@shopzeus.com> <48FE6C64.7060606@telus.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-questions@freebsd.org Subject: Re: gjournal: journaled slices vs. journaled partitions X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Nov 2008 17:55:57 -0000 Hello, I built a similar setup last weekend on a new home server with two 500GB drives. I didn't want to only put gmirror and have full drives rebuild on power failure/reset on the system. I was told that putting bsdlabels on a gjournal provider wasn't a good idea but I have yet to have an answer about why... I went with this setup anyway and I made some reset tests to see what happens on reboot and everything always went fine. When building this setup I got one big problem. If the root filesystem (/) was on a gjournal provider, an unclean shutdown when data was being written on the disk rendered the system completely unbootable. I got this message: GEOM_MIRROR: Device mirror/gm launched (2/2) GEOM_JOURNAL: Journal 3672855181: mirror/gma contains data. GEOM_JOURNAL: Journal 3672855181: mirror/gma contains journal. GEOM_JOURNAL: Journal 3868799910: mirror/gmd contains data. GEOM_JOURNAL: Journal 3868799910: mirror/gmd contains journal. GEOM_JOURNAL: Journal mirror/gmd consistent. Trying to mount root from ufs:/dev/mirror/gm.journal Manual root filesystem specification: : Mount using filesystem eg. ufs:da0s1a ? List valid disk boot devices Abort manual input mountroot> ? List of GEOM managed disk devices: mirror/gmd.journal mirror/gmd mirror/gmc mirror/gma mirror/gm ad10s1c ad10s1b ad8s1c ad8s1b ad10s2 ad10s1 ad8s1 ad10 ad8 acd0 As you can see, in the proposed list of disk devices devices to boot on, "mirror/gm.journala" is absent. As I and Ivan Voras, that I contacted about this problem, found, the GEOM_JOURNAL thread that is supposed to mark the journal consistent takes too much time to do it with the root filesystem's provider and the kernel try to mount a device that doesn't yet exist. A bug report has been opened about this problem. For my final setup I decided to put the root filesystem on a separate mirrorred slice of 1GB. Since this slice isn't often written on, not many rebuilds should occur in case of power failure. And I made my "power failure" test by hitting the reset button while writing data on this filesystem and the rebuild on 1GB doesn't takes too much time (at most 20-30 seconds). Now I have the question. Why the "load" algorith wasn't recommended? Is it fixed in 7.0-RELEASE-p5? Here is my complete setup that seems to boot correctly every times I made my reset tests while writing data on each filesystems. The 2GB gjournal provider is directly on the mirror provider for all mirrored filesystems exept the root one and I made my bsd labels on the gjournal provider, instead of creating a journal for every filesystem. [root@headless ~]# cat /etc/fstab # Device Mountpoint FStype Options Dump Pass# /dev/ad10s1b none swap sw 0 0 /dev/ad8s1b none swap sw 0 0 /dev/mirror/root / ufs rw 1 1 /dev/ufs/usr /usr ufs rw,async 2 2 /dev/ufs/var /var ufs rw,async 2 2 /dev/ufs/tmp /tmp ufs rw,async 2 2 /dev/ufs/home /home ufs rw,async 2 2 /dev/ufs/data /mnt/data ufs rw,async 2 2 /dev/acd0 /cdrom cd9660 ro,noauto 0 0 [root@headless ~]# mount /dev/mirror/root on / (ufs, local, soft-updates) devfs on /dev (devfs, local) /dev/ufs/usr on /usr (ufs, asynchronous, local, gjournal) /dev/ufs/var on /var (ufs, asynchronous, local, gjournal) /dev/ufs/tmp on /tmp (ufs, asynchronous, local, gjournal) /dev/ufs/home on /home (ufs, asynchronous, local, acls, gjournal) /dev/ufs/data on /mnt/data (ufs, asynchronous, local, acls, gjournal) [root@headless ~]# glabel status Name Status Components ufs/usr N/A mirror/data.journald ufs/var N/A mirror/data.journale ufs/tmp N/A mirror/data.journalf ufs/home N/A mirror/data.journalg ufs/data N/A mirror/data.journalh [root@headless ~]# gjournal list Geom name: gjournal 372943514 ID: 372943514 Providers: 1. Name: mirror/data.journal Mediasize: 495810966528 (462G) Sectorsize: 512 Mode: r5w5e11 Consumers: 1. Name: mirror/data Mediasize: 497958450688 (464G) Sectorsize: 512 Mode: r1w1e1 Jend: 497958450176 Jstart: 495810966528 Role: Data,Journal [root@headless ~]# gmirror list Geom name: data State: COMPLETE Components: 2 Balance: split Slice: 4096 Flags: NOFAILSYNC GenID: 0 SyncID: 1 ID: 990032118 Providers: 1. Name: mirror/data Mediasize: 497958450688 (464G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad8s2 Mediasize: 497958451200 (464G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: HARDCODED GenID: 0 SyncID: 1 ID: 235591066 2. Name: ad10s2 Mediasize: 497958451200 (464G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: HARDCODED GenID: 0 SyncID: 1 ID: 2007880058 Geom name: root State: COMPLETE Components: 2 Balance: split Slice: 4096 Flags: NONE GenID: 0 SyncID: 1 ID: 4098555256 Providers: 1. Name: mirror/root Mediasize: 1073022976 (1.0G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad8s1a Mediasize: 1073023488 (1.0G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: HARDCODED GenID: 0 SyncID: 1 ID: 3394521634 2. Name: ad10s1a Mediasize: 1073023488 (1.0G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: HARDCODED GenID: 0 SyncID: 1 ID: 3774466459 Gabriel 2008/11/4 Volodymyr Kostyrko > Carl wrote: > >> Volodymyr Kostyrko wrote: >> >> I have some setups were gjournal was put on device rather the on >>> partition, i.e.: >>> >>> [umgah] ~> gmirror status >>> Name Status Components >>> mirror/umgah0 COMPLETE ad0 >>> ad1 >>> [umgah] ~> gjournal status >>> Name Status Components >>> mirror/umgah0.journal N/A mirror/umgah0 >>> [umgah] ~> glabel status >>> Name Status Components >>> ufs/umgah0root N/A mirror/umgah0.journala >>> label/umgah0swap N/A mirror/umgah0.journalb >>> ufs/umgah0usr N/A mirror/umgah0.journald >>> ufs/umgah0var N/A mirror/umgah0.journale >>> >> >> Does the above suggest that you've ended up with individual journal >> providers for each partition anyway? If so, where are they and have you >> really achieved anything functionally different? Are they at the end of >> their individually associated partitions or all together somewhere else? Has >> the ill-advised journaled small partition issue been successfully overcome >> through what you've done? >> > > First, there is only one journal - for /dev/mirror/umgah0 and it is named > /dev/mirror/umgah0.journal. Anything else is just a bsdlabel partitions, > there are four of 'em. > > >> [umgah] ~> mount >>> /dev/ufs/umgah0root on / (ufs, asynchronous, local, noatime, gjournal) >>> devfs on /dev (devfs, local) >>> /dev/md0 on /tmp (ufs, asynchronous, local) >>> /dev/ufs/umgah0var on /var (ufs, asynchronous, local, noatime, gjournal) >>> /dev/ufs/umgah0usr on /usr (ufs, asynchronous, local, noatime, gjournal) >>> devfs on /var/named/dev (devfs, local) >>> >>> And yes, mirror autosynchronization is turned off, gjournal takes care of >>> that too. >>> >>> It's not stated in manual, but gjournal is typically transparent for any >>> type of access, just in case of UFS file system is marked as journaled so >>> any metadata writes can be distinguished from data writes. Without that >>> gjournal does literally nothing. >>> >> >> And what does this mean for your swap partition? >> > > Just nothing, it's just swap. It can't be journaled. > > Laszlo Nagy wrote earlier: >> >>> Another tricky question: why would you journal a SWAP partition? >>> >> >> Volodymyr, does your assertion that gjournal does nothing when a file >> system is not UFS mean that there is no penalty with regard to your swap >> partition despite the existence of "mirror/umgah0.journalb"? >> > > I haven't seen any perfomance decrease in this configuration. And according > to manual and articles about gjournal it should work this way. > > Any chance you'd like to share your command sequence for constructing your >> gmirror'd and gjournal'd filesystem, Volodymyr? :-) >> > > If we have two disks (ad0, ad1) it should look like this: > > > gmirror label -b load -n umgah0 ad1 > > We are getting all drive gmirrored without synchronization (we don't need > it - journal would take care of any discrepancies) and with load balance > (load was fixed not so long ago in stable and should be fine to go with). > > > gjournal label mirror/umgah0 > > We are creating a journal on top of our gmirror. It eats 1G from the end of > the disks and gives us the rest to use. > > > bsdlabel -wB mirror/umgah0.journal > > We are writing the standard bsdlabel to the disk and making it bootable. > After that we will get one partition 'a'. > > > Yes, no fdisk. I don't think this old piece of rough junk is ever needed on > machine running FreeBSD solely. It just takes space, it requires > compatibility to forgotten-and-abandoned standards and gives nothing more. > You have your server dual-booting Windows or Linux? This is the only case > you need fdisk for. > > > > bsdlabel -e mirror/umgah0.journal > > Now we are splitting our journal to some partitions. I did it this way: > > # /dev/mirror/umgah0.journal: > 8 partitions: > # size offset fstype [fsize bsize bps/cpg] > a: 524288 16 4.2BSD > b: 16777216 * swap > c: 779325614 0 unused 0 0 # "raw" part, don't > edit > d: 33554432 * 4.2BSD > e: * * 4.2BSD > > After that we can format this filesystems: > > > newfs -J -L umgah0root /dev/mirror/umgah0.journala > > newfs -J -L umgah0var /dev/mirror/umgah0.journald > > newfs -J -L umgah0usr /dev/mirror/umgah0.journale > > And label the swap: > > > glabel label umgah0swap /dev/mirror/umgah0.journalb > > You can skip all this glabel thing, I just prefer to have slim fstab, as > slim as possible. > > > /dev/label/umgah0swap none swap sw 0 0 > > md /tmp mfs rw,-s1024m,-S,-oasync 0 0 > > /dev/ufs/umgah0root / ufs rw,async,noatime 0 1 > /dev/ufs/umgah0var /var ufs rw,async,noatime 0 2 > /dev/ufs/umgah0usr /usr ufs rw,async,noatime 0 2 > > > There's a lot more here to describe from moving system to newly created > partitions to inserting and rebuilding our first disk to gmirror. All this > issues are described in handbook or other articles found on the net. > > > -- > Sphinx of black quartz judge my vow. > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to " > freebsd-questions-unsubscribe@freebsd.org" > -- Gabriel Lavoie glavoie@gmail.com