From owner-freebsd-questions@FreeBSD.ORG Tue Jul 22 05:03:32 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41CF8106568C for ; Tue, 22 Jul 2008 05:03:32 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from ns1.jnielsen.net (ns1.jnielsen.net [69.55.238.237]) by mx1.freebsd.org (Postfix) with ESMTP id 0C7618FC28 for ; Tue, 22 Jul 2008 05:03:31 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from stealth.jnielsen.net (jn@stealth.jnielsen.net [74.218.226.254]) (authenticated bits=0) by ns1.jnielsen.net (8.12.9p2/8.12.9) with ESMTP id m6M53TJP012600; Tue, 22 Jul 2008 01:03:29 -0400 (EDT) (envelope-from lists@jnielsen.net) From: John Nielsen To: freebsd-questions@freebsd.org Date: Tue, 22 Jul 2008 01:03:27 -0400 User-Agent: KMail/1.9.7 References: In-Reply-To: X-Face: #X5#Y*q>F:]zT!DegL3z5Xo'^MN[$8k\[4^3rN~wm=s=Uw(sW}R?3b^*f1Wu*.<=?iso-8859-1?q?of=5F4NrS=0A=09P*M/9CpxDo!D6?=)IY1w<9B1jB; tBQf[RU-R<,I)e"$q7N7 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200807220103.27950.lists@jnielsen.net> X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on ns1.jnielsen.net X-Virus-Status: Clean Cc: Steven Schlansker Subject: Re: Using ccd with zfs X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 05:03:32 -0000 On Tuesday 22 July 2008 12:18:31 am Steven Schlansker wrote: > Hello -questions, > I have a FreeBSD ZFS storage system working wonderfully with 7.0. > It's set up as three 3-disk RAIDZs -triplets of 500, 400, and 300GB > drives. > > I recently purchased three 750GB drives and would like to convert to > using a RAIDZ2. As ZFS has no restriping capabilities yet, I will > have to nuke the zpool from orbit and make a new one. I would like to > verify my methodology against your experience to see if what I wish to > do is reasonable: > > I plan to first take 2 of the 750GB drives and make an unreplicated > 1.5TB zpool as a temporary storage. Since ZFS doesn't seem to have > the ability to create zpools in degraded mode (with missing drives) I > plan to use iSCSI to create two additional drives (backed by /dev/ > zero) to fake having two extra drives, relying on ZFS's RAIDZ2 > protection to keep everything running despite the fact that two of the > drives are horribly broken ;) > > To make these 500, 400, and 300GB drives useful, I would like to > stitch them together using ccd. I would use it as 500+300 = 800GB and > 400+400=800GB > > That way, in the end I would have > 750 x 3 > 500 + 300 x3 > 400 + 400 x 1 > 400 + 200 + 200 x 1 > as the members in my RAIDZ2 group. I understand that this is slightly > less reliable than having "real" drives for all the members, but I am > not interested in purchasing 5 more 750GB drives. I'll replace the > drives as they fail. > > I am wondering if there are any logistical problems. The three parts > I am worried about are: > > 1) Are there any problems with using an iSCSI /dev/zero drive to fake > drives for creation of a new zpool, with the intent to replace them > later with proper drives? I don't know about the iSCSI approach but I have successfully created a degraded zpool using md and a sparse file in place of the missing disk. Worked like a charm and I was able to transfer everything to the zpool before nuking the real device (which I had been using for temporary storage) and replacing the md file with it. You can create a sparse file using dd: dd if=/dev/zero of=sparsefile bs=512 seek=(size of the fake device in 512-byte blocks) count=0 Turn it into a device node using mdconfig: mdconfig -a -t vnode -f sparsefile Then create your zpool using the /dev/md0 device (unless the mdconfig operation returns a different node number). The size of the sparse file should not be bigger than the size of the real device you plan to replace it with. If using GEOM (which I think you should, see below), be sure to remember to subtract 512 bytes for each level of each provider (GEOM modules store their metadata in the last sector of each provider so that space is unavailable for use). To be on the safe side you can whack a few KB off. You can't remove the fake device from a running zpool but the first time you reboot it will be absent and the zpool will come up degraded. > 2) Are there any problems with using CCD under zpool? Should I stripe > or concatenate? Will the startup scripts (either by design or less > likely intelligently) decide to start CCD before zfs? The zpool > should start without me interfering, correct? I would suggest using gconcat rather than CCD. Since it's a GEOM module (and you will have remembered to load it via /boot/loader.conf) it will initialize its devices before ZFS starts. It's also much easier to set up than CCD. If you are concatenating two devices of the same size you could consider using gstripe instead, but think about the topology of your drives and controllers and the likely usage patterns your final setup will create to decide if that's a good idea. > 3) I hear a lot about how you should use whole disks so ZFS can enable > write caching for improved performance. Do I need to do anything > special to let the system know that it's OK to enable the write > cache? And persist across reboots? Not that I know of. As I understand it ZFS _assumes_ it's working with whole disks so since it uses its own i/o scheduler performance can be degraded for anything sharing a physical device with a ZFS slice. > Any other potential pitfalls? Also, I'd like to confirm that there's > no way to do this pure ZFS-like - I read the documentation but it > doesn't seem to have support for nesting vdevs (which would let me do > this without ccd) You're right, you can't do this with ZFS alone. Good thing FreeBSD is so versatile. :) JN > Thanks for any information that you might be able to provide, > Steven Schlansker > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "freebsd-questions-unsubscribe@freebsd.org"