Date: Mon, 23 Jul 2007 12:36:51 +0200 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Alexander Leidinger <Alexander@Leidinger.net> Cc: Ulf Lilleengen <lulf@FreeBSD.org>, perforce@freebsd.org, Eric Anderson <anderson@freebsd.org> Subject: Re: PERFORCE change 123662 for review Message-ID: <20070723103651.GD5456@garage.freebsd.pl> In-Reply-To: <20070720150716.77d2636a@deskjail> References: <200707172109.l6HL9PMJ078780@repoman.freebsd.org> <46A03390.3030602@freebsd.org> <20070720123524.GA71360@twoflower.idi.ntnu.no> <20070720150716.77d2636a@deskjail>
next in thread | previous in thread | raw e-mail | index | archive | help
--NtwzykIc2mflq5ck Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 20, 2007 at 03:07:16PM +0200, Alexander Leidinger wrote: > Quoting Ulf Lilleengen <lulf@FreeBSD.org> (Fri, 20 Jul 2007 14:35:24 +020= 0): >=20 > [growing RAID-5] > > Well, what I do is to attach/create the new subdisk as usual, but since= it's a > > RAID-5 array that I know is operational, I give the subdisk a flag, and= sets the > > plex in a resize state. Then, In the raid-5 code, I modify gv_raid5_off= set > > (which basically computes offsets within a subdisk based on the number = of > > subdisks and stripesize). However, what I do, is that instead of taking= all > > subdisks in the calculation, I only take those who does not have the GR= OW flag > > (when reading), and I take all subdisks into calculation when it's a wr= ite. > >=20 > > This means, that if a create a gv_grow_plex function that reads (stripe= size x > > sdcount) bytes (from the subdisks that do not have the GROW flag), and = writes > > that data to the plex (including all subdisks). This way, i sort of ove= rwrite > > the old data, but the data is spread out over the new subdisks. I'm sor= ry if > > this might seem a bit complex, but just ask more questions if you didn't > > understand. >=20 > Do you use the additional drive(s) only to write checksums to them, or > do you write real data to it? If the later, how do you make sure you > read the right data in case you read data again, which was just written > there a moment before (how do you know to read from all subdisks and > not only from a subset in this case)? You only need to move offset while you synchronize new disk. When you start you have: Disk0 Disk1 Disk2 NewDisk D0 D1 P0 U D2 P1 D3 U P2 D4 D5 U D6 D7 P3 U D8 P4 D9 U P5 D10 D11 U After some time you have: Disk0 Disk1 Disk2 NewDisk D0 D1 D2 NP0 D3 D4 NP1 D5 U U U U --> D6 D7 P3 U D8 P4 D9 U P5 D10 D11 U And at the end you have: Disk0 Disk1 Disk2 NewDisk D0 D1 D2 NP0 D3 D4 NP1 D5 D6 NP2 D7 D8 NP3 D9 D10 D11 U U U U U U U U Where: D<x> - data block P<x> - parity block NP<x> - new parity block U - unused --> - if offset in I/O request is below that point, you use four disks, if it is above that point you use only three disks BTW. Such functionality is really cool. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --NtwzykIc2mflq5ck Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGpITDForvXbEpPzQRAlsiAJ9f5lTZ188mFypz1uO4+ltEb6QMjgCeO45h 5hgG+gHHAH8G3dWg2LNr9Bw= =cOgP -----END PGP SIGNATURE----- --NtwzykIc2mflq5ck--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070723103651.GD5456>