From owner-freebsd-geom@FreeBSD.ORG Sun Nov 28 12:14:16 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E05F16A4CE for ; Sun, 28 Nov 2004 12:14:16 +0000 (GMT) Received: from smtp.dkm.cz (smtp.dkm.cz [62.24.64.34]) by mx1.FreeBSD.org (Postfix) with SMTP id 9D54B43D2D for ; Sun, 28 Nov 2004 12:14:15 +0000 (GMT) (envelope-from tomas@zvala.cz) Received: (qmail 90906 invoked by uid 0); 28 Nov 2004 12:14:14 -0000 Received: from b13.dkm.cz (HELO ?192.168.0.2?) (62.24.66.13) by smtp.dkm.cz with SMTP; 28 Nov 2004 12:14:14 -0000 Message-ID: <41A9C110.9050205@zvala.cz> Date: Sun, 28 Nov 2004 13:14:08 +0100 From: Tomas Zvala User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: geom_mirror performance issues X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Nov 2004 12:14:16 -0000 Hello, I've been playing with geom_mirror for a while now and few issues came up to my mind. a) I have two identical drives (Seagate 120GB SATA 8MB Cache 7.2krpm) that are able to sequentialy read at 58MB/s at the same time (about 115MB/s throughput). But when I have them in geom_mirror I get 30MB/s at best. Thats about 60MB/s for the mirror (about half the potential). The throughput is almost the same for both 'split' and 'load' balancing algorithms altough with load algorithm it seems that all the reading is being done from just one drive. b) Pretty often i can see in gstat that both drives are doing the same things (the same number of transactions and same throughput) but one of them has significantly higher load(ie. one 50% and the other one 95%). How is disk load calculated and why does this happen? c) When I use 'split' load balancing algorithm, 128kB requests are split into two 64kB requests making twice as many transactions on the disks. Is it possible to lure fbsd into allowing 256kB requests that will get split into two 128kB requests? d) When I use round-robin algorithm the performance halves (i get about 20MB/s raw throughput). Why is this? I would expect round-robin algorithm to be the most effective one for reading as every drive gets exactly half the load. e) My last question again goes with the 'load' balancing. How often is switch between drives done? When I set my load balancing to 'load' i get 100% load on one drive and 0% or at most 5% on the other one. Is this an intention. Seems like a bug to me. Last thing doesent go exactly with geom_mirror. I was thinking if it was possible to implement some kind of read/write buffering on geom level working about the same as read/write buffering works on HW raid cards. Would it have any effect on performance or is it just a step in a wrong direction? Oh not to forget, I was using dumb dd if= of=/dev/null bs=1048576 count=10240 to do 'benchmarks' and to study behaviour of load balancing. Right now I'm trying to get some results from bonnie++ to compare results based on something else than sequential reads. Thank You for your time Tomas Zvala From owner-freebsd-geom@FreeBSD.ORG Sun Nov 28 16:29:21 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EDB1716A4CE for ; Sun, 28 Nov 2004 16:29:21 +0000 (GMT) Received: from lakermmtao09.cox.net (lakermmtao09.cox.net [68.230.240.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 45E8343D39 for ; Sun, 28 Nov 2004 16:29:21 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from [192.168.0.100] (really [68.110.143.167]) by lakermmtao09.cox.net (InterMail vM.6.01.04.00 201-2131-117-20041022) with ESMTP id <20041128162920.KBDM16487.lakermmtao09.cox.net@[192.168.0.100]>; Sun, 28 Nov 2004 11:29:20 -0500 Message-ID: <41A9FCDF.3070709@gromit.dlib.vt.edu> Date: Sun, 28 Nov 2004 11:29:19 -0500 From: Paul Mather User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Tomas Zvala References: <41A9C110.9050205@zvala.cz> In-Reply-To: <41A9C110.9050205@zvala.cz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-geom@freebsd.org Subject: Re: geom_mirror performance issues X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Nov 2004 16:29:22 -0000 Tomas Zvala wrote: > Hello, > I've been playing with geom_mirror for a while now and few issues > came up to my mind. > a) I have two identical drives (Seagate 120GB SATA 8MB Cache > 7.2krpm) that are able to sequentialy read at 58MB/s at the same time > (about 115MB/s throughput). But when I have them in geom_mirror I get > 30MB/s at best. Thats about 60MB/s for the mirror (about half the > potential). The throughput is almost the same for both 'split' and > 'load' balancing algorithms altough with load algorithm it seems that > all the reading is being done from just one drive. When I did similar geom_mirror tests (dd'ing large files), I noticed with the 'load' balance algorithm that both drives of my mirror were in fact used, but not simultaneously. One drive would be used exclusively for a few seconds, and then it would switch over to use the other drive for a few seconds, and then back again. I attributed this behaviour to perhaps that the 'load' statistics in GEOM were only being updated relatively infrequently (to minimise performance impact), although I don't know if this is the case. The odd thing is that the overall throughput of 'load' was the second-highest of all the balancing algorithms I tried, coming just a little behind the 'split' balance algorithm. (Mind you, this was on a 300 MHz Pentium II system, where the CPU overheads of a transaction are more significant than a GHz-level system.) > d) When I use round-robin algorithm the performance halves (i get > about 20MB/s raw throughput). Why is this? I would expect round-robin > algorithm to be the most effective one for reading as every drive gets > exactly half the load. My speculation is that perhaps relative drive head positions might deviate more under 'round-robin' leading to more time lost seeking (which is the dominant cost in disk I/O). I also found 'round-robin' to be relatively slow compared to 'load' and 'split.' Cheers, Paul. From owner-freebsd-geom@FreeBSD.ORG Sun Nov 28 20:44:59 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6665516A4CE for ; Sun, 28 Nov 2004 20:44:59 +0000 (GMT) Received: from darkness.comp.waw.pl (darkness.comp.waw.pl [195.117.238.136]) by mx1.FreeBSD.org (Postfix) with ESMTP id E361F43D4C for ; Sun, 28 Nov 2004 20:44:58 +0000 (GMT) (envelope-from pjd@darkness.comp.waw.pl) Received: by darkness.comp.waw.pl (Postfix, from userid 1009) id 4E648ACBCB; Sun, 28 Nov 2004 21:44:57 +0100 (CET) Date: Sun, 28 Nov 2004 21:44:57 +0100 From: Pawel Jakub Dawidek To: Tomas Zvala Message-ID: <20041128204457.GG7232@darkness.comp.waw.pl> References: <41A9C110.9050205@zvala.cz> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="G32kpiKURVjWs3Ul" Content-Disposition: inline In-Reply-To: <41A9C110.9050205@zvala.cz> User-Agent: Mutt/1.4.2i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 5.2.1-RC2 i386 cc: freebsd-geom@freebsd.org Subject: Re: geom_mirror performance issues X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Nov 2004 20:44:59 -0000 --G32kpiKURVjWs3Ul Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Nov 28, 2004 at 01:14:08PM +0100, Tomas Zvala wrote: +> Hello, +> I've been playing with geom_mirror for a while now and few issues=20 +> came up to my mind. +> a) I have two identical drives (Seagate 120GB SATA 8MB Cache=20 +> 7.2krpm) that are able to sequentialy read at 58MB/s at the same time= =20 +> (about 115MB/s throughput). But when I have them in geom_mirror I get=20 +> 30MB/s at best. Thats about 60MB/s for the mirror (about half the=20 +> potential). The throughput is almost the same for both 'split' and 'load= '=20 +> balancing algorithms altough with load algorithm it seems that all the= =20 +> reading is being done from just one drive. Think how mirror works. When you do sequential read you get something like this: # dd if=3D/dev/mirror/foo of=3D/dev/null bs=3D128k disk0 disk1 (offset) (offset) 0 128k 256k 384k Now, try to write a program which reads every second sector from the disk. You will get not more than your 30MB/s. This is not stripe. Time spend on moving head from offset 128k (after reading first 128kB) to 256k cost the same as reading those data. You should try /usr/src/tools/tools/raidtest/ which does random I/Os. +> b) Pretty often i can see in gstat that both drives are doing the=20 +> same things (the same number of transactions and same throughput) but o= ne=20 +> of them has significantly higher load(ie. one 50% and the other one 95%)= .=20 +> How is disk load calculated and why does this happen? You use round-robin algorithm? I can't reproduce it here. I see ~50% busy on both components. +> c) When I use 'split' load balancing algorithm, 128kB requests are=20 +> split into two 64kB requests making twice as many transactions on the=20 +> disks. Is it possible to lure fbsd into allowing 256kB requests that=20 +> will get split into two 128kB requests? You can try to change MAXPHYS in param.h and try to recompile your kernel, but I've no idea if this will "just work". +> d) When I use round-robin algorithm the performance halves (i get=20 +> about 20MB/s raw throughput). Why is this? I would expect round-robin= =20 +> algorithm to be the most effective one for reading as every drive gets= =20 +> exactly half the load. Repeat your tests with random I/Os. +> e) My last question again goes with the 'load' balancing. How often=20 +> is switch between drives done? When I set my load balancing to 'load' i= get=20 +> 100% load on one drive and 0% or at most 5% on the other one. Is this an= =20 +> intention. Seems like a bug to me. Again, try with random reading/writing. --=20 Pawel Jakub Dawidek http://www.FreeBSD.org pjd@FreeBSD.org http://garage.freebsd.pl FreeBSD committer Am I Evil? Yes, I Am! --G32kpiKURVjWs3Ul Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFBqjjJForvXbEpPzQRAkjOAKD4R69NUALQ8K84C9vHr5aQmn1R8gCfTeSn zKIFIg/+crdhn/929SDFSV0= =1fgU -----END PGP SIGNATURE----- --G32kpiKURVjWs3Ul-- From owner-freebsd-geom@FreeBSD.ORG Mon Nov 29 11:46:30 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C25E716A4CE for ; Mon, 29 Nov 2004 11:46:30 +0000 (GMT) Received: from foo.nemo-project.org (foo.nemo-project.org [194.54.103.89]) by mx1.FreeBSD.org (Postfix) with ESMTP id 33DC843D5A for ; Mon, 29 Nov 2004 11:46:30 +0000 (GMT) (envelope-from terje+geom@elde.net) Received: by foo.nemo-project.org (Postfix, from userid 1001) id 5A20DD908C; Mon, 29 Nov 2004 12:47:40 +0100 (CET) Date: Mon, 29 Nov 2004 12:47:40 +0100 From: Terje Elde To: freebsd-geom@freebsd.org Message-ID: <20041129114740.GD90910@calleigh.elde.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i Subject: mirror handling of broken harddrives X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Nov 2004 11:46:30 -0000 Hi, So far Im really happy with geom, but there is a issue Im curious about. Given faulty hardware, resulting in: ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=35170616 ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=35170616 ad0: FAILURE - READ_DMA timed out How would this be handled by geom_mirror if it was a part of a two-plex volume? Would geom_mirror automatically recover by reading from the other plex? Would it mark the broken plex as down, and not use/trust it? Also, when using this drive, all disk IO (against the same disk?) freezes while it`s trying to complete these reads. Would geom_mirror be able to recover quicker from this, by satisfying the request from the other plex? Thanks for any input, Terje Elde From owner-freebsd-geom@FreeBSD.ORG Mon Nov 29 11:57:22 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2C24616A4CE for ; Mon, 29 Nov 2004 11:57:22 +0000 (GMT) Received: from zaphod.nitro.dk (port324.ds1-khk.adsl.cybercity.dk [212.242.113.79]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8EEFA43D55 for ; Mon, 29 Nov 2004 11:57:19 +0000 (GMT) (envelope-from simon@zaphod.nitro.dk) Received: by zaphod.nitro.dk (Postfix, from userid 3000) id E616911A6B; Mon, 29 Nov 2004 12:57:17 +0100 (CET) Date: Mon, 29 Nov 2004 12:57:17 +0100 From: "Simon L. Nielsen" To: Terje Elde Message-ID: <20041129115717.GA753@zaphod.nitro.dk> References: <20041129114740.GD90910@calleigh.elde.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="CE+1k2dSO48ffgeK" Content-Disposition: inline In-Reply-To: <20041129114740.GD90910@calleigh.elde.net> User-Agent: Mutt/1.5.6i cc: freebsd-geom@freebsd.org Subject: Re: mirror handling of broken harddrives X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Nov 2004 11:57:22 -0000 --CE+1k2dSO48ffgeK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2004.11.29 12:47:40 +0100, Terje Elde wrote: > Given faulty hardware, resulting in: >=20 > ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=3D35170616 > ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=3D35170616 > ad0: FAILURE - READ_DMA timed out >=20 > How would this be handled by geom_mirror if it was a part of a two-plex > volume? I can show you from my log how it handles it :-) : ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=3D152911 ad0: FAILURE - WRITE_DMA timed out GEOM_MIRROR: Request failed (error=3D5). ad0[WRITE(offset=3D78290432, lengt= h=3D16384)] GEOM_MIRROR: Device boot: provider ad0 disconnected. GEOM_MIRROR: Device boot: provider ad0 detected. GEOM_MIRROR: Device boot: rebuilding provider ad0. GEOM_MIRROR: Device boot: rebuilding provider ad0 finished. GEOM_MIRROR: Device boot: provider ad0 activated. So, it just works :-). Actually the mirror in question has rebuild itself like that many times, and I actually only noticed after quite a while :-). I actually had more or less the same (the ata errors) happen on another system, which uses ata(4) for RAID1, and there I had to manually rebuild the mirror... geom_mirror has just worked great for me, thanks Pawel! :-) --=20 Simon L. Nielsen --CE+1k2dSO48ffgeK Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (FreeBSD) iD8DBQFBqw6dh9pcDSc1mlERAorBAJ97A9ysQBAoaCTPOTMoG5dPWY9Y+wCfYmk7 SwdeBoAMnwGBvl7FAiP+LzI= =t4eM -----END PGP SIGNATURE----- --CE+1k2dSO48ffgeK-- From owner-freebsd-geom@FreeBSD.ORG Mon Nov 29 13:34:11 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C921D16A4CE; Mon, 29 Nov 2004 13:34:11 +0000 (GMT) Received: from foo.nemo-project.org (foo.nemo-project.org [194.54.103.89]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1B13543D2D; Mon, 29 Nov 2004 13:34:11 +0000 (GMT) (envelope-from terje@elde.org) Received: by foo.nemo-project.org (Postfix, from userid 1001) id 06C9CD908C; Mon, 29 Nov 2004 14:35:22 +0100 (CET) Date: Mon, 29 Nov 2004 14:35:21 +0100 From: Terje Elde To: "Simon L. Nielsen" Message-ID: <20041129133521.GE90910@calleigh.elde.net> References: <20041129114740.GD90910@calleigh.elde.net> <20041129115717.GA753@zaphod.nitro.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041129115717.GA753@zaphod.nitro.dk> User-Agent: Mutt/1.5.4i cc: freebsd-geom@freebsd.org Subject: Re: mirror handling of broken harddrives X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Nov 2004 13:34:11 -0000 On Mon, Nov 29, 2004 at 12:57:17PM +0100, Simon L. Nielsen wrote: > > Given faulty hardware, resulting in: > > > > ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=35170616 > > ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=35170616 > > ad0: FAILURE - READ_DMA timed out > > > > How would this be handled by geom_mirror if it was a part of a two-plex > > volume? > > I can show you from my log how it handles it :-) : > > ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=152911 > ad0: FAILURE - WRITE_DMA timed out > GEOM_MIRROR: Request failed (error=5). ad0[WRITE(offset=78290432, length=16384)] > GEOM_MIRROR: Device boot: provider ad0 disconnected. > GEOM_MIRROR: Device boot: provider ad0 detected. > GEOM_MIRROR: Device boot: rebuilding provider ad0. > GEOM_MIRROR: Device boot: rebuilding provider ad0 finished. > GEOM_MIRROR: Device boot: provider ad0 activated. > > So, it just works :-). Hmm, yes. But does it work the way it should? I mean, if the reason for the problem is that the hardware ad0 is defect, then you'll try to rebuild it again and again. Every time you're trying to write a defect sector, the disk will block. Since this is a DMA timeout, I imagine the blocking can quite easily lead to other problems as well, for things sharing the same DMA channel. > geom_mirror has just worked great for me, thanks Pawel! :-) Works great for me too. I'm nothing but happy. Terje From owner-freebsd-geom@FreeBSD.ORG Mon Nov 29 20:53:50 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C85C216A4CE for ; Mon, 29 Nov 2004 20:53:50 +0000 (GMT) Received: from zaphod.nitro.dk (port324.ds1-khk.adsl.cybercity.dk [212.242.113.79]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5BAD943D4C for ; Mon, 29 Nov 2004 20:53:50 +0000 (GMT) (envelope-from simon@zaphod.nitro.dk) Received: by zaphod.nitro.dk (Postfix, from userid 3000) id 90BA611A6B; Mon, 29 Nov 2004 21:53:49 +0100 (CET) Date: Mon, 29 Nov 2004 21:53:49 +0100 From: "Simon L. Nielsen" To: Terje Elde Message-ID: <20041129205349.GC753@zaphod.nitro.dk> References: <20041129114740.GD90910@calleigh.elde.net> <20041129115717.GA753@zaphod.nitro.dk> <20041129133521.GE90910@calleigh.elde.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Pk6IbRAofICFmK5e" Content-Disposition: inline In-Reply-To: <20041129133521.GE90910@calleigh.elde.net> User-Agent: Mutt/1.5.6i cc: freebsd-geom@freebsd.org Subject: Re: mirror handling of broken harddrives X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Nov 2004 20:53:50 -0000 --Pk6IbRAofICFmK5e Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2004.11.29 14:35:21 +0100, Terje Elde wrote: > On Mon, Nov 29, 2004 at 12:57:17PM +0100, Simon L. Nielsen wrote: > > ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=3D152911 > > ad0: FAILURE - WRITE_DMA timed out > > GEOM_MIRROR: Request failed (error=3D5). ad0[WRITE(offset=3D78290432, l= ength=3D16384)] > > GEOM_MIRROR: Device boot: provider ad0 disconnected. > > GEOM_MIRROR: Device boot: provider ad0 detected. > > GEOM_MIRROR: Device boot: rebuilding provider ad0. > > GEOM_MIRROR: Device boot: rebuilding provider ad0 finished. > > GEOM_MIRROR: Device boot: provider ad0 activated. > > > > So, it just works :-). > > Hmm, yes. But does it work the way it should? > > I mean, if the reason for the problem is that the hardware ad0 is defect,= then > you'll try to rebuild it again and again. Every time you're trying to wr= ite a > defect sector, the disk will block. That has not been the case for me. The system has run without problems for a couple of months with those rebuilds. It's two old 10GB IBM disks which are only used for the root file system (since my SATA and RAID controllers conflict so I can't boot from them). Since newer ATA drivers has remapping of bad sectors, if there really is a problem with a bad sector that cause an error, rebuilding the array should cause the bad sector to be remapped. It should also be noted that two SATA disks in the same system running another mirror has not had any of those self rebuilds. > Since this is a DMA timeout, I imagine the blocking can quite easily lead= to > other problems as well, for things sharing the same DMA channel. I don't really think the timeout is related to DMA, just the disk not responding for whatever reason (could be it's doing thermal recalibration). The disks are too old to have proper SMART support so I can see if there are actual errors. --=20 Simon L. Nielsen --Pk6IbRAofICFmK5e Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (FreeBSD) iD8DBQFBq4xdh9pcDSc1mlERAnTHAKCEa7WAl0VFkistbJcIOenZYNsO0ACgkNv4 IMrEASC6zNwNUMfOgbwhzl4= =svJa -----END PGP SIGNATURE----- --Pk6IbRAofICFmK5e-- From owner-freebsd-geom@FreeBSD.ORG Wed Dec 1 19:28:45 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F093B16A4CE for ; Wed, 1 Dec 2004 19:28:45 +0000 (GMT) Received: from lara.cc.fer.hr (lara.cc.fer.hr [161.53.72.113]) by mx1.FreeBSD.org (Postfix) with ESMTP id 069E743D3F for ; Wed, 1 Dec 2004 19:28:45 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from [127.0.0.1] (localhost.cc.fer.hr [127.0.0.1]) by lara.cc.fer.hr (8.13.1/8.13.1) with ESMTP id iB1JSeBi010861 for ; Wed, 1 Dec 2004 20:28:40 +0100 (CET) (envelope-from ivoras@fer.hr) Message-ID: <41AE1B68.5040003@fer.hr> Date: Wed, 01 Dec 2004 20:28:40 +0100 From: Ivan Voras User-Agent: Mozilla Thunderbird 0.9 (X11/20041111) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Subject: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Dec 2004 19:28:46 -0000 I'm very happy with existing GEOM RAID classes, everything from gconcat to graid3. So happy infact, that I'd very gladly see more of them, for example RAID5, or even something like a variation of current graid3, with huge stripe sizes instead of 512byte blocks :) Are there any plans for such things? From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 14:04:07 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6701F16A4CE for ; Thu, 2 Dec 2004 14:04:07 +0000 (GMT) Received: from foo.nemo-project.org (foo.nemo-project.org [194.54.103.89]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0B92343D1D for ; Thu, 2 Dec 2004 14:04:07 +0000 (GMT) (envelope-from terje+geom@elde.net) Received: by foo.nemo-project.org (Postfix, from userid 1001) id 50967D9097; Thu, 2 Dec 2004 15:05:28 +0100 (CET) Date: Thu, 2 Dec 2004 15:05:27 +0100 From: Terje Elde To: freebsd-geom@freebsd.org Message-ID: <20041202140527.GT72822@calleigh.elde.net> References: <41AE1B68.5040003@fer.hr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41AE1B68.5040003@fer.hr> User-Agent: Mutt/1.5.4i Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 14:04:07 -0000 On Wed, Dec 01, 2004 at 08:28:40PM +0100, Ivan Voras wrote: > I'm very happy with existing GEOM RAID classes, everything from gconcat > to graid3. So happy infact, that I'd very gladly see more of them, for > example RAID5, or even something like a variation of current graid3, > with huge stripe sizes instead of 512byte blocks :) One thing that would be funny is GEOM XOR, with the possability of setting up two (or more) volumes, such that you write random bytes to one drive, and write the data xored with the random to the other. Introduce RAID3 or 5 and add an extra disk, and you've for N-1 of M redundancy in your heavily encrypted volume. ;) Terje From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 15:59:44 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1CECC16A4CE for ; Thu, 2 Dec 2004 15:59:44 +0000 (GMT) Received: from lara.cc.fer.hr (lara.cc.fer.hr [161.53.72.113]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1206043D48 for ; Thu, 2 Dec 2004 15:59:43 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from [127.0.0.1] (localhost.cc.fer.hr [127.0.0.1]) by lara.cc.fer.hr (8.13.1/8.13.1) with ESMTP id iB2FxbNC017355 for ; Thu, 2 Dec 2004 16:59:37 +0100 (CET) (envelope-from ivoras@fer.hr) Message-ID: <41AF3BE9.8050108@fer.hr> Date: Thu, 02 Dec 2004 16:59:37 +0100 From: Ivan Voras User-Agent: Mozilla Thunderbird 0.9 (X11/20041111) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-geom@freebsd.org References: <41AE1B68.5040003@fer.hr> <20041202140527.GT72822@calleigh.elde.net> In-Reply-To: <20041202140527.GT72822@calleigh.elde.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 15:59:44 -0000 Terje Elde wrote: > One thing that would be funny is GEOM XOR, with the possability of setting up > two (or more) volumes, such that you write random bytes to one drive, and > write the data xored with the random to the other. I could easily do that with ggate for fun, if somebody'll use it :) (Of course, performance will probably suck, it being in userland...) > Introduce RAID3 or 5 and add an extra disk, and you've for N-1 of M redundancy > in your heavily encrypted volume. ;) Oh yes :) A password will still be required, for generating the random sequence... From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 16:06:32 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 196DE16A4CE for ; Thu, 2 Dec 2004 16:06:32 +0000 (GMT) Received: from foo.nemo-project.org (foo.nemo-project.org [194.54.103.89]) by mx1.FreeBSD.org (Postfix) with ESMTP id 73B3A43D45 for ; Thu, 2 Dec 2004 16:06:31 +0000 (GMT) (envelope-from terje@elde.org) Received: by foo.nemo-project.org (Postfix, from userid 1001) id 0954ED9091; Thu, 2 Dec 2004 17:07:52 +0100 (CET) Date: Thu, 2 Dec 2004 17:07:52 +0100 From: Terje Elde To: Ivan Voras Message-ID: <20041202160752.GV72822@calleigh.elde.net> References: <41AE1B68.5040003@fer.hr> <20041202140527.GT72822@calleigh.elde.net> <41AF3BE9.8050108@fer.hr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41AF3BE9.8050108@fer.hr> User-Agent: Mutt/1.5.4i cc: freebsd-geom@freebsd.org Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 16:06:32 -0000 On Thu, Dec 02, 2004 at 04:59:37PM +0100, Ivan Voras wrote: > >One thing that would be funny is GEOM XOR, with the possability of setting > >up two (or more) volumes, such that you write random bytes to one drive, > >and write the data xored with the random to the other. > > I could easily do that with ggate for fun, if somebody'll use it :) (Of > course, performance will probably suck, it being in userland...) ggate would be one option, but it'd be much nicer to have it as a 'real' geom module. > >Introduce RAID3 or 5 and add an extra disk, and you've for N-1 of M > >redundancy in your heavily encrypted volume. ;) > > Oh yes :) > > A password will still be required, for generating the random sequence... *cringe* The only point of using such a XOR is to end up with an effective OTP (One Time Pad). If you use a password as seed for a simple PRNG, then you're throwing away all the gain, and would be better off with GEOM BDE instead. FreeBSD 5 has a seemingly very good yarrow-based entropy source. Why not use that? The only known perfect encryption algorithm is OTP, assuming your input is perfectly random. If you use a seeded PRNG, then you'd end up reducing the security to that of a regular stream cipher. Terje From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 16:16:20 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B606D16A4CE for ; Thu, 2 Dec 2004 16:16:20 +0000 (GMT) Received: from lara.cc.fer.hr (lara.cc.fer.hr [161.53.72.113]) by mx1.FreeBSD.org (Postfix) with ESMTP id C2EE343D3F for ; Thu, 2 Dec 2004 16:16:19 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from [127.0.0.1] (localhost.cc.fer.hr [127.0.0.1]) by lara.cc.fer.hr (8.13.1/8.13.1) with ESMTP id iB2GGE0a017465 for ; Thu, 2 Dec 2004 17:16:14 +0100 (CET) (envelope-from ivoras@fer.hr) Message-ID: <41AF3FCE.1030405@fer.hr> Date: Thu, 02 Dec 2004 17:16:14 +0100 From: Ivan Voras User-Agent: Mozilla Thunderbird 0.9 (X11/20041111) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 16:16:20 -0000 Terje Elde wrote: > ggate would be one option, but it'd be much nicer to have it as a > 'real' geom module. It would, but I don't know enough to make a kernel module. >> A password will still be required, for generating the random sequence... > > > > *cringe* > > The only point of using such a XOR is to end up with an effective OTP (One > Time Pad). If you use a password as seed for a simple PRNG, then you're > throwing away all the gain, and would be better off with GEOM BDE instead. > > FreeBSD 5 has a seemingly very good yarrow-based entropy source. Why not use > that? > I think I misunderstood something. Do you propose this (for 2 disks): for each block to be written: a) generate a block of random data b) write random data to first disk c) write random data xor user data to second disk So, as long as any person has both disks, the data can be recovered. Where's the security in that? From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 17:24:12 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9857116A4CE for ; Thu, 2 Dec 2004 17:24:12 +0000 (GMT) Received: from foo.nemo-project.org (foo.nemo-project.org [194.54.103.89]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4ABD543D45 for ; Thu, 2 Dec 2004 17:24:12 +0000 (GMT) (envelope-from terje+geom@elde.net) Received: by foo.nemo-project.org (Postfix, from userid 1001) id AE8C6D9091; Thu, 2 Dec 2004 18:25:34 +0100 (CET) Date: Thu, 2 Dec 2004 18:25:34 +0100 From: Terje Elde To: Ivan Voras Message-ID: <20041202172534.GW72822@calleigh.elde.net> References: <41AF3FCE.1030405@fer.hr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41AF3FCE.1030405@fer.hr> User-Agent: Mutt/1.5.4i cc: freebsd-geom@freebsd.org Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 17:24:12 -0000 On Thu, Dec 02, 2004 at 05:16:14PM +0100, Ivan Voras wrote: > I think I misunderstood something. Do you propose this (for 2 disks): > > for each block to be written: > a) generate a block of random data > b) write random data to first disk > c) write random data xor user data to second disk > > So, as long as any person has both disks, the data can be recovered. > Where's the security in that? That you have a filesystem that's not edible unless you have both disks. Typical usage would naturally be for two people to not have the same disks, except for when the filesystem should be accessible. A simple use-case could be using the filesystem to store CA root keys on. The filesystem would thus only be available when both (or all, og N of M) trusted people cooperate in making it available. Pendrives and similar storage could be useful. Terje From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 18:36:01 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5B7EF16A4CE for ; Thu, 2 Dec 2004 18:36:01 +0000 (GMT) Received: from mail3.speakeasy.net (mail3.speakeasy.net [216.254.0.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id CD1CC43D46 for ; Thu, 2 Dec 2004 18:36:00 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: (qmail 1420 invoked from network); 2 Dec 2004 18:36:00 -0000 Received: from gate.funkthat.com (HELO hydrogen.funkthat.com) ([69.17.45.168]) (envelope-sender ) by mail3.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 2 Dec 2004 18:36:00 -0000 Received: from hydrogen.funkthat.com (jsbapq@localhost.funkthat.com [127.0.0.1])iB2IZxGH096732; Thu, 2 Dec 2004 10:35:59 -0800 (PST) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.12.10/8.12.10/Submit) id iB2IZxu6096731; Thu, 2 Dec 2004 10:35:59 -0800 (PST) Date: Thu, 2 Dec 2004 10:35:59 -0800 From: John-Mark Gurney To: Ivan Voras Message-ID: <20041202183559.GH19624@funkthat.com> References: <41AF3FCE.1030405@fer.hr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41AF3FCE.1030405@fer.hr> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 4.2-RELEASE i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html cc: freebsd-geom@freebsd.org Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: John-Mark Gurney List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 18:36:01 -0000 Ivan Voras wrote this message on Thu, Dec 02, 2004 at 17:16 +0100: > I think I misunderstood something. Do you propose this (for 2 disks): > > for each block to be written: > a) generate a block of random data > b) write random data to first disk > c) write random data xor user data to second disk > > So, as long as any person has both disks, the data can be recovered. > Where's the security in that? No, the point is to take say, a CDROM which you have preloaded with pure random data, i.e. burncd /dev/random, then you create a proper sized partition, then using gxor you meld the two... Then for any read/write requests, you take the data, read from the OTP, xor the data, and pass it on... Then when you go away, you take the cdrom, w/o it, there is no data... I like the idea, and it would be a perfect project from someone who is learning geom... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-geom@FreeBSD.ORG Thu Dec 2 19:19:59 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A357616A4CE for ; Thu, 2 Dec 2004 19:19:59 +0000 (GMT) Received: from darkness.comp.waw.pl (darkness.comp.waw.pl [195.117.238.136]) by mx1.FreeBSD.org (Postfix) with ESMTP id CBD4443D31 for ; Thu, 2 Dec 2004 19:19:58 +0000 (GMT) (envelope-from pjd@darkness.comp.waw.pl) Received: by darkness.comp.waw.pl (Postfix, from userid 1009) id 842A0ACAF8; Thu, 2 Dec 2004 20:19:54 +0100 (CET) Date: Thu, 2 Dec 2004 20:19:54 +0100 From: Pawel Jakub Dawidek To: John-Mark Gurney Message-ID: <20041202191954.GE813@darkness.comp.waw.pl> References: <41AF3FCE.1030405@fer.hr> <20041202183559.GH19624@funkthat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MIdTMoZhcV1D07fI" Content-Disposition: inline In-Reply-To: <20041202183559.GH19624@funkthat.com> User-Agent: Mutt/1.4.2i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 5.2.1-RC2 i386 cc: Ivan Voras cc: freebsd-geom@freebsd.org Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 19:19:59 -0000 --MIdTMoZhcV1D07fI Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 02, 2004 at 10:35:59AM -0800, John-Mark Gurney wrote: +> Ivan Voras wrote this message on Thu, Dec 02, 2004 at 17:16 +0100: +> > I think I misunderstood something. Do you propose this (for 2 disks): +> >=20 +> > for each block to be written: +> > a) generate a block of random data +> > b) write random data to first disk +> > c) write random data xor user data to second disk +> >=20 +> > So, as long as any person has both disks, the data can be recovered.= =20 +> > Where's the security in that? +>=20 +> No, the point is to take say, a CDROM which you have preloaded with pure +> random data, i.e. burncd /dev/random, then you create a proper sized +> partition, then using gxor you meld the two... +>=20 +> Then for any read/write requests, you take the data, read from the OTP, +> xor the data, and pass it on... Then when you go away, you take the +> cdrom, w/o it, there is no data... +>=20 +> I like the idea, and it would be a perfect project from someone who is +> learning geom... I was thinking about simlar thing, as we use simlar mechanisms at work to share a secret between a few smart cards. I'm also not sure if CD-ROM with static random data will be safe enough. I want to generate random data before every write, xor data with generated random data and write both. It should also be faster, as I don't need to read random data first. It could be less safe from data integrity point of view in case of a power failure, when write request reach only one component. We can also implement both:) I think, I can do it quite fast. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --MIdTMoZhcV1D07fI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFBr2raForvXbEpPzQRArfiAJwK2ZX4qKhXzTmL8IaUCJmihVwMCACglm3d Iw2c3KV6qbBMeBrrLCJzWFc= =wcth -----END PGP SIGNATURE----- --MIdTMoZhcV1D07fI-- From owner-freebsd-geom@FreeBSD.ORG Fri Dec 3 00:30:41 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E4E416A4CE; Fri, 3 Dec 2004 00:30:41 +0000 (GMT) Received: from foo.nemo-project.org (foo.nemo-project.org [194.54.103.89]) by mx1.FreeBSD.org (Postfix) with ESMTP id 60A7F43D39; Fri, 3 Dec 2004 00:30:40 +0000 (GMT) (envelope-from terje+geom@elde.net) Received: by foo.nemo-project.org (Postfix, from userid 1001) id A7873D9091; Fri, 3 Dec 2004 01:32:03 +0100 (CET) Date: Fri, 3 Dec 2004 01:32:03 +0100 From: Terje Elde To: Pawel Jakub Dawidek Message-ID: <20041203003203.GX72822@calleigh.elde.net> References: <41AF3FCE.1030405@fer.hr> <20041202183559.GH19624@funkthat.com> <20041202191954.GE813@darkness.comp.waw.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041202191954.GE813@darkness.comp.waw.pl> User-Agent: Mutt/1.5.4i cc: John-Mark Gurney cc: Ivan Voras cc: freebsd-geom@freebsd.org Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Dec 2004 00:30:41 -0000 On Thu, Dec 02, 2004 at 08:19:54PM +0100, Pawel Jakub Dawidek wrote: > +> I like the idea, and it would be a perfect project from someone who is > +> learning geom... I was thinking of doing this as a excersice for myself, though I don't think I'd be able to find time for it any time soon. > I'm also not sure if CD-ROM with static random data will be safe enough. > I want to generate random data before every write, xor data with generated > random data and write both. It should also be faster, as I don't need to > read random data first. It could be less safe from data integrity point > of view in case of a power failure, when write request reach only one > component. Part of the point of a one time pad is to use it one time. ;) Granted, the chance of someone sniffing the traffic is less relevant for this case than it would be in say, network traffic, but still. > We can also implement both:) > > I think, I can do it quite fast. :) Terje From owner-freebsd-geom@FreeBSD.ORG Fri Dec 3 21:17:24 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A6A0816A4CE for ; Fri, 3 Dec 2004 21:17:24 +0000 (GMT) Received: from darkness.comp.waw.pl (darkness.comp.waw.pl [195.117.238.136]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4AA8A43D45 for ; Fri, 3 Dec 2004 21:17:24 +0000 (GMT) (envelope-from pjd@darkness.comp.waw.pl) Received: by darkness.comp.waw.pl (Postfix, from userid 1009) id 66873ACAF8; Fri, 3 Dec 2004 22:17:21 +0100 (CET) Date: Fri, 3 Dec 2004 22:17:21 +0100 From: Pawel Jakub Dawidek To: John-Mark Gurney Message-ID: <20041203211721.GG813@darkness.comp.waw.pl> References: <41AF3FCE.1030405@fer.hr> <20041202183559.GH19624@funkthat.com> <20041202191954.GE813@darkness.comp.waw.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="PpAOPzA3dXsRhoo+" Content-Disposition: inline In-Reply-To: <20041202191954.GE813@darkness.comp.waw.pl> User-Agent: Mutt/1.4.2i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 5.2.1-RC2 i386 cc: Ivan Voras cc: freebsd-geom@freebsd.org Subject: Re: More geom classes? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Dec 2004 21:17:24 -0000 --PpAOPzA3dXsRhoo+ Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 02, 2004 at 08:19:54PM +0100, Pawel Jakub Dawidek wrote: +> I was thinking about simlar thing, as we use simlar mechanisms at work +> to share a secret between a few smart cards. +>=20 +> I'm also not sure if CD-ROM with static random data will be safe enough. +> I want to generate random data before every write, xor data with generat= ed +> random data and write both. It should also be faster, as I don't need to +> read random data first. It could be less safe from data integrity point +> of view in case of a power failure, when write request reach only one +> component. +>=20 +> We can also implement both:) +>=20 +> I think, I can do it quite fast. I already did it, btw. It is in perforce: pjd_geom_classes. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --PpAOPzA3dXsRhoo+ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFBsNfhForvXbEpPzQRAqDUAKCO/T0nvaoHQL5cBvdvRGFw48x+/gCdFyk2 0XKTDxinpKK9yKmBr0KDNbM= =xpoD -----END PGP SIGNATURE----- --PpAOPzA3dXsRhoo+-- From owner-freebsd-geom@FreeBSD.ORG Sat Dec 4 16:07:27 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5BF6016A4CE; Sat, 4 Dec 2004 16:07:27 +0000 (GMT) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id CBECA43D53; Sat, 4 Dec 2004 16:07:26 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (pool-151-199-90-129.roa.east.verizon.net [151.199.90.129]) by gromit.dlib.vt.edu (8.13.1/8.13.1) with ESMTP id iB4G7M1r066706 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 4 Dec 2004 11:07:23 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (localhost.Chelsea-Ct.Org [127.0.0.1]) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1) with ESMTP id iB4G7GPX028769 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 4 Dec 2004 11:07:16 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: (from paul@localhost) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1/Submit) id iB4G7Fdh028752; Sat, 4 Dec 2004 11:07:15 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) X-Authentication-Warning: zappa.Chelsea-Ct.Org: paul set sender to paul@gromit.dlib.vt.edu using -f From: Paul Mather To: freebsd-geom@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Sat, 04 Dec 2004 11:07:13 -0500 Message-Id: <1102176433.85167.15.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 FreeBSD GNOME Team Port cc: le@freebsd.org Subject: Any chance of "gvinum setstate" for RELENG_5? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Dec 2004 16:07:27 -0000 This morning, whilst checking my e-mail, I noticed that logcheck had apprised me of the following: ===== Unusual System Events =-=-=-=-=-=-=-=-=-=-= Dec 4 08:01:28 handle kernel: ad1: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=1581375 Dec 4 08:01:29 handle kernel: ad1: FAILURE - WRITE_DMA timed out Dec 4 08:01:29 handle kernel: GEOM_VINUM: subdisk swap.p1.s0 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: plex swap.p1 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: subdisk root.p1.s0 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: plex root.p1 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: subdisk var.p1.s0 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: plex var.p1 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: subdisk usr.p1.s0 is down Dec 4 08:01:29 handle kernel: GEOM_VINUM: plex usr.p1 is down ===== This has happened before. Unfortunately, ad1 is not actually "down" as "gvinum list" reports---it's just "down" as far as gvinum is concerned. Sadly, without "setstate," attempts to get the drive recognised as "up" by gvinum seem to result in a panic/reboot/fsck (or at least appeared to the last time this happened). In fact, just rebooting the system with a drive marked "down" appeared to precipitate a panic the last time I had to do it. (I don't know for certain, as the machine is far away in a machine room, but some kind of abnormal shutdown occurred that necessitated a fsck of all filesystems on reboot.) So, is there any chance of getting "setstate" supported under RELENG_5 (which this machine runs), so that I can "gvinum setstate up " to get geom_vinum to believe the drive is alive and hence not panic when I then do a "gvinum start "? (The minor irritation of the "FAILURE - WRITE_DMA timed out" becomes an annoyance when it means I have to reboot to get the "failed" drive recognised again. Those errors in the ATA system never happened under 5.1; they seemed to creep in with 5.2 and have remained ever since. Unfortunately, the way this machine has been assembled, I can't "atacontrol detach" individual drives to try and get geom_vinum [hopefully] to see them reappear that way.) In other words, "setstate" support---even just limited to drives---would be a big help! Cheers, Paul. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa From owner-freebsd-geom@FreeBSD.ORG Sat Dec 4 19:01:36 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F3F616A4CE; Sat, 4 Dec 2004 19:01:36 +0000 (GMT) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5025043D58; Sat, 4 Dec 2004 19:01:36 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (pool-151-199-90-129.roa.east.verizon.net [151.199.90.129]) by gromit.dlib.vt.edu (8.13.1/8.13.1) with ESMTP id iB4J1YWD067118 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 4 Dec 2004 14:01:35 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (localhost.Chelsea-Ct.Org [127.0.0.1]) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1) with ESMTP id iB4J1SDH020653 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 4 Dec 2004 14:01:29 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: (from paul@localhost) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1/Submit) id iB4J1RSS020652; Sat, 4 Dec 2004 14:01:27 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) X-Authentication-Warning: zappa.Chelsea-Ct.Org: paul set sender to paul@gromit.dlib.vt.edu using -f From: Paul Mather To: freebsd-geom@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Sat, 04 Dec 2004 14:01:26 -0500 Message-Id: <1102186886.20507.31.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 FreeBSD GNOME Team Port cc: le@freebsd.org Subject: Is there any way to throttle back plex reconstruction with geom_vinum? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Dec 2004 19:01:36 -0000 In other words, is there anything analogous to the geom_mirror sysctls "kern.geom.mirror.syncs_per_sec" and "kern.geom.mirror.reqs_per_sync" for geom_vinum? The reason I ask is because when I just tried rebuilding my "usr" plex (upon which /usr resides) my system became completely unresponsive over the network. The ssh session I was using to kick off the "gvinum start usr.p1" stalled dead. Attempts to slogin from another machine hung. Attempts to use other network daemons remotely failed. (The system is a nameserver, and DNS lookups against it failed---not good!) The system still responded to pings, indicating it was alive and not completely hung. Eventually, it returned to the network. In my logs, I noticed the following: Dec 4 11:43:17 handle kernel: GEOM_VINUM: plex sync usr.p0 -> usr.p1 started Dec 4 11:43:17 handle kernel: GEOM_VINUM: sd usr.p1.s0 is initializing Dec 4 11:43:17 handle kernel: GEOM_VINUM: plex usr.p1 is degraded Dec 4 12:35:07 handle kernel: OM_VINUM: plex request failed for gvinum/plex/usr .p1[READ(offset=1162870784, length=16384)] Dec 4 12:35:07 handle kernel: GEOM_VINUM: plex request failed for gvinum/plex/u sr.p1[READ(offset=1162870784, length=16384)] Dec 4 12:35:07 handle last message repeated 346 times Dec 4 12:35:07 handle kernel: GEOM_VINUM: plex usr.p1 is up Dec 4 12:35:08 handle kernel: GEOM_VINUM: plex sync usr.p0 -> usr.p1 finished and also this: Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon IPv4: load average: 20 Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon IPv6: load average: 20 Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon MSA: load average: 20 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon IPv4: load average: 16 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon IPv6: load average: 16 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon MSA: load average: 16 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon IPv4 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon IPv6 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon MSA (Note the load average.) I'm not sure what the "plex request failed" thing is all about (usr.p1 was the plex being reconstructed). What units are being used in the "READ(offset=1162870784, length=16384)" error? (And why read from the plex being reconstructed? Is this to verify the data are written correctly?) The "gvinum printconfig" output of the subdisk pertaining to that plex is as follows: sd name usr.p1.s0 drive hardy len 47360177s driveoffset 2621424s plex usr.p1 plexoffset 0s If the units is bytes, that would place the error somewhere near the start of the plex. So, it worries me a little that the plex is marked up at the same time the READ error is reported and reconstruction declared complete a second later. How can I be sure reconstruction was completed successfully? Are the timestamps just a side-effect of how syslogd does its logging? I.e., are the timestamps generated when syslogd actually gets to write to the log file (possible delayed access due to disk starvation caused by the flat-out geom_vinum reconstruction underway, hence lots of things logged with near-identical times when the drive is no longer monopolised), or when the logged message is received by syslogd? Cheers, Paul. PS: smartctl reports the health of the drive as okay. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa