From owner-freebsd-geom@FreeBSD.ORG Sun Mar 22 01:16:45 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 22F991065676 for ; Sun, 22 Mar 2009 01:16:45 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com [209.85.219.171]) by mx1.freebsd.org (Postfix) with ESMTP id 2F5908FC26 for ; Sun, 22 Mar 2009 01:16:43 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: by ewy19 with SMTP id 19so1017671ewy.43 for ; Sat, 21 Mar 2009 18:16:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=rW4K9XL/+gOKcYQTmqG+qLWLjyqViIcFS5zbTqwgauE=; b=ViqYzDmKMuGsa/4gNylPLyNoPsUy2Lf06e+7/2qvVth+9vu5TCo1LmOXNZQo8BAgOP qZy0HhFezc9Xs91FiW4DIf00KhjyB+SSH9x8q6n/xWl4pIzj4qPTaQYWgBiBV5UQn+rW STYNx9s7z6bdnVAddVtqSwZ+7N4jh4ThoE0/E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=EdM3GV2Lo3PfgYMD17CqxYLyxvwmDh30x9nU/nyG7ShIBePK6YvdWnhVTI2GbJNXUx ROo88PsweO/w+OkpuBpv8HQ+R2QS0j4YP7PdVbdIViQY4UQRVDO+DweRfAXtuSzjvybl 79jtQlrrmG7s7W7DCABNdyr5QwAkK1VgsTDIw= MIME-Version: 1.0 Sender: rizzo.unipi@gmail.com Received: by 10.210.57.12 with SMTP id f12mr4176008eba.22.1237684246139; Sat, 21 Mar 2009 18:10:46 -0700 (PDT) In-Reply-To: <42965.1237667050@critter.freebsd.dk> References: <20090321200334.GB3102@garage.freebsd.pl> <42965.1237667050@critter.freebsd.dk> Date: Sun, 22 Mar 2009 02:10:46 +0100 X-Google-Sender-Auth: 51416657876adaf2 Message-ID: From: Luigi Rizzo To: Poul-Henning Kamp Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: luigi@freebsd.org, Pawel Jakub Dawidek , Ivan Voras , freebsd-geom@freebsd.org Subject: Re: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Mar 2009 01:16:45 -0000 On Sat, Mar 21, 2009 at 9:24 PM, Poul-Henning Kamp wro= te: > In message <20090321200334.GB3102@garage.freebsd.pl>, Pawel Jakub Dawidek= write > s: > >> =A0 =A0 =A0 Special GEOM classes. >> =A0 =A0 =A0 --------------------- >> >> =A0 =A0 =A0 - There are no special GEOM classes. >> >>I wonder if phk changed his opinion over time. :) > > He didn't. > >>Maybe instead of adding special providers and GEOM classes, the >>infrastructure should be extended in some way, so that we won't use >>provider term to describe something that isn't really a regular GEOM >>provider. > > I have not had time to read this entire thread, being somewhat > snowed under with work elsewhere. ... > With that said: I always envisioned the ability to insert and > delete transparant nodes, with the poster boy example being: > > =A0 =A0 =A0 =A0insert a mirror geom > =A0 =A0 =A0 =A0add a mirror on some other provider > =A0 =A0 =A0 =A0sync them. > =A0 =A0 =A0 =A0delete the old mirro copy > =A0 =A0 =A0 =A0pull the mirror mirror geom out again > > and (tada!) you have migrated a live partition from one disk to > another. > > For that to work, the new class has to end up between the > consumer(s) and the geom-class, and I generally planned to > stick a {geom-consumer-provider} =A0combination in between > the provider and its class, rather than a {provider-geom-consumer} > between the consumer and its provider. > > The reason for this, is that it can be done without stalling > the I/O stream since bios all have built in return tickets. > > So I think, my opinion on this proposal is: > ... > > 2. There still are not, and should not be created any special GEOM > =A0 classes. =A0GEOM derives much of it's strength from the fact that > =A0 there are no special cases to handle, that shouldn't be sold > =A0 too cheaply. > > 3. Do it properly instead: Implement the general insert/remove > =A0 properly, so that we can do things like the "move" example above. > > Poul-Henning > > -- > Poul-Henning Kamp =A0 =A0 =A0 | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG =A0 =A0 =A0 =A0 | TCP/IP since RFC 956 > FreeBSD committer =A0 =A0 =A0 | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetenc= e. > With the scheduling issues hopefully addressed in the other email/thread: the only thing we asked in this thread is whether a transparent insert/remove in GEOM is already possible, or it must be implemented. It looks like we are in the latter case, so one option we suggested (and implemented) was to stick "something" between the provider and its class, with this "something" being a regular geom class. http://info.iet.unipi.it/~luigi/FreeBSD/20090319-geom-proxy.patch This seems to be almost (see [1]) perfectly in line with your suggestion above, does not cause deviations from the model, and does not introducte special classes (see [2]). The only thing we need is adding two pointers to decouple the provider from its geom. I'd love to know if a better way exists, maybe the behaviour described in note [1] below is what you had in mind ? [1]: The way i can read your sententence ... stick a {geom-consumer-provider} combination in between the provider and its class, is the following: take the existing provider "pp" attached to geom "gp" and make it the provider of the new geom "new_gp". Then create a new provider, "new_pp", link it to "gp", and link the consumer of "new_gp" to "new_pp". So we have the following= : (each node is in square brackets): BEFORE ---> [ pp --> gp ... ] AFTER ---> [ pp --> new_gp --> new_cp ] ---> [ new_pp --> gp ... = ] On removal, relink "pp" to "gp" and destroy all the new_* stuff. This should save the extra pointers in the struct g_provider, and perhaps not much harder to implement than what we did ? [2] the GEOM_PROXY flag that we suggested is just an optimization to avoid calling taste() on a provider that nobody should be interested in attaching to. I think its presence does not change the model, but nothing bad happens if we don't use this flag. How does it sound now ? cheers luigi From owner-freebsd-geom@FreeBSD.ORG Sun Mar 22 01:29:16 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13606106564A for ; Sun, 22 Mar 2009 01:29:16 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com [209.85.219.171]) by mx1.freebsd.org (Postfix) with ESMTP id 71AF48FC12 for ; Sun, 22 Mar 2009 01:29:15 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: by ewy19 with SMTP id 19so1018959ewy.43 for ; Sat, 21 Mar 2009 18:29:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=zHCVz9V/JxjU7UsKUPU6dq7qa4iK5/ESOPql9qGbiNU=; b=dk/xIqM9ll/5Acg9MxoOJepEpBhC+qLRtY6qP9dMfl7B7gfA/XaNNZPxzsZG0S1NA0 IYtsL5JU45rR3iI59F38speYKxY6GMKlLRAgC9DvR3vO4m2KJ47StcyPgeRhP4kCAWk8 k7/iFtCve9goTZoIvhGxBzcZ5o4usR2uSMhZ8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; b=m1EbUKJlWK38+tk0MZ1eyE5jI64XdjEphBsW/8Iq4fiXATchtmfvFBZ2U+/ItGjOLn 73pBQfA9HYamRd1TkmTa9lz9inUcJmIlwNPiOf18BWSQ66vBmWKt04dujcdiJfU4Vcia DV2+I6csEh6Ryuq8Ict/Ls6dynLYORaASACYw= MIME-Version: 1.0 Sender: rizzo.unipi@gmail.com Received: by 10.210.53.5 with SMTP id b5mr1023483eba.90.1237683660209; Sat, 21 Mar 2009 18:01:00 -0700 (PDT) Date: Sun, 22 Mar 2009 02:00:59 +0100 X-Google-Sender-Auth: 1d44b7d43bbdda99 Message-ID: From: Luigi Rizzo To: Poul-Henning Kamp Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: luigi@freebsd.org, Pawel Jakub Dawidek , Ivan Voras , freebsd-geom@freebsd.org Subject: disk scheduling (was: Re: RFC: adding 'proxy' nodes to provider ports (with patch)) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Mar 2009 01:29:16 -0000 On Sat, Mar 21, 2009 at 9:24 PM, Poul-Henning Kamp wro= te: > In message <20090321200334.GB3102@garage.freebsd.pl>, Pawel Jakub Dawidek= write > s: > >> =A0 =A0 =A0 Special GEOM classes. >> =A0 =A0 =A0 --------------------- >> >> =A0 =A0 =A0 - There are no special GEOM classes. >> >>I wonder if phk changed his opinion over time. :) > > He didn't. > >>Maybe instead of adding special providers and GEOM classes, the >>infrastructure should be extended in some way, so that we won't use >>provider term to describe something that isn't really a regular GEOM >>provider. > > I have not had time to read this entire thread, being somewhat > snowed under with work elsewhere. > > First up, I am not sure I understand why the proxy nodes would > be the (or even 'a') right solution for I/O scheduling. > > In fact, it is not very clear to me at all that scheduling should > happen inside geom at all. > > I would tend to think that it belongs in the devicedriver, where > intelligent information about things like tagged queuing abilities > can be taken into account. > > For any kind of scheduling to do anything non-trivial, requests > needs to be piled up so they can be reordered, doing that in > places where bio's dont naturally pile up would require a damn > good argument and strong numbers to convince me. > > Where the already do pile up, the existing disksort mechanism > and API can be used. =A0(If you want to mess with the disksort > *algorithm*, by all means do so, but that should not require > you to hack up any apis, apart from the one to select algorithm). The thread was meant to be on inserting transparent nodes in GEOM. Scheduling was just an example on where the problem came out, but since you ask let's take a short diversion (and let me relabel this thread so we can discuss things separately). + nobody objects that the ideal place for scheduling is where requests naturally "pile up". Too bad that this ideal place is sometimes one we cannot access, i.e. the firmware of the disk drive. + some scheduling algorithms are "non work conserving", and they work by delaying some requests in the hope to save some seeks. They can be very effective (we sent numbers in our previous posting in january, but you can look at the literature on anticipatory scheduling for more). For the way they work, these algorithms artificially cause queues to build up. As such you can implement them effectively even above the device driver. + changing disksort can do some things but not all one would want. E.g. if you need to delay requests (as you do in several disk schedulers) then you need to interact heavily with the driver, e.g. to make sure it does not assume that the scheduler is work-conserving (some do, we found out in the GSoC 2005 work on disk schedulers), and to find out which kind of locking to use when it is time to reinject delayed requests. So, implementing certain scheduling algorithms in the device driver requires specific code on each and every driver. + of course adding or not a disk scheduler in one's system is completely optional, and there is no intention to change any current default. if you want a quick example on how can you fix some severe problems with the current disk scheduler even doing scheduling above the device driver, try the same experiments we did, first without scheduler, then with the geom_sched module that we posted: 1. run a few 'dd' in parallel on top of an ATA or SATA disk, and look at the overal throughput with and without scheduler; 2. run a cvs update (or other seeky application) in parallel with a sequential dd reader, and look at how slowly 'dd' runs without scheduler; 3. run a cvs update (or other seeky application) in parallel with a sequential dd writer, and look at how slowly cvs goes without scheduler. This is mostly an effect of Examples #1 and #2 are a direct result of the request patterns issued by readers, and cannot be fixed with work-conserving changes to disksort. Readers only have one pending request each, so the disk is doing a seek on each request, and the throughput degrades heavily. With anticipation, after one request you give the process a little bit of time to present another one, so you can serve a short burst of requests from each reader, boosting both individual and overall throughput. Example #3 is a result of the "capture effect" of our disksort: writers have many pending requests and if they are for contiguous blocks, once one of them is served the disk keeps serving the same process starving the others. Here you can do a lot of useful stuff even above the device driver, e.g. do not serve more than so many contiguous requests in a row. cheers luigi From owner-freebsd-geom@FreeBSD.ORG Sun Mar 22 08:13:04 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 944C0106566C; Sun, 22 Mar 2009 08:13:04 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 53D8A8FC13; Sun, 22 Mar 2009 08:13:04 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 3B2C43F129; Sun, 22 Mar 2009 08:13:03 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2M8D2xO045711; Sun, 22 Mar 2009 08:13:02 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sun, 22 Mar 2009 02:00:59 +0100." Date: Sun, 22 Mar 2009 08:13:02 +0000 Message-ID: <45710.1237709582@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: luigi@freebsd.org, Pawel Jakub Dawidek , Ivan Voras , freebsd-geom@freebsd.org Subject: Re: disk scheduling (was: Re: RFC: adding 'proxy' nodes to provider ports (with patch)) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Mar 2009 08:13:04 -0000 In message , Luigi Rizzo writes: >The thread was meant to be on inserting transparent nodes in GEOM. > >Scheduling was just an example on where the problem came out, Scheduling is the *only* application I have seen mentioned for this special case geom construct ? >+ nobody objects that the ideal place for scheduling is where > requests naturally "pile up". Too bad that this ideal > place is sometimes one we cannot access, i.e. the firmware > of the disk drive. Do you seriously propose that we could compete in scheduling quality, with the disk drives firmware on drives that can have multiple outstanding requests ? >+ [anticipatory scheduling] > As such you can implement them effectively even above the device driver. I have yet to see any study propose that they could do any good inside the geom mesh, as opposed to right in front of the device driver ? >+ changing disksort can do some things but not all one would want. > [...] Then the correct answer is to insert a perfectly normal geom class above the disk drive to implement that. I totally fail to se what special kind of classes would buy you ? >if you want a quick example [...] I know what anticipatory disk-scheduling is, what it does, and what the downsides of it are. (I also know that with SSD's it becomes all but pointless). The question is not if we should improve disksorting, the question is if we need to hack up GEOM for it. The answer is "no". Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Sun Mar 22 08:22:03 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 941531065672; Sun, 22 Mar 2009 08:22:03 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 554AB8FC1A; Sun, 22 Mar 2009 08:22:03 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 1F37F3F129; Sun, 22 Mar 2009 08:22:02 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2M8M1d0045753; Sun, 22 Mar 2009 08:22:01 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sun, 22 Mar 2009 02:10:46 +0100." Date: Sun, 22 Mar 2009 08:22:01 +0000 Message-ID: <45752.1237710121@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: luigi@freebsd.org, Pawel Jakub Dawidek , Ivan Voras , freebsd-geom@freebsd.org Subject: Re: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Mar 2009 08:22:03 -0000 In message , Luigi Rizzo writes: > BEFORE ---> [ pp --> gp ... ] > AFTER ---> [ pp --> new_gp --> new_cp ] ---> [ new_pp --> gp ... ] Correct. There are many reasons for doing it this way, but the two major ones are: Providers see essentially one-way traffic (going down), because the bio's have their return-path recorded (admittedly: for this very reason), whereas consumers see two way traffic. If you wanted to substitute another provider, you would have to stall I/O activity on the consumers in order to get all the pointers set up right to not derail any bios while doing so. If instead you insert under the provider, you can hold topology, fiddle the pointers in the right order, and release topology all while bios zip up and down over the construction site. >[2] the GEOM_PROXY flag that we suggested is just an optimization to > avoid calling taste() on a provider that nobody should be interested > in attaching to. I think its presence does not change the model, > but nothing bad happens if we don't use this flag. You would not call taste() anyway, because all the new stuff is already open and active. But you need to add a new g_ctl verb to instantiate a transparant instance of the class, and this is where you can tell if inserting a given glass is even possible: classes that cannot will error out. Similarly, you need a verb to remove a transparent geom, which will fail if the class doesn't understand this, or do not consider that geom to be transparant. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Sun Mar 22 09:51:31 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3AD54106566B; Sun, 22 Mar 2009 09:51:31 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com [209.85.219.171]) by mx1.freebsd.org (Postfix) with ESMTP id 1E4D08FC0A; Sun, 22 Mar 2009 09:51:29 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: by ewy19 with SMTP id 19so1071394ewy.43 for ; Sun, 22 Mar 2009 02:51:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=MBbSsF+IHsEMTzm6OrjBDEK7jtmvEjq3pc8IYVzYBo4=; b=uzE2dn+Bl0ytez+6j1/1IeqE6DNntjlF1GCZZk/LAbJbeGx4vk9fAM3piESK1A20Iq gmNoirKk3QEAUHj1B0cQy27wFTKeakZccYekTizRWSON9vlLOGZ1VNojHu5bRfa3L4gX chwpEYJiRQ7pZJ/jtml9dW59ZF6IVzXVLkVUY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=XSZLRaivg+yZIJkw/Eq8lA6eDX/YqeF0tPb+HNm3zT/gbuwPtrdWkIH1v5J4IWvmtc 61x41Z81stkstPlFU9J0bP9zUJTNorf0vhr19iEYZyH6jMtxfYEf7Nb0DZ0oQoXeIfmx 8Q3BjQ1J/onmKzS3tUvOExIRkPRq8AQCXyVHg= MIME-Version: 1.0 Sender: rizzo.unipi@gmail.com Received: by 10.210.19.7 with SMTP id 7mr4434008ebs.15.1237715489217; Sun, 22 Mar 2009 02:51:29 -0700 (PDT) In-Reply-To: <45710.1237709582@critter.freebsd.dk> References: <45710.1237709582@critter.freebsd.dk> Date: Sun, 22 Mar 2009 10:51:29 +0100 X-Google-Sender-Auth: fd3d421c0efbeac8 Message-ID: From: Luigi Rizzo To: Poul-Henning Kamp Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: luigi@freebsd.org, Pawel Jakub Dawidek , Ivan Voras , freebsd-geom@freebsd.org Subject: Re: disk scheduling (was: Re: RFC: adding 'proxy' nodes to provider ports (with patch)) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Mar 2009 09:51:32 -0000 On Sun, Mar 22, 2009 at 9:13 AM, Poul-Henning Kamp wrote: > In message , Luigi > Rizzo writes: > >>The thread was meant to be on inserting transparent nodes in GEOM. >> >>Scheduling was just an example on where the problem came out, > > Scheduling is the *only* application I have seen mentioned for > this special case geom construct ? man 4 geom has a section which explicitly mentions this construct, with the same example that you posted in the thread: ... SPECIAL TOPOLOGICAL MANEUVERS INSERT/DELETE are very special operations which allow a new geom to be instantiated between a consumer and a provider attached to each other and to remove it again. To understand the utility of this, imagine a provider being mounted as a file system. Between the DEVFS geom's consumer and its provider we insert a mirror module which configures itself with one mirror copy and consequently is transparent to the I/O requests on the path. We can now configure yet a mirror copy on the mirror geom, request a synchroniza- tion, and finally drop the first mirror copy. We have now, in essence, moved a mounted file system from one disk to another while it was being used. At this point the mirror geom can be deleted from the path again; it has served its purpose. >>+ changing disksort can do some things but not all one would want. >> [...] > > Then the correct answer is to insert a perfectly normal geom > class above the disk drive to implement that. I totally fail > to se what special kind of classes would buy you ? > >>if you want a quick example [...] > > I know what anticipatory disk-scheduling is, what it does, > and what the downsides of it are. (I also know that with > SSD's it becomes all but pointless). > > The question is not if we should improve disksorting, the question > is if we need to hack up GEOM for it. > > The answer is "no". Ok good, we are back on track on the geom architecture: then the question was just whether the INSERT/DELETE mentioned in the manpage was already supported or not, and how to implement it in a clean way. Hopefully the discussion in the main thread now contains enough detail to do it the right way. cheers luigi From owner-freebsd-geom@FreeBSD.ORG Sun Mar 22 13:02:37 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7676A1065674; Sun, 22 Mar 2009 13:02:37 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-fx0-f167.google.com (mail-fx0-f167.google.com [209.85.220.167]) by mx1.freebsd.org (Postfix) with ESMTP id B50428FC14; Sun, 22 Mar 2009 13:02:36 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: by fxm11 with SMTP id 11so1350592fxm.43 for ; Sun, 22 Mar 2009 06:02:35 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <45710.1237709582@critter.freebsd.dk> References: <45710.1237709582@critter.freebsd.dk> Date: Sun, 22 Mar 2009 14:02:20 +0100 Received: by 10.204.31.101 with SMTP id x37mr2097984bkc.4.1237726955458; Sun, 22 Mar 2009 06:02:35 -0700 (PDT) Message-ID: <9bbcef730903220602q736b96dflab447e2d6d996754@mail.gmail.com> From: Ivan Voras To: Poul-Henning Kamp Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Pawel Jakub Dawidek , luigi@freebsd.org, Luigi Rizzo , freebsd-geom@freebsd.org Subject: Re: disk scheduling (was: Re: RFC: adding 'proxy' nodes to provider ports (with patch)) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Mar 2009 13:02:37 -0000 2009/3/22 Poul-Henning Kamp : > In message ,= Luigi > =C2=A0Rizzo writes: > >>The thread was meant to be on inserting transparent nodes in GEOM. >> >>Scheduling was just an example on where the problem came out, > > Scheduling is the *only* application I have seen mentioned for > this special case geom construct ? I've joined this thread because once upon a time I was working on what has grown into gjournal, and one aspect of the original project was a logging "safety net" mode. The idea was to insert this class (or whatever) just before a file system consumer then do risky things with the file system metadata (like fsck-ing a badly damaged file system), with the option of commiting it or rolling it back. It has even grown into another SoC project. I see now it doesn't comply with my idea of a "lightweight" proxy (the first item, about 1:1 mappings) - so proxies look more and more like they should be classes. Also, gcache looks like a candidate. From owner-freebsd-geom@FreeBSD.ORG Mon Mar 23 06:02:56 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 57D1C106564A; Mon, 23 Mar 2009 06:02:56 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045082.chello.pl [87.206.45.82]) by mx1.freebsd.org (Postfix) with ESMTP id 80F558FC0A; Mon, 23 Mar 2009 06:02:55 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id DD5A645685; Mon, 23 Mar 2009 07:02:53 +0100 (CET) Received: from localhost (chello087206045082.chello.pl [87.206.45.82]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id E0CFA45683; Mon, 23 Mar 2009 07:02:47 +0100 (CET) Date: Mon, 23 Mar 2009 07:03:25 +0100 From: Pawel Jakub Dawidek To: Poul-Henning Kamp Message-ID: <20090323060325.GN3102@garage.freebsd.pl> References: <45752.1237710121@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2feizKym29CxAecD" Content-Disposition: inline In-Reply-To: <45752.1237710121@critter.freebsd.dk> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: luigi@freebsd.org, Luigi Rizzo , Ivan Voras , freebsd-geom@freebsd.org Subject: Re: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 06:02:56 -0000 --2feizKym29CxAecD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Mar 22, 2009 at 08:22:01AM +0000, Poul-Henning Kamp wrote: > In message , = Luigi=20 > Rizzo writes: >=20 > > BEFORE ---> [ pp --> gp ... ] > > AFTER ---> [ pp --> new_gp --> new_cp ] ---> [ new_pp --> gp .= .. ] >=20 > Correct. >=20 > There are many reasons for doing it this way, but the two major ones > are: >=20 > Providers see essentially one-way traffic (going down), because the > bio's have their return-path recorded (admittedly: for this very > reason), whereas consumers see two way traffic. >=20 > If you wanted to substitute another provider, you would have to stall > I/O activity on the consumers in order to get all the pointers set > up right to not derail any bios while doing so. >=20 > If instead you insert under the provider, you can hold topology, > fiddle the pointers in the right order, and release topology > all while bios zip up and down over the construction site. There is still a naming problem. pp and new_pp will end up with the same name. I'd suggest instructing GEOM to expose only parent in /dev/. > >[2] the GEOM_PROXY flag that we suggested is just an optimization to > > avoid calling taste() on a provider that nobody should be interested > > in attaching to. I think its presence does not change the model, > > but nothing bad happens if we don't use this flag. >=20 > You would not call taste() anyway, because all the new stuff is > already open and active. The taste is still going to be send on new class arrival and on the last pp write close. > But you need to add a new g_ctl verb to instantiate a transparant > instance of the class, and this is where you can tell if inserting > a given glass is even possible: classes that cannot will error out. >=20 > Similarly, you need a verb to remove a transparent geom, which > will fail if the class doesn't understand this, or do not consider > that geom to be transparant. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --2feizKym29CxAecD Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFJxyYtForvXbEpPzQRAkWPAKCP61CZ5//Il3peY/pD9Om4aD8Y/wCfYCKS zy6rU8Ev+ifBrLcPwgE5EPA= =jen3 -----END PGP SIGNATURE----- --2feizKym29CxAecD-- From owner-freebsd-geom@FreeBSD.ORG Mon Mar 23 06:58:18 2009 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0D6B106566B; Mon, 23 Mar 2009 06:58:18 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 9248B8FC22; Mon, 23 Mar 2009 06:58:18 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id DB3963F129; Mon, 23 Mar 2009 06:58:16 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2N6wG5p042619; Mon, 23 Mar 2009 06:58:16 GMT (envelope-from phk@critter.freebsd.dk) To: Pawel Jakub Dawidek From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 23 Mar 2009 07:03:25 +0100." <20090323060325.GN3102@garage.freebsd.pl> Date: Mon, 23 Mar 2009 06:58:16 +0000 Message-ID: <42618.1237791496@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: luigi@FreeBSD.org, Luigi Rizzo , Ivan Voras , freebsd-geom@FreeBSD.org Subject: Re: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 06:58:19 -0000 In message <20090323060325.GN3102@garage.freebsd.pl>, Pawel Jakub Dawidek write s: >There is still a naming problem. pp and new_pp will end up with the same >name. I'd suggest instructing GEOM to expose only parent in /dev/. who said the new provider had to have same name ? >The taste is still going to be send on new class arrival and on the last >pp write close. We decide that. Since we are inserting in an already open path, I think it makes very good sense to supress tasting, at least until close. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Mon Mar 23 09:42:21 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8418106564A; Mon, 23 Mar 2009 09:42:21 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 99F998FC23; Mon, 23 Mar 2009 09:42:21 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 68FA03F129; Mon, 23 Mar 2009 09:42:20 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2N9gJOC043416; Mon, 23 Mar 2009 09:42:20 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sun, 22 Mar 2009 10:51:29 +0100." Date: Mon, 23 Mar 2009 09:42:19 +0000 Message-ID: <43415.1237801339@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: luigi@freebsd.org, Pawel Jakub Dawidek , Ivan Voras , freebsd-geom@freebsd.org Subject: Re: disk scheduling (was: Re: RFC: adding 'proxy' nodes to provider ports (with patch)) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 09:42:22 -0000 In message , Luigi Rizzo writes: >>>Scheduling was just an example on where the problem came out, >> >> Scheduling is the *only* application I have seen mentioned for >> this special case geom construct ? > >man 4 geom has a section which explicitly mentions this construct, >with the same example that you posted in the thread: You will notice that there is no mention of "special classes" or "proxy nodes with special properties". If you want to do it, do it right. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Mon Mar 23 11:06:56 2009 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 452E9106566C for ; Mon, 23 Mar 2009 11:06:56 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 31E258FC28 for ; Mon, 23 Mar 2009 11:06:56 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n2NB6u2d003996 for ; Mon, 23 Mar 2009 11:06:56 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n2NB6tb6003992 for freebsd-geom@FreeBSD.org; Mon, 23 Mar 2009 11:06:55 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 23 Mar 2009 11:06:55 GMT Message-Id: <200903231106.n2NB6tb6003992@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 11:06:57 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition o kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock o kern/131037 geom [geli] Unable to create disklabel on .eli-Device o kern/130528 geom gjournal fsck during boot o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid o bin/128398 geom [patch] glabel(8): teach geom_label to recognise gpt l f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] geom_label: kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror] [usb] gmirror fails to start usb devices tha o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and f kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error s kern/89102 geom [geom] [panic] panic when forced unmount FS from unplu o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 48 problems total. From owner-freebsd-geom@FreeBSD.ORG Mon Mar 23 20:02:20 2009 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D318E1065780; Mon, 23 Mar 2009 20:02:20 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.9.129]) by mx1.freebsd.org (Postfix) with ESMTP id 5D7EA8FC14; Mon, 23 Mar 2009 20:02:20 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 90F76730A1; Mon, 23 Mar 2009 21:07:12 +0100 (CET) Date: Mon, 23 Mar 2009 21:07:12 +0100 From: Luigi Rizzo To: Poul-Henning Kamp Message-ID: <20090323200712.GA28660@onelab2.iet.unipi.it> References: <20090323060325.GN3102@garage.freebsd.pl> <42618.1237791496@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42618.1237791496@critter.freebsd.dk> User-Agent: Mutt/1.4.2.3i Cc: luigi@FreeBSD.org, freebsd-geom@FreeBSD.org, Pawel Jakub Dawidek , fabio@gandalf.sssup.it, Ivan Voras Subject: Re: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 20:02:22 -0000 On Mon, Mar 23, 2009 at 06:58:16AM +0000, Poul-Henning Kamp wrote: > In message <20090323060325.GN3102@garage.freebsd.pl>, Pawel Jakub Dawidek write > s: > > >There is still a naming problem. pp and new_pp will end up with the same > >name. I'd suggest instructing GEOM to expose only parent in /dev/. > > who said the new provider had to have same name ? > > >The taste is still going to be send on new class arrival and on the last > >pp write close. > > We decide that. > > Since we are inserting in an already open path, I think it makes very > good sense to supress tasting, at least until close. To summarize, here is how I have implemented a node that supports both regular "create" and the transparent "insert" we are discussing. Say we want to attach to an existing provider "pp" whose name is "ad0" BEFORE ---> [ pp --> old_gp ...] Then we can do either "geom xx create ad0" which results in AFTER create ---> [ newpp --> gp --> cp ] ---> [ pp --> old_gp ... ] or "geom xx insert ad0", which results in AFTER insert ---> [ pp --> gp --> cp ] ---> [ newpp --> old_gp ... ] The names of the various objects are the same in both cases so old_gp->name = "ad0" pp->name = "ad0" gp->name = "ad0.xx." newpp->name = "ad0.xx." This lets new clients connect to provider "ad0" without having to know about any insertion. Also, to remove the newly inserted pieces, in both cases you can run the same command "geom xx destroy ad0.xx." (remembering that in this case you are naming the geom, not the provider). In terms of code, no changes to the infrastructure, and the create/insert and destroy functions are the following (error checking removed for clarity) g_xx_create(struct g_provider *pp, struct g_class *mp, int insert ...) { snprintf(name, sizeof(name), "%s%s", pp->name, MY_SUFFIX); gp = g_new_geomf(mp, name); ... allocate and fill softc and geom... newpp = g_new_providerf(insert ? pp->geom : gp, gp->name); ... initialize mediasize and sectorsize cp = g_new_consumer(gp); g_attach(cp, insert ? newpp : pp); if (insert) { g_cancel_event(newpp); /* no taste() on this*/ /* link pp to old_gp */ LIST_REMOVE(pp, provider); pp->geom = gp; LIST_INSERT_HEAD(&gp->provider, pp, provider); g_access(cp, 1, 1, 1); /* we can move data */ sc->sc_insert = 1; /* remember for the destroy */ } g_error_provider(newpp, 0); } Here it is a bit inefficient to have to call g_cancel_event() but short of changing g_new_providerf() there is no way to avoid the g_new_provider event. g_xx_destroy(struct g_geom *gp) { ... if (sc->sc_insert) { pp = LIST_FIRST(&gp->provider); cp = LIST_FIRST(&gp->consumer); newpp = cp->provider; /* Link provider to the original geom. */ LIST_REMOVE(pp, provider); pp->geom = newpp->geom; LIST_INSERT_HEAD(&pp->geom->provider, pp, provider); g_access(cp, -1, -1, -1); /* I am not sure if we need the following 3 */ g_detach(cp); LIST_REMOVE(newpp, provider); g_destroy_provider(newpp); } ... /* regular destroy path */ } Above, I am not totally sure if we need to explicitly call g_detach() and destroy the provider, or if it will come for free as a result of the regular destoy code. The block "if (sc->sc_insert) {..}" is reasonably generic (and large, when you put in the error checking) to possibly deserve a function in geom_subr.c -- but until there are no other clients, it makes no sense. As usual, feedback welcome. cheers luigi From owner-freebsd-geom@FreeBSD.ORG Mon Mar 23 20:19:20 2009 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C39AA106566B; Mon, 23 Mar 2009 20:19:20 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045082.chello.pl [87.206.45.82]) by mx1.freebsd.org (Postfix) with ESMTP id 1F9868FC21; Mon, 23 Mar 2009 20:19:19 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 37D0A45C9B; Mon, 23 Mar 2009 21:19:18 +0100 (CET) Received: from localhost (chello087206045082.chello.pl [87.206.45.82]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 7CFE245683; Mon, 23 Mar 2009 21:19:10 +0100 (CET) Date: Mon, 23 Mar 2009 21:19:48 +0100 From: Pawel Jakub Dawidek To: Luigi Rizzo Message-ID: <20090323201948.GA1723@garage.freebsd.pl> References: <20090323060325.GN3102@garage.freebsd.pl> <42618.1237791496@critter.freebsd.dk> <20090323200712.GA28660@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LZvS9be/3tNcYl/X" Content-Disposition: inline In-Reply-To: <20090323200712.GA28660@onelab2.iet.unipi.it> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: Ivan Voras , Poul-Henning Kamp , freebsd-geom@FreeBSD.org, luigi@FreeBSD.org, fabio@gandalf.sssup.it Subject: Re: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 20:19:21 -0000 --LZvS9be/3tNcYl/X Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Mar 23, 2009 at 09:07:12PM +0100, Luigi Rizzo wrote: > On Mon, Mar 23, 2009 at 06:58:16AM +0000, Poul-Henning Kamp wrote: > > In message <20090323060325.GN3102@garage.freebsd.pl>, Pawel Jakub Dawid= ek write > > s: > >=20 > > >There is still a naming problem. pp and new_pp will end up with the sa= me > > >name. I'd suggest instructing GEOM to expose only parent in /dev/. > >=20 > > who said the new provider had to have same name ? > >=20 > > >The taste is still going to be send on new class arrival and on the la= st > > >pp write close. > >=20 > > We decide that. > >=20 > > Since we are inserting in an already open path, I think it makes very > > good sense to supress tasting, at least until close. >=20 > To summarize, here is how I have implemented a node that > supports both regular "create" and the transparent "insert" > we are discussing. > Say we want to attach to an existing provider "pp" whose name is "ad0" >=20 > BEFORE ---> [ pp --> old_gp ...] >=20 > Then we can do either "geom xx create ad0" which results in >=20 > AFTER create ---> [ newpp --> gp --> cp ] ---> [ pp --> old_gp ... ] >=20 > or "geom xx insert ad0", which results in >=20 > AFTER insert ---> [ pp --> gp --> cp ] ---> [ newpp --> old_gp ... ] >=20 > The names of the various objects are the same in both cases so >=20 > old_gp->name =3D "ad0" > pp->name =3D "ad0" > gp->name =3D "ad0.xx." > newpp->name =3D "ad0.xx." >=20 > This lets new clients connect to provider "ad0" without having to > know about any insertion. > Also, to remove the newly inserted pieces, in both cases you can > run the same command "geom xx destroy ad0.xx." (remembering that > in this case you are naming the geom, not the provider). >=20 > In terms of code, no changes to the infrastructure, and the > create/insert and destroy functions are the following (error checking > removed for clarity) >=20 > g_xx_create(struct g_provider *pp, struct g_class *mp, int insert ...) > { > snprintf(name, sizeof(name), "%s%s", pp->name, MY_SUFFIX); > gp =3D g_new_geomf(mp, name); > ... allocate and fill softc and geom... > newpp =3D g_new_providerf(insert ? pp->geom : gp, gp->name); > ... initialize mediasize and sectorsize > cp =3D g_new_consumer(gp); > g_attach(cp, insert ? newpp : pp); > if (insert) { > g_cancel_event(newpp); /* no taste() on this*/ > /* link pp to old_gp */ > LIST_REMOVE(pp, provider); > pp->geom =3D gp; newpp->private =3D pp->private; pp->private =3D NULL; newpp->index =3D pp->index; pp->index =3D 0; > LIST_INSERT_HEAD(&gp->provider, pp, provider); > g_access(cp, 1, 1, 1); /* we can move data */ > sc->sc_insert =3D 1; /* remember for the des= troy */ > } > g_error_provider(newpp, 0); > } >=20 > Here it is a bit inefficient to have to call g_cancel_event() > but short of changing g_new_providerf() there is no way to > avoid the g_new_provider event. >=20 > g_xx_destroy(struct g_geom *gp) > { > ... > if (sc->sc_insert) { > pp =3D LIST_FIRST(&gp->provider); > cp =3D LIST_FIRST(&gp->consumer); > newpp =3D cp->provider; > /* Link provider to the original geom. */ > LIST_REMOVE(pp, provider); > pp->geom =3D newpp->geom; pp->private =3D newpp->private; newpp->private =3D NULL; pp->index =3D newpp->index; newpp->index =3D 0; > LIST_INSERT_HEAD(&pp->geom->provider, pp, provider); > g_access(cp, -1, -1, -1); > /* I am not sure if we need the following 3 */ > g_detach(cp); > LIST_REMOVE(newpp, provider); > g_destroy_provider(newpp); > } > ... > /* regular destroy path */ > } >=20 > Above, I am not totally sure if we need to explicitly call g_detach() > and destroy the provider, or if it will come for free as a result of > the regular destoy code. >=20 > The block "if (sc->sc_insert) {..}" is reasonably generic > (and large, when you put in the error checking) to possibly deserve > a function in geom_subr.c -- but until there are no other clients, > it makes no sense. >=20 > As usual, feedback welcome. I don't think this is good idea to try to squeeze creation and insertion in one function. IMHO it would be better to have generic functions for insert/remove functionality: int g_insert(struct g_class *class, struct g_provider *oldpp); int g_remove(struct g_provider *oldpp); (In g_insert() class name can be attached to new provider's name for example.) --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --LZvS9be/3tNcYl/X Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFJx+7kForvXbEpPzQRAluwAJ9K7kpyfB58to9cTtBgLXPXUEkDQwCdFNmR vKbosqvo+QbJ+mcddARSZRg= =uK+L -----END PGP SIGNATURE----- --LZvS9be/3tNcYl/X-- From owner-freebsd-geom@FreeBSD.ORG Wed Mar 25 22:15:28 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1AFF1065691 for ; Wed, 25 Mar 2009 22:15:28 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3]) by mx1.freebsd.org (Postfix) with ESMTP id 43D248FC27 for ; Wed, 25 Mar 2009 22:15:28 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from localhost (amavis.fra.cksoft.de [192.168.74.71]) by mail.cksoft.de (Postfix) with ESMTP id 54AEC41C6EA; Wed, 25 Mar 2009 23:00:06 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from mail.cksoft.de ([195.88.108.3]) by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new, port 10024) with ESMTP id hknk9wliJ1V5; Wed, 25 Mar 2009 23:00:05 +0100 (CET) Received: by mail.cksoft.de (Postfix, from userid 66) id DDF2141C690; Wed, 25 Mar 2009 23:00:05 +0100 (CET) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id 7797F4448E6; Wed, 25 Mar 2009 21:57:27 +0000 (UTC) Date: Wed, 25 Mar 2009 21:57:27 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: Marcel Moolenaar Message-ID: <20090325214318.Q67075@maildrop.int.zabbadoz.net> X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1519328060-1238018247=:67075" Cc: freebsd-geom@freebsd.org Subject: gpart on top of eli inside a slice is not working X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2009 22:15:29 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1519328060-1238018247=:67075 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Hi, assume you get a laptop with the usual pre-install that you must not and cannot change but still want to add a freebsd to the wastefull emptyness at the end of the large disk. So you have 3 classic slices: 1 compaq recovery 2 ntfs 3 dos (free) So I tried to play a bit with that and tried to install freebsd on slice 3 inside eli. To try gjournal as well I thought I go with gpart directly as it will be the tool in the future instead od bsdlabel and created 3 paritions: 1 for the journal, 2 for swap and 2 for the data. All was fine. I rebooted. and there was garbage. Here=B4s a script to reproduce this on head. It will create a swap backed memory disk and a $0.key file. If unsure run the steps by hand to avoid the script accidentally go wild. It's quickly hacked together so it's not nice but did the job on a 8-current here. It does not yet create the journal or newfs anything as I tried the minimum to reproduce this. Leaving out the fdisk and the s=3D${md}s3 it will work fine. So I guess it a problem of stacking things. And before you are going to ask - changing slice 3 to 165 (freebsd) does not change anything. Any ideas? ----- 8< 8< 8<---------------------------------------------------------- #!/bin/sh case `id -u` in 0) ;; *) echo "Run as super user" >&2 exit 1 ;; esac md=3D`mdconfig -a -t swap -s 32901120 -x 63 -y 64` case "${md}" in md*) echo "Created swapped backed memory disk ${md}" ;; *) echo "ERROR creating memory disk">&2 exit 1 ;; esac echo =A8creating initial set=A8 echo " ..fdisk" # cannot get this to work fdisk -q -i -f - /dev/${md} < Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F6091065674 for ; Thu, 26 Mar 2009 05:09:35 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from asmtpout011.mac.com (asmtpout011.mac.com [17.148.16.86]) by mx1.freebsd.org (Postfix) with ESMTP id 7B7B28FC18 for ; Thu, 26 Mar 2009 05:09:35 +0000 (UTC) (envelope-from xcllnt@mac.com) MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Received: from [192.168.4.253] (mail.xcllnt.net [75.101.29.67]) by asmtp011.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0KH3003PSKZYYT80@asmtp011.mac.com> for freebsd-geom@freebsd.org; Wed, 25 Mar 2009 22:09:35 -0700 (PDT) Message-id: From: Marcel Moolenaar To: "Bjoern A. Zeeb" In-reply-to: <20090325214318.Q67075@maildrop.int.zabbadoz.net> Content-transfer-encoding: quoted-printable Date: Wed, 25 Mar 2009 22:09:34 -0700 References: <20090325214318.Q67075@maildrop.int.zabbadoz.net> X-Mailer: Apple Mail (2.930.3) Cc: freebsd-geom@freebsd.org Subject: Re: gpart on top of eli inside a slice is not working X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 05:09:35 -0000 On Mar 25, 2009, at 2:57 PM, Bjoern A. Zeeb wrote: > Here=B4s a script to reproduce this on head. First of all: exemplary problem reporting! > Any ideas? The probe method of the GPT scheme explicitly disallows nesting. This is inconsistent with the create method, which happily allows creating a GPT underneath a MBR. The bug is in the create method: GPT cannot be created inside a MBR slice (or any other partioning for that matter). I'll fix that shortly. FYI, --=20 Marcel Moolenaar xcllnt@mac.com From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 06:30:07 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E00D106566B for ; Thu, 26 Mar 2009 06:30:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3]) by mx1.freebsd.org (Postfix) with ESMTP id 36CDD8FC1A for ; Thu, 26 Mar 2009 06:30:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from localhost (amavis.fra.cksoft.de [192.168.74.71]) by mail.cksoft.de (Postfix) with ESMTP id D370041C647; Thu, 26 Mar 2009 07:30:05 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from mail.cksoft.de ([195.88.108.3]) by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new, port 10024) with ESMTP id 3g6yyl4tztqy; Thu, 26 Mar 2009 07:30:05 +0100 (CET) Received: by mail.cksoft.de (Postfix, from userid 66) id 6B3D741C670; Thu, 26 Mar 2009 07:30:05 +0100 (CET) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id D07E54448E6; Thu, 26 Mar 2009 06:29:48 +0000 (UTC) Date: Thu, 26 Mar 2009 06:29:48 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: Marcel Moolenaar In-Reply-To: Message-ID: <20090326062604.X67075@maildrop.int.zabbadoz.net> References: <20090325214318.Q67075@maildrop.int.zabbadoz.net> X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1880230994-1238048988=:67075" Cc: freebsd-geom@freebsd.org Subject: Re: gpart on top of eli inside a slice is not working X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 06:30:07 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1880230994-1238048988=:67075 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 25 Mar 2009, Marcel Moolenaar wrote: > > On Mar 25, 2009, at 2:57 PM, Bjoern A. Zeeb wrote: > >> Here=B4s a script to reproduce this on head. > > First of all: exemplary problem reporting! > >> Any ideas? > > The probe method of the GPT scheme explicitly disallows nesting. > This is inconsistent with the create method, which happily allows > creating a GPT underneath a MBR. > > The bug is in the create method: GPT cannot be created inside a > MBR slice (or any other partioning for that matter). I'll fix > that shortly. Well technically it is created inside some random garbage from eli and not directly inside the MBR slice. So the only possible solutions for those would be: 1) Somehow convert the entire disk to part and then exposing the 3 freebsd-* partitions and have a dedicated eli inside each. 2) try (and stick with) bsdlabel on top of the eli inside the mbr slice? So can you explain why there is the restriction that part cannot be used inside a MBR slice or rather somewhere on top of such? /bz --=20 Bjoern A. Zeeb The greatest risk is not taking one. --0-1880230994-1238048988=:67075-- From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 10:55:51 2009 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6FA61065672 for ; Thu, 26 Mar 2009 10:55:51 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.9.129]) by mx1.freebsd.org (Postfix) with ESMTP id 80F5D8FC1A for ; Thu, 26 Mar 2009 10:55:51 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 4432B73098; Thu, 26 Mar 2009 12:00:48 +0100 (CET) Date: Thu, 26 Mar 2009 12:00:48 +0100 From: Luigi Rizzo To: geom@freebsd.org Message-ID: <20090326110048.GA48516@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: geom debugging tools ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 10:55:52 -0000 do we have a tool that can list all active geoms, providers and consumers ? "geom list" does part of the job, but I don't know how to get the list of available classes. The following trick ls /lib/geom | sed 's/geom_//;s/\.so//' | xargs -n 1 -J % geom % list can only give a partial list of names. From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 11:30:02 2009 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E4F11065673 for ; Thu, 26 Mar 2009 11:30:02 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.9.129]) by mx1.freebsd.org (Postfix) with ESMTP id 24F0C8FC24 for ; Thu, 26 Mar 2009 11:30:01 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 554A7730A1; Thu, 26 Mar 2009 12:34:59 +0100 (CET) Date: Thu, 26 Mar 2009 12:34:59 +0100 From: Luigi Rizzo To: Poul-Henning Kamp Message-ID: <20090326113459.GA50106@onelab2.iet.unipi.it> References: <20090326110048.GA48516@onelab2.iet.unipi.it> <8425.1238066250@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8425.1238066250@critter.freebsd.dk> User-Agent: Mutt/1.4.2.3i Cc: geom@freebsd.org Subject: Re: geom debugging tools ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 11:30:03 -0000 On Thu, Mar 26, 2009 at 11:17:30AM +0000, Poul-Henning Kamp wrote: > In message <20090326110048.GA48516@onelab2.iet.unipi.it>, Luigi Rizzo writes: > >do we have a tool that can list all active geoms, providers > >and consumers ? > > > >"geom list" does part of the job, but I don't > >know how to get the list of available classes. > > sysctl -b kern.geom.confxml wonderful, thanks From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 11:34:28 2009 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08535106566B for ; Thu, 26 Mar 2009 11:34:28 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id C25378FC1B for ; Thu, 26 Mar 2009 11:34:27 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id A0D1978CE2; Thu, 26 Mar 2009 11:17:30 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2QBHUHY008426; Thu, 26 Mar 2009 11:17:30 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 26 Mar 2009 12:00:48 +0100." <20090326110048.GA48516@onelab2.iet.unipi.it> Date: Thu, 26 Mar 2009 11:17:30 +0000 Message-ID: <8425.1238066250@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: geom@freebsd.org Subject: Re: geom debugging tools ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 11:34:28 -0000 In message <20090326110048.GA48516@onelab2.iet.unipi.it>, Luigi Rizzo writes: >do we have a tool that can list all active geoms, providers >and consumers ? > >"geom list" does part of the job, but I don't >know how to get the list of available classes. sysctl -b kern.geom.confxml -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 12:48:54 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E7DC106566B for ; Thu, 26 Mar 2009 12:48:54 +0000 (UTC) (envelope-from wilkinsa@stlux550.dsto.defence.gov.au) Received: from digger1.defence.gov.au (digger1.defence.gov.au [203.5.217.4]) by mx1.freebsd.org (Postfix) with ESMTP id 7E2C58FC08 for ; Thu, 26 Mar 2009 12:48:53 +0000 (UTC) (envelope-from wilkinsa@stlux550.dsto.defence.gov.au) Received: from ednmsw510.dsto.defence.gov.au (ednmsw510.dsto.defence.gov.au [131.185.68.11]) by digger1.defence.gov.au (DSTO/DSTO) with ESMTP id n2QCKibc025145 for ; Thu, 26 Mar 2009 22:50:51 +1030 (CST) Received: from ednex510.dsto.defence.gov.au (ednex510.dsto.defence.gov.au) by ednmsw510.dsto.defence.gov.au (Clearswift SMTPRS 5.2.9) with ESMTP id for ; Thu, 26 Mar 2009 22:55:20 +1030 Received: from stlex511.dsto.defence.gov.au ([203.6.60.49]) by ednex510.dsto.defence.gov.au with Microsoft SMTPSVC(6.0.3790.3959); Thu, 26 Mar 2009 22:55:20 +1030 Received: from stlux550.dsto.defence.gov.au ([203.6.60.61]) by stlex511.dsto.defence.gov.au with Microsoft SMTPSVC(6.0.3790.3959); Thu, 26 Mar 2009 21:25:19 +0900 Received: from stlux550.dsto.defence.gov.au (localhost [127.0.0.1]) by stlux550.dsto.defence.gov.au (8.14.3/8.14.3) with ESMTP id n2QBMMKO016408 for ; Thu, 26 Mar 2009 20:22:22 +0900 (WST) (envelope-from wilkinsa@stlux550.dsto.defence.gov.au) Received: (from wilkinsa@localhost) by stlux550.dsto.defence.gov.au (8.14.3/8.14.3/Submit) id n2QBMMte016407 for freebsd-geom@freebsd.org; Thu, 26 Mar 2009 20:22:22 +0900 (WST) (envelope-from wilkinsa) Date: Thu, 26 Mar 2009 20:22:22 +0900 From: "Wilkinson, Alex" To: freebsd-geom@freebsd.org Message-ID: <20090326112222.GE10080@stlux503.dsto.defence.gov.au> Mail-Followup-To: freebsd-geom@freebsd.org References: <20090326110048.GA48516@onelab2.iet.unipi.it> <8425.1238066250@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <8425.1238066250@critter.freebsd.dk> Organisation: Defence Science Technology Organisation User-Agent: Mutt/1.5.18 (2008-05-17) X-OriginalArrivalTime: 26 Mar 2009 12:25:19.0833 (UTC) FILETIME=[ED733C90:01C9AE0D] X-TM-AS-Product-Ver: SMEX-7.0.0.1584-5.6.1016-16542.003 X-TM-AS-Result: No-1.092100-0.000000-31 Content-Transfer-Encoding: 7bit Subject: Re: geom debugging tools ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 12:48:54 -0000 0n Thu, Mar 26, 2009 at 11:17:30AM +0000, Poul-Henning Kamp wrote: >sysctl -b kern.geom.confxml Curios, why doesn't 'sysctl -a' display "kern.geom.confxml" ? e.g. #sysctl -a | grep -i kern.geom kern.geom.collectstats: 1 kern.geom.debugflags: 0 kern.geom.label.debug: 0 # -aW IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914. If you have received this email in error, you are requested to contact the sender and delete the email. From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 13:22:41 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F3967106574A for ; Thu, 26 Mar 2009 13:22:39 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: from mail-bw0-f164.google.com (mail-bw0-f164.google.com [209.85.218.164]) by mx1.freebsd.org (Postfix) with ESMTP id A1CFC8FC34 for ; Thu, 26 Mar 2009 13:22:38 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: by bwz8 with SMTP id 8so515054bwz.43 for ; Thu, 26 Mar 2009 06:22:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.55.200 with SMTP id v8mr295871bkg.54.1238073757477; Thu, 26 Mar 2009 06:22:37 -0700 (PDT) In-Reply-To: <20090326112222.GE10080@stlux503.dsto.defence.gov.au> References: <20090326110048.GA48516@onelab2.iet.unipi.it> <8425.1238066250@critter.freebsd.dk> <20090326112222.GE10080@stlux503.dsto.defence.gov.au> Date: Thu, 26 Mar 2009 14:22:37 +0100 Message-ID: From: =?ISO-8859-1?Q?Marius_N=FCnnerich?= To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: geom debugging tools ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 13:22:42 -0000 On Thu, Mar 26, 2009 at 12:22, Wilkinson, Alex wrote: > > =A0 =A00n Thu, Mar 26, 2009 at 11:17:30AM +0000, Poul-Henning Kamp wrote: > > =A0 =A0>sysctl -b kern.geom.confxml > > Curios, why doesn't 'sysctl -a' display "kern.geom.confxml" ? e.g. > > =A0 #sysctl -a | grep -i kern.geom > =A0 kern.geom.collectstats: 1 > =A0 kern.geom.debugflags: 0 > =A0 kern.geom.label.debug: 0 > =A0 # > Because sysctl's with binary data are not shown with the -a parameter. From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 16:06:05 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 181FB106564A for ; Thu, 26 Mar 2009 16:06:05 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from asmtpout018.mac.com (asmtpout018.mac.com [17.148.16.93]) by mx1.freebsd.org (Postfix) with ESMTP id 023908FC08 for ; Thu, 26 Mar 2009 16:06:04 +0000 (UTC) (envelope-from xcllnt@mac.com) MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Received: from agonzales-t60.jnpr.net ([66.129.224.36]) by asmtp018.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0KH400A9XFE4PH90@asmtp018.mac.com> for freebsd-geom@freebsd.org; Thu, 26 Mar 2009 09:06:04 -0700 (PDT) Message-id: <1FA0EF30-7FCC-4384-8151-36843EFBE01D@mac.com> From: Marcel Moolenaar To: "Bjoern A. Zeeb" In-reply-to: <20090326062604.X67075@maildrop.int.zabbadoz.net> Content-transfer-encoding: quoted-printable Date: Thu, 26 Mar 2009 09:04:42 -0700 References: <20090325214318.Q67075@maildrop.int.zabbadoz.net> <20090326062604.X67075@maildrop.int.zabbadoz.net> X-Mailer: Apple Mail (2.930.3) Cc: freebsd-geom@freebsd.org Subject: Re: gpart on top of eli inside a slice is not working X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 16:06:05 -0000 On Mar 25, 2009, at 11:29 PM, Bjoern A. Zeeb wrote: > On Wed, 25 Mar 2009, Marcel Moolenaar wrote: > >> >> On Mar 25, 2009, at 2:57 PM, Bjoern A. Zeeb wrote: >> >>> Here=B4s a script to reproduce this on head. >> >> First of all: exemplary problem reporting! >> >>> Any ideas? >> >> The probe method of the GPT scheme explicitly disallows nesting. >> This is inconsistent with the create method, which happily allows >> creating a GPT underneath a MBR. >> >> The bug is in the create method: GPT cannot be created inside a >> MBR slice (or any other partioning for that matter). I'll fix >> that shortly. > > Well technically it is created inside some random garbage from eli and > not directly inside the MBR slice. When I refer to nesting, I mean the on-disk layout. It's almost meaningless to talk in terms of GEOM nesting, because you can't assume anything. Thus: the fact that geli is in between the two gpart instances is irrelevant. > So the only possible solutions for those would be: > 1) Somehow convert the entire disk to part and then exposing the 3 > freebsd-* partitions and have a dedicated eli inside each. I don't understand what you're trying to say. Can you elaborate? > 2) try (and stick with) bsdlabel on top of the eli inside the mbr > slice? A different scheme, one that is allowed to be nested (again, from an on-disk layout PoV), is the right thing to do. > So can you explain why there is the restriction that part cannot be > used inside a MBR slice or rather somewhere on top of such? There's no such restriction in gpart. If there was, then gpart would not be able to implement the BSD scheme or the EBR scheme. FYI, --=20 Marcel Moolenaar xcllnt@mac.com From owner-freebsd-geom@FreeBSD.ORG Thu Mar 26 22:36:42 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0A38106564A for ; Thu, 26 Mar 2009 22:36:42 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 948DB8FC23 for ; Thu, 26 Mar 2009 22:36:42 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 699C578D23; Thu, 26 Mar 2009 22:36:41 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2QLGX0L001095; Thu, 26 Mar 2009 21:16:33 GMT (envelope-from phk@critter.freebsd.dk) To: "Wilkinson, Alex" From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 26 Mar 2009 20:22:22 +0900." <20090326112222.GE10080@stlux503.dsto.defence.gov.au> Date: Thu, 26 Mar 2009 21:16:33 +0000 Message-ID: <1094.1238102193@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-geom@freebsd.org Subject: Re: geom debugging tools ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2009 22:36:43 -0000 In message <20090326112222.GE10080@stlux503.dsto.defence.gov.au>, "Wilkinson, A lex" writes: > > 0n Thu, Mar 26, 2009 at 11:17:30AM +0000, Poul-Henning Kamp wrote: > > >sysctl -b kern.geom.confxml > >Curios, why doesn't 'sysctl -a' display "kern.geom.confxml" ? e.g. Because we don't want to spam the sysctl -a output with so much output. There are many other sysctls which also are not shown during sysctl -a because they return binary structures. The '-b' flag means "I know it may be binary" and we (ab)use that to supress the XML output from sysctl -a. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Fri Mar 27 12:30:08 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4827E106566B for ; Fri, 27 Mar 2009 12:30:08 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3]) by mx1.freebsd.org (Postfix) with ESMTP id CB6608FC1E for ; Fri, 27 Mar 2009 12:30:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from localhost (amavis.fra.cksoft.de [192.168.74.71]) by mail.cksoft.de (Postfix) with ESMTP id 4FF8541C70C; Fri, 27 Mar 2009 13:30:06 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from mail.cksoft.de ([195.88.108.3]) by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new, port 10024) with ESMTP id iJLCQOHL2owc; Fri, 27 Mar 2009 13:30:05 +0100 (CET) Received: by mail.cksoft.de (Postfix, from userid 66) id C6E9841C707; Fri, 27 Mar 2009 13:30:05 +0100 (CET) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id EBD754448E6; Fri, 27 Mar 2009 12:26:11 +0000 (UTC) Date: Fri, 27 Mar 2009 12:26:11 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: Marcel Moolenaar In-Reply-To: <1FA0EF30-7FCC-4384-8151-36843EFBE01D@mac.com> Message-ID: <20090327092226.K67075@maildrop.int.zabbadoz.net> References: <20090325214318.Q67075@maildrop.int.zabbadoz.net> <20090326062604.X67075@maildrop.int.zabbadoz.net> <1FA0EF30-7FCC-4384-8151-36843EFBE01D@mac.com> X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-geom@freebsd.org Subject: Re: gpart on top of eli inside a slice is not working X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2009 12:30:08 -0000 On Thu, 26 Mar 2009, Marcel Moolenaar wrote: Hi, >>> The bug is in the create method: GPT cannot be created inside a >>> MBR slice (or any other partioning for that matter). I'll fix >>> that shortly. >> >> Well technically it is created inside some random garbage from eli and >> not directly inside the MBR slice. > > When I refer to nesting, I mean the on-disk layout. It's almost > meaningless to talk in terms of GEOM nesting, because you can't > assume anything. > > Thus: the fact that geli is in between the two gpart instances is > irrelevant. ok. >> So the only possible solutions for those would be: >> 1) Somehow convert the entire disk to part and then exposing the 3 >> freebsd-* partitions and have a dedicated eli inside each. > > I don't understand what you're trying to say. Can you > elaborate? Well if you convert the entire thing to GPT (not part; still wrongly using it as a synonym 'cause of the old gpt(8) name and being confuse;-) ) you'd have md0p1 Compaq Recovery (however this would work) md0p2 NTFS (however this would work) md0p3 freebsd-ufs md0p4 freebsd-swap md0p5 freebsd-ufs and then you would do md0p3.eli md0p4.eli md0p5.eli In this case the "freebsd-*" is publicly exposed in GPT. But in contrast, with the fdisk version, where slice 3 is "DOS" md0s1 "Compaq Recovery" md0s2 "NTFS" md0s3 "DOS" md0s3.eli random garbage md0s3.elia equivalent to md0p3 md0s3.elib equivalent to md0p4 md0s3.elid equivalent to md0p5 the freebsd parts are not publicly visible. >> 2) try (and stick with) bsdlabel on top of the eli inside the mbr >> slice? > > A different scheme, one that is allowed to be nested (again, from > an on-disk layout PoV), is the right thing to do. I went with gpart + BSD for now. >> So can you explain why there is the restriction that part cannot be >> used inside a MBR slice or rather somewhere on top of such? > > There's no such restriction in gpart. If there was, then gpart > would not be able to implement the BSD scheme or the EBR scheme. So again here, s,part,GPT, ;-) What is the EBR scheme? Are the schemes somewhere documented in more detail? Well they are someway but perhaps gpart(8)[?] could talk a bit more what a "scheme" is and what the affiliation to the geom classes and options. So I have to say I much more liked gpart to create the traditional BSD disklabels than the old bsdlabel. Things can be scripted more eassily etc. Two things would significantly improve usability though are 1 ability to give -s size in human readable way instead of having to do all the math. 2 be able to give -b start to just say "start at next free offset" w/o looking it up or doing the math. Example: gpart add -b next -s 64G -t freebsd-ufs da3 /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From owner-freebsd-geom@FreeBSD.ORG Fri Mar 27 12:56:10 2009 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 63CAD1065702; Fri, 27 Mar 2009 12:56:10 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.9.129]) by mx1.freebsd.org (Postfix) with ESMTP id 28DC98FC18; Fri, 27 Mar 2009 12:56:09 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 571DA73098; Fri, 27 Mar 2009 14:01:08 +0100 (CET) Date: Fri, 27 Mar 2009 14:01:08 +0100 From: Luigi Rizzo To: phk@freebsd.org, geom@freebsd.org Message-ID: <20090327130108.GA96723@onelab2.iet.unipi.it> References: <20090326110048.GA48516@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090326110048.GA48516@onelab2.iet.unipi.it> User-Agent: Mutt/1.4.2.3i Cc: Subject: usage and format of kern.geom.conf* sysctl variables X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2009 12:56:11 -0000 I have a few questions on the following sysctl variables, implemented in sys/geom/geom_kern.c kern.geom.conftxt kern.geom.confdot kern.geom.confxml QUESTION #1 All the variables return a trailing NUL when printed, because their handler is error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1); I wonder if the trailing NUL is intentional in all cases, and if so, did you add it to hide the output in the default output, or because the string may contain non-low-ascii character ? (I am asking because on the console the trailing NUL is printed as an extra space; xterm output is fine). QUESTION #2 Is it reasonable to put in the variables also information accumulated at runtime (eg. usage stats and so on), or we should stick to pure "configuration" information ? QUESTION #3 (content) Should we limit the content of these variables to 'configuration' info (e.g. name, topology, media sizes) or is it reasonable to have fields for stats and other info accumulated at runtime ? QUESTION #4 (conftxt record separator) It seems that the format is one line per provider, so e.g. if a provider has to print a lot of info (e.g. an array of numbers) it still has to put everything on one line, right ? QUESTION #4 (conftxt format) Any reason why conftxt is limited to DISK and MD classes ? Also, for each provider, conftxt does not print the name of the geom the provider is attached to, but only its class; this doesn't let me figure out the linkage, e.g. in the case below how do I know that ntfs/WINXP is on ad0s1 and not on, say, another disk with the same mediasize ? ... 0 DISK ad0 160041885696 512 hd 16 sc 63 1 SCHED ad0.sched. 160041885696 512 2 MBR ad0.sched.s3 113993842688 512 i 2 o 46046117888 ty 15 3 MBREXT ad0.sched.s5 113992794112 512 i 0 o 1048576 ty 7 2 MBR ad0.sched.s2 3093299200 512 i 1 o 42952818688 ty 27 2 MBR ad0.sched.s1 42952379904 512 i 0 o 32256 ty 7 1 MBR ad0s3 113993842688 512 i 2 o 46046117888 ty 15 2 MBREXT ad0s5 113992794112 512 i 0 o 1048576 ty 7 3 LABEL ntfs/DATA 113992794112 512 i 0 o 0 1 MBR ad0s2 3093299200 512 i 1 o 42952818688 ty 27 2 LABEL ntfs/RECOVERY 3093299200 512 i 0 o 0 1 MBR ad0s1 42952379904 512 i 0 o 32256 ty 7 2 LABEL ntfs/WINXP 42952379904 512 i 0 o 0 ... cheers luigi From owner-freebsd-geom@FreeBSD.ORG Fri Mar 27 13:31:53 2009 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8CCBE1065674 for ; Fri, 27 Mar 2009 13:31:53 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 527ED8FC20 for ; Fri, 27 Mar 2009 13:31:53 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id E176578CCD; Fri, 27 Mar 2009 13:31:51 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.3/8.14.3) with ESMTP id n2RDVpp4004845; Fri, 27 Mar 2009 13:31:51 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Fri, 27 Mar 2009 14:01:08 +0100." <20090327130108.GA96723@onelab2.iet.unipi.it> Date: Fri, 27 Mar 2009 13:31:51 +0000 Message-ID: <4844.1238160711@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: geom@freebsd.org Subject: Re: usage and format of kern.geom.conf* sysctl variables X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2009 13:31:54 -0000 In message <20090327130108.GA96723@onelab2.iet.unipi.it>, Luigi Rizzo writes: >QUESTION #1 > All the variables return a trailing NUL when printed, > because their handler is > > error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1); > > I wonder if the trailing NUL is intentional in all cases. Yes. >QUESTION #2 > Is it reasonable to put in the variables also information > accumulated at runtime (eg. usage stats and so on), or > we should stick to pure "configuration" information ? No. Statistics are collected via the shared-memory interface which gstat(8) uses. This is much more efficient since there is no syscall overhead to update the values. >QUESTION #3 (content) > Should we limit the content of these variables to > 'configuration' info (e.g. name, topology, media sizes) or is > it reasonable to have fields for stats and other info accumulated > at runtime ? The intention is that confxml is definitive with respect to relevant configuration information. That is not the same as to say that _everything_ should be included in it. >QUESTION #4 (conftxt record separator) > It seems that the format is one line per provider, so e.g. > if a provider has to print a lot of info (e.g. an array of > numbers) it still has to put everything on one line, right ? contxt is specifically and *only* for the use of sysinstall. This use should be discontinued as soon as possible. >QUESTION #4 (conftxt format) > Any reason why conftxt is limited to DISK and MD classes ? See above. I wrote a couple of "blue print" articles about this stuff for daemonnews many years ago, I hope they still exist somewhere on the net, because they are still relevant. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Fri Mar 27 16:44:47 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E196D106571C for ; Fri, 27 Mar 2009 16:44:46 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from asmtpout019.mac.com (asmtpout019.mac.com [17.148.16.94]) by mx1.freebsd.org (Postfix) with ESMTP id B07EE8FC1B for ; Fri, 27 Mar 2009 16:44:46 +0000 (UTC) (envelope-from xcllnt@mac.com) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=US-ASCII; format=flowed Received: from [192.168.4.253] (mail.xcllnt.net [75.101.29.67]) by asmtp019.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0KH600HHGBUH4S40@asmtp019.mac.com> for freebsd-geom@freebsd.org; Fri, 27 Mar 2009 09:44:42 -0700 (PDT) Message-id: <5BDA79FB-5678-4FF2-9BD1-D5915DDFC3C3@mac.com> From: Marcel Moolenaar To: "Bjoern A. Zeeb" In-reply-to: <20090327092226.K67075@maildrop.int.zabbadoz.net> Date: Fri, 27 Mar 2009 09:44:41 -0700 References: <20090325214318.Q67075@maildrop.int.zabbadoz.net> <20090326062604.X67075@maildrop.int.zabbadoz.net> <1FA0EF30-7FCC-4384-8151-36843EFBE01D@mac.com> <20090327092226.K67075@maildrop.int.zabbadoz.net> X-Mailer: Apple Mail (2.930.3) Cc: freebsd-geom@freebsd.org Subject: Re: gpart on top of eli inside a slice is not working X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2009 16:44:49 -0000 On Mar 27, 2009, at 5:26 AM, Bjoern A. Zeeb wrote: >>> So the only possible solutions for those would be: >>> 1) Somehow convert the entire disk to part and then exposing the 3 >>> freebsd-* partitions and have a dedicated eli inside each. >> >> I don't understand what you're trying to say. Can you >> elaborate? > > Well if you convert the entire thing to GPT (not part; still wrongly > using it as a synonym 'cause of the old gpt(8) name and being > confuse;-) ) you'd have > > md0p1 Compaq Recovery (however this would work) > md0p2 NTFS (however this would work) > md0p3 freebsd-ufs > md0p4 freebsd-swap > md0p5 freebsd-ufs > > and then you would do > md0p3.eli > md0p4.eli > md0p5.eli > > In this case the "freebsd-*" is publicly exposed in GPT. I see. Yes, of course. This shows the downside of having a flat partitioning. While it does the job, it may not be the most aesthetically pleasing in some cases... > >>> 2) try (and stick with) bsdlabel on top of the eli inside the mbr >>> slice? >> >> A different scheme, one that is allowed to be nested (again, from >> an on-disk layout PoV), is the right thing to do. > > I went with gpart + BSD for now. Sounds good. With gpart you can create BSD disklabels with up to 20 partitions, so it can still be used when you want to carve up in more than 7 (usable) partitions. >>> So can you explain why there is the restriction that part cannot be >>> used inside a MBR slice or rather somewhere on top of such? >> >> There's no such restriction in gpart. If there was, then gpart >> would not be able to implement the BSD scheme or the EBR scheme. > > So again here, s,part,GPT, ;-) Ah :-) With at least 128 partition entries the need for nesting was eliminated. It's explicitly allowed to sub-partition a GPT partition with a MBR, but this was not so much done for the sake of nesting I think, but rather virtualization. Also: GPT was designed as part of EFI. You don't want to add unnecessary complications to firmware and nested GPTs surely do that. > What is the EBR scheme? The EBR scheme is used to create logical partitions: http://en.wikipedia.org/wiki/Extended_boot_record > Are the > schemes somewhere documented in more detail? They're as well documented as they were before :-) In other words no. > Well they are someway but perhaps gpart(8)[?] could talk a bit more > what a "scheme" is and what the affiliation to the geom classes > and options. I think the visibility has increased from before. Previously partitioning was thought of in terms of utilities. There was a 1-to-1 mapping between scheme and tool. Now there's a single tool and users need to select a scheme when they create a partitioning. It definitely makes sense to elaborate more in the manpage for gpart(8), or even create gpart(9) pages. I'll keep it in mind. > Two things would significantly improve usability though are > 1 ability to give -s size in human readable way instead of having to > do all the math. > 2 be able to give -b start to just say "start at next free offset" w/o > looking it up or doing the math. Yes on both accounts. We'll get that fleshed out and implemented in time. It's just "syntactic sugaring"; UI padding... Thanks for the feedback. -- Marcel Moolenaar xcllnt@mac.com