From owner-freebsd-arch@FreeBSD.ORG Thu Mar 19 08:14:52 2009 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6C192106566C for ; Thu, 19 Mar 2009 08:14:52 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.9.129]) by mx1.freebsd.org (Postfix) with ESMTP id D424F8FC21 for ; Thu, 19 Mar 2009 08:14:51 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id AFE3873098; Thu, 19 Mar 2009 09:19:36 +0100 (CET) Date: Thu, 19 Mar 2009 09:19:36 +0100 From: Luigi Rizzo To: geom@freebsd.org, phk@freebsd.org, ivan@freebsd.org, fabio@gandalf.sssup.it Message-ID: <20090319081936.GA32750@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="J/dobhs11T7y2rNN" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: arch@freebsd.org Subject: RFC: adding 'proxy' nodes to provider ports (with patch) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Mar 2009 08:14:52 -0000 --J/dobhs11T7y2rNN Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, Fabio Checconi and I have been thinking on how to implement "proxy" geom nodes, i.e. nodes that have exactly 1 provider and 1 consumer port, do not do any data transformation, and can be transparently inserted or removed on top of a provider port while the tree is actively moving data. Our immediate need was to add/remove a scheduler while a disk is mounted, but there are possibly other uses e.g. if one wants to "sniff" the traffic through a disk, or do other ops that are transparent for the data stream. We would like opinion on the following solution, which seems extremely simple in terms of implementation. The idea is to intercept requests coming on a provider port, pp, and redirect them to a geom node acting as a proxy if the port is configured in this way: +=====...===...======+ H H H H H H +=====...====== cp ==+ | +---------------+ V | V +=====.....==== pp ==+ | +======= proxy_pp ==+ H 'ad0s1' H | H H H ------->--+ H H H gp -------<--+ H proxy_node H H H | H H +=======....===...===+ | +======= proxy_cp ==+ | V +---------------+ Normally the proxy does not exist, and the geom tree works as it does now. When we create a 'proxy' node, with something like geom my_proxy_class proxy ad0s1 we do something very similar to a 'create', but: - the proxy node is marked specially in gp->flags, so the core will not generate a g_new_provider_event when the provider port is created (this means there is no taste() call and nobody should be able to attach to the port). - the provider port we attach to is linked, with two pointers, to the provider and consumer ports of the proxy_node. In this situation, g_io_request() finds that port pp has a proxy attached to it, and immediately redirects the requests to the proxy, which does everything a geom node does (cloning requests, etc). When the proxy wants to pass the request down, it sends it again to pp, but now there is no redirection because the source can be identified as the proxy. The pointers in the bio insure a correct flow of the requests on the reverse path. Disconnecting a proxy is almost trivial: apart from handling possible races on the data path, we just need to clear pp->proxy_pp and pp->proxy_cp. After that, we can just send the regular destroy events to the proxy node, who will have to take care of flushing any pending bio's (e.g. see our geom_sched node that already does this). Overall the change is very small (see attached patch): a couple of lines in g_io_request, two extra fields in the g_provider, and the addition of a flag to gp->flags to control the generation of g_new_provider_event. There is basically no overhead on regular operation, and only a couple of extra pointers in struct g_provider (we use a spare bit in gp->flags to mark G_GEOM_PROXY nodes). The only things missing in the patch should be: - a check to avoid races on creation&destruction of a proxy. I am not so sure on how to achieve this, but creation and destruction are rare and can normally wait, so we could just piggyback the small critical section (manipulating pp->proxy_cp and pp->proxy_cp) into some other piece of code that is guaranteed to be race-free. - a check to prevent attaching to a provider port of a proxy (not a problem, i believe); - a check to prevent attaching a proxy to a provider port that already has one. Of course you can attach a proxy to another proxy, and if you want to change the order it is as simple as removing the existing proxy and reattaching it after the new one. Feedback welcome. cheers luigi and fabio --J/dobhs11T7y2rNN Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="geom-proxy.patch" Index: geom/geom_io.c =================================================================== --- geom/geom_io.c (revision 189933) +++ geom/geom_io.c (working copy) @@ -273,6 +273,10 @@ cp = bp->bio_from; pp = bp->bio_to; + /* Proxies are transparent, errors are caught by regular checks. */ + if (cp->geom->flags & G_GEOM_PROXY) + return (0); + /* Fail if access counters dont allow the operation */ switch(bp->bio_cmd) { case BIO_READ: @@ -343,6 +347,9 @@ bp->_bio_cflags = bp->bio_cflags; #endif + if (pp->proxy_provider != NULL && cp != pp->proxy_consumer) + pp = pp->proxy_provider; + if (bp->bio_cmd & (BIO_READ|BIO_WRITE|BIO_GETATTR)) { KASSERT(bp->bio_data != NULL, ("NULL bp->data in g_io_request(cmd=%hhu)", bp->bio_cmd)); Index: geom/geom_subr.c =================================================================== --- geom/geom_subr.c (revision 189933) +++ geom/geom_subr.c (working copy) @@ -365,6 +365,8 @@ g_cancel_event(gp); LIST_REMOVE(gp, geom); TAILQ_REMOVE(&geoms, gp, geoms); + if (gp->flags & G_GEOM_PROXY) + printf("g_destroy_geom(): destroying proxy %s", gp->name); g_free(gp->name); g_free(gp); } @@ -380,6 +382,11 @@ g_topology_assert(); G_VALID_GEOM(gp); g_trace(G_T_TOPOLOGY, "g_wither_geom(%p(%s))", gp, gp->name); + if (gp->flags & G_GEOM_PROXY) { + pp = LIST_FIRST(&gp->consumer)->provider; + pp->proxy_provider = NULL; + pp->proxy_consumer = NULL; + } if (!(gp->flags & G_GEOM_WITHER)) { gp->flags |= G_GEOM_WITHER; LIST_FOREACH(pp, &gp->provider, provider) @@ -581,7 +588,8 @@ pp->stat = devstat_new_entry(pp, -1, 0, DEVSTAT_ALL_SUPPORTED, DEVSTAT_TYPE_DIRECT, DEVSTAT_PRIORITY_MAX); LIST_INSERT_HEAD(&gp->provider, pp, provider); - g_post_event(g_new_provider_event, pp, M_WAITOK, pp, gp, NULL); + if (!(gp->flags & G_GEOM_PROXY)) + g_post_event(g_new_provider_event, pp, M_WAITOK, pp, gp, NULL); return (pp); } @@ -721,6 +729,20 @@ return (error); } +int +g_attach_proxy(struct g_consumer *cp, struct g_provider *pp, + struct g_provider *newpp) +{ + int error; + + error = g_attach(cp, pp); + if (!error) { + pp->proxy_provider = newpp; + pp->proxy_consumer = cp; + } + return (error); +} + void g_detach(struct g_consumer *cp) { Index: geom/geom.h =================================================================== --- geom/geom.h (revision 189933) +++ geom/geom.h (working copy) @@ -134,6 +134,7 @@ void *softc; unsigned flags; #define G_GEOM_WITHER 1 +#define G_GEOM_PROXY 2 }; /* @@ -190,6 +191,9 @@ #define G_PF_WITHER 0x2 #define G_PF_ORPHAN 0x4 + struct g_provider *proxy_provider; + struct g_consumer *proxy_consumer; + /* Two fields for the implementing class to use */ void *private; u_int index; @@ -219,6 +223,8 @@ /* geom_subr.c */ int g_access(struct g_consumer *cp, int nread, int nwrite, int nexcl); int g_attach(struct g_consumer *cp, struct g_provider *pp); +int g_attach_proxy(struct g_consumer *cp, struct g_provider *pp, + struct g_provider *newpp); void g_destroy_consumer(struct g_consumer *cp); void g_destroy_geom(struct g_geom *pp); void g_destroy_provider(struct g_provider *pp); --J/dobhs11T7y2rNN--