From owner-freebsd-geom@FreeBSD.ORG Tue Feb 24 15:21:11 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 75CDE16A4CE for ; Tue, 24 Feb 2004 15:21:11 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7C25543D1F for ; Tue, 24 Feb 2004 15:21:10 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.11/8.12.11) with ESMTP id i1ONL812013282 for ; Wed, 25 Feb 2004 00:21:08 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: geom@freebsd.org From: Poul-Henning Kamp Date: Wed, 25 Feb 2004 00:21:08 +0100 Message-ID: <13281.1077664868@critter.freebsd.dk> Subject: Geom/DIAGNOSTIC patch X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2004 23:21:11 -0000 This is a patch I have in the pipeline. It adds some very strong checks on pointer arguments if DIAGNOSTIC is used. If you use these functions for debugging in class implementations you should be aware of KLD DIAGNOSTIC/no DIAGNOSTIC issues. I'm not quite sure if this should be DIAGNOSTIC of if we want to make a new GEOM_DIAGNOSTIC option, input on this is welcome. I'm pretty sure that I do not want to see class code committed which calls these functions, so I may move it to geom_int.h Test & Review welome... Poul-Henning Index: geom/geom.h =================================================================== RCS file: /home/ncvs/src/sys/geom/geom.h,v retrieving revision 1.80 diff -u -r1.80 geom.h --- geom/geom.h 12 Feb 2004 22:42:11 -0000 1.80 +++ geom/geom.h 24 Feb 2004 23:13:14 -0000 @@ -221,13 +221,29 @@ struct g_consumer * g_new_consumer(struct g_geom *gp); struct g_geom * g_new_geomf(struct g_class *mp, const char *fmt, ...); struct g_provider * g_new_providerf(struct g_geom *gp, const char *fmt, ...); -void g_sanity(void const *ptr); void g_spoil(struct g_provider *pp, struct g_consumer *cp); int g_std_access(struct g_provider *pp, int dr, int dw, int de); void g_std_done(struct bio *bp); void g_std_spoiled(struct g_consumer *cp); void g_wither_geom(struct g_geom *gp, int error); +#ifdef DIAGNOSTIC +int g_valid_obj(void const *ptr); +#define G_VALID_CLASS(foo) \ + KASSERT(g_valid_obj(foo) == 1, ("%p is not a g_class", foo)) +#define G_VALID_GEOM(foo) \ + KASSERT(g_valid_obj(foo) == 2, ("%p is not a g_geom", foo)) +#define G_VALID_CONSUMER(foo) \ + KASSERT(g_valid_obj(foo) == 3, ("%p is not a g_consumer", foo)) +#define G_VALID_PROVIDER(foo) \ + KASSERT(g_valid_obj(foo) == 4, ("%p is not a g_provider", foo)) +#else +#define G_VALID_CLASS(foo) do { } while (0) +#define G_VALID_GEOM(foo) do { } while (0) +#define G_VALID_CONSUMER(foo) do { } while (0) +#define G_VALID_PROVIDER(foo) do { } while (0) +#endif + int g_modevent(module_t, int, void *); /* geom_io.c */ @@ -258,16 +274,17 @@ void *p; p = malloc(size, M_GEOM, flags); - g_sanity(p); - /* printf("malloc(%d, %x) -> %p\n", size, flags, p); */ return (p); } static __inline void g_free(void *ptr) { - g_sanity(ptr); - /* printf("free(%p)\n", ptr); */ + +#ifdef DIAGNOSTIC + KASSERT(g_valid_obj(ptr) == 0, + ("g_free(%p) of live object, type %d", ptr, g_valid_obj(ptr))); +#endif free(ptr, M_GEOM); } @@ -281,19 +298,16 @@ #define g_topology_unlock() \ do { \ - g_sanity(NULL); \ sx_xunlock(&topology_lock); \ } while (0) #define g_topology_assert() \ do { \ - g_sanity(NULL); \ sx_assert(&topology_lock, SX_XLOCKED); \ } while (0) #define g_topology_assert_not() \ do { \ - g_sanity(NULL); \ sx_assert(&topology_lock, SX_UNLOCKED); \ } while (0) Index: geom/geom_dump.c =================================================================== RCS file: /home/ncvs/src/sys/geom/geom_dump.c,v retrieving revision 1.30 diff -u -r1.30 geom_dump.c --- geom/geom_dump.c 7 Dec 2003 05:04:49 -0000 1.30 +++ geom/geom_dump.c 24 Feb 2004 23:13:15 -0000 @@ -273,7 +273,6 @@ { va_list ap; - g_sanity(NULL); if (!(g_debugflags & level)) return; va_start(ap, fmt); Index: geom/geom_event.c =================================================================== RCS file: /home/ncvs/src/sys/geom/geom_event.c,v retrieving revision 1.48 diff -u -r1.48 geom_event.c --- geom/geom_event.c 10 Feb 2004 15:55:17 -0000 1.48 +++ geom/geom_event.c 24 Feb 2004 23:13:15 -0000 @@ -59,7 +59,6 @@ static u_int g_pending_events; static TAILQ_HEAD(,g_provider) g_doorstep = TAILQ_HEAD_INITIALIZER(g_doorstep); static struct mtx g_eventlock; -static struct sx g_eventstall; #define G_N_EVENTREFS 20 @@ -84,23 +83,10 @@ } void -g_stall_events(void) -{ - - sx_xlock(&g_eventstall); -} - -void -g_release_events(void) -{ - - sx_xunlock(&g_eventstall); -} - -void g_orphan_provider(struct g_provider *pp, int error) { + /* G_VALID_PROVIDER(pp) We likely lack topology lock */ g_trace(G_T_TOPOLOGY, "g_orphan_provider(%p(%s), %d)", pp, pp->name, error); KASSERT(error != 0, @@ -128,8 +114,9 @@ struct g_consumer *cp, *cp2; int wf; - g_trace(G_T_TOPOLOGY, "g_orphan_register(%s)", pp->name); g_topology_assert(); + G_VALID_PROVIDER(pp); + g_trace(G_T_TOPOLOGY, "g_orphan_register(%s)", pp->name); wf = pp->flags & G_PF_WITHER; pp->flags &= ~G_PF_WITHER; @@ -166,13 +153,14 @@ struct g_event *ep; struct g_provider *pp; - sx_xlock(&g_eventstall); g_topology_lock(); for (;;) { mtx_lock(&g_eventlock); pp = TAILQ_FIRST(&g_doorstep); - if (pp != NULL) + if (pp != NULL) { + G_VALID_PROVIDER(pp); TAILQ_REMOVE(&g_doorstep, pp, orphan); + } mtx_unlock(&g_eventlock); if (pp == NULL) break; @@ -183,7 +171,6 @@ if (ep == NULL) { mtx_unlock(&g_eventlock); g_topology_unlock(); - sx_xunlock(&g_eventstall); return (0); } TAILQ_REMOVE(&g_events, ep, events); @@ -201,7 +188,6 @@ if (g_pending_events == 0) wakeup(&g_pending_events); g_topology_unlock(); - sx_xunlock(&g_eventstall); return (1); } @@ -337,5 +323,4 @@ { mtx_init(&g_eventlock, "GEOM orphanage", NULL, MTX_DEF); - sx_init(&g_eventstall, "GEOM event stalling"); } Index: geom/geom_int.h =================================================================== RCS file: /home/ncvs/src/sys/geom/geom_int.h,v retrieving revision 1.26 diff -u -r1.26 geom_int.h --- geom/geom_int.h 23 Apr 2003 20:54:42 -0000 1.26 +++ geom/geom_int.h 24 Feb 2004 23:13:15 -0000 @@ -68,8 +68,6 @@ /* geom_event.c */ void g_event_init(void); void g_run_events(void); -void g_stall_events(void); -void g_release_events(void); /* geom_subr.c */ extern struct class_list_head g_classes; Index: geom/geom_subr.c =================================================================== RCS file: /home/ncvs/src/sys/geom/geom_subr.c,v retrieving revision 1.71 diff -u -r1.71 geom_subr.c --- geom/geom_subr.c 14 Feb 2004 17:58:57 -0000 1.71 +++ geom/geom_subr.c 24 Feb 2004 23:13:15 -0000 @@ -57,8 +57,6 @@ static struct g_tailq_head geoms = TAILQ_HEAD_INITIALIZER(geoms); char *g_wait_event, *g_wait_up, *g_wait_down, *g_wait_sim; -static int g_valid_obj(void const *ptr); - struct g_hh00 { struct g_class *mp; int error; @@ -125,6 +123,7 @@ g_topology_assert(); hh = arg; mp = hh->mp; + G_VALID_CLASS(mp); g_trace(G_T_TOPOLOGY, "g_unload_class(%s)", mp->name); /* @@ -216,7 +215,7 @@ struct sbuf *sb; g_topology_assert(); - KASSERT(g_valid_obj(mp), ("g_new_geom_f() on alien class %p", mp)); + G_VALID_CLASS(mp); sb = sbuf_new(NULL, NULL, 0, SBUF_AUTOEXTEND); va_start(ap, fmt); sbuf_vprintf(sb, fmt, ap); @@ -239,8 +238,9 @@ g_destroy_geom(struct g_geom *gp) { - g_trace(G_T_TOPOLOGY, "g_destroy_geom(%p(%s))", gp, gp->name); g_topology_assert(); + G_VALID_GEOM(gp); + g_trace(G_T_TOPOLOGY, "g_destroy_geom(%p(%s))", gp, gp->name); KASSERT(LIST_EMPTY(&gp->consumer), ("g_destroy_geom(%s) with consumer(s) [%p]", gp->name, LIST_FIRST(&gp->consumer))); @@ -255,7 +255,7 @@ } /* - * This function is called (repeatedly) until has withered away. + * This function is called (repeatedly) until the has withered away. */ void g_wither_geom(struct g_geom *gp, int error) @@ -264,11 +264,12 @@ struct g_consumer *cp, *cp2; static int once_is_enough; + g_topology_assert(); + G_VALID_GEOM(gp); if (once_is_enough) return; once_is_enough = 1; g_trace(G_T_TOPOLOGY, "g_wither_geom(%p(%s))", gp, gp->name); - g_topology_assert(); if (!(gp->flags & G_GEOM_WITHER)) { gp->flags |= G_GEOM_WITHER; LIST_FOREACH(pp, &gp->provider, provider) @@ -298,6 +299,7 @@ struct g_consumer *cp; g_topology_assert(); + G_VALID_GEOM(gp); KASSERT(!(gp->flags & G_GEOM_WITHER), ("g_new_consumer on WITHERing geom(%s) (class %s)", gp->name, gp->class->name)); @@ -318,8 +320,9 @@ { struct g_geom *gp; - g_trace(G_T_TOPOLOGY, "g_destroy_consumer(%p)", cp); g_topology_assert(); + g_trace(G_T_TOPOLOGY, "g_destroy_consumer(%p)", cp); + G_VALID_CONSUMER(cp); KASSERT (cp->provider == NULL, ("g_destroy_consumer but attached")); KASSERT (cp->acr == 0, ("g_destroy_consumer with acr")); KASSERT (cp->acw == 0, ("g_destroy_consumer with acw")); @@ -347,6 +350,7 @@ if (g_shutdown) return; pp = arg; + G_VALID_PROVIDER(pp); LIST_FOREACH(mp, &g_classes, class) { if (mp->taste == NULL) continue; @@ -358,14 +362,6 @@ continue; mp->taste(mp, pp, 0); g_topology_assert(); - /* - * XXX: Bandaid for 5.2-RELEASE - * XXX: DO NOT REPLICATE THIS CODE! - */ - if (!g_valid_obj(pp)) { - printf("g_provider %p disappeared while tasting\n", pp); - return; - } } } @@ -378,6 +374,7 @@ va_list ap; g_topology_assert(); + G_VALID_GEOM(gp); KASSERT(gp->start != NULL, ("new provider on geom(%s) without ->start (class %s)", gp->name, gp->class->name)); @@ -407,6 +404,7 @@ g_error_provider(struct g_provider *pp, int error) { + /* G_VALID_PROVIDER(pp); We may not have g_topology */ pp->error = error; } @@ -434,6 +432,7 @@ struct g_geom *gp; g_topology_assert(); + G_VALID_PROVIDER(pp); KASSERT(LIST_EMPTY(&pp->consumers), ("g_destroy_provider but attached")); KASSERT (pp->acr == 0, ("g_destroy_provider with acr")); @@ -473,6 +472,7 @@ int n, m; g_topology_assert(); + G_VALID_GEOM(gp); /* Invalidate this geoms rank and move it to the tail */ gp1 = TAILQ_NEXT(gp, geoms); @@ -522,6 +522,8 @@ int error; g_topology_assert(); + G_VALID_CONSUMER(cp); + G_VALID_PROVIDER(pp); KASSERT(cp->provider == NULL, ("attach but attached")); cp->provider = pp; LIST_INSERT_HEAD(&pp->consumers, cp, consumers); @@ -539,9 +541,9 @@ { struct g_provider *pp; - g_trace(G_T_TOPOLOGY, "g_detach(%p)", cp); - KASSERT(cp != (void*)0xd0d0d0d0, ("ARGH!")); g_topology_assert(); + G_VALID_CONSUMER(cp); + g_trace(G_T_TOPOLOGY, "g_detach(%p)", cp); KASSERT(cp->provider != NULL, ("detach but not attached")); KASSERT(cp->acr == 0, ("detach but nonzero acr")); KASSERT(cp->acw == 0, ("detach but nonzero acw")); @@ -572,12 +574,14 @@ int pr,pw,pe; int error; + g_topology_assert(); + G_VALID_CONSUMER(cp); pp = cp->provider; + G_VALID_PROVIDER(pp); g_trace(G_T_ACCESS, "g_access(%p(%s), %d, %d, %d)", cp, pp->name, dcr, dcw, dce); - g_topology_assert(); KASSERT(cp->provider != NULL, ("access but not attached")); KASSERT(cp->acr + dcr >= 0, ("access resulting in negative acr")); KASSERT(cp->acw + dcw >= 0, ("access resulting in negative acw")); @@ -687,10 +691,12 @@ } int -g_std_access(struct g_provider *pp __unused, +g_std_access(struct g_provider *pp, int dr __unused, int dw __unused, int de __unused) { + g_topology_assert(); + G_VALID_PROVIDER(pp); return (0); } @@ -717,8 +723,9 @@ struct g_geom *gp; struct g_provider *pp; - g_trace(G_T_TOPOLOGY, "g_std_spoiled(%p)", cp); g_topology_assert(); + G_VALID_CONSUMER(cp); + g_trace(G_T_TOPOLOGY, "g_std_spoiled(%p)", cp); g_detach(cp); gp = cp->geom; LIST_FOREACH(pp, &gp->provider, provider) @@ -751,6 +758,7 @@ if (flag == EV_CANCEL) return; pp = arg; + G_VALID_PROVIDER(pp); for (cp = LIST_FIRST(&pp->consumers); cp != NULL; cp = cp2) { cp2 = LIST_NEXT(cp, consumers); if (!cp->spoiled) @@ -769,6 +777,8 @@ struct g_consumer *cp2; g_topology_assert(); + G_VALID_PROVIDER(pp); + G_VALID_CONSUMER(cp); LIST_FOREACH(cp2, &pp->consumers, consumers) { if (cp2 == cp) @@ -797,11 +807,16 @@ return (0); } +#ifdef DIAGNOSTIC /* - * XXX: Bandaid for 5.2. - * XXX: DO NOT EVEN THINK ABOUT CALLING THIS FUNCTION! + * This function walks (topologically unsafely) the mesh and return a + * non-zero integer if it finds the pointer is an object. The return + * value indicates which type of object it is belived to be. + * If topology is not locked, this function is potentially dangerous, + * but since it is for debugging purposes and can be useful for instance + * from DDB, we do not assert topology held. */ -static int +int g_valid_obj(void const *ptr) { struct g_class *mp; @@ -809,50 +824,20 @@ struct g_consumer *cp; struct g_provider *pp; - g_topology_assert(); LIST_FOREACH(mp, &g_classes, class) { if (ptr == mp) return (1); LIST_FOREACH(gp, &mp->geom, geom) { if (ptr == gp) - return (1); + return (2); LIST_FOREACH(cp, &gp->consumer, consumer) if (ptr == cp) - return (1); + return (3); LIST_FOREACH(pp, &gp->provider, provider) if (ptr == pp) - return (1); + return (4); } } return(0); } - -/* - * Check if the given pointer is a live object - */ - -void -g_sanity(void const *ptr) -{ - struct g_class *mp; - struct g_geom *gp; - struct g_consumer *cp; - struct g_provider *pp; - - if (!(g_debugflags & 0x8)) - return; - LIST_FOREACH(mp, &g_classes, class) { - KASSERT(mp != ptr, ("Ptr is live class")); - LIST_FOREACH(gp, &mp->geom, geom) { - KASSERT(gp != ptr, ("Ptr is live geom")); - KASSERT(gp->name != ptr, ("Ptr is live geom's name")); - LIST_FOREACH(cp, &gp->consumer, consumer) { - KASSERT(cp != ptr, ("Ptr is live consumer")); - } - LIST_FOREACH(pp, &gp->provider, provider) { - KASSERT(pp != ptr, ("Ptr is live provider")); - } - } - } -} - +#endif -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Wed Feb 25 02:31:01 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B372916A4CE for ; Wed, 25 Feb 2004 02:31:01 -0800 (PST) Received: from darkness.comp.waw.pl (unknown [195.117.238.236]) by mx1.FreeBSD.org (Postfix) with ESMTP id 12F1D43D1F for ; Wed, 25 Feb 2004 02:31:00 -0800 (PST) (envelope-from pjd@darkness.comp.waw.pl) Received: by darkness.comp.waw.pl (Postfix, from userid 1009) id EC1E2AEA50; Wed, 25 Feb 2004 11:30:57 +0100 (CET) Date: Wed, 25 Feb 2004 11:30:57 +0100 From: Pawel Jakub Dawidek To: freebsd-geom@freebsd.org Message-ID: <20040225103057.GB5506@darkness.comp.waw.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tjCHc7DPkfUGtrlw" Content-Disposition: inline User-Agent: Mutt/1.4.2i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 5.2.1-RC2 i386 Subject: GEOM stripe library. X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2004 10:31:02 -0000 --tjCHc7DPkfUGtrlw Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi. I finished some time ago GEOM stripe library. It should simplify stripe implementation. phk@ already looked at it. I need feedback from responsible parties if this is complete and usable. So, please look at the code below (from g_stripe.h) and try to fit it to vinum, ata-raid and simplar stuff, that use striping. Full code is here: http://people.freebsd.org/~pjd/geom_stripe_lib.tbz and an example geom_raid0 implementation is here: http://people.freebsd.org/~pjd/geom_raid0.tbz /* * Structure g_stripe_slice describe single slice on single disk. * Disk can contains many slices. * * DATA OFFSET * +-----------+ 0 * | meta data | <- should not be touched * +-----------+ 1024 * | | * | slice 0 | * | | * +-----------+ 3072 * | meta data | <- should not be touched * +-----------+ 4096 * | | * | | * | slice 1 | * | | * | | * +-----------+ 7168 * * This disk have two slice: * * d_slices[0].s_offset =3D 1024 * d_slices[0].s_length =3D 2048 * d_slices[0].s_virt_offset =3D 0 * d_slices[0].s_virt_end =3D 2048 (s_virt_offset + s_length) * * d_slices[1].s_offset =3D 4096 * d_slices[1].s_length =3D 3072 * d_slices[1].s_virt_offset =3D 2048 (d_slices[0].s_virt_end) * d_slices[1].s_virt_end =3D 5120 (s_virt_offset + s_length) */ struct g_stripe_slice { off_t s_offset; /* Offset in source disk. */ off_t s_length; /* Slice length. */ off_t s_virt_offset; /* Offset in destination disk. */ off_t s_virt_end; /* End in destination disk. */ }; /* * Structure g_stripe_group describes group of disks that are used * on given offset. This allows to create stripe on disks with different * sizes. * * disk 0 disk 1 disk 2 START_DISK START_OFFSET * +-------+ +-------+ +-------+ 0 0 * |0000000| |0000000| |0000000| * |0000000| |0000000| |0000000| * + - - - + +-------+ + - - - + 1024 0 + 1024 * 3 =3D 3072 * |1111111| |1111111| * |1111111| |1111111| * |1111111| |1111111| * +-------+ + - - - + 2560 3072 + 1536 * 2 =3D 6144 * |2222222| * +-------+ 3072 6144 + 512 * 1 =3D 6656 * * sc_groups[0].g_ndisks =3D 3 * sc_groups[0].g_disks =3D { 0, 1, 2 } * sc_groups[0].g_start_offset =3D 0 * sc_groups[0].g_end_offset =3D 3072 * sc_groups[0].g_start_disk =3D 0 * * sc_groups[1].g_ndisks =3D 2 * sc_groups[1].g_disks =3D { 0, 2 } * sc_groups[1].g_start_offset =3D 3072 * sc_groups[1].g_end_offset =3D 6144 * sc_groups[1].g_start_disk =3D 2560 * * sc_groups[2].g_ndisks =3D 1 * sc_groups[2].g_disks =3D { 2 } * sc_groups[2].g_start_offset =3D 6144 * sc_groups[2].g_end_offset =3D 6656 * sc_groups[2].g_start_disk =3D 3072 */ struct g_stripe_group { uint16_t g_ndisks; struct g_stripe_disk **g_disks; off_t g_start_offset; off_t g_end_offset; off_t g_start_disk; }; [...] /* Exported functions: */ struct g_geom *g_stripe_create(struct g_class *mp, const char *name, uint16_t ndisks, uint32_t stripesize, uint32_t sectorsize, void *priv, void (*orphan)(struct g_provider *)); void g_stripe_add_slice(struct g_geom *gp, struct g_provider *pp, off_t off= set, off_t length); int g_stripe_attach(struct g_geom *gp, struct g_provider *pp, uint16_t no, uint16_t nslices); int g_stripe_detach(struct g_geom *gp, struct g_provider *pp); int g_stripe_destroy(struct g_geom *gp, int force); --=20 Pawel Jakub Dawidek http://www.FreeBSD.org pjd@FreeBSD.org http://garage.freebsd.pl FreeBSD committer Am I Evil? Yes, I Am! --tjCHc7DPkfUGtrlw Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFAPHlhForvXbEpPzQRAtfnAJ0RsHIql1chTQKBovg1LkjkGj3GOACg9FSE 5TN7Qvuh8ZEBLzF6YuxmNaQ= =Q54X -----END PGP SIGNATURE----- --tjCHc7DPkfUGtrlw-- From owner-freebsd-geom@FreeBSD.ORG Thu Feb 26 11:29:12 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 089A516A4CE; Thu, 26 Feb 2004 11:29:12 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 15A8A43D2F; Thu, 26 Feb 2004 11:29:11 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.11/8.12.11) with ESMTP id i1Q9lWnQ000887; Thu, 26 Feb 2004 10:47:32 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Pawel Jakub Dawidek From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 25 Feb 2004 11:30:57 +0100." <20040225103057.GB5506@darkness.comp.waw.pl> Date: Thu, 26 Feb 2004 10:47:32 +0100 Message-ID: <886.1077788852@critter.freebsd.dk> cc: freebsd-geom@freebsd.org Subject: Re: GEOM stripe library. X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2004 19:29:12 -0000 In message <20040225103057.GB5506@darkness.comp.waw.pl>, Pawel Jakub Dawidek wr ites: > * This disk have two slice: > * > * d_slices[0].s_offset =3D 1024 > * d_slices[0].s_length =3D 2048 > * d_slices[0].s_virt_offset =3D 0 > * d_slices[0].s_virt_end =3D 2048 (s_virt_offset + s_length) I can't really see what we need s_virt_end for. If you need it for implementation, then please calculate it as part of setup so that the users of this library will not even see it. >/* > * Structure g_stripe_group describes group of disks that are used > * on given offset. This allows to create stripe on disks with different > * sizes. > * > * disk 0 disk 1 disk 2 START_DISK START_OFFSET > * +-------+ +-------+ +-------+ 0 0 > * |0000000| |0000000| |0000000| > * |0000000| |0000000| |0000000| > * + - - - + +-------+ + - - - + 1024 0 + 1024 * 3 =3D 3072 > * |1111111| |1111111| > * |1111111| |1111111| > * |1111111| |1111111| > * +-------+ + - - - + 2560 3072 + 1536 * 2 =3D 6144 > * |2222222| > * +-------+ 3072 6144 + 512 * 1 =3D 6656 > * > * sc_groups[0].g_ndisks =3D 3 > * sc_groups[0].g_disks =3D { 0, 1, 2 } > * sc_groups[0].g_start_offset =3D 0 > * sc_groups[0].g_end_offset =3D 3072 > * sc_groups[0].g_start_disk =3D 0 Where is the actual stripe-width ? Shouldn't that be a parameter here ? > * sc_groups[1].g_ndisks =3D 2 > * sc_groups[1].g_disks =3D { 0, 2 } > * sc_groups[1].g_start_offset =3D 3072 > * sc_groups[1].g_end_offset =3D 6144 > * sc_groups[1].g_start_disk =3D 2560 s/2560/1024/ ? > * sc_groups[2].g_ndisks =3D 1 > * sc_groups[2].g_disks =3D { 2 } > * sc_groups[2].g_start_offset =3D 6144 > * sc_groups[2].g_end_offset =3D 6656 > * sc_groups[2].g_start_disk =3D 3072 s/3072/2560/ ? I have been thinking a bit more about this, and I can see a number of in-use striping geometries you cannot handle with this, in particular non-uniform striping (64k from disk0, 128k from disk1, 64k from disk0, 128k from disk1, ...) There are also some underspecified parameters and some overspecified ones. I think it would be much more compact in the client code and just as efficient for the library code to describe just the actual layout: struct g_slicecomponent { off_t offset; off_t bite; /* private members for the library */ ... }; struct g_slicepart { off_t stripes; struct g_slicecomponent *components; /* private members for the library */ ... }; struct stripe { int nconsumer; struct g_consumer *consumer; int nparts; struct g_slicepart *slicepart; /* private members for the library */ ... }; Simply striping two 1 megabyte disks with 8k stripes, preserving the first 63 sectors for metadata would look like this in not-quite-C: struct stripe { .nconsumer = 2; .consumer = [ ptr1; ptr2; ]; .nparts = 1; .slicepart = [ { .stripes = (1M - 63*512) / 8192; .components = [ { .offset = 63; .bite = 8k / 2; }; { .offset = 63; .bite = 8k / 2; }; ] }; ] }; And setting this up in class code could look like this: struct stripe *sp; sp = g_stripe_new(/* consumers */ 2, /* parts */ 1); sp->consumer[0] = c0; sp->consumer[1] = c1; sp->slicepart[0].stripes = (size - magic) / stripewidth; sp->slicepart[0].components[0].offset = magic; sp->slicepart[0].components[0].bite = stripewidth / 2; sp->slicepart[0].components[1].offset = magic; sp->slicepart[0].components[1].bite = stripewidth / 2; error = g_stripe_calculate(sp); A geometry somewhat like yours and with all parameters filled in would then look like this in not-quite-C: struct stripe { .nconsumer = 3; .consumer = [ ptr1; ptr2; ptr3; ]; .nparts = 3; .slicepart = [ { .stripes = 1024; .components = [ { .offset = 1024; .bite = 1024; }; { .offset = 1024; .bite = 1024; }; { .offset = 1024; .bite = 1024; }; ] }; { .stripes = 3072; .components = [ { .offset = 1024 + 1024 * 1024; .bite = 512; }; { .offset = 0; .bite = 0; }; { .offset = 1024 + 1024 * 1024; .bite = 512; }; ] }; { .stripes = 256; .components = [ { .offset = 0; .bite = 0; }; { .offset = 0; .bite = 0; }; { .consumer = 2; .offset = 1024 + 1024 * 1024 + 3072 * 512; .bite = 2048; }; ] }; ] }; And would look like this in C code: struct stripe *sp; sp = g_stripe_new(/* consumers */ 3, /* parts */ 3); sp->consumer[0] = c0; sp->consumer[1] = c1; sp->consumer[2] = c2; sp->slicepart[0].stripes = 1024; for (i = 0; i < 3; i++) { sp->slicepart[0].components[i].offset = 1024; sp->slicepart[0].components[i].bite = 1024; } sp->slicepart[1].stripes = 3072; sp->slicepart[1].components[0].offset = -1; /* autocalculate */ sp->slicepart[1].components[0].bite = 512; sp->slicepart[1].components[2].offset = -1; /* autocalculate */ sp->slicepart[1].components[2].bite = 512; sp->slicepart[1].stripes = 256; sp->slicepart[1].components[2].offset = -1; /* autocalculate */ sp->slicepart[1].components[2].bite = 2048; error = g_stripe_calculate(sp); Setting offset to -1 means "continue wherever we got to". -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Thu Feb 26 12:38:10 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DE16816A4CE for ; Thu, 26 Feb 2004 12:38:10 -0800 (PST) Received: from darkness.comp.waw.pl (unknown [195.117.238.236]) by mx1.FreeBSD.org (Postfix) with ESMTP id 770DD43D41 for ; Thu, 26 Feb 2004 12:38:09 -0800 (PST) (envelope-from pjd@darkness.comp.waw.pl) Received: by darkness.comp.waw.pl (Postfix, from userid 1009) id 49A3BACE06; Thu, 26 Feb 2004 21:38:07 +0100 (CET) Date: Thu, 26 Feb 2004 21:38:07 +0100 From: Pawel Jakub Dawidek To: Poul-Henning Kamp Message-ID: <20040226203807.GD5720@darkness.comp.waw.pl> References: <20040225103057.GB5506@darkness.comp.waw.pl> <886.1077788852@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gneEPciiIl/aKvOT" Content-Disposition: inline In-Reply-To: <886.1077788852@critter.freebsd.dk> User-Agent: Mutt/1.4.2i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 5.2.1-RC2 i386 cc: freebsd-geom@freebsd.org Subject: Re: GEOM stripe library. X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2004 20:38:11 -0000 --gneEPciiIl/aKvOT Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Feb 26, 2004 at 10:47:32AM +0100, Poul-Henning Kamp wrote: +> > * This disk have two slice: +> > * +> > * d_slices[0].s_offset =3D3D 1024 +> > * d_slices[0].s_length =3D3D 2048 +> > * d_slices[0].s_virt_offset =3D3D 0 +> > * d_slices[0].s_virt_end =3D3D 2048 (s_virt_offset + s_length) +>=20 +> I can't really see what we need s_virt_end for. If you need it for +> implementation, then please calculate it as part of setup so that +> the users of this library will not even see it. All structures are private for library. Library's user is not aware of structure g_stripe_slice nor g_stripe_group, he operates on well-known GEOM structures: g_provider, g_geom. For slices definitions he use g_stripe_add_slice() function and he don't touch geom_stripe_lib's structures at all. Defining slices for disks is one-way operation, so this design should be enough and it is also simple. Look at geom_raid0 implementaion. It only operates on generic GEOM structures. +> >/* +> > * Structure g_stripe_group describes group of disks that are used +> > * on given offset. This allows to create stripe on disks with different +> > * sizes. +> > * +> > * disk 0 disk 1 disk 2 START_DISK START_OFFSET +> > * +-------+ +-------+ +-------+ 0 0 +> > * |0000000| |0000000| |0000000| +> > * |0000000| |0000000| |0000000| +> > * + - - - + +-------+ + - - - + 1024 0 + 1024 * 3 =3D3D 30= 72 +> > * |1111111| |1111111| +> > * |1111111| |1111111| +> > * |1111111| |1111111| +> > * +-------+ + - - - + 2560 3072 + 1536 * 2 =3D3D= 6144 +> > * |2222222| +> > * +-------+ 3072 6144 + 512 * 1 =3D3D = 6656 +> > * +> > * sc_groups[0].g_ndisks =3D3D 3 +> > * sc_groups[0].g_disks =3D3D { 0, 1, 2 } +> > * sc_groups[0].g_start_offset =3D3D 0 +> > * sc_groups[0].g_end_offset =3D3D 3072 +> > * sc_groups[0].g_start_disk =3D3D 0 +>=20 +> Where is the actual stripe-width ? Shouldn't that be a parameter here ? This structure is also private. Stripe size is defined while one is calling g_stripe_create() function. "Groups" are calculated automatically, when all disks are ready. +> > * sc_groups[1].g_start_disk =3D3D 2560 +>=20 +> s/2560/1024/ ? [...] +> > * sc_groups[2].g_start_disk =3D3D 3072 +>=20 +> s/3072/2560/ ? Right, sorry for mistake. +> I have been thinking a bit more about this, and I can see a number of +> in-use striping geometries you cannot handle with this, in particular +> non-uniform striping (64k from disk0, 128k from disk1, 64k from disk0, +> 128k from disk1, ...) There are also some underspecified parameters +> and some overspecified ones. Yes, this is not possible, but I'm also not aware of software that is doing this. I don't think even veritas gives such possibility. I think this will only complicate calculations. +> And would look like this in C code: +>=20 +> struct stripe *sp; +>=20 +> sp =3D g_stripe_new(/* consumers */ 3, /* parts */ 3); +>=20 +> sp->consumer[0] =3D c0; +> sp->consumer[1] =3D c1; +> sp->consumer[2] =3D c2; +> sp->slicepart[0].stripes =3D 1024; +> for (i =3D 0; i < 3; i++) { +> sp->slicepart[0].components[i].offset =3D 1024; +> sp->slicepart[0].components[i].bite =3D 1024; +> } +> sp->slicepart[1].stripes =3D 3072; +> sp->slicepart[1].components[0].offset =3D -1; /* autocalculate */ +> sp->slicepart[1].components[0].bite =3D 512; +> sp->slicepart[1].components[2].offset =3D -1; /* autocalculate */ +> sp->slicepart[1].components[2].bite =3D 512; +>=20 +> sp->slicepart[1].stripes =3D 256; +> sp->slicepart[1].components[2].offset =3D -1; /* autocalculate */ +> sp->slicepart[1].components[2].bite =3D 2048; +>=20 +> error =3D g_stripe_calculate(sp); +>=20 +> Setting offset to -1 means "continue wherever we got to". I'm not sure how to comments this, because I think you misunderstand my model. Here is an example (without errors checking) how to create stripe using 3 disks: da0, da1, da2. Stripe size is 64kB, metadata are stored on last disk's sector. gp =3D g_stripe_create(mp, "test.stripe", /* ndisks */ 3, /* stripesize */ 65536, /* sectorsize */ 512, NULL, orphan_func); pp =3D g_provider_by_name("da0"); error =3D g_stripe_attach(gp, pp, /* disk no */ 0, /* number of slices */ 1= ); g_stripe_add_slice(gp, pp, /* offset */ 0, /* size */ pp->mediasize - pp->sectorsize); pp =3D g_provider_by_name("da1"); error =3D g_stripe_attach(gp, pp, /* disk no */ 1, /* number of slices */ 1= ); g_stripe_add_slice(gp, pp, /* offset */ 0, /* size */ pp->mediasize - pp->sectorsize); pp =3D g_provider_by_name("da2"); error =3D g_stripe_attach(gp, pp, /* disk no */ 2, /* number of slices */ 1= ); g_stripe_add_slice(gp, pp, /* offset */ 0, /* size */ pp->mediasize - pp->sectorsize); That's all, the whole rest will be calculated and configured automatically. --=20 Pawel Jakub Dawidek http://www.FreeBSD.org pjd@FreeBSD.org http://garage.freebsd.pl FreeBSD committer Am I Evil? Yes, I Am! --gneEPciiIl/aKvOT Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFAPlkvForvXbEpPzQRAo10AKDjmlq589InyUrLHBVtLO9A3r+b5ACg7aVM doAGTyFia1wt+/9Lt8ntAXs= =GNeo -----END PGP SIGNATURE----- --gneEPciiIl/aKvOT-- From owner-freebsd-geom@FreeBSD.ORG Thu Feb 26 12:43:49 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CA93C16A4CF; Thu, 26 Feb 2004 12:43:49 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3579043D1F; Thu, 26 Feb 2004 12:43:49 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.11/8.12.11) with ESMTP id i1QKhmNm003154; Thu, 26 Feb 2004 21:43:48 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Pawel Jakub Dawidek From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 26 Feb 2004 21:38:07 +0100." <20040226203807.GD5720@darkness.comp.waw.pl> Date: Thu, 26 Feb 2004 21:43:48 +0100 Message-ID: <3153.1077828228@critter.freebsd.dk> cc: freebsd-geom@FreeBSD.org Subject: Re: GEOM stripe library. X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2004 20:43:50 -0000 In message <20040226203807.GD5720@darkness.comp.waw.pl>, Pawel Jakub Dawidek wr ites: >+> And would look like this in C code: >+>=20 >+> struct stripe *sp; >+>=20 >+> sp =3D g_stripe_new(/* consumers */ 3, /* parts */ 3); >+>=20 >+> sp->consumer[0] =3D c0; >+> sp->consumer[1] =3D c1; >+> sp->consumer[2] =3D c2; >+> sp->slicepart[0].stripes =3D 1024; >+> for (i =3D 0; i < 3; i++) { >+> sp->slicepart[0].components[i].offset =3D 1024; >+> sp->slicepart[0].components[i].bite =3D 1024; >+> } >+> sp->slicepart[1].stripes =3D 3072; >+> sp->slicepart[1].components[0].offset =3D -1; /* autocalculate */ >+> sp->slicepart[1].components[0].bite =3D 512; >+> sp->slicepart[1].components[2].offset =3D -1; /* autocalculate */ >+> sp->slicepart[1].components[2].bite =3D 512; >+>=20 >+> sp->slicepart[1].stripes =3D 256; >+> sp->slicepart[1].components[2].offset =3D -1; /* autocalculate */ >+> sp->slicepart[1].components[2].bite =3D 2048; >+>=20 >+> error =3D g_stripe_calculate(sp); >+>=20 >+> Setting offset to -1 means "continue wherever we got to". > >I'm not sure how to comments this, because I think you misunderstand >my model. The beauty of my model is that there is only two function calls doing one memory allocation involved in this, rather than a ton of function calls where you have to check the error code after each and every one. (See my daemon-news "blueprint" articles for more on this). I also think that you should be prepared for non-uniform striping, even if you think it complicates things, because it exits in real life, even if you have not seen it. If you library is insufficiently general, then it will not gain us enough code reuse to be worth the trouble. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Thu Feb 26 13:19:20 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CEDA716A4CE for ; Thu, 26 Feb 2004 13:19:20 -0800 (PST) Received: from darkness.comp.waw.pl (unknown [195.117.238.236]) by mx1.FreeBSD.org (Postfix) with ESMTP id 54E8843D1D for ; Thu, 26 Feb 2004 13:19:20 -0800 (PST) (envelope-from pjd@darkness.comp.waw.pl) Received: by darkness.comp.waw.pl (Postfix, from userid 1009) id A04F0AE693; Thu, 26 Feb 2004 22:19:18 +0100 (CET) Date: Thu, 26 Feb 2004 22:19:18 +0100 From: Pawel Jakub Dawidek To: Poul-Henning Kamp Message-ID: <20040226211918.GE5720@darkness.comp.waw.pl> References: <20040226203807.GD5720@darkness.comp.waw.pl> <3153.1077828228@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="APlYHCtpeOhspHkB" Content-Disposition: inline In-Reply-To: <3153.1077828228@critter.freebsd.dk> User-Agent: Mutt/1.4.2i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 5.2.1-RC2 i386 cc: freebsd-geom@FreeBSD.org Subject: Re: GEOM stripe library. X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2004 21:19:20 -0000 --APlYHCtpeOhspHkB Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Feb 26, 2004 at 09:43:48PM +0100, Poul-Henning Kamp wrote: +> The beauty of my model is that there is only two function calls doing +> one memory allocation involved in this, rather than a ton of function +> calls where you have to check the error code after each and every +> one. (See my daemon-news "blueprint" articles for more on this). Hmm, strange. I was thinking that in GEOM world adding disk by disk is much more correct and natural, because of tasting events than building disks table first and looking at orphans in meanwhile and at the end call one function. +> I also think that you should be prepared for non-uniform striping, +> even if you think it complicates things, because it exits in real +> life, even if you have not seen it. I'll think about this, thanks for pointing this out. --=20 Pawel Jakub Dawidek http://www.FreeBSD.org pjd@FreeBSD.org http://garage.freebsd.pl FreeBSD committer Am I Evil? Yes, I Am! --APlYHCtpeOhspHkB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFAPmLWForvXbEpPzQRAsCqAKDFhlHPghBD8BCFCoJPnghYYuAOsACfdSh4 cx9HZtTslbtbKsHsnlqV3RY= =RkFu -----END PGP SIGNATURE----- --APlYHCtpeOhspHkB--