From owner-freebsd-geom@FreeBSD.ORG Sun Sep 23 01:04:51 2012 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0AD10106566B; Sun, 23 Sep 2012 01:04:51 +0000 (UTC) (envelope-from prvs=161333704c=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id DB8D28FC0A; Sun, 23 Sep 2012 01:04:49 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50000185296.msg; Sun, 23 Sep 2012 02:04:47 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Sun, 23 Sep 2012 02:04:47 +0100 (not processed: message from trusted or authenticated source) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=161333704c=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <80F518854AE34A759D9441AE1A60D2DC@multiplay.co.uk> From: "Steven Hartland" To: "Andriy Gapon" , References: <505DF1A3.1020809@FreeBSD.org> Date: Sun, 23 Sep 2012 02:04:39 +0100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_02EE_01CD992F.CA115AA0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-geom@FreeBSD.org Subject: Re: zfs zvol: set geom mediasize right at creation time X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Sep 2012 01:04:51 -0000 This is a multi-part message in MIME format. ------=_NextPart_000_02EE_01CD992F.CA115AA0 Content-Type: text/plain; format=flowed; charset="Windows-1252"; reply-type=original Content-Transfer-Encoding: 7bit ----- Original Message ----- From: "Andriy Gapon" > Please review the following patch. > > In addition to what the description says I almost by accident sneaked another > change into the patch. It's setting of stripesize to volblocksize. I think > that the change should make sense, but it is really a different change. > > > A side note: setting sectorsize to volblocksize seemed like an overkill and it > would certainly mess the existing zvols in use. Maybe there should be another > property like reportedblocksize or something. > > commit 1585e6cfb602c2a2647b9f802445bb174bc430a4 > Author: Andriy Gapon > Date: Wed Sep 19 20:49:28 2012 +0300 > > zvol: set mediasize in geom provider right upon its creation > > ... instead of deferring the action until first open. > Unlike upstream this has no benefit on FreeBSD. > We know that as soon as the provider is created it is going to be tasted > and thus opened. Initial mediasize of zero causes tasting failure > and subsequent retasting because of the size change. > > diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c > b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c > index d47d270..6e9e7a3 100644 > --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c > +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c > @@ -475,6 +475,7 @@ zvol_create_minor(const char *name) > zvol_state_t *zv; > objset_t *os; > dmu_object_info_t doi; > + uint64_t volblocksize, volsize; > int error; > > ZFS_LOG(1, "Creating ZVOL %s...", name); > @@ -535,9 +536,20 @@ zvol_create_minor(const char *name) > zv = zs->zss_data = kmem_zalloc(sizeof (zvol_state_t), KM_SLEEP); > #else /* !sun */ > > + error = zap_lookup(os, ZVOL_ZAP_OBJ, "size", 8, 1, &volsize); > + if (error) { > + ASSERT(error == 0); > + dmu_objset_disown(os, zvol_tag); > + mutex_exit(&spa_namespace_lock); > + return (error); > + } > + > DROP_GIANT(); > g_topology_lock(); > zv = zvol_geom_create(name); > + zv->zv_volsize = volsize; > + zv->zv_provider->mediasize = zv->zv_volsize; > + > #endif /* !sun */ > > (void) strlcpy(zv->zv_name, name, MAXPATHLEN); > @@ -554,6 +566,7 @@ zvol_create_minor(const char *name) > error = dmu_object_info(os, ZVOL_OBJ, &doi); > ASSERT(error == 0); > zv->zv_volblocksize = doi.doi_data_block_size; > + zv->zv_provider->stripesize = zv->zv_volblocksize; > > if (spa_writeable(dmu_objset_spa(os))) { > if (zil_replay_disable) Do you know what the effect of the volblocksize change will have on a volume who's disk block size changes? e.g. via a quirk for a 4k disk being added I ask as we've testing a patch here which changes ashift to be based on stripesize instead of sectorsize but in its current form it has some odd side effects on pools which are boot pools. Said patch is attached for reference. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. ------=_NextPart_000_02EE_01CD992F.CA115AA0 Content-Type: text/plain; format=flowed; name="zfs-ashift-fix.txt"; reply-type=original Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="zfs-ashift-fix.txt" Changes zfs zpool initial / desired ashift to be based off stripesize=0A= instead of sectorsize making it compatible with drives marked with=0A= the 4k sector size quirk.=0A= =0A= Without the correct min block size BIO_DELETE requests passed to=0A= a large number of current SSD's via TRIM don't actually perform=0A= any LBA TRIM so its vital for the correct operation of TRIM to get=0A= the correct min block size.=0A= =0A= To do this we added the additional dashift (desired ashift) to=0A= vdev_open_func_t calls. This was needed as just updating ashift to=0A= be based off stripesize would mean that a devices reported minimum=0A= transfer size (ashift) could increase and that in turn would cause=0A= member devices to be unusable and hence break pools with error=0A= ZFS-8000-5E.=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h.orig = 2012-07-03 11:48:22.353483879 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h = 2012-07-03 11:59:17.195442033 +0000=0A= @@ -55,7 +55,7 @@=0A= /*=0A= * Virtual device operations=0A= */=0A= -typedef int vdev_open_func_t(vdev_t *vd, uint64_t *size, uint64_t = *ashift);=0A= +typedef int vdev_open_func_t(vdev_t *vd, uint64_t *size, uint64_t = *ashift, uint64_t *dashift);=0A= typedef void vdev_close_func_t(vdev_t *vd);=0A= typedef uint64_t vdev_asize_func_t(vdev_t *vd, uint64_t psize);=0A= typedef int vdev_io_start_func_t(zio_t *zio);=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c.orig = 2012-07-03 12:53:37.716867380 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c 2012-07-03 = 12:01:58.522455031 +0000=0A= @@ -1122,6 +1122,7 @@=0A= uint64_t osize =3D 0;=0A= uint64_t asize, psize;=0A= uint64_t ashift =3D 0;=0A= + uint64_t dashift =3D 0;=0A= =0A= ASSERT(vd->vdev_open_thread =3D=3D curthread ||=0A= spa_config_held(spa, SCL_STATE_ALL, RW_WRITER) =3D=3D = SCL_STATE_ALL);=0A= @@ -1151,7 +1152,7 @@=0A= return (ENXIO);=0A= }=0A= =0A= - error =3D vd->vdev_ops->vdev_op_open(vd, &osize, &ashift);=0A= + error =3D vd->vdev_ops->vdev_op_open(vd, &osize, &ashift, &dashift);=0A= =0A= /*=0A= * Reset the vdev_reopening flag so that we actually close=0A= @@ -1251,7 +1252,7 @@=0A= * For testing purposes, a higher ashift can be requested.=0A= */=0A= vd->vdev_asize =3D asize;=0A= - vd->vdev_ashift =3D MAX(ashift, vd->vdev_ashift);=0A= + vd->vdev_ashift =3D MAX(MAX(ashift, dashift), vd->vdev_ashift);=0A= } else {=0A= /*=0A= * Make sure the alignment requirement hasn't increased.=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c.orig = 2012-07-03 11:49:34.103219588 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c = 2012-07-03 12:51:44.521525471 +0000=0A= @@ -103,7 +103,7 @@=0A= }=0A= =0A= static int=0A= -vdev_disk_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_disk_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= spa_t *spa =3D vd->vdev_spa;=0A= vdev_disk_t *dvd;=0A= @@ -284,7 +284,7 @@=0A= }=0A= =0A= /*=0A= - * Determine the device's minimum transfer size.=0A= + * Determine the device's minimum and desired transfer size.=0A= * If the ioctl isn't supported, assume DEV_BSIZE.=0A= */=0A= if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFOEXT, (intptr_t)&dkmext,=0A= @@ -292,6 +292,7 @@=0A= dkmext.dki_pbsize =3D DEV_BSIZE;=0A= =0A= *ashift =3D highbit(MAX(dkmext.dki_pbsize, SPA_MINBLOCKSIZE)) - 1;=0A= + *dashift =3D *ashift;=0A= =0A= /*=0A= * Clear the nowritecache bit, so that on a vdev_reopen() we will=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c.orig = 2012-07-03 11:48:42.314740333 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c = 2012-07-03 11:57:22.579381320 +0000=0A= @@ -47,7 +47,7 @@=0A= }=0A= =0A= static int=0A= -vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= vdev_file_t *vf;=0A= vnode_t *vp;=0A= @@ -127,6 +127,7 @@=0A= =0A= *psize =3D vattr.va_size;=0A= *ashift =3D SPA_MINBLOCKSHIFT;=0A= + *dashift =3D SPA_MINBLOCKSHIFT;=0A= =0A= return (0);=0A= }=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c.orig = 2012-07-03 12:50:50.158161825 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c = 2012-07-03 12:52:01.408085155 +0000=0A= @@ -416,7 +416,7 @@=0A= }=0A= =0A= static int=0A= -vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= struct g_provider *pp;=0A= struct g_consumer *cp;=0A= @@ -502,9 +502,10 @@=0A= *psize =3D pp->mediasize;=0A= =0A= /*=0A= - * Determine the device's minimum transfer size.=0A= + * Determine the device's minimum and desired transfer size.=0A= */=0A= *ashift =3D highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1;=0A= + *dashift =3D highbit(MAX(pp->stripesize, SPA_MINBLOCKSIZE)) - 1;=0A= =0A= /*=0A= * Clear the nowritecache settings, so that on a vdev_reopen()=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c.orig = 2012-07-03 11:49:22.342245151 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c = 2012-07-03 11:58:02.161948585 +0000=0A= @@ -127,7 +127,7 @@=0A= }=0A= =0A= static int=0A= -vdev_mirror_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_mirror_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, = uint64_t *dashift)=0A= {=0A= int numerrors =3D 0;=0A= int lasterror =3D 0;=0A= @@ -150,6 +150,7 @@=0A= =0A= *asize =3D MIN(*asize - 1, cvd->vdev_asize - 1) + 1;=0A= *ashift =3D MAX(*ashift, cvd->vdev_ashift);=0A= + *dashift =3D *ashift;=0A= }=0A= =0A= if (numerrors =3D=3D vd->vdev_children) {=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c.orig = 2012-07-03 11:49:10.545275865 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c = 2012-07-03 11:58:07.670470640 +0000=0A= @@ -40,7 +40,7 @@=0A= =0A= /* ARGSUSED */=0A= static int=0A= -vdev_missing_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_missing_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, = uint64_t *dashift)=0A= {=0A= /*=0A= * Really this should just fail. But then the root vdev will be in the=0A= @@ -50,6 +50,7 @@=0A= */=0A= *psize =3D 0;=0A= *ashift =3D 0;=0A= + *dashift =3D 0;=0A= return (0);=0A= }=0A= =0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c.orig = 2012-07-03 11:49:03.675875505 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c = 2012-07-03 11:58:15.334806334 +0000=0A= @@ -1447,7 +1447,7 @@=0A= }=0A= =0A= static int=0A= -vdev_raidz_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_raidz_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= vdev_t *cvd;=0A= uint64_t nparity =3D vd->vdev_nparity;=0A= @@ -1476,6 +1476,7 @@=0A= =0A= *asize =3D MIN(*asize - 1, cvd->vdev_asize - 1) + 1;=0A= *ashift =3D MAX(*ashift, cvd->vdev_ashift);=0A= + *dashift =3D *ashift;=0A= }=0A= =0A= *asize *=3D vd->vdev_children;=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c.orig = 2012-07-03 11:49:27.901760380 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c = 2012-07-03 11:58:19.704427068 +0000=0A= @@ -50,7 +50,7 @@=0A= }=0A= =0A= static int=0A= -vdev_root_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_root_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= int lasterror =3D 0;=0A= int numerrors =3D 0;=0A= @@ -78,6 +78,7 @@=0A= =0A= *asize =3D 0;=0A= *ashift =3D 0;=0A= + *dashift =3D 0;=0A= =0A= return (0);=0A= }=0A= ------=_NextPart_000_02EE_01CD992F.CA115AA0-- From owner-freebsd-geom@FreeBSD.ORG Sun Sep 23 04:48:35 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EA85D1065670; Sun, 23 Sep 2012 04:48:34 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) by mx1.freebsd.org (Postfix) with ESMTP id A35658FC0C; Sun, 23 Sep 2012 04:48:34 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id q8N4mSdI014744 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 22 Sep 2012 21:48:28 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id q8N4mSAB014743; Sat, 22 Sep 2012 21:48:28 -0700 (PDT) (envelope-from jmg) Date: Sat, 22 Sep 2012 21:48:28 -0700 From: John-Mark Gurney To: Pawel Jakub Dawidek Message-ID: <20120923044828.GI19036@funkthat.com> Mail-Followup-To: Pawel Jakub Dawidek , freebsd-geom@freebsd.org References: <20120919040430.GF19036@funkthat.com> <20120922162025.GE1454@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120922162025.GE1454@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 22 Sep 2012 21:48:28 -0700 (PDT) Cc: freebsd-geom@freebsd.org Subject: Re: geli and BIO_FLUSH and/or BIO_ORDERED issue? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Sep 2012 04:48:35 -0000 Pawel Jakub Dawidek wrote this message on Sat, Sep 22, 2012 at 18:20 +0200: > On Tue, Sep 18, 2012 at 09:04:30PM -0700, John-Mark Gurney wrote: > > I was looking at geli and I'm not sure if it's implementing BIO_FLUSH > > and/or BIO_ORDERED properly... > > > > >From my understanding is the BIO_ORDERED is suppose to wait for the > > previous _WRITES to complete before returning so that you can ensure > > that data is on disk, i.e. _ORDERED is set on a BIO_FLUSH... > > > > BIO_ORDERED is handled by diskq_* code such that when you add an _ORDERED > > command, all commands are put after it, but there doesn't appear to > > be any code to ensure that an _ORDERED command waits for prevoius > > pending commands to complete.. > > > > This is extra obvious in eli in that a _FLUSH is immediately dispatched, > > even when there may be _WRITEs that haven't been finished encrypting and > > sent down to the disk to get _FLUSHed... > > > > Any comments about this? > > Hmm, BIO_ORDERED was introduced pretty recently and GEOM classes were > not updated to honour it, but it also seems to be to complex to handle > in GEOM classes. I wonder if we could hold off new writes and wait for > the in-progress writes in GEOM if we spot BIO_ORDERED request without > the need to implement this logic in GEOM classes. Yeh. When I was looking at it, it definately seems like it should be something that we provide a generic method of handling (as part of bioq_*), since all the geom classes need to handle it... It'll be a bit difficult since we'd need to introduce some syncronization between the up/down threads to start the new writes when the previous writes finish... And with a class like geli, you can get better latency if we move handling into the class though... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-geom@FreeBSD.ORG Sun Sep 23 06:42:50 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9776B1065678 for ; Sun, 23 Sep 2012 06:42:50 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 5A6458FC08 for ; Sun, 23 Sep 2012 06:42:50 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id C1528EB1 for ; Sun, 23 Sep 2012 08:41:51 +0200 (CEST) Date: Sun, 23 Sep 2012 08:43:07 +0200 From: Pawel Jakub Dawidek To: freebsd-geom@freebsd.org Message-ID: <20120923064307.GK1454@garage.freebsd.pl> References: <20120919040430.GF19036@funkthat.com> <20120922162025.GE1454@garage.freebsd.pl> <20120923044828.GI19036@funkthat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="S6vg04ofUPzW4qJg" Content-Disposition: inline In-Reply-To: <20120923044828.GI19036@funkthat.com> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: geli and BIO_FLUSH and/or BIO_ORDERED issue? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Sep 2012 06:42:50 -0000 --S6vg04ofUPzW4qJg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Sep 22, 2012 at 09:48:28PM -0700, John-Mark Gurney wrote: > Pawel Jakub Dawidek wrote this message on Sat, Sep 22, 2012 at 18:20 +020= 0: > > On Tue, Sep 18, 2012 at 09:04:30PM -0700, John-Mark Gurney wrote: > > > I was looking at geli and I'm not sure if it's implementing BIO_FLUSH > > > and/or BIO_ORDERED properly... > > >=20 > > > >From my understanding is the BIO_ORDERED is suppose to wait for the > > > previous _WRITES to complete before returning so that you can ensure > > > that data is on disk, i.e. _ORDERED is set on a BIO_FLUSH... > > >=20 > > > BIO_ORDERED is handled by diskq_* code such that when you add an _ORD= ERED > > > command, all commands are put after it, but there doesn't appear to > > > be any code to ensure that an _ORDERED command waits for prevoius > > > pending commands to complete.. > > >=20 > > > This is extra obvious in eli in that a _FLUSH is immediately dispatch= ed, > > > even when there may be _WRITEs that haven't been finished encrypting = and > > > sent down to the disk to get _FLUSHed... > > >=20 > > > Any comments about this? > >=20 > > Hmm, BIO_ORDERED was introduced pretty recently and GEOM classes were > > not updated to honour it, but it also seems to be to complex to handle > > in GEOM classes. I wonder if we could hold off new writes and wait for > > the in-progress writes in GEOM if we spot BIO_ORDERED request without > > the need to implement this logic in GEOM classes. >=20 > Yeh. When I was looking at it, it definately seems like it should be > something that we provide a generic method of handling (as part of > bioq_*), since all the geom classes need to handle it... No, in most cases this is not a problem, because most of GEOM classes just pass all I/O requests without any reordering, so it is enough if the very last layer (eg. disk driver) handles BIO_ORDERED properly. I thought what you meant with GELI was that it can reorder writes, for which it needs more time with BIO_FLUSH requests that it handles immediately. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl --S6vg04ofUPzW4qJg Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlBer3oACgkQForvXbEpPzTCvgCgjF+b+eI1KhJpnn0SbHH1gcf9 x7gAn2kiLWpY09P1j+AwHleTxG5j2iy4 =74PA -----END PGP SIGNATURE----- --S6vg04ofUPzW4qJg-- From owner-freebsd-geom@FreeBSD.ORG Sun Sep 23 07:10:17 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 314B7106566C; Sun, 23 Sep 2012 07:10:17 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) by mx1.freebsd.org (Postfix) with ESMTP id 028218FC15; Sun, 23 Sep 2012 07:10:16 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id q8N7AFUE017029 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 23 Sep 2012 00:10:15 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id q8N7AFda017025; Sun, 23 Sep 2012 00:10:15 -0700 (PDT) (envelope-from jmg) Date: Sun, 23 Sep 2012 00:10:15 -0700 From: John-Mark Gurney To: Pawel Jakub Dawidek Message-ID: <20120923070842.GJ19036@funkthat.com> Mail-Followup-To: Pawel Jakub Dawidek , freebsd-geom@freebsd.org References: <20120919040430.GF19036@funkthat.com> <20120922162025.GE1454@garage.freebsd.pl> <20120923044828.GI19036@funkthat.com> <20120923064307.GK1454@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120923064307.GK1454@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sun, 23 Sep 2012 00:10:16 -0700 (PDT) Cc: freebsd-geom@freebsd.org Subject: Re: geli and BIO_FLUSH and/or BIO_ORDERED issue? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Sep 2012 07:10:17 -0000 Pawel Jakub Dawidek wrote this message on Sun, Sep 23, 2012 at 08:43 +0200: > On Sat, Sep 22, 2012 at 09:48:28PM -0700, John-Mark Gurney wrote: > > Pawel Jakub Dawidek wrote this message on Sat, Sep 22, 2012 at 18:20 +0200: > > > On Tue, Sep 18, 2012 at 09:04:30PM -0700, John-Mark Gurney wrote: > > > > I was looking at geli and I'm not sure if it's implementing BIO_FLUSH > > > > and/or BIO_ORDERED properly... > > > > > > > > >From my understanding is the BIO_ORDERED is suppose to wait for the > > > > previous _WRITES to complete before returning so that you can ensure > > > > that data is on disk, i.e. _ORDERED is set on a BIO_FLUSH... > > > > > > > > BIO_ORDERED is handled by diskq_* code such that when you add an _ORDERED > > > > command, all commands are put after it, but there doesn't appear to > > > > be any code to ensure that an _ORDERED command waits for prevoius > > > > pending commands to complete.. > > > > > > > > This is extra obvious in eli in that a _FLUSH is immediately dispatched, > > > > even when there may be _WRITEs that haven't been finished encrypting and > > > > sent down to the disk to get _FLUSHed... > > > > > > > > Any comments about this? > > > > > > Hmm, BIO_ORDERED was introduced pretty recently and GEOM classes were > > > not updated to honour it, but it also seems to be to complex to handle > > > in GEOM classes. I wonder if we could hold off new writes and wait for > > > the in-progress writes in GEOM if we spot BIO_ORDERED request without > > > the need to implement this logic in GEOM classes. > > > > Yeh. When I was looking at it, it definately seems like it should be > > something that we provide a generic method of handling (as part of > > bioq_*), since all the geom classes need to handle it... > > No, in most cases this is not a problem, because most of GEOM classes > just pass all I/O requests without any reordering, so it is enough if > the very last layer (eg. disk driver) handles BIO_ORDERED properly. > > I thought what you meant with GELI was that it can reorder writes, for > which it needs more time with BIO_FLUSH requests that it handles > immediately. I did mean that GELI can reorder writes, since when it schedules the writes on a queue, there is nothing that ensures that each thread on an SMP system (I have 6 on mine) will complete the requets in the order they were queued.. So, even if GELI simply added the _FLUSH command to the queue, we'd still need to have a method for GELI to serialize the writes either going into the queue (by holding off the _ORDERED command till all outstanding _WRITES are back, but this will increase latency) or out of the queue (by giving each bio an ordered id, recording which id's have the _ORDERED flag set, and only submiting the _ORDERED command once all the previous _WRITES have been submitted)... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-geom@FreeBSD.ORG Mon Sep 24 04:27:48 2012 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 419541065680; Mon, 24 Sep 2012 04:27:48 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward5.mail.yandex.net (forward5.mail.yandex.net [IPv6:2a02:6b8:0:602::5]) by mx1.freebsd.org (Postfix) with ESMTP id AF7978FC0C; Mon, 24 Sep 2012 04:27:47 +0000 (UTC) Received: from smtp4.mail.yandex.net (smtp4.mail.yandex.net [77.88.46.104]) by forward5.mail.yandex.net (Yandex) with ESMTP id 8FFAF120150B; Mon, 24 Sep 2012 08:27:46 +0400 (MSK) Received: from smtp4.mail.yandex.net (localhost [127.0.0.1]) by smtp4.mail.yandex.net (Yandex) with ESMTP id 6B2ED5C03D0; Mon, 24 Sep 2012 08:27:46 +0400 (MSK) Received: from mail.kirov.so-ups.ru (mail.kirov.so-ups.ru [178.74.170.1]) by smtp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id RjgqqLMY-RkgexVsH; Mon, 24 Sep 2012 08:27:46 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1348460866; bh=BZdhF76ICecGeGg674l3tC+tq/YunNM/kk4CXoRviBE=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=TNYlpq3yKX2zQGxTKUKzqVCBzUJpiw5/jL7DD2p2wiqgzk4R2BOOx2GzIECR5qJtq 1OKekbr69Q+bW0jJY3aiUOX2jaofoB/Tg1/1qKefYQfxqnE0wZjy0DOy09k3pYKRCF 3ImlgsLxq5hHk+Os5ovjR2nqoASFhmuSLxQ+dPWU= Message-ID: <505FE141.5070803@yandex.ru> Date: Mon, 24 Sep 2012 08:27:45 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Andriy Gapon References: <505DF409.9070908@FreeBSD.org> In-Reply-To: <505DF409.9070908@FreeBSD.org> X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: freebsd-geom@FreeBSD.org Subject: Re: re-tasting of providers held with withering consumers X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Sep 2012 04:27:48 -0000 On 22.09.2012 21:23, Andriy Gapon wrote: > > Because removal of withered geoms is done asynchronously, there is a window when > some provider may require re-tasting (because of media change or size change), > but it would still be in use by the withering geom. That prevents re-tasting a > class of that withering geom (for obvious reasons). > > The following patch tries to trigger owed re-tasting after the withering > provider is gone for good: > http://people.freebsd.org/~avg/geom-withered-retaste.diff Hi, Andriy, it seems you forgot to include g_renew_provider() implementation into the patch. -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Mon Sep 24 06:20:33 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A65F1065674 for ; Mon, 24 Sep 2012 06:20:33 +0000 (UTC) (envelope-from jacks.1785@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id 022EA8FC08 for ; Mon, 24 Sep 2012 06:20:32 +0000 (UTC) Received: by ieak10 with SMTP id k10so9797299iea.13 for ; Sun, 23 Sep 2012 23:20:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=CpRAoM9Yh0SrNhHk8iOGgeMnoOpgnMp9YjYxs1lCVXs=; b=WDF1by+gv2GF8dboBUZLmNZBCu++/O0/yUglIuSv1Q92VtPrm0jSRLaqfEAXeAlMyK PvNUWye85pciDRMOeAU+zVrakmsGn4PsAVHnPMIwKQIzRyQT0xGcyETuFNWpA7PFlFM/ UgPIPW8B6YxC0JHT7RI2ZyQjr4G9Sv83aOWMUapy/U57ocwKp5d0RzDkW0ZLopkheLKk nz7FvqBUc0XEVayI+Kj1dDJElprZ6iaG2ttxMlipYDhespq1ncB5X7faS7oQe8q6ieKw 7g0x471Ef7E6rgxVg7kjlQKwV8ZWk0YT8ioaGixQSj+YSnJSfnZEEn6etcOwahHFZs0c Y8Kw== MIME-Version: 1.0 Received: by 10.50.180.169 with SMTP id dp9mr4317490igc.8.1348467632170; Sun, 23 Sep 2012 23:20:32 -0700 (PDT) Received: by 10.64.44.105 with HTTP; Sun, 23 Sep 2012 23:20:32 -0700 (PDT) In-Reply-To: References: Date: Mon, 24 Sep 2012 11:50:32 +0530 Message-ID: From: Jack To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: GEOM and CAM pass behaviour X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Sep 2012 06:20:33 -0000 Hi again, On Sat, Sep 22, 2012 at 9:50 AM, Jack wrote: > Hi all, > > I would like to know 2 things regarding geom and cam pass driver > behaviour. It would be helpful if someone could educate me regarding > this. > > Consider the case where a scsi/ata tape/hard drive is attached to the > system after the FreeBSD is booted(from different drive) completely. > > > 1. Is it true that GEOM layer will always taste the (hard drive/tape > drive)device whether we access device via cam pass driver or do block > access. In other words, say if we never do a block access on the > selected device, and we only access this device via cam pass driver, > will then Geom will ever have to taste(e.g. for partition table, etc. > or for other purposes) this > device.? > > or is it something like, as soon as the device is attached, geom layer > will always taste it - it doesn't matter how we access the device. > > 2. Also, is it possible that while a process is accessing the selected > device via cam pass driver, the block access is also possible at that > same time, (by same or different process) to that same device. > I know that when a process is accessing the device via block access, > it is possible for another(or even same) process to access that same > device via cam pass driver, but I would like to know whether the other > way is true also. > > > Thanks. > -- > Jack It seems that two or more processes can access a same device via cam pass driver, and the block access is also possible at that same time, (by same or different process) to that same device. Queueing is done at HBA driver layer. But I'm not confirmed about GEOM tasting scene. Is there a way so that a da/ada/sa device is not created for a particular device, leaving only pass device, so that this device is accessible only via cam pass driver and not via geom/block layer. Actually I'm developing a userland utility, and my intent is that a particular ata/scsi device attached to the system( after Freebsd is booted completely), is accessible only via cam pass driver(ie via pass device), and not via geom/block layer(ie via da/ada/sa) driver. The device is selected by the user of utility. Any suggestions, or approaches would be helpful. Regards. -- Jack From owner-freebsd-geom@FreeBSD.ORG Mon Sep 24 06:37:48 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD0F3106564A for ; Mon, 24 Sep 2012 06:37:48 +0000 (UTC) (envelope-from jacks.1785@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8941C8FC0A for ; Mon, 24 Sep 2012 06:37:48 +0000 (UTC) Received: by ieak10 with SMTP id k10so9829669iea.13 for ; Sun, 23 Sep 2012 23:37:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=VEARH/ug3fYEWhJsg1NLNh/7w+yx8nBIf+MHJL86UiA=; b=E1Qyi5KftaS3mKb2lFWP7olGp2grO3FlyepPljBWAXnlpvHpj0ydEHUxQBf+qukFmf BAfHFFKgITB1xWkWq/3cizhDmO+d2hf+veycYBn7pdg4VmU79Rqd09Rals77GoQNQHy7 1FFIy485V0mGZK1DZp77Lb6hzXOLgLDNZBh40j0vrH2Ii4V9/5K2AIOeiAiHlGajGtbU 0PG20eioo534XdrCnQ+b2BIX++eI/QOgv6TjKwmrlS6Sn2ye2pXzgndGdZTqlrmvZhys qqoKDmtVGZZ+sBXIntGayIGCX5uYoH9Sp7Lu/AuYjL3JGWTf9K/Avf1Gx8tRwcGO/oGl iJ4w== MIME-Version: 1.0 Received: by 10.50.180.169 with SMTP id dp9mr4342876igc.8.1348468667766; Sun, 23 Sep 2012 23:37:47 -0700 (PDT) Received: by 10.64.44.105 with HTTP; Sun, 23 Sep 2012 23:37:47 -0700 (PDT) Date: Mon, 24 Sep 2012 12:07:47 +0530 Message-ID: From: Jack To: freebsd-geom Content-Type: text/plain; charset=ISO-8859-1 Subject: GEOM tasting behaviour X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Sep 2012 06:37:48 -0000 Hi all, I'm little bit unclear about the GEOM tasting behaviour. It would be helpful if someone could clarify it. Is it true that GEOM layer will always taste the (hard drive/tape drive)device whether we access device via cam pass driver or do block access. In other words, say if we never do a block access(ie via da/ada/sa devices) on a particular device, and we only access this device via cam pass driver, will then GEOM ever taste(e.g. for partition table, etc. or for other purposes) this device.? or is it something like, as soon as the device is attached, geom layer will always taste it - it doesn't matter how we access the device. Regards. -- Jack From owner-freebsd-geom@FreeBSD.ORG Mon Sep 24 11:07:22 2012 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 137F61065674 for ; Mon, 24 Sep 2012 11:07:22 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E99068FC17 for ; Mon, 24 Sep 2012 11:07:21 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q8OB7LPX085942 for ; Mon, 24 Sep 2012 11:07:21 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q8OB7LYx085940 for freebsd-geom@FreeBSD.org; Mon, 24 Sep 2012 11:07:21 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 24 Sep 2012 11:07:21 GMT Message-Id: <201209241107.q8OB7LYx085940@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Sep 2012 11:07:22 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/170038 geom [geom] geom_mirror always starts degraded after reboot o kern/169539 geom [geom] [patch] fix ability to run gmirror on MSI MegaR a bin/169077 geom bsdinstall(8) does not use partition labels in /etc/fs f kern/165745 geom [geom] geom_multipath page fault on removed drive o kern/165428 geom [glabel][patch] Add xfs support to glabel o kern/164254 geom [geom] gjournal not stopping on GPT partitions o kern/164252 geom [geom] gjournal overflow o kern/164143 geom [geom] Partition table not recognized after upgrade R8 a kern/163020 geom [geli] [patch] enable the Camellia-XTS on GEOM ELI o kern/162690 geom [geom] gpart label changes only take effect after a re o kern/162010 geom [geli] panic: Provider's error should be set (error=0) o kern/161979 geom [geom] glabel doesn't update after newfs, and glabel s o kern/161752 geom [geom] glabel(8) doesn't get gpt label change o bin/161677 geom gpart(8) Probably bug in gptboot o kern/160562 geom [geom][patch] Allow to insert new component to geom_ra o kern/160409 geom [geli] failed to attach provider f kern/159595 geom [geom] [panic] panic on gmirror unload in vbox [regres f kern/159414 geom [isp] isp(4)+gmultipath(8) : removing active fiber pat p kern/158398 geom [headers] [patch] includes o kern/158197 geom [geom] geom_cache with size>1000 leads to panics o kern/157879 geom [libgeom] [regression] ABI change without version bump o kern/157863 geom [geli] kbdmux prevents geli passwords from being enter o kern/157739 geom [geom] GPT labels with geom_multipath o kern/157724 geom [geom] gpart(8) 'add' command must preserve gap for sc o kern/157723 geom [geom] GEOM should not process 'c' (raw) partitions fo o kern/157108 geom [gjournal] dumpon(8) fails on gjournal providers o kern/155994 geom [geom] Long "Suspend time" when reading large files fr o kern/154226 geom [geom] GEOM label does not change when you modify them o kern/150858 geom [geom] [geom_label] [patch] glabel(8) is not compatibl o kern/150626 geom [geom] [gjournal] gjournal(8) destroys label o kern/150555 geom [geom] gjournal unusable on GPT partitions o kern/150334 geom [geom] [udf] [patch] geom label does not support UDF o kern/149762 geom volume labels with rogue characters o bin/149215 geom [panic] [geom_part] gpart(8): Delete linux's slice via o kern/147667 geom [gmirror] Booting with one component of a gmirror, the o kern/145818 geom [geom] geom_stat_open showing cached information for n o kern/145042 geom [geom] System stops booting after printing message "GE o kern/143455 geom gstripe(8) in RELENG_8 (31st Jan 2010) broken o kern/142563 geom [geom] [hang] ioctl freeze in zpool o kern/141740 geom [geom] gjournal(8): g_journal_destroy concurrent error o kern/140352 geom [geom] gjournal + glabel not working o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/134113 geom [geli] Problem setting secondary GELI key o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o bin/131415 geom [geli] keystrokes are unregulary sent to Geli when typ o kern/131353 geom [geom] gjournal(8) kernel lock o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid o kern/127420 geom [geom] [gjournal] [panic] Journal overflow on gmirrore o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro f kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o bin/86388 geom [geom] [geom_part] periodic(8) daily should backup gpa o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. 74 problems total. From owner-freebsd-geom@FreeBSD.ORG Tue Sep 25 19:37:27 2012 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 626F5106564A; Tue, 25 Sep 2012 19:37:27 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 360728FC18; Tue, 25 Sep 2012 19:37:27 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 39C9AB944; Tue, 25 Sep 2012 15:37:26 -0400 (EDT) From: John Baldwin To: freebsd-ia64@freebsd.org, Paul Procacci Date: Tue, 25 Sep 2012 14:37:30 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p17; KDE/4.5.5; amd64; ; ) References: <201209251720.q8PHKE7T072562@freefall.freebsd.org> In-Reply-To: <201209251720.q8PHKE7T072562@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201209251437.30766.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 25 Sep 2012 15:37:26 -0400 (EDT) Cc: geom@freebsd.org Subject: Re: ia64/171814: [panic] bioq_init or bioq_remove (unsure which) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Sep 2012 19:37:27 -0000 On Tuesday, September 25, 2012 1:20:14 pm Paul Procacci wrote: > The following reply was made to PR ia64/171814; it has been noted by GNATS. > > From: Paul Procacci > To: John Baldwin > Cc: freebsd-ia64@freebsd.org, freebsd-gnats-submit@freebsd.org > Subject: Re: ia64/171814: [panic] bioq_init or bioq_remove (unsure which) > Date: Tue, 25 Sep 2012 12:11:17 -0500 > > --047d7b66f839532c0a04ca89cbf7 > Content-Type: text/plain; charset=ISO-8859-1 > > Thanks John for your response. > > Here is the output provided what you had explained to do: > > > 0xffffffff80865023 is in devstat_remove_entry > (/usr/src/sys/kern/subr_devstat.c:193). > 188 > 189 /* Remove this entry from the devstat queue */ > 190 atomic_add_acq_int(&ds->sequence1, 1); > 191 if (ds->id == NULL) { > 192 devstat_num_devs--; > 193 STAILQ_REMOVE(devstat_head, ds, devstat, dev_links); > 194 } > 195 devstat_free(ds); > 196 devstat_generation++; > 197 mtx_unlock(&devstat_mutex); I think the devstat entry must have been destroyed twice somehow. Earlier in geom_subr.c the devstat entry is created with a unit of -1: struct g_consumer * g_new_consumer(struct g_geom *gp) { ... cp->stat = devstat_new_entry(cp, -1, 0, DEVSTAT_ALL_SUPPORTED, DEVSTAT_TYPE_DIRECT, DEVSTAT_PRIORITY_MAX); } That should result in devstat_new_entry() setting ds->id to 'cp' (which is clearly not NULL), so it shouldn't even attempt the STAILQ_REMOVE(), but that is where it appears to have faulted. -- John Baldwin From owner-freebsd-geom@FreeBSD.ORG Tue Sep 25 21:19:41 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DAF8E10657F4 for ; Tue, 25 Sep 2012 21:19:41 +0000 (UTC) (envelope-from root@free.fr) Received: from smtp5-g21.free.fr (smtp5-g21.free.fr [IPv6:2a01:e0c:1:1599::14]) by mx1.freebsd.org (Postfix) with ESMTP id 96E9B8FC08 for ; Tue, 25 Sep 2012 21:19:39 +0000 (UTC) Received: from free.fr (unknown [82.235.65.2]) by smtp5-g21.free.fr (Postfix) with ESMTP id 99766D480F7 for ; Tue, 25 Sep 2012 23:19:35 +0200 (CEST) From: Raoul MEGELAS To: freebsd-geom@freebsd.org Date: Tue, 25 Sep 2012 23:19:34 +0200 Sender: root@free.fr Message-Id: <20120925211935.99766D480F7@smtp5-g21.free.fr> Subject: gpart on macbook air X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Sep 2012 21:19:42 -0000 Hi all, i am trying to install FreeBSD current on a macbook air (internal ssd as you know): resizing the osx partitions to leave some place to FreeBSD. i noticed the following: 1. on freebsd, deleting a partition with gpart, say: gpart delete -i 4 ada0 damage the osx boot. of cours, booting with a backup disk and repairing the disk make the trick. 2. refit does not recognizes the gpt freebsd partitions. but it recognizes and boots a non gpt freebsd partitions. googling a bit i found this: http://randomcomputerbits.blogspot.com/ on 2007: freebsd-on-macbook.html is this still true? if not, how to solve this problem? Thanks a lot. Raoul From owner-freebsd-geom@FreeBSD.ORG Wed Sep 26 18:58:38 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13C8A106564A for ; Wed, 26 Sep 2012 18:58:38 +0000 (UTC) (envelope-from sullivanms@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 987618FC16 for ; Wed, 26 Sep 2012 18:58:37 +0000 (UTC) Received: by bkcje9 with SMTP id je9so547973bkc.13 for ; Wed, 26 Sep 2012 11:58:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=PGTud3CUC+XjFgf3NUPL2yEOeLyFv5fQ4UH8DKAGgu8=; b=zcL03fgzqo+1yVT9ccErW5vcOSATYJYMacKO9iyABrbJ9qVbwwdd/zY8MIb9gWeLhY 6cPnZuM1hV/ht3sEmz1ADiwgxhvwMdmYBvnUXneWemMpEBSpT5csITtAR4xYG606GlmQ q7odI/h5hL3cHh3dzI8yS7OATyIKeTcn03szDVvlwPK8BUyFCtTwYXy/adVqVTgGp/dO ggP0LEC4jc6UX8kYBfR69WViKTklOGF5EEt+9oBk9RFp+Un1V3Ai6HVMDUbxhqtG/jPd GCS3Jh+xgQubuBMWDh3mfDvJoTiJLSXihHiAz0hiViHCnH41yGgcv15hqhE78P2IynNo EaYg== Received: by 10.204.4.149 with SMTP id 21mr1040865bkr.122.1348685916133; Wed, 26 Sep 2012 11:58:36 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.48.136 with HTTP; Wed, 26 Sep 2012 11:58:15 -0700 (PDT) From: Michael Sullivan Date: Wed, 26 Sep 2012 14:58:15 -0400 Message-ID: To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: GELI tastes partitions before labels, prompts for passphrase for both X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Sep 2012 18:58:38 -0000 Hello, I'm running 9.1-RC1. I configured an encrypted root disk with GELI using the GPT label ("gpt/zsystem0") rather than the partition name ("ada0p3"). Everything works fine, but as it boots, I'm prompted for the passphrase for the partition and have to make that fail before I get prompted for the passphrase for the label. It's a minor annoyance but might be worse on a server with many disks. I've seen a few other people mention this behavior but haven't seen anything to indicate that anybody is working on it. Is there a solution out there that I'm not aware of? My understanding of GEOM is rudimentary at this point, but poking around in the code the only ideas I have are to create a blacklist of providers (through a tunable string?) and check against it during tasting; or something like adding a flag to the ELI metadata and, if it's set, checking the provider's class and giving up if it's not a label. Do either of those approaches sound reasonable? Thanks Michael From owner-freebsd-geom@FreeBSD.ORG Fri Sep 28 22:22:03 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B00E9106564A for ; Fri, 28 Sep 2012 22:22:03 +0000 (UTC) (envelope-from wblock@wonkity.com) Received: from wonkity.com (wonkity.com [67.158.26.137]) by mx1.freebsd.org (Postfix) with ESMTP id 6D3578FC0A for ; Fri, 28 Sep 2012 22:22:03 +0000 (UTC) Received: from wonkity.com (localhost [127.0.0.1]) by wonkity.com (8.14.5/8.14.5) with ESMTP id q8SMLvUu020507 for ; Fri, 28 Sep 2012 16:21:57 -0600 (MDT) (envelope-from wblock@wonkity.com) Received: from localhost (wblock@localhost) by wonkity.com (8.14.5/8.14.5/Submit) with ESMTP id q8SMLutE020504 for ; Fri, 28 Sep 2012 16:21:57 -0600 (MDT) (envelope-from wblock@wonkity.com) Date: Fri, 28 Sep 2012 16:21:56 -0600 (MDT) From: Warren Block To: freebsd-geom@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (wonkity.com [127.0.0.1]); Fri, 28 Sep 2012 16:21:57 -0600 (MDT) Subject: Simple way to clear arbitrary drive metadata? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Sep 2012 22:22:03 -0000 Last night, I found that the remnants of a GPT backup table on an MBR drive prevented it from booting. When reusing drives from old mirrors, old mirror metadata can be a problem also. And there may be old hardware RAID metadata at the end of the drive. It would be great if dd understood negative seek values. This would get most of that old metadata: dd if=/dev/zero of=/dev/ada8 seek=-34 ...but dd does not understand negative seek values. (Been on my list for a while to look at that.) Which leaves things like diskinfo ada8 | cut -f4 (subtract 34) dd if=/dev/zero of=/dev/ada8 seek=(calculated value) That can be done in one command line with bc and backticks, but it's not clear or elegant. gpart can clear secondary GPT tables, but I'm pretty sure it won't wipe out that space unless it actually is a GPT table. Likewise with glabel and gmirror, they're safe because they only touch data they understand. Is there something simpler and more blunt? From owner-freebsd-geom@FreeBSD.ORG Fri Sep 28 23:53:16 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A040106566B for ; Fri, 28 Sep 2012 23:53:16 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212]) by mx1.freebsd.org (Postfix) with ESMTP id 3AF848FC08 for ; Fri, 28 Sep 2012 23:53:16 +0000 (UTC) Received: from epsilon.delphij.net (drawbridge.ixsystems.com [206.40.55.65]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by anubis.delphij.net (Postfix) with ESMTPSA id 8B52D1D6D7; Fri, 28 Sep 2012 16:53:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=delphij.net; s=anubis; t=1348876390; bh=htV+REDFPUTUmfFpY62chYcjO76K9WcBsEs7bHKQ54I=; h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To; b=tXRZRfV7B1x9ykU4tCGJteENfOTDLE+Pttt/78MWnI1i8T8bvDG6+brehlJkt9hNC ypo3pVnT4smeEqYBCRnkz9zDSDzh0HebYoiH6GgtQ+4uzOXwwsQZeQUNEUX4t8YaWc d8niudewMoptvyJcwhqQ0Q+3NG1a5CPQNKn6HigM= Message-ID: <50663866.9070001@delphij.net> Date: Fri, 28 Sep 2012 16:53:10 -0700 From: Xin Li Organization: The freeBSD Project User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.7) Gecko/20120830 Thunderbird/10.0.7 MIME-Version: 1.0 To: Warren Block References: In-Reply-To: X-Enigmail-Version: 1.4.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: Simple way to clear arbitrary drive metadata? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: d@delphij.net List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Sep 2012 23:53:16 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 09/28/12 15:21, Warren Block wrote: > Last night, I found that the remnants of a GPT backup table on an > MBR drive prevented it from booting. When reusing drives from old > mirrors, old mirror metadata can be a problem also. And there may > be old hardware RAID metadata at the end of the drive. > > It would be great if dd understood negative seek values. This > would get most of that old metadata: > > dd if=/dev/zero of=/dev/ada8 seek=-34 > > ...but dd does not understand negative seek values. (Been on my > list for a while to look at that.) > > Which leaves things like > > diskinfo ada8 | cut -f4 (subtract 34) dd if=/dev/zero of=/dev/ada8 > seek=(calculated value) > > That can be done in one command line with bc and backticks, but > it's not clear or elegant. gpart can clear secondary GPT tables, > but I'm pretty sure it won't wipe out that space unless it actually > is a GPT table. Likewise with glabel and gmirror, they're safe > because they only touch data they understand. > > Is there something simpler and more blunt? I think you can do: gpart destroy -F ada8 gpart create -s gpt ada8 gpart destroy -F ada8 The second 'create' will write an empty partition table to the secondary table. Cheers, - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJQZjhmAAoJEG80Jeu8UPuzp/wIAJ9TQdwRIvfMn5zP3yMqYKIV OvVNSUaecPaav9G7CEwApl1bQnmCSYepv6FASH65CoNyr14kioS0e8BET4s/GzQD LhliFucVnd6X6POdyL5VEdJ78UYuox8h9elykJBSlwdgeWGCpoRwI9sG8+oWtl+Z zpYKgUBU+eCTsXWjIBbLGphhgXgDT+j1uEks8qxbVsUNZH054tKWEQ6iK2+bKGYa 6dp3M+Lrt6qJLcKWtFvxMVP2rzCzYmRmSFkKVUiIHgSOp2yH4uFvzRo9CY74azuL QS4/+h5iuMtnMiXKr5sWoGOi4WCTLVSnmo07ac9aP4H0jlTuVmJ/Qq/hoqOuMZg= =ae1M -----END PGP SIGNATURE----- From owner-freebsd-geom@FreeBSD.ORG Sat Sep 29 17:27:31 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C764106564A for ; Sat, 29 Sep 2012 17:27:31 +0000 (UTC) (envelope-from wblock@wonkity.com) Received: from wonkity.com (wonkity.com [67.158.26.137]) by mx1.freebsd.org (Postfix) with ESMTP id 3FD588FC08 for ; Sat, 29 Sep 2012 17:27:31 +0000 (UTC) Received: from wonkity.com (localhost [127.0.0.1]) by wonkity.com (8.14.5/8.14.5) with ESMTP id q8THROmr026771; Sat, 29 Sep 2012 11:27:24 -0600 (MDT) (envelope-from wblock@wonkity.com) Received: from localhost (wblock@localhost) by wonkity.com (8.14.5/8.14.5/Submit) with ESMTP id q8THROqL026768; Sat, 29 Sep 2012 11:27:24 -0600 (MDT) (envelope-from wblock@wonkity.com) Date: Sat, 29 Sep 2012 11:27:24 -0600 (MDT) From: Warren Block To: d@delphij.net In-Reply-To: <50663866.9070001@delphij.net> Message-ID: References: <50663866.9070001@delphij.net> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (wonkity.com [127.0.0.1]); Sat, 29 Sep 2012 11:27:24 -0600 (MDT) Cc: freebsd-geom@freebsd.org Subject: Re: Simple way to clear arbitrary drive metadata? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Sep 2012 17:27:31 -0000 On Fri, 28 Sep 2012, Xin Li wrote: > On 09/28/12 15:21, Warren Block wrote: >> Last night, I found that the remnants of a GPT backup table on an >> MBR drive prevented it from booting. When reusing drives from old >> mirrors, old mirror metadata can be a problem also. And there may >> be old hardware RAID metadata at the end of the drive. >> >> It would be great if dd understood negative seek values. This >> would get most of that old metadata: >> >> dd if=/dev/zero of=/dev/ada8 seek=-34 >> >> ...but dd does not understand negative seek values. (Been on my >> list for a while to look at that.) >> >> Which leaves things like >> >> diskinfo ada8 | cut -f4 (subtract 34) dd if=/dev/zero of=/dev/ada8 >> seek=(calculated value) >> >> That can be done in one command line with bc and backticks, but >> it's not clear or elegant. gpart can clear secondary GPT tables, >> but I'm pretty sure it won't wipe out that space unless it actually >> is a GPT table. Likewise with glabel and gmirror, they're safe >> because they only touch data they understand. >> >> Is there something simpler and more blunt? > > I think you can do: > > gpart destroy -F ada8 > gpart create -s gpt ada8 > gpart destroy -F ada8 > > The second 'create' will write an empty partition table to the > secondary table. Nice! It works perfectly for GPT and glabel metadata. gmirror is a problem. If the gmirror kernel module is loaded, drives with gmirror metadata create a mirror. GEOM prevents writes to the drive then. sysctl kern.geom.debugflags=16 allows writes, but the mirror is still in memory and running. 'gmirror stop' (which the system also does on shutdown) helpfully writes the whole metadata block back to the drive. After reboot, it's right back where it was. Seems like the only way to deal with gmirror is to have the user check for it directly, and stop the mirror if the attached drive is a member. Probably the same for graid, but I don't know if I have a system where I can test that.