Date: Fri, 28 Sep 2012 16:16:24 -0400 From: George Neville-Neil <gnn@neville-neil.com> To: Matt Burke <mattblists@icritical.com> Cc: freebsd-fs@FreeBSD.org Subject: Re: [patch] DTrace disk IO Message-ID: <49E027FB-2FA7-4839-84C8-3112A039DEE7@neville-neil.com> In-Reply-To: <5056D896.8060607@icritical.com> References: <5056D896.8060607@icritical.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 17, 2012, at 04:00 , Matt Burke <mattblists@icritical.com> wrote: > I've recently been trying to get a grip on measuring disk IO latency = (per transaction), and have found it to be rather difficult given the = asynchronous nature of the beast, and also I can't find a way of = translating the bio start of transaction timestamps to anything I can = use in DTrace when pulling them out. >=20 Interesting. Have you looked at all at the I/O module I added in HEAD? Best, George > So I knocked up this little patch against releng/9.1 to put a couple = of DTrace probes in the right places to pick up crucial data like the = now+then timestamps while they're present. >=20 > The predicate on the probe is needed to pick up the right firing - for = reasons I've not been able to fathom because gstat et al give correct = data, devstat_end_transaction is called multiple times for a given = operation - from g_disk_done(), then g_io_deliver() - without anything = useful in the bio struct (device name, number, etc). There also seem to = be a lot of firings coming from the following path which I don't = understand, again without anything useful in the bio: > kernel`devstat_end_transaction+0x13b > kernel`g_io_deliver+0x1b0 > kernel`g_io_schedule_up+0xa6 > kernel`g_up_procbody+0x5c > kernel`fork_exit+0x11f > kernel`0xffffffff80c1c3fe >=20 > Catching flushes is also proving problematic. It would seem that = devstat_end_transaction_bio() is called, but the bio and devstat structs = are virtually empty. bp->bio_dev, bp->bio_disk, ds->device_name, = ds->device_number, ds_unit_number are all null/empty, so I know that one = disk has flushed, and I know how long it took, but I can't find out = which disk it was. >=20 >=20 > Thoughts? >=20 >=20 >=20 > Index: sys/kern/subr_devstat.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/kern/subr_devstat.c (revision 240481) > +++ sys/kern/subr_devstat.c (working copy) > @@ -29,6 +29,7 @@ > #include <sys/cdefs.h> > __FBSDID("$FreeBSD$"); >=20 > +#include "opt_kdtrace.h" > #include <sys/param.h> > #include <sys/kernel.h> > #include <sys/systm.h> > @@ -41,9 +42,22 @@ > #include <sys/conf.h> > #include <vm/vm.h> > #include <vm/pmap.h> > +#include <sys/sdt.h> >=20 > #include <machine/atomic.h> >=20 > +SDT_PROVIDER_DEFINE(devstat); > +SDT_PROBE_DEFINE(devstat, subr_devstat, devstat_end_transaction, = stat, stat); > +SDT_PROBE_ARGTYPE(devstat, subr_devstat, devstat_end_transaction, = stat, 0, "struct devstat *"); > +SDT_PROBE_ARGTYPE(devstat, subr_devstat, devstat_end_transaction, = stat, 1, "uint32_t"); > +SDT_PROBE_ARGTYPE(devstat, subr_devstat, devstat_end_transaction, = stat, 2, "struct bintime *"); > +SDT_PROBE_ARGTYPE(devstat, subr_devstat, devstat_end_transaction, = stat, 3, "struct bintime *"); > +SDT_PROBE_DEFINE(devstat, subr_devstat, devstat_end_transaction_bio, = stat, stat); > +SDT_PROBE_ARGTYPE(devstat, subr_devstat, devstat_end_transaction_bio, = stat, 0, "struct devstat *"); > +SDT_PROBE_ARGTYPE(devstat, subr_devstat, devstat_end_transaction_bio, = stat, 1, "struct bio *"); > + > + > + > static int devstat_num_devs; > static long devstat_generation =3D 1; > static int devstat_version =3D DEVSTAT_VERSION; > @@ -312,6 +326,8 @@ >=20 > ds->end_count++; > atomic_add_rel_int(&ds->sequence0, 1); > + > + SDT_PROBE(devstat, subr_devstat, devstat_end_transaction, stat, = ds, bytes, now, then, 0); > } >=20 > void > @@ -332,6 +348,8 @@ > else=20 > flg =3D DEVSTAT_NO_DATA; >=20 > + SDT_PROBE(devstat, subr_devstat, devstat_end_transaction_bio, = stat, ds, bp, 0, 0, 0); > + > devstat_end_transaction(ds, bp->bio_bcount - bp->bio_resid, > DEVSTAT_TAG_SIMPLE, flg, NULL, = &bp->bio_t0); > } >=20 >=20 >=20 >=20 > Sample dtrace script: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > BEGIN > { > bio_cmds[1] =3D "READ"; > bio_cmds[2] =3D "WRITE"; > bio_cmds[4] =3D "DELETE"; > bio_cmds[8] =3D "GETATTR"; > bio_cmds[16] =3D "FLUSH"; > } >=20 > devstat::devstat_end_transaction_bio: > { > self->bio =3D args[1]; > } >=20 > devstat::devstat_end_transaction: > /self->bio && args[0]->device_number/ > { > diff_frac =3D args[2]->frac - args[3]->frac; > diff_ufrac =3D (diff_frac < 0) ? (args[3]->frac - = args[2]->frac) : diff_frac; > diff =3D (1000000000*(diff_ufrac>>32))>>32; >=20 > printf("%d\t%s%d\t%s\t%d\t0.%09d\n", timestamp, > args[0]->device_name, args[0]->unit_number, > bio_cmds[self->bio->bio_cmd], > args[1], > diff > ); > } >=20 >=20 > --=20 > The information contained in this message is confidential and intended = for the addressee only. If you have received this message in error, or = there are any problems with its content, please contact the sender.=20 >=20 > iCritical is a trading name of Critical Software Ltd. Registered in = England: 04909220. > Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 = 5NH. >=20 > This message has been scanned for security threats by iCritical. = www.icritical.com >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49E027FB-2FA7-4839-84C8-3112A039DEE7>