Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Apr 2012 18:09:40 -0400
From:      Ryan Stone <rysto32@gmail.com>
To:        freebsd-hackers@freebsd.org
Cc:        George Neville-Neil <gnn@freebsd.org>
Subject:   [PATCH] Implementation of DTrace sched provider (with bonus schedgraph script)
Message-ID:  <CAFMmRNzRN1GHDCVPPvfzR101bTS8-5KagmXZLADjifnJ-YG7Ww@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
I've implemented the sched provider for FreeBSD.  This provider
provides probes that fire when various scheduling decisions are made.
This implementation is intended to be compatible with the
implementation in Solaris and its derivatives, with the following
caveats:

Several probes reference features that are not implemented in FreeBSD.
 This probes are provided but will never fire.  These probes are:
cpucaps-sleep, cpucaps-wakeup, schedctl-nopreempt, schedctl-preempt
and schedctl-yield.

I've added some extra probes that do not exist in Solaris and its
derivatives, to make it possible to implement a schedgraph DTrace
script.  These probes are lend-pri and load-change.  Scripts intended
to be portable to other implementations should not reference these
probes.

FreeBSD currently does not properly translate internal types to the
portable implementation-independent types defined in the
documentation.  This means that your scripts will see a struct thread
* where they should get a lwpsinfo_t *, for example.

The patch implementing the sched provider can be found here:

http://people.freebsd.org/~rstone/patches/sched_sdt.diff

This patch is against r234420.  It should apply cleanly to stable/9 as
well as head, but it will not compile if applied against stable/8
because of a change in the arguments accepted by the SDT_PROBE_DEFINE*
macros.

My D script that collections schedgraph data can be found here:

http://people.freebsd.org/~rstone/dtrace/schedgraph.d

I recommend collecting data with the ring bufpolicy.  This causes
DTrace to collect data in per-cpu ring buffers which should guarantee
that there is no dropped data points.  The data is written to stdout
when dtrace(1) exits.  In my example I exit after running for 5
seconds, but you could just as easily modify the script to run until a
certain probe fires and then exit, for example.

The output of schedgraph.d isn't quite ready for processing by
schedgraph.  Here is a very short sh script that post-processes the
data to make it parseable by schedgraph:

http://people.freebsd.org/~rstone/dtrace/make_ktr

Finally, schedgraph.d uses the cpu variable, which is currently not
available in FreeBSD.  Here is my patch (which I will commit to HEAD
soon) that implements that variable.  You will have to rebuild
dtrace.ko and libdtrace.so.

http://people.freebsd.org/~rstone/patches/dtrace_cpu.diff



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNzRN1GHDCVPPvfzR101bTS8-5KagmXZLADjifnJ-YG7Ww>