Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Nov 2015 18:45:42 -0800
From:      Mark Johnston <markj@FreeBSD.org>
To:        freebsd-arch@FreeBSD.org
Subject:   zero-cost SDT probes
Message-ID:  <20151122024542.GA44664@wkstn-mjohnston.west.isilon.com>

next in thread | raw e-mail | index | archive | help
Hi,

For the past while I've been experimenting with various ways to
implement "zero-cost" SDT DTrace probes. Basically, at the moment an SDT
probe site expands to this:

if (func_ptr != NULL)
	func_ptr(<probe args>);

When the probe is enabled, func_ptr is set to dtrace_probe(); otherwise
it's NULL. With zero-cost probes, the SDT_PROBE macros expand to

func(<probe args>);

When the kernel is running, each probe site has been overwritten with
NOPs. When a probe is enabled, one of the NOPs is overwritten with a
breakpoint, and the handler uses the PC to figure out which probe fired.
This approach has the benefit of incurring less overhead when the probe
is not enabled; it's more complicated to implement though, which is why
this hasn't already been done.

I have a working implementation of this for amd64 and i386[1]. Before
adding support for the other arches, I'd like to get some idea as to
whether the approach described below is sound and acceptable.

The main difficulty is in figuring out where the probe sites actually
are once the kernel is running. In my patch, a probe site is a call to
an externally-defined function which is defined in an
automatically-generated C file. At link time, we first perform a partial
link of all the kernel's object files. Then, a script uses the relocations
against the still-undefined probe functions to generate
1) stub functions for the probes, so that the kernel can actually be
   linked, and
2) a linker set containing the offsets of each probe site relative to
   the beginning of the text section.
The result is linked with the partially-linked kernel to generate the
final kernel file.

During boot, we iterate over the linker set, using the offsets plus the
address of btext to overwrite probe sites with NOPs. SDT probes in kernel
modules are handled differently (and more simply): the kernel linker just
has special handling for relocations against symbols named __dtrace_sdt_*;
this is how illumos/Solaris implements all of this.

My uncertainty revolves around the use of relocations in the
partially-linked kernel to determine the address of probe sites in the
running kernel. With the GNU ld in base, this happens to work because
the final link doesn't modify the text section. Is this something I can
rely upon? Will this assumption be false with the advent of lld and LTO?
Are there other, cleaner ways to implement what I described above?

Thanks,
-Mark

[1] https://people.freebsd.org/~markj/patches/sdt-zerocost/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151122024542.GA44664>