Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Jan 2013 23:03:13 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Alfred Perlstein <bright@mu.org>
Cc:        Daniel Eischen <deischen@freebsd.org>, hackers@freebsd.org, Jason Evans <jasone@freebsd.org>
Subject:   Re: malloc+utrace, tracking memory leaks in a running program.
Message-ID:  <20130110210313.GY2561@kib.kiev.ua>
In-Reply-To: <50EF0892.104@mu.org>
References:  <50D52B10.1060205@mu.org> <A0AD197D-B72D-4FF5-B9AF-5E4F2AAAA421@freebsd.org> <50EE6281.7030602@mu.org> <50EE6630.2010902@mu.org> <20130110073854.GQ2561@kib.kiev.ua> <50EEDB5E.2010906@mu.org> <20130110180514.GS2561@kib.kiev.ua> <50EF0892.104@mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--KQ2iXOoQ638mtNze
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jan 10, 2013 at 01:29:38PM -0500, Alfred Perlstein wrote:
> On 1/10/13 1:05 PM, Konstantin Belousov wrote:
> > On Thu, Jan 10, 2013 at 10:16:46AM -0500, Alfred Perlstein wrote:
> >> On 1/10/13 2:38 AM, Konstantin Belousov wrote:
> >>> On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:
> >>>> Here are more convenient links that give diffs against FreeBSD and
> >>>> jemalloc for the proposed changes:
> >>>>
> >>>> FreeBSD:
> >>>> https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcf=
c63a0803a374212018f6b68~1...utrace2
> >>>>
> >>> Why  do you need to expedite the records through the ktrace at all ?
> >>> Wouldn't direct write(2)s to a file allow for better performance
> >>> due to not stressing kernel memory allocator and single writing threa=
d ?
> >>> Also, the malloc coupling to the single-system interface would be
> >>> prevented.
> >>>
> >>> I believe that other usermode tracers also behave in the similar way,
> >>> using writes and not private kernel interface.
> >>>
> >>> Also, what /proc issues did you mentioned ? There is
> >>> sysctl kern.proc.vmmap which is much more convenient than /proc/pid/m=
ap
> >>> and does not require /proc mounted.
> >>>
> >>>> jemalloc:
> >>>> https://github.com/alfredperlstein/jemalloc/compare/master...utrace2
> >>>>
> >> Konstantin, you are right, it is a strange thing this utrace.  I am not
> >> sure why it was done this way.
> >>
> >> You are correct in that much more efficient system could be made using
> >> writes gathered into a single write(2).
> > Even without writes gathering, non-coalesced writes should be faster th=
an
> > utrace.
> >
> >> Do you think there is any reason they may have re-used the kernel paths
> >> for ktrace even at the cost of efficiency?
> > I can only speculate. The utracing of the malloc calls in the context
> > of the ktrace stream is useful for the human reading the trace. Instead
> > of seeing the sequence of unexplanaible calls allocating and freeing
> > memory, you would see something more penetrable. For example, you would
> > see accept/malloc/read/write/free, which could be usefully interpreted
> > as network server serving the client.
> >
> > This context is not needed for a leak detector.
> Now I may be wrong here, but I think it's an artifact of someone=20
> noticing how useful fitting this into the ktrace system and leveraging=20
> existing code.
>=20
> Even though there are significant performance deficiencies, the actual=20
> utility of the existing framework may have been such a stepping stool=20
> towards tracing that it was just used.
>=20
> Right now the code already exists, however it logs just {operation,=20
> size, ptr}, example:
> malloc, 512, -> 0xdeadbeef
> free, 0, 0xdeadbeef
> realloc, 512, 0 -> 0xdeadc0de
> realloc, 1024, 0xdeadc0de -> 0xffff0000
> free, 0, 0xffff0000
>=20
> What do you think of just adding the address of the caller of=20
> malloc/free/realloc to these already existing tracepoints?

In most real-world applications I saw, malloc() was not a function called
to do the allocation. Usually, there is either an app-specific wrapper,
or the language runtime system which calls malloc(), e.g. the new operator
for the C++ code. Than, the caller address becomes constant for the whole
duration of the program run.

What would be useful is the full backtrace of each allocation. The tools
like libunwind are indeed optimized for this usage pattern.

=46rom this POV, the libc malloc(3) might be better offering a set of the
well-defined hooks for a pluggable tracer to utilize. I am on the fence
there, you could override the malloc/free without hooks, by the ELF symbol
interposing technique, but hooks would also offer features not easily
implementable with the interposing.

--KQ2iXOoQ638mtNze
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJQ7yyQAAoJEJDCuSvBvK1B+ykP/3o2xJU5OBePwhJepRkiE8AL
+y5m9TDUwJ8abK+lnyDvq2c+uK3AiaanTvM/fqrnvq6Y8oN1cw6l7dQKWR5Eja9p
bjB5lo80FfZYM5lvdZm51inNPW+E/8QeEeib/LEm9XkTbHmTzRdlH3W63mGNjRSA
alYtLcYuCnWD7Y/XXXB1P53Lt0kiE4Ldsu48cprcWbOtShiyrnW2WbNoOItyMG7y
5Fnn6SmRMGvwCZBiuxY4kYcL6Vbx5SNrxDp1rkyY2b7nJVQADbb3vLAjaDTg4eIJ
GxCU3F2NTNKkU+hk6wXDFMBIeQozbQ41IupdpqbUTFG+guLCdtFSV6Bjk2+Izq8m
i3jc4E3+OvZVWq9zCbKgB7uR7XpVf7fsE8PPhzQvVFMeNokH+XZvw1r//m+09fQz
NOTjkjFIe2QtO+5DEcP9b8Aanxwov5YF7zh23mZmtM/G46OzlpJ+qf+SuiEymtzs
zwQ9HfFwCQ3GvFIAS4ObjJ6ClWn4+LvJpilFkby0+mggzNGeuahaw32/92k8F3ow
Q2r6pLSfLELgB/kAZP4vtLQmGt5E+gs7R/at/cguk3smiapFp4T0HzVso9FZufHh
hhAuQacm2ywOO+lT6BAFQ7mNpq+n9DPQC0OVv0lt5kHNZQIHO0OMtEMbz+3nAKsk
i8jMPEIyto6tvUlb5puN
=K1XI
-----END PGP SIGNATURE-----

--KQ2iXOoQ638mtNze--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130110210313.GY2561>