Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Jun 2011 06:41:36 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Alexander Best <arundel@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, Kevin Lo <kevlo@FreeBSD.org>, Roman Divacky <rdivacky@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-all@FreeBSD.org
Subject:   Re: svn commit: r223493 - in head/usr.bin: ktrace ncplogin systat tftp vmstat
Message-ID:  <20110625055010.G943@besplex.bde.org>
In-Reply-To: <20110624080548.GA60075@freebsd.org>
References:  <201106240718.p5O7IiZ2034966@svn.freebsd.org> <20110624072113.GA52982@freebsd.org> <20110624080548.GA60075@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 24 Jun 2011, Alexander Best wrote:

> On Fri Jun 24 11, Roman Divacky wrote:
>> On Fri, Jun 24, 2011 at 07:18:44AM +0000, Kevin Lo wrote:
>>> Author: kevlo
>>> Date: Fri Jun 24 07:18:44 2011
>>> New Revision: 223493
>>> URL: http://svn.freebsd.org/changeset/base/223493
>>>
>>> Log:
>>>   Remove duplicated header files
>>
>> You may be interested in this tool:
>>
>>         http://code.google.com/p/include-what-you-use/
>
> did they add C support since the last time i tested it? back then it only
> worked with CPP code, iirc. it was possible to thorw it at C code, but it
> returned pretty useless results.

Why not use a FreeBSD utility?  There is /usr/src/tools/tools/kerninclude/
kerninclude.sh.  This is only for the kernel.  No one uses it.  I
believe it is based on my unusedinc utility which works almost anywhere
that make(1) works and has code clean enough to compile with WARNS ~= 6
and doesn't have a convoluted source tree.

My utility just copies the source file to the current directory, removes
1 #include line at a time from it, and compiles it before and after
with many warning flags; any differences are attributed to the #include
line being used.  Unfortunately, this gives many false positives and
negatives due to namespace pollution and other poor structuring in
standard headers.  You can remove a primary header like <stdio.h> which
is obviously used, and find that everything still includes because
some other header is polluted.  The pollution may be an documented
feature (like <sys/param.h> including <sys/types.h>), in which case
it may be correct to depend on it and a style bug to include both
headers.  The pollution may be an undocumented feature, in which case
it may be correct to depend on it after fixing the documentation.
But usually, the pollution is an undocumented bug.  In all cases,
before removing a single header, you must:
- determine if pollution is involved.  I normally try removing the
   #include line manually, running "make depend", and checking if
   the file is still included
- if pollution is involved, read all the header files and man pages
   to see if it is intended.

kerninclude.sh does a more sophisticated version of this.  It does
something like copying the entire include hierarchy and replacing each
header referred to in an #include line by an empty header, to ensure
that the header isn't used by pollution.  But would make it too slow
to use in a small source tree or one file at a time even if it supported
these.  unusedinc scribbling in the source directory is sloppy (it
does this because using $(mktemp) tends to break paths.  This and other
problems with paths might be fixable by setting up a full obj tree.
I'm not sure if kerninclude.sh does this, but if it does this would be
another source of slowness.  kerninclude.sh is inconvenient to use on
the kernel too.  It takes too long, and finds more bugs than you want
to know about.

AFAIK, google doesn't yet have the AI needed to do this correctly or
efficiently.  It would have to understand FreeBSD header pollution better
than FreeBSD does.  Often the correct fix is to fix the pollution in
standard headers and not change the application #include's.  Often, the
correct quick fix is to not change the application #includes and wait for
someone to fix the pollution.

Example of unusedinc output:

% trying /dumpster/home/bde/q/vmstat/vmstat.c
% line 75 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <time.h>
% line 70 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <paths.h>
% line 69 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <nlist.h>
% line 66 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <kvm.h>
% line 60 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <vm/vm_param.h>
% line 57 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/vmmeter.h>
% line 56 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/time.h>
% line 54 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/resource.h>
% line 51 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/signal.h>
% line 50 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/malloc.h>
% line 48 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/uio.h>
% line 46 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/time.h>
% line 45 in source /dumpster/home/bde/q/vmstat/vmstat.c seems to be unnecessary:
% #include <sys/param.h>

I only investigated 3 of the lines of output:
- the 2 #includes of <sys/time.h>.  kevlo removed 1 of these.  The wrong one,
   since <sys/time.h> is a documented prerequisite for <sys/resource.h> in at
   least getrusage(2).  <sys/time.h> was included both before and after
   <sys/resource.h>.  The former was unsorted but satisfied the prerequisite.
   Now, only the latter is included.  This is sorted but doesn't satisfy the
   prereqisite.  However, the documentation is wrong.  <sys/resource.h> was
   depolluted long ago, so doesn't have any prerequisites or any internal
   pollution.  getrusage(2) also documents the old prerequisite
   <sys/types.h>.  Many other man pages do this too.  This was required in
   POSIX.1-1988, and it was added to some FreeBSD man pages relatively
   recently (sometimes via OtherBSD).  But POSIX.1-2001 removed this
   requirement, and FreeBSD fixed many other headers to not require it,
   without updating the man pages.  This requirement belongs at most in
   a portability/history section of the man pages.
- <nlist.h> is used in vmstat, but removing the #include of it has no effect
   there since it is still #included via pollution in <kvm.h> which
   uses it bogusly (just once for an incomplete struct declaration of
   struct nlist for a prototype; <nlist.h> gives the complete struct
   declaration, but this is not used in <kvm.h>).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110625055010.G943>