Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Apr 2009 09:02:36 +0200
From:      Christoph Mallon <christoph.mallon@gmx.de>
To:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Cc:        Maxim Sobolev <sobomax@FreeBSD.org>, Roman Divacky <rdivacky@freebsd.org>, Ed Schouten <ed@FreeBSD.org>, Warner Losh <imp@FreeBSD.org>
Subject:   C99: Suggestions for style(9)
Message-ID:  <49F4070C.2000108@gmx.de>

next in thread | raw e-mail | index | archive | help
Hi hackers@,

as some of you may have noticed, several years ago a new millenium 
started and a decade ago there was a new C standard. HEAD recently 
switched to C99 as default (actually gnu99, but that's rather close). So 
it's finally time to re-examine wether style(9) needs to be accomodated.
The diff with the proposed changes is attached. Below are the changes 
with some further explanations. They are in the order as they appear in 
style(9).
Maybe using all of these changes is a bit too big change at once, but 
I'd like some of them applied to style(9). So, please consider the 
proposed changes individually and not a as a all-or-nothing package.

> -Do not put declarations
> -inside blocks unless the routine is unusually complicated.
> +Prefer declaring loop iterators in the for-statement to limit their scope.
>  .Bd -literal
> -	for (; cnt < 15; cnt++) {
> +	for (int cnt = 0; cnt < 15; cnt++) {
[...]
> -When declaring variables in functions declare them sorted by size,
> -then in alphabetical order; multiple ones per line are okay.
> +When declaring variables in functions declare them sorted in alphabetical order;
> +prefer one declaration per line, especially when pointer declarators or 
> +initializations are involved.
>  If a line overflows reuse the type keyword.
>  .Pp
> -Be careful to not obfuscate the code by initializing variables in
> -the declarations.
> -Use this feature only thoughtfully.
> -DO NOT use function calls in initializers.
> +Prefer initializing variables right at their declaration.
> +When the value is not supposed to change in the function, make the variable
> +const.
> +This makes it easier for a reader to identify the where the value of a variable
> +originates.
> +Do not reuse the same variable in a different context, declare a new variable.
>  .Bd -literal
> -	struct foo one, *two;
> -	double three;
> -	int *four, five;
> -	char *six, seven, eight, nine, ten, eleven, twelve;
> +	struct foo *bar;
> +	struct foo baz = { 42, "qux" };
>  
> -	four = myfunction();
> +	FILE *const f = fopen("name", "r");
> +	if (f == NULL)
> +		err("Failed to open file");
> +	/* We can safely assume that f is not changed anymore, even if the
> +	 * function is a hundred lines long */

This change is to reduce the scope of local variables. For reasons why 
this does not cost anything in terms of stack space, please see the last 
change, which adds explanations about local variables.
Smaller scopes and distinct variables for distinct contexts make it 
easier for readers of the code to identify the def-use-chains of 
variables, because there are less places with assignments to a variable 
and the scope is smaller. Also, when a variable is initialised right at 
its declaration and is not supposed to change, you can emphasize this by 
making it const.
I looked at older discussions regarding this topic and identified 
several concerns, which were raised. I'll address them below:

- It does not add anything to the language.
Well, yes, just like if, do, for, goto, the comma operator, compound 
assignment (+=, ...), ++/-- and many other things. All you need to be 
Turing complete is lambda calculus - there hardly can be less syntax. 
But, like all mentioned constructs, it makes the code more concise.

- It leads to sloppy code. You could reuse another variable or should 
think again whether you really need this variable.
Reusing variables in different contexts is error prone and bad for 
maintainability. You could reuse a variable, which is not as dead as you 
thought. More local variables cost nothing, especially no stack space, 
see the last proposed changed below. It is good to use more local 
variables, because it gives the reader a name, which carries 
information. Also it makes it easier for a reader (and the compiler!) to 
identify, which expressions are the same. You could repeat 
foo->long_name->even_longer_name[2 * i + 1] five times or just declare a 
local variable and cache the value to make it easier to read.
It also enables the compiler to produce better warnings: If you declare 
a new variable, it is not initialised and if you do not initialise it on 
all paths before a use, the compiler will warn you. But if you reuse an 
older variable, it is already initialised by its former uses and so you 
get no warning, but operate on garbage values (the old value from the 
previous use).
So it does not lead to slopyy code, but better code, which is easier to 
comprehend, the compiler can better help you to prevent bugs, and as a 
side effect the compiler can produce better code (aliasing is a major 
problem for common subexpression elimination).

- You can determine the stack usage with all the variable declarations 
at the top.
This is not true. There is no relation between the declared variables 
and the stack used. Especially, more stack than suggested by the 
declarations can be used due to various optimisations - less space can 
be used, too, of course.

- It is harder to find the declaration of a variable.
Every editor, which is more advanced than ed, has facilities to jump to 
the declaration of a variable (e.g. in vim it's "gd"). And most often 
this is not necessary, because the declaration is just a few lines above 
instead of all the way up at the top of the function, so no jumping 
around is required at all.

- You could accidently shadow a variable.
All modern compilers have warnings for this. FreeBSD uses -Wshadow for 
GCC (and all relevant compilers understand this very option).


> -Values in
> -.Ic return
> -statements should be enclosed in parentheses.
> -.Pp

return with parentheses:
Removed, because it does not improve maintainability in any way. There 
is no source for confusion here, so the rule even contradicts the rule, 
which states not to use redundant parentheses. Maybe, decades ago it was 
just a workaround for a broken compiler, which does not exist anymore.


> -Old-style function declarations look like this:
> -.Bd -literal
> -static char *
> -function(a1, a2, fl, a4)
> -	int a1, a2;	/* Declare ints, too, don't default them. */
> -	float fl;	/* Beware double vs. float prototype differences. */
> -	int a4;		/* List in order declared. */
> -{
> -.Ed
> -.Pp
> -Use ANSI function declarations unless you explicitly need K&R compatibility.
> +Use ANSI function declarations.

Old-style function declarations:
Removed, because there is no need to use them anymore. The C99 standard 
considers them obsolescent[1], too. All headers use prototype 
declarations, there's no reason to handle the function definitions 
different. Also they are a source of bugs - or did you know that the 
following is wrong:
     void wrong(char /* sic! */);
     void wrong(x) char x; {}
And this is correct[2]:
     void right(int /* sic! */);
     void right(x) char x; {}
(It's a bug in GCC that it does not complain about the former.)


> -
> -static void
> -usage()
> -{
> -	/* Insert an empty line if the function has no local variables. */

Empty line when there are no local variables:
Removed, because it does not improve readability and it actually hinders 
readability for very short functions, e.g. static inline functions, 
which just consist of one or two lines of code without any local 
variables except for parameters. Further, it makes even less sense with 
the changed recommendations for declarations.


> +.Sh LOCAL VARIABLES
> +Unaliased local variables are for free on modern compilers.
> +They do not unconditionally use stack space.
> +In fact there is no relation between their number and the used stack space.
> +A local variable is guaranteed to be unaliased, if its address has not been
> +taken (e.g. &i).
> +Do not use the same local variable in different context just because it happens
> +that the same type is needed.
> +It does not improve the generated code in any way, but it makes it harder for a
> +reader to determine all definitions and uses which belong together.
> +Further, a new variable can be given a meaningful name (even if it is very
> +short), which automatically serves as documentation and so improves
> +comprehensibility.
> +Especially avoid re-using variables, whose address has been taken:
> +.Bd -literal
> +	int i;
> +	foo(&i);
> +	printf("%d\\n", i);
> +	for (i = 0; i != 10; ++i) {
> +		/* BAD: i will be needlessly moved to and from memory in
> +		 * the loop, because its address was taken before. This
> +		 * results in larger and slower code.
> +		 * Declare a new variable for the loop instead. */
> +		printf("%d\\n", i);
> +	}
> +.Ed
> +

Last, but definitely not least, I added this paragraph about the use of 
local variables. This is to clarify, how today's compilers handle 
unaliased local variables. There is absolutely no need to reuse them in 
different contexts. Trying to optimise stack usage in this way is 
misguided and is a source of bugs, when a reused variable is not as dead 
as thought. For more reasons, please read the quoted diff.
Maxim, you requested this paragraph: Does this addition suit you?


Hopefully at least some of these suggestions are considered improvements 
and are applied to style(9).

Regards
	Christoph


[1] ISO/IEC 9899:1999 (E) §6.11.6 and §6.11.7
[2] ISO/IEC 9899:1999 (E) §6.7.5.3:15



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49F4070C.2000108>