Date: Sun, 26 Apr 2009 09:02:36 +0200 From: Christoph Mallon <christoph.mallon@gmx.de> To: FreeBSD Hackers <freebsd-hackers@freebsd.org> Cc: Maxim Sobolev <sobomax@FreeBSD.org>, Roman Divacky <rdivacky@freebsd.org>, Ed Schouten <ed@FreeBSD.org>, Warner Losh <imp@FreeBSD.org> Subject: C99: Suggestions for style(9) Message-ID: <49F4070C.2000108@gmx.de>
next in thread | raw e-mail | index | archive | help
Hi hackers@, as some of you may have noticed, several years ago a new millenium started and a decade ago there was a new C standard. HEAD recently switched to C99 as default (actually gnu99, but that's rather close). So it's finally time to re-examine wether style(9) needs to be accomodated. The diff with the proposed changes is attached. Below are the changes with some further explanations. They are in the order as they appear in style(9). Maybe using all of these changes is a bit too big change at once, but I'd like some of them applied to style(9). So, please consider the proposed changes individually and not a as a all-or-nothing package. > -Do not put declarations > -inside blocks unless the routine is unusually complicated. > +Prefer declaring loop iterators in the for-statement to limit their scope. > .Bd -literal > - for (; cnt < 15; cnt++) { > + for (int cnt = 0; cnt < 15; cnt++) { [...] > -When declaring variables in functions declare them sorted by size, > -then in alphabetical order; multiple ones per line are okay. > +When declaring variables in functions declare them sorted in alphabetical order; > +prefer one declaration per line, especially when pointer declarators or > +initializations are involved. > If a line overflows reuse the type keyword. > .Pp > -Be careful to not obfuscate the code by initializing variables in > -the declarations. > -Use this feature only thoughtfully. > -DO NOT use function calls in initializers. > +Prefer initializing variables right at their declaration. > +When the value is not supposed to change in the function, make the variable > +const. > +This makes it easier for a reader to identify the where the value of a variable > +originates. > +Do not reuse the same variable in a different context, declare a new variable. > .Bd -literal > - struct foo one, *two; > - double three; > - int *four, five; > - char *six, seven, eight, nine, ten, eleven, twelve; > + struct foo *bar; > + struct foo baz = { 42, "qux" }; > > - four = myfunction(); > + FILE *const f = fopen("name", "r"); > + if (f == NULL) > + err("Failed to open file"); > + /* We can safely assume that f is not changed anymore, even if the > + * function is a hundred lines long */ This change is to reduce the scope of local variables. For reasons why this does not cost anything in terms of stack space, please see the last change, which adds explanations about local variables. Smaller scopes and distinct variables for distinct contexts make it easier for readers of the code to identify the def-use-chains of variables, because there are less places with assignments to a variable and the scope is smaller. Also, when a variable is initialised right at its declaration and is not supposed to change, you can emphasize this by making it const. I looked at older discussions regarding this topic and identified several concerns, which were raised. I'll address them below: - It does not add anything to the language. Well, yes, just like if, do, for, goto, the comma operator, compound assignment (+=, ...), ++/-- and many other things. All you need to be Turing complete is lambda calculus - there hardly can be less syntax. But, like all mentioned constructs, it makes the code more concise. - It leads to sloppy code. You could reuse another variable or should think again whether you really need this variable. Reusing variables in different contexts is error prone and bad for maintainability. You could reuse a variable, which is not as dead as you thought. More local variables cost nothing, especially no stack space, see the last proposed changed below. It is good to use more local variables, because it gives the reader a name, which carries information. Also it makes it easier for a reader (and the compiler!) to identify, which expressions are the same. You could repeat foo->long_name->even_longer_name[2 * i + 1] five times or just declare a local variable and cache the value to make it easier to read. It also enables the compiler to produce better warnings: If you declare a new variable, it is not initialised and if you do not initialise it on all paths before a use, the compiler will warn you. But if you reuse an older variable, it is already initialised by its former uses and so you get no warning, but operate on garbage values (the old value from the previous use). So it does not lead to slopyy code, but better code, which is easier to comprehend, the compiler can better help you to prevent bugs, and as a side effect the compiler can produce better code (aliasing is a major problem for common subexpression elimination). - You can determine the stack usage with all the variable declarations at the top. This is not true. There is no relation between the declared variables and the stack used. Especially, more stack than suggested by the declarations can be used due to various optimisations - less space can be used, too, of course. - It is harder to find the declaration of a variable. Every editor, which is more advanced than ed, has facilities to jump to the declaration of a variable (e.g. in vim it's "gd"). And most often this is not necessary, because the declaration is just a few lines above instead of all the way up at the top of the function, so no jumping around is required at all. - You could accidently shadow a variable. All modern compilers have warnings for this. FreeBSD uses -Wshadow for GCC (and all relevant compilers understand this very option). > -Values in > -.Ic return > -statements should be enclosed in parentheses. > -.Pp return with parentheses: Removed, because it does not improve maintainability in any way. There is no source for confusion here, so the rule even contradicts the rule, which states not to use redundant parentheses. Maybe, decades ago it was just a workaround for a broken compiler, which does not exist anymore. > -Old-style function declarations look like this: > -.Bd -literal > -static char * > -function(a1, a2, fl, a4) > - int a1, a2; /* Declare ints, too, don't default them. */ > - float fl; /* Beware double vs. float prototype differences. */ > - int a4; /* List in order declared. */ > -{ > -.Ed > -.Pp > -Use ANSI function declarations unless you explicitly need K&R compatibility. > +Use ANSI function declarations. Old-style function declarations: Removed, because there is no need to use them anymore. The C99 standard considers them obsolescent[1], too. All headers use prototype declarations, there's no reason to handle the function definitions different. Also they are a source of bugs - or did you know that the following is wrong: void wrong(char /* sic! */); void wrong(x) char x; {} And this is correct[2]: void right(int /* sic! */); void right(x) char x; {} (It's a bug in GCC that it does not complain about the former.) > - > -static void > -usage() > -{ > - /* Insert an empty line if the function has no local variables. */ Empty line when there are no local variables: Removed, because it does not improve readability and it actually hinders readability for very short functions, e.g. static inline functions, which just consist of one or two lines of code without any local variables except for parameters. Further, it makes even less sense with the changed recommendations for declarations. > +.Sh LOCAL VARIABLES > +Unaliased local variables are for free on modern compilers. > +They do not unconditionally use stack space. > +In fact there is no relation between their number and the used stack space. > +A local variable is guaranteed to be unaliased, if its address has not been > +taken (e.g. &i). > +Do not use the same local variable in different context just because it happens > +that the same type is needed. > +It does not improve the generated code in any way, but it makes it harder for a > +reader to determine all definitions and uses which belong together. > +Further, a new variable can be given a meaningful name (even if it is very > +short), which automatically serves as documentation and so improves > +comprehensibility. > +Especially avoid re-using variables, whose address has been taken: > +.Bd -literal > + int i; > + foo(&i); > + printf("%d\\n", i); > + for (i = 0; i != 10; ++i) { > + /* BAD: i will be needlessly moved to and from memory in > + * the loop, because its address was taken before. This > + * results in larger and slower code. > + * Declare a new variable for the loop instead. */ > + printf("%d\\n", i); > + } > +.Ed > + Last, but definitely not least, I added this paragraph about the use of local variables. This is to clarify, how today's compilers handle unaliased local variables. There is absolutely no need to reuse them in different contexts. Trying to optimise stack usage in this way is misguided and is a source of bugs, when a reused variable is not as dead as thought. For more reasons, please read the quoted diff. Maxim, you requested this paragraph: Does this addition suit you? Hopefully at least some of these suggestions are considered improvements and are applied to style(9). Regards Christoph [1] ISO/IEC 9899:1999 (E) §6.11.6 and §6.11.7 [2] ISO/IEC 9899:1999 (E) §6.7.5.3:15
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49F4070C.2000108>