From owner-svn-src-all@freebsd.org Thu Jun 25 12:07:54 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E522B98D491; Thu, 25 Jun 2015 12:07:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail108.syd.optusnet.com.au (mail108.syd.optusnet.com.au [211.29.132.59]) by mx1.freebsd.org (Postfix) with ESMTP id 928211901; Thu, 25 Jun 2015 12:07:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197]) by mail108.syd.optusnet.com.au (Postfix) with ESMTPS id 6CCDD1A50AD; Thu, 25 Jun 2015 22:07:44 +1000 (AEST) Date: Thu, 25 Jun 2015 22:07:44 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bryan Drewery cc: Bruce Evans , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r284163 - head/bin/cp In-Reply-To: <558B745C.8040305@FreeBSD.org> Message-ID: <20150625203730.F937@besplex.bde.org> References: <201506081924.t58JOJQw095752@svn.freebsd.org> <20150609152946.Y935@besplex.bde.org> <558B745C.8040305@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=XMDNMlVE c=1 sm=1 tr=0 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=PLlN6-q0pNDeQg5w_scA:9 a=5Z_-POF5Cg6TRMZp:21 a=loHo1uOnaBBYlrnk:21 a=CjuIK1q_8ugA:10 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jun 2015 12:07:55 -0000 On Wed, 24 Jun 2015, Bryan Drewery wrote: > On 6/9/2015 1:28 AM, Bruce Evans wrote: >> On Mon, 8 Jun 2015, Bryan Drewery wrote: >> >>> Log: >>> Cleanup some style(9) issues. >>> >>> - Whitespace. >>> - Comments. >>> - Wrap long lines. >> >> cp's style had a remarlable amount of bitrot. >> >> This change unimproves it in some places. > > I have been traveling and packing. I'm replying now but won't have time > to address the issues until next week. I was trying to avoid doing any > of this but touched code which was horrendously misstyled and chained > into reindenting the whole file and doing it wrong :). At this point I > don't want to tweak this much more. Thanks for replying. >>> @@ -245,10 +245,10 @@ main(int argc, char *argv[]) >>> type = FILE_TO_FILE; >>> >>> if (have_trailing_slash && type == FILE_TO_FILE) { >>> - if (r == -1) >>> + if (r == -1) { >> >> This adds excessive braces. >> >>> errx(1, "directory %s does not exist", >>> - to.p_path); >>> - else >>> + to.p_path); >>> + } else > > It is a multi-line statement due to the hard 80-width wrap. I feel it is > fine in this case. OK. The mail mangled all the tabs so this is hard to see now. >>> @@ -379,7 +379,8 @@ copy(char *argv[], enum op type, int fts >>> mode = curr->fts_statp->st_mode; >>> if ((mode & (S_ISUID | S_ISGID | S_ISTXT)) || >>> ((mode | S_IRWXU) & mask) != (mode & mask)) >>> - if (chmod(to.p_path, mode & mask) != 0){ >>> + if (chmod(to.p_path, mode & mask) != >>> + 0) { >>> warn("chmod: %s", to.p_path); >>> rval = 1; >>> } >> >> This changes from a minor misformatting to avoid a long line to even uglier >> formatting with a split line. > > I agree 100%. I did it because of our hard 80-width cut-off. What would > the proper style be? My inclination would be to wrap at the first comma > but then it is even more odd. I find our 80-width cut-off to be strange > when editors/tmux/window manager/etc can resize and wrap long lines already. I don't mind the hack of omitting the space before the final parentheses. This gives a line length of exactly 80, which is a little too long. The main problem with this is that automatic formatting programs will want to "fix" it. I like to wrap at the last possible comma, except for printf()s I usually wrap after the format string. Here the last comma is also the first comma. It is a little too early. Editors/tmux/window manager/etc cannot wrap long lines already. Even indent(1) is clueless about wrapping long lines. It mostly doesn't do it. Otherwise it would want to change more than it does, and make more messes by getting the wrapping slightly wronger than the original. How could a mere editor/window manager know: - C syntax, so as to pick wrapping points like commas - the project and programmer's preference for wrapping early or late in a context-dependent way - special cases like the above? Formatting away from the project or current source style to the programmer's preferred style is relatively easy, but is very difficult to format back to the original style so as to not change everything for a 1-line real change. > Actually I don't see a width restriction in style(9) at all but surely > we have this rule documented somewhere. My guess is that it is inherited > by KNF. cp.c (like most utilities in bin) used to be in KNF. I think CSRG release engineers (mainly Bostic?) enforced them a common style (perhaps the single release engineer's style and not quite KNF). From the above hack alone in cp.c, we can infer that the line length limit is 80. The default line length limit in indent(1) is 79, but indent(1) doesn't actually understand line lengths. Fuller references for the line length limit are: expand 4.4BD-Lite2/usr/src/admin/style/style | grep ... (79+ dots) This finds 1 line of length 80, and that line is broken (it has garbage trailing tabs), and a couple of lines of length 79, and none longer. This files gives KNF rules by example. Whitespace in it was broken by converting it to a man page. similar greps in kern and pure BSD utilities >> It is necessary to make such changes if you >> use indent(1) to generate and check the changes -- otherwise, indent keeps > > Do you have an indent configuration I can use? I use this .indent.pro. %%% -TFILE -Tfd_mask -Tfd_set -Tu_char -Tu_int -Tu_long -Tu_short -ta -bad -bap -nbbb -nbc -br -nbs -c41 -cd41 -cdb -ce -ci4 -cli0 -d0 -di8 -ndj -ei -nfc1 -nfcb -i8 -ip8 -l79 -lc77 -ldi0 -nlp -npcs -psl -sc -nsob -nv %%% The T directives in this are very incomplete. The -ta directive is supposed to handle all typedefs that spelled with a _t suffix. It replaces a much longer but even more incomplete list of T directives, and mostly works (it obviously makes messes if you have some non-typedefs spelled with a _t suffix). >>> Modified: head/bin/cp/utils.c >>> ============================================================================== >>> >>> --- head/bin/cp/utils.c Mon Jun 8 19:13:04 2015 (r284162) >>> +++ head/bin/cp/utils.c Mon Jun 8 19:24:18 2015 (r284163) >>> ... >>> -/* Small (default) buffer size in bytes. It's inefficient for this to be >>> - * smaller than MAXPHYS */ >>> +/* >>> + * Small (default) buffer size in bytes. It's inefficient for this to be >>> + * smaller than MAXPHYS. >>> + */ >> >> Still has unusual sentence break of 1 space. cp uses normal sentence >> breaks > > I did a minimal effort on comments and didn't clean up grammar or > breaks. I have not adopted 2 space breaks into my style(9) conformation yet. The above .indent.pro intentionally turns off most comment formatting, using directives that I added many years ago. Otherwise, almost every large comment gets rewrapped. I miss these directives the most in gnu indent. gnu indent is also missing a couple of other critical directives, probably including the relatively new -ta. Otherwise, it is a bit smarter than FreeBSD indent. It actually understands the line length limit, but without directives to control the details it tends to make messes enforcing it. >>> @@ -345,7 +352,7 @@ setfile(struct stat *fs, int fd) >>> fdval = fd != -1; >>> islink = !fdval && S_ISLNK(fs->st_mode); >>> fs->st_mode &= S_ISUID | S_ISGID | S_ISVTX | >>> - S_IRWXU | S_IRWXG | S_IRWXO; >>> + S_IRWXU | S_IRWXG | S_IRWXO; >> >> Here the formatting was reasonable, but it was in gnu style and was hard to >> maintain since it is not supported by indent(1). It is still hard to >> maintain, >> since it has fancy splitting earlier than necessary to put the S_IS* and >> S_IR* parts of the expressions on separate lines. indent(1) cannot >> reproduce >> this splitting. Also, with the normal indentation of the condinuation >> line, >> the fancy splitting is not so readable. > > I'm do not see how this was proper before or how it is worse now. The > indentation is tabs and then 4 spaces. I don't see exceptions to this in > style(9) or in other code. I said is was improper but more readable before. It is improper because of the strict indentation rules. These often prevent lining things up. Gnu has different strict indentation rules which often give lining up. In the above, they give something like: fs->st_mode &= S_ISUID | S_ISGID | S_ISVTX | S_IRWXU | S_IRWXG | S_IRWXO; Here the first difference from the rules is that the `|' operator goes on a new line. The next difference is that the continuation indent is to line up this operator under '&='. Perhaps it is supposed to be left justified, but I right justified it to line up the terms. More usually, the assignment operator is just '=', and there is no choice for the justifcation of any 1-letter operator under it. All these examples split the line earlier than necessary, so as to get 3 terms from the RHS on each line. Except in the strict KNF version, this also gives lining up of 3 S_IS's with 3 S_IR's. I was thinking that there was a bit more lining up than that, but there isn't except possibly when the macros are expanded to octal. More lining up would occur for similar expressions with S_I{R,W,X}{USR,GRP,OTH} -- there are 9 terms, and you might want to arrange them in a 3x3 matrix. >>> @@ -543,8 +550,10 @@ usage(void) >>> { >>> >>> (void)fprintf(stderr, "%s\n%s\n", >>> -"usage: cp [-R [-H | -L | -P]] [-f | -i | -n] [-alpsvx] source_file >>> target_file", >>> -" cp [-R [-H | -L | -P]] [-f | -i | -n] [-alpsvx] source_file >>> ... " >>> -"target_directory"); >>> + "usage: cp [-R [-H | -L | -P]] [-f | -i | -n] [-alpsvx] " >>> + "source_file target_file", >>> + " cp [-R [-H | -L | -P]] [-f | -i | -n] [-alpsvx] " >>> + "source_file ... " >>> + "target_directory"); >>> exit(EX_USAGE); >>> } >> >> This breaks the careful outdentation and obfuscates the strings. The > > Again, this broke the 80-width limit. I preferred the old way but was > going on down the 80-width line on my screen fixing violations. > > I suggest we update our styles to not require this awful wrapping. It > makes `grep -r` very difficult when strings are split up. Perhaps I am > mistaken on the rule but we have a lot of code that needlessly wraps early. I like outdenting long strings in printfs to column 0. After all, they will start in column 0 in the output. Unfortunately, just the quotes around them make them start in column 1 and sometimes be longer than the source line length limit even if they are shorter than the output line length limit. This only works well for literal strong. Complicated formatting directives tend to be longer in the source than the output, and simple formatting directives tend to be longer in the output than the source (%jd" may expand to 20 decimal digits even with only 64-bit intmax_t). But strings in usage messages are mostly literal. Note that the string splitting also bogotifies the very standard style in usage messages, of starting with (void)fprintf(stderr, "%s\n%s\n", where there is 1 "%s\n" per line. With literal strings and even without C90 string concatenation, there is no need for a separate format string. This style mainly reduces the source line lengths by 2 characters for \n, and makes the line structure clearer. Then any use of string concatenation makes it unclearer again. Bruce