From owner-freebsd-hackers@FreeBSD.ORG Sat Dec 6 15:40:28 2008 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE65D1065676; Sat, 6 Dec 2008 15:40:28 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from igloo.linux.gr (igloo.linux.gr [62.1.205.36]) by mx1.freebsd.org (Postfix) with ESMTP id 606D78FC21; Sat, 6 Dec 2008 15:40:27 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from kobe.laptop (adsl153-194.kln.forthnet.gr [62.1.244.194]) (authenticated bits=128) by igloo.linux.gr (8.14.3/8.14.3/Debian-5) with ESMTP id mB6FeFbm019938 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 6 Dec 2008 17:40:21 +0200 Received: from kobe.laptop (kobe.laptop [127.0.0.1]) by kobe.laptop (8.14.3/8.14.3) with ESMTP id mB6FeFWp037491; Sat, 6 Dec 2008 17:40:15 +0200 (EET) (envelope-from keramida@ceid.upatras.gr) Received: (from keramida@localhost) by kobe.laptop (8.14.3/8.14.3/Submit) id mB6FeEUg037490; Sat, 6 Dec 2008 17:40:14 +0200 (EET) (envelope-from keramida@ceid.upatras.gr) From: Giorgos Keramidas To: Konstantin Belousov To: "Sheldon Givens" References: <871vwmtawz.fsf@kobe.laptop> <87prk5ms72.fsf@kobe.laptop> Date: Sat, 06 Dec 2008 17:40:14 +0200 In-Reply-To: <87prk5ms72.fsf@kobe.laptop> (Giorgos Keramidas's message of "Sat, 06 Dec 2008 13:57:53 +0200") Message-ID: <878wqtia75.fsf@kobe.laptop> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-MailScanner-ID: mB6FeFbm019938 X-Hellug-MailScanner: Found to be clean X-Hellug-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-4.004, required 5, autolearn=not spam, ALL_TRUSTED -1.80, AWL 0.40, BAYES_00 -2.60) X-Hellug-MailScanner-From: keramida@ceid.upatras.gr X-Spam-Status: No Cc: freebsd-hackers@FreeBSD.org Subject: Re: Small change to wc X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Dec 2008 15:40:28 -0000 On Sat, 06 Dec 2008 13:57:53 +0200, Giorgos Keramidas wrote: >>> Can you post a `diff -u' or `diff -c' version of the patch? I like the >>> idea of the new option but it would be easier to read in -u/-c format. >> >> New diff -u: > > Excellent, thanks! Other than a few minor style-bugs, which can be > fixed before committing it (see inline comments for details), the patch > looks great to me :-) Ok, I've fixed a few minor bugs I noticed: * When only -L is specified it should not enable the default 'cwl' set of options. * The tlongline total length should not be overwritten by each input file, unless we *did* find a longer line in the particular file. * The (llct - 1) trick is not needed when printing llcnt if we only count != '\n' characters near line 229. The updated patch, and a manpage change to document the new option is attached below. Konstantin, if you like this version of the patch, I'll commit it to /head and schedule an MFC after a week or so. The longer line length that -L reports seems to work like this here: : keramida@kobe:/hg/bsd/src/usr.bin/wc$ ./wc /etc/rc.conf : 114 222 2632 /etc/rc.conf : keramida@kobe:/hg/bsd/src/usr.bin/wc$ ./wc -L /etc/rc.conf : 82 /etc/rc.conf : keramida@kobe:/hg/bsd/src/usr.bin/wc$ ./wc -lwc -L /etc/rc.conf : 114 222 2632 82 /etc/rc.conf : keramida@kobe:/hg/bsd/src/usr.bin/wc$ ./wc -lwc -L /etc/rc.???? : 114 222 2632 82 /etc/rc.conf : 1598 5648 36725 85 /etc/rc.subr : 1712 5870 39357 85 total : keramida@kobe:/hg/bsd/src/usr.bin/wc$ Here's the current patch version... %%% diff -r fb56dd4c9c47 usr.bin/wc/wc.1 --- a/usr.bin/wc/wc.1 Sat Dec 06 17:04:51 2008 +0200 +++ b/usr.bin/wc/wc.1 Sat Dec 06 17:39:17 2008 +0200 @@ -43,7 +43,7 @@ .Nd word, line, character, and byte count .Sh SYNOPSIS .Nm -.Op Fl clmw +.Op Fl Lclmw .Op Ar .Sh DESCRIPTION The @@ -71,6 +71,15 @@ .Pp The following options are available: .Bl -tag -width indent +.It Fl L +The number of characters in the longest input line +is written to the standard output. +When more then one +.Ar file +argument is specified, the longest input line of +.Em all +files is reported as the value of the final +.Dq total . .It Fl c The number of bytes in each input file is written to the standard output. @@ -129,6 +138,10 @@ as well as the totals for both: .Pp .Dl "wc -mlw report1 report2" +.Pp +Find the longest line in a list of files: +.Pp +.Dl "wc -L file1 file2 file3 | fgrep total" .Sh COMPATIBILITY Historically, the .Nm @@ -154,6 +167,16 @@ .Xr iswspace 3 function, as required by .St -p1003.2 . +.Pp +The +.Fl L +option is a non-standard +.Fx +extension, compatible with the +.Fl L +option of the GNU +.Nm +utility. .Sh SEE ALSO .Xr iswspace 3 .Sh STANDARDS diff -r fb56dd4c9c47 usr.bin/wc/wc.c --- a/usr.bin/wc/wc.c Sat Dec 06 17:04:51 2008 +0200 +++ b/usr.bin/wc/wc.c Sat Dec 06 17:39:17 2008 +0200 @@ -62,8 +62,8 @@ #include #include -uintmax_t tlinect, twordct, tcharct; -int doline, doword, dochar, domulti; +uintmax_t tlinect, twordct, tcharct, tlongline; +int doline, doword, dochar, domulti, dolongline; static int cnt(const char *); static void usage(void); @@ -75,7 +75,7 @@ (void) setlocale(LC_CTYPE, ""); - while ((ch = getopt(argc, argv, "clmw")) != -1) + while ((ch = getopt(argc, argv, "clmwL")) != -1) switch((char)ch) { case 'l': doline = 1; @@ -87,6 +87,9 @@ dochar = 1; domulti = 0; break; + case 'L': + dolongline = 1; + break; case 'm': domulti = 1; dochar = 0; @@ -99,7 +102,7 @@ argc -= optind; /* Wc's flags are on by default. */ - if (doline + doword + dochar + domulti == 0) + if (doline + doword + dochar + domulti + dolongline == 0) doline = doword = dochar = 1; errors = 0; @@ -125,6 +128,8 @@ (void)printf(" %7ju", twordct); if (dochar || domulti) (void)printf(" %7ju", tcharct); + if (dolongline) + (void)printf(" %7ju", tlongline); (void)printf(" total\n"); } exit(errors == 0 ? 0 : 1); @@ -134,7 +139,7 @@ cnt(const char *file) { struct stat sb; - uintmax_t linect, wordct, charct; + uintmax_t linect, wordct, charct, llct, tmpll; int fd, len, warned; size_t clen; short gotsp; @@ -143,7 +148,7 @@ wchar_t wch; mbstate_t mbs; - linect = wordct = charct = 0; + linect = wordct = charct = llct = tmpll = 0; if (file == NULL) { file = "stdin"; fd = STDIN_FILENO; @@ -168,8 +173,13 @@ } charct += len; for (p = buf; len--; ++p) - if (*p == '\n') + if (*p == '\n') { + if (tmpll > llct) + llct = tmpll; + tmpll = 0; ++linect; + } else + tmpll++; } tlinect += linect; (void)printf(" %7ju", linect); @@ -177,6 +187,11 @@ tcharct += charct; (void)printf(" %7ju", charct); } + if (dolongline) { + if (llct > tlongline) + tlongline = llct; + (void)printf(" %7ju", tlongline); + } (void)close(fd); return (0); } @@ -229,10 +244,16 @@ else if (clen == 0) clen = 1; charct++; + if (wch != L'\n') + tmpll++; len -= clen; p += clen; - if (wch == L'\n') + if (wch == L'\n') { + if (tmpll > llct) + llct = tmpll; + tmpll = 0; ++linect; + } if (iswspace(wch)) gotsp = 1; else if (gotsp) { @@ -256,6 +277,11 @@ tcharct += charct; (void)printf(" %7ju", charct); } + if (dolongline) { + if (llct > tlongline) + tlongline = llct; + (void)printf(" %7ju", llct); + } (void)close(fd); return (0); } @@ -263,6 +289,6 @@ static void usage() { - (void)fprintf(stderr, "usage: wc [-clmw] [file ...]\n"); + (void)fprintf(stderr, "usage: wc [-Lclmw] [file ...]\n"); exit(1); } %%%