Date: Fri, 08 Dec 2017 14:34:52 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 224160] [patch] wc -c is slow Message-ID: <bug-224160-8-8XUdTcqYdN@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-224160-8@https.bugs.freebsd.org/bugzilla/> References: <bug-224160-8@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224160 Conrad Meyer <cem@freebsd.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |patch Status|New |In Progress Summary|wc -c is slow |[patch] wc -c is slow Assignee|freebsd-bugs@FreeBSD.org |cem@freebsd.org --- Comment #2 from Conrad Meyer <cem@freebsd.org> --- wc(1) uses a stack buffer of size MAXBSIZE, or 64kB. Increasing this may h= elp (move it to the heap). Secondly, there is an optimization for counting lines, and that same optimization counts characters, but it is not used if wc is only asked to c= ount characters! Silly. It's also not used if wc is asked to count stdin! Stu= pid. Just fixing stdin + character count optimization gives much better results, comparable to GNU wc: 2097152000 ~/obj/usr/home/conrad/src/freebsd/amd64.amd64/usr.bin/wc/wc -c 0.01s user 0.43s system 45% cpu 0.964 total Bumping the buffer size to 4 MB yields big improvement in system time. (No= te that the dd size was increased 10x.) Before: 20971520000 ~/obj/usr/home/conrad/src/freebsd/amd64.amd64/usr.bin/wc/wc -c 0.14s user 3.99s system 42% cpu 9.653 total After: 20971520000 ~/obj/usr/home/conrad/src/freebsd/amd64.amd64/usr.bin/wc/wc -c 0.12s user 1.90s system 40% cpu 4.954 total GNU wc is actually worse: 20971520000 gwc -c 0.21s user 2.91s system 48% cpu 6.490 total Here is the PoC patch (whitespace changes elided (-w) for legibility). Note that it leaks memory. 4 MB may be totally inappropriate for small devices, too. --- a/usr.bin/wc/wc.c +++ b/usr.bin/wc/wc.c @@ -199,15 +199,17 @@ cnt(const char *file) size_t clen; short gotsp; u_char *p; - u_char buf[MAXBSIZE]; + u_char *buf; wchar_t wch; mbstate_t mbs; +#define MY_BUF_SIZE (4 * 1024 * 1024) + buf =3D malloc(MY_BUF_SIZE); + linect =3D wordct =3D charct =3D llct =3D tmpll =3D 0; if (file =3D=3D NULL) fd =3D STDIN_FILENO; - else { - if ((fd =3D open(file, O_RDONLY, 0)) < 0) { + else if ((fd =3D open(file, O_RDONLY, 0)) < 0) { xo_warn("%s: open", file); return (1); } @@ -218,8 +220,8 @@ cnt(const char *file) * lines than to get words, since the word count requires some * logic. */ - if (doline) { - while ((len =3D read(fd, buf, MAXBSIZE))) { + if (doline || dochar) { + while ((len =3D read(fd, buf, MY_BUF_SIZE))) { if (len =3D=3D -1) { xo_warn("%s: read", file); (void)close(fd); @@ -230,6 +232,7 @@ cnt(const char *file) llct); } charct +=3D len; + if (doline) { for (p =3D buf; len--; ++p) if (*p =3D=3D '\n') { if (tmpll > llct) @@ -239,7 +242,9 @@ cnt(const char *file) } else tmpll++; } + } reset_siginfo(); + if (doline) tlinect +=3D linect; if (dochar) tcharct +=3D charct; @@ -270,13 +275,12 @@ cnt(const char *file) return (0); } } - } /* Do it the hard way... */ word: gotsp =3D 1; warned =3D 0; memset(&mbs, 0, sizeof(mbs)); - while ((len =3D read(fd, buf, MAXBSIZE)) !=3D 0) { + while ((len =3D read(fd, buf, MY_BUF_SIZE)) !=3D 0) { if (len =3D=3D -1) { xo_warn("%s: read", file !=3D NULL ? file : "stdin"= ); (void)close(fd); --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-224160-8-8XUdTcqYdN>