From owner-freebsd-bugs@freebsd.org Thu Dec 7 13:07:28 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2CC94E86283 for ; Thu, 7 Dec 2017 13:07:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1BBDF63A5F for ; Thu, 7 Dec 2017 13:07:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id vB7D7RBj038406 for ; Thu, 7 Dec 2017 13:07:27 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 224160] wc -c is slow Date: Thu, 07 Dec 2017 13:07:28 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: wosch@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Dec 2017 13:07:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224160 Bug ID: 224160 Summary: wc -c is slow Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: bin Assignee: freebsd-bugs@FreeBSD.org Reporter: wosch@FreeBSD.org The wc(1) command has several optimizations to run as fast as possible. However, it is still slow in some use cases, much slower than the GNU wc command Using the OpenStreetMap database dump planet-latest.osm.bz2 (from https://wiki.openstreetmap.org/wiki/Planet.osm) which it is a 61GB bzip'd XML file. I checked how large the uncompressed XML is, on a 32 CPU machine: # FreeBSD wc $ pbzip2 -dc planet-latest.osm.bz2 | time wc -c 908171295050 4729.53 real 4400.69 user 199.34 sys the wc(1) command was running at 100% CPU time, and pbzip2 was using only 5= 00% CPU time. I run the tests again with GNU wc. The wc command was using only 20% CPU ti= me, and pbzip2 around 3000%. # GNU wc $ pbzip2 -dc planet-latest.osm.bz2 | time gwc -c 908171295050 2003.15 real 8.86 user 355.53 sys The FreeBSD wc(1) command is using 500 times more user time (4400 <-> 9) th= an the GNU wc, and a little bit less system time (199 <-> 355). The bottleneck= was not pbzip2, it was wc.=20 We should check why the optimization for wc -c for reading from stdin is not working. --=20 You are receiving this mail because: You are the assignee for the bug.=