From owner-freebsd-bugs@FreeBSD.ORG Thu May 20 06:40:19 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9D4116A4CE for ; Thu, 20 May 2004 06:40:19 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id AE79643D46 for ; Thu, 20 May 2004 06:40:19 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) i4KDeJaF094707 for ; Thu, 20 May 2004 06:40:19 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i4KDeJfm094705; Thu, 20 May 2004 06:40:19 -0700 (PDT) (envelope-from gnats) Resent-Date: Thu, 20 May 2004 06:40:19 -0700 (PDT) Resent-Message-Id: <200405201340.i4KDeJfm094705@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Jonathan Wakely Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8068416A4CE for ; Thu, 20 May 2004 06:38:14 -0700 (PDT) Received: from www.freebsd.org (www.freebsd.org [216.136.204.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6224A43D55 for ; Thu, 20 May 2004 06:38:14 -0700 (PDT) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i4KDcEnB086634 for ; Thu, 20 May 2004 06:38:14 -0700 (PDT) (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.12.11/8.12.11/Submit) id i4KDcE1J086633; Thu, 20 May 2004 06:38:14 -0700 (PDT) (envelope-from nobody) Message-Id: <200405201338.i4KDcE1J086633@www.freebsd.org> Date: Thu, 20 May 2004 06:38:14 -0700 (PDT) From: Jonathan Wakely To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-2.3 Subject: misc/66941: Unacceptable stringstream performance X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 May 2004 13:40:19 -0000 >Number: 66941 >Category: misc >Synopsis: Unacceptable stringstream performance >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu May 20 06:40:19 PDT 2004 >Closed-Date: >Last-Modified: >Originator: Jonathan Wakely >Release: 4.9 >Organization: Mintel International >Environment: FreeBSD cartman.mintel.co.uk 4.9-STABLE-20040212-SESNAP FreeBSD 4.9-STABLE-20040212-SESNAP #1: Mon Feb 16 10:48:30 GMT 2004 jason@cartman.mintel.co.uk:/usr/obj/usr/src/sys/CARTMAN i386 >Description: The /usr/include/g++/sstream header provided with FreeBSD's GCC 2.95.4 performs very badly. Every single character written to the stream causes the buffer to allocate one extra character, copy the existing buffer contents, and append the new character. Once the buffer gets large the overhead of reallocating and copying for every single character becomes enormous. The C++ standard requires that the appends happen in amortised constant time. This implies the buffer should grow exponentially so that the overhead of reallocating+copying happens less frequently as the buffer grows. The performance makes it impractical to use for any sizable chunk of data, forcing you to use the unsafe instead. >How-To-Repeat: Testcase: #include #include #include #include #include template clock_t test(unsigned count) { SStreamT s; const clock_t start = ::clock(); for (unsigned i = 0; i < count; ++i) { s << ' '; } return ::clock() - start; } int main() { using namespace std; const unsigned count[] = {10000, 100000, 1000000}; cout << setw(18) << "iterations" << setw(18) << count[0] << setw(18) << count[1] << setw(18) << count[2] << endl << setw(18) << "strstream" << setw(18) << test(count[0]) << setw(18) << test(count[1]) << setw(18) << test(count[2]) << endl << setw(18) << "stringstream" << setw(18) << test(count[0]) << setw(18) << test(count[1]) << setw(18) << test(count[2]) << endl; } Running this on an unloaded 4-way Xeon gives: iterations 10000 100000 1000000 strstream 0 1 3 stringstream 2 503 129648 i.e. it takes roughly 1000s to write 1000000 characters to a buffer! >Fix: I've been patching the file on all our development servers for months, without problems (except when OS upgrades overwrite the file with broken versions again). The patched version grows the buffer exponentially, separately tracking the unused capacity and only reallocating when that spare capacity is exhausted. With my patch the above testcase produces: Patched on same system: iterations 10000 100000 1000000 strstream 0 0 4 stringstream 0 2 15 I'll attach the patch to this PR >Release-Note: >Audit-Trail: >Unformatted: