From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 3 15:31:03 2008 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AEC2216A417 for ; Sun, 3 Feb 2008 15:31:03 +0000 (UTC) (envelope-from erikt@midgard.homeip.net) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 7025C13C4E1 for ; Sun, 3 Feb 2008 15:31:03 +0000 (UTC) (envelope-from erikt@midgard.homeip.net) Received: from c83-253-25-183.bredband.comhem.se ([83.253.25.183]:55047 helo=falcon.midgard.homeip.net) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1JLgZX-0005Jg-4n for hackers@freebsd.org; Sun, 03 Feb 2008 16:15:55 +0100 Received: (qmail 72769 invoked from network); 3 Feb 2008 16:15:50 +0100 Received: from owl.midgard.homeip.net (10.1.5.7) by falcon.midgard.homeip.net with ESMTP; 3 Feb 2008 16:15:50 +0100 Received: (qmail 67211 invoked by uid 1001); 3 Feb 2008 16:15:50 +0100 Date: Sun, 3 Feb 2008 16:15:50 +0100 From: Erik Trulsson To: Ed Schouten Message-ID: <20080203151550.GA67020@owl.midgard.homeip.net> Mail-Followup-To: Ed Schouten , Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= , hackers@freebsd.org References: <8663x6mc2o.fsf@ds4.des.no> <20080203131322.GK1179@hoeg.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <20080203131322.GK1179@hoeg.nl> User-Agent: Mutt/1.5.17 (2007-11-01) X-Originating-IP: 83.253.25.183 X-Scan-Result: No virus found in message 1JLgZX-0005Jg-4n. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1JLgZX-0005Jg-4n 2add0a7beedd1263bb69711c98ee2a3a Cc: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= , hackers@freebsd.org Subject: Re: sort(1) memory usage X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Feb 2008 15:31:03 -0000 On Sun, Feb 03, 2008 at 02:13:22PM +0100, Ed Schouten wrote: > * Dag-Erling Sm=F8rgrav wrote: > > I've been trying to figure out why some periodic scripts consume so much > > memory. I've narrowed it down to sort(1). > >=20 > > At first, I thought the scripts were using it inefficiently, feeding it > > more data than was really needed. Then I discovered this: > >=20 > > des@ds4 ~% (sleep 10 | sort) & (sleep 5 ; top -o res | grep sort) > > [1] 66024 > > 66024 des 1 -8 5 54796K 52680K piperd 1 0:00 0.88% sort > >=20 > > That's right - sort(1) consumes 50+ MB of memory doing *nothing*. > >=20 > > (roughly half that on a 32-bit box) > >=20 > > Something is rotten in the state of GNU... >=20 > On my i386 box it spends 27M, but when I replace `sort' with `sed', > without any arguments, it's only 1.4 MB. I tried this on RELENG_6. I can > also reproduce this on Linux. >=20 Yep, it seems that GNU sort allocates a quite large buffer by default when the size of the input is unknown (such as when it reads input from stdin.) A quick check in the source code indicates that it tries to size this buffer according to how much memory the system has (and according to any limits set on how much memory the process is allowed to use.) The size of this buffer can be controlled with the --buffer-size option to sort(1). --=20 Erik Trulsson ertr1013@student.uu.se