From owner-freebsd-chat  Wed Jun 27 12: 7: 6 2001
Delivered-To: freebsd-chat@freebsd.org
Received: from guru.mired.org (okc-27-141-144.mmcable.com [24.27.141.144])
	by hub.freebsd.org (Postfix) with SMTP id EED1537B406
	for <freebsd-chat@FreeBSD.ORG>; Wed, 27 Jun 2001 12:07:03 -0700 (PDT)
	(envelope-from mwm@mired.org)
Received: (qmail 114 invoked by uid 100); 27 Jun 2001 19:06:57 -0000
From: Mike Meyer <mwm@mired.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <15162.11985.348196.939734@guru.mired.org>
Date: Wed, 27 Jun 2001 14:06:57 -0500
To: Anton Berezin <tobez@tobez.org>
Cc: Dag-Erling Smorgrav <des@ofug.org>,
	Ryan Thompson <ryan@sasknow.com>, freebsd-chat@FreeBSD.ORG
Subject: Re: most complex code in BSD?
In-Reply-To: <20010627205835.A72506@heechee.tobez.org>
References: <Pine.BSF.4.21.0106260042370.82644-100000@ren.sasknow.com>
	<xzpzoavadi6.fsf@flood.ping.uio.no>
	<20010627205835.A72506@heechee.tobez.org>
X-Mailer: VM 6.90 under 21.1 (patch 14) "Cuyahoga Valley" XEmacs Lucid
X-face: "5Mnwy%?j>IIV\)A=):rjWL~NB2aH[}Yq8Z=u~vJ`"(,&SiLvbbz2W`;h9L,Yg`+vb1>RG%
 *h+%X^n0EZd>TM8_IB;a8F?(Fb"lw'IgCoyM.[Lg#r\
Sender: owner-freebsd-chat@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-chat.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo?subject=subscribe%20freebsd-chat>
List-Unsubscribe: <mailto:majordomo?subject=unsubscribe%20freebsd-chat>
X-Loop: FreeBSD.org

Anton Berezin <tobez@tobez.org> types:
> On Tue, Jun 26, 2001 at 09:47:45AM +0200, Dag-Erling Smorgrav wrote:
> > Ryan Thompson <ryan@sasknow.com> writes:
> > > Classic problem with this is, even if you can understand it (or at
> > > least figure it out in less than a minute or so) who the hell could
> > > hazard a guess at the efficiency of that "algorithm"? (Before going
> > > to the trouble of testing it on a few million lines). I bet it isn't
> > > O(n) ;-)
> > I strongly suspect it's O(n*log(k)), where k is the number of distinct
> > lines in the input.  In most cases k will probably be a significant
> > fraction of n, so call it O(n*log(n)).
> 
> I do think that in fact it is closer to O(n), if one replaces s,.,,sg
> with $_="", which is the obvious optimization, and which you already
> mentioned.  The hash accesses are essentially O(1) for random enough
> keys and in case there are enough buckets - and there is always enough
> buckets in Perl since it rebalances its hashes when necessary.  So it
> should be close to O(n).

If I understand the problem correctly, the deviation from O(n) will be
the cost of rebalancing the hashes. Unless that's O(n) or less, Ryan's
correct.

	<mike
--
Mike Meyer <mwm@mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message