From owner-freebsd-questions@FreeBSD.ORG Thu Oct 4 20:38:00 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E2C7216A41B for ; Thu, 4 Oct 2007 20:38:00 +0000 (UTC) (envelope-from fbsd.questions@rachie.is-a-geek.net) Received: from snoogles.rachie.is-a-geek.net (rachie.is-a-geek.net [66.230.99.27]) by mx1.freebsd.org (Postfix) with ESMTP id C240D13C481 for ; Thu, 4 Oct 2007 20:38:00 +0000 (UTC) (envelope-from fbsd.questions@rachie.is-a-geek.net) Received: from localhost (localhost [127.0.0.1]) by snoogles.rachie.is-a-geek.net (Postfix) with ESMTP id C674E1CDEE for ; Thu, 4 Oct 2007 12:37:59 -0800 (AKDT) From: Mel To: freebsd-questions@freebsd.org Date: Thu, 4 Oct 2007 22:37:54 +0200 User-Agent: KMail/1.9.7 References: <4704DFF3.9040200@ibctech.ca> <20071003200013.GD45244@demeter.hydra> <47054A1D.2000701@ibctech.ca> In-Reply-To: <47054A1D.2000701@ibctech.ca> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200710042237.57712.fbsd.questions@rachie.is-a-geek.net> Subject: Re: Managing very large files X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Oct 2007 20:38:01 -0000 On Thursday 04 October 2007 22:16:29 Steve Bertrand wrote: > >> man 1 split > >> > >> (esp. -l) > > > > That's probably the best option for a one-shot deal like this. On the > > other hand, Perl itself provides the ability to go through a file one > > line at a time, so you could just read a line, operate, write a line (to > > a new file) as needed, over and over, until you get through the whole > > file. > > > > The real problem would be reading the whole file into a variable (or even > > multiple variables) at once. > > This is what I am afraid of. Just out of curiosity, if I did try to read > the entire file into a Perl variable all at once, would the box panic, > or as the saying goes 'what could possibly go wrong'? There's probably a reason why you want to process that file - splitting it can be a problem if you need to keep track of some states and it splits on the wrong line. So, I'd probably open it in perl (or whatever processor) directly and use a database for storage if I really need to keep string contexts, so that on each line iteration my perl memory is clean. -- Mel