From owner-freebsd-current@FreeBSD.ORG Tue Jan 25 22:10:11 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EED0016A4CE for ; Tue, 25 Jan 2005 22:10:11 +0000 (GMT) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id AF7DC43D39 for ; Tue, 25 Jan 2005 22:10:11 +0000 (GMT) (envelope-from tim@robbins.dropbear.id.au) Received: from robbins.dropbear.id.au (210.50.86.185) by smtp01.syd.iprimus.net.au (7.0.036) id 41A76DD9019C28BD; Wed, 26 Jan 2005 09:10:10 +1100 Received: by robbins.dropbear.id.au (Postfix, from userid 1000) id 477204298; Wed, 26 Jan 2005 09:10:47 +1100 (EST) Date: Wed, 26 Jan 2005 09:10:47 +1100 From: Tim Robbins To: Scot Hetzel Message-ID: <20050125221047.GA339@cat.robbins.dropbear.id.au> References: <790a9fff05012509511b64e3ad@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <790a9fff05012509511b64e3ad@mail.gmail.com> User-Agent: Mutt/1.4.1i cc: freebsd-current@freebsd.org Subject: Re: uniq truncates lines > 2048 bytes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jan 2005 22:10:12 -0000 On Tue, Jan 25, 2005 at 11:51:51AM -0600, Scot Hetzel wrote: > I noticed that if a file has lines > 2048 bytes, uniq will truncate > the line to LINE_MAX (2048 bytes). An easy way to test this is to do > the following: > > cd /usr/ports/accessibility/gnomemag > make fetch-list > test.list > make fetch-list >> test.list > uniq test.list > test2.list > > test2.list should be half the size of test.list, but it is 2048 bytes. > > I have come up with a patch to uniq that fixes this problem. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=76578 This looks good except for failure to check for realloc() returning NULL and a few minor style problems. It may be possible to use fgetwln() to read lines instead of getwc() + realloc() etc., but this function is new and peculiar to FreeBSD. I was planning on going through all text-processing utilities in the base system some time and either fixing line length problems or documenting them, similar to what I did with multibyte character support. I may make a start at that today. Tim