From owner-freebsd-current@FreeBSD.ORG Sun Aug 15 22:08:01 2010 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3612B10656A5; Sun, 15 Aug 2010 22:08:01 +0000 (UTC) (envelope-from tim@kientzle.com) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id EC8018FC19; Sun, 15 Aug 2010 22:08:00 +0000 (UTC) Received: by pvg4 with SMTP id 4so2013386pvg.13 for ; Sun, 15 Aug 2010 15:08:00 -0700 (PDT) Received: by 10.114.121.18 with SMTP id t18mr5195441wac.136.1281910080253; Sun, 15 Aug 2010 15:08:00 -0700 (PDT) Received: from [192.168.43.181] (m3d0536d0.tmodns.net [208.54.5.61]) by mx.google.com with ESMTPS id d39sm10737313wam.16.2010.08.15.15.07.56 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 15 Aug 2010 15:07:59 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Tim Kientzle In-Reply-To: <4C6844D8.5070602@andric.com> Date: Sun, 15 Aug 2010 15:07:51 -0700 Content-Transfer-Encoding: 7bit Message-Id: <9C0F9422-439B-4DB8-A1C4-9F1749407FC5@kientzle.com> References: <4C6505A4.9060203@FreeBSD.org> <20100813085235.GA16268@freebsd.org> <4C66C010.3040308@FreeBSD.org> <4C673F02.8000805@FreeBSD.org> <20100815013438.GA8958@troutmask.apl.washington.edu> <4C67492C.5020206@FreeBSD.org> <8639ufd78w.fsf@ds4.des.no> <4C6844D8.5070602@andric.com> To: Dimitry Andric X-Mailer: Apple Mail (2.1081) Cc: Doug Barton , Justin Hibbits , delphij@freebsd.org, Gabor Kovesdan , Steve Kargl , =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , current@freebsd.org Subject: Re: Official request: Please make GNU grep the default X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Aug 2010 22:08:01 -0000 On Aug 15, 2010, at 12:49 PM, Dimitry Andric wrote: > So my first quick fix attempt was to replace the home-grown grep_fgetln > with fgetln(3), which is in libc. This does not support gzip and bzip2 > files, but just to prove the point, it is enough. It gave the following > profiling result: FYI: libarchive has some pretty heavily-optimized bulk I/O routines and handles automatic decompression (including gzip, bzip2, lzma, xz, lzip, compress, and soon uuencode). There's a trick supported in libarchive now that will let you just use it's automatic decompression features on non-archive files (via "format_raw"). Unfortunately, it provides binary blocks of data; there's no nice line-reader interface. There's an effort afoot to refactor libarchive so that the stream I/O and compression/decompression support is actually a separate library that should be very useful for this sort of usage. As part of that, we plan to add some line-oriented I/O features that should be noticeably more efficient than stdio. Cheers, Tim