From owner-freebsd-bugs@FreeBSD.ORG Fri Jun 25 23:00:10 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 70B24106564A for ; Fri, 25 Jun 2010 23:00:10 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2C8908FC0A for ; Fri, 25 Jun 2010 23:00:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o5PN0AEQ072968 for ; Fri, 25 Jun 2010 23:00:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o5PN0A4a072967; Fri, 25 Jun 2010 23:00:10 GMT (envelope-from gnats) Resent-Date: Fri, 25 Jun 2010 23:00:10 GMT Resent-Message-Id: <201006252300.o5PN0A4a072967@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Peter Jeremy Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16F77106566B for ; Fri, 25 Jun 2010 22:56:44 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by mx1.freebsd.org (Postfix) with ESMTP id 9B4E38FC14 for ; Fri, 25 Jun 2010 22:56:43 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c211-30-160-13.belrs4.nsw.optusnet.com.au [211.30.160.13]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o5PMue4B024972 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 26 Jun 2010 08:56:41 +1000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id o5PMucCi037028; Sat, 26 Jun 2010 08:56:38 +1000 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id o5PMucNC037027; Sat, 26 Jun 2010 08:56:38 +1000 (EST) (envelope-from peter) Message-Id: <201006252256.o5PMucNC037027@server.vk2pj.dyndns.org> Date: Sat, 26 Jun 2010 08:56:38 +1000 (EST) From: Peter Jeremy To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: bin/148150: Poor file(1) performance X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Peter Jeremy List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2010 23:00:10 -0000 >Number: 148150 >Category: bin >Synopsis: Poor file(1) performance >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Jun 25 23:00:09 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Peter Jeremy >Release: FreeBSD 8.1-PRERELEASE amd64 >Organization: n/a >Environment: System: FreeBSD server.vk2pj.dyndns.org 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #4: Sun Jun 13 09:18:30 EST 2010 root@server.vk2pj.dyndns.org:/var/obj/usr/src/sys/server amd64 FreeBSD aspire.vk2pj.dyndns.org 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #12: Mon Jun 14 11:34:12 EST 2010 root@builder.vk2pj.dyndns.org:/obj/usr/src/sys/aspire i386 >Description: I recently had reason to run file(1) on a large number of files and felt the performance wasn't up to par. When I investigated further, I found that about 95% of the runtime related to the two regex's to recognize REXX files: # OS/2 batch files are REXX. the second regex is a bit generic, oh well # the matched commands seem to be common in REXX and uncommon elsewhere 100 regex/c =3D^[\ \t]{0,10}call[\ \t]{1,10}rxfunc OS/2 REXX batch file text 100 regex/c =3D^[\ \t]{0,10}say\ ['"] OS/2 REXX batch file text Since REXX files are not present in my environment, I can avoid the issue by just commenting out the offending lines. Someone with more expertise in magic(5) might be able to suggest a better fix. I have tried reporting this to the upstream maintainers and ` received a "not interested" response. >How-To-Repeat: Copy /usr/share/misc/magic to magic.old Apply the equivalent of the below patch to create magic.new time(1) file(1) on the same set of files using magic.old and magic.new Using my home directory on my i386 netbook, I get: file -m magic.new * > /dev/null 1.42s user 0.13s system 98% cpu 1.576 total file -m magic.new * > /dev/null 1.35s user 0.10s system 98% cpu 1.469 total file -m magic.new * > /dev/null 1.35s user 0.10s system 98% cpu 1.470 total file -m magic.old * > /dev/null 33.35s user 0.11s system 98% cpu 34.055 total file -m magic.old * > /dev/null 33.12s user 0.14s system 98% cpu 33.714 total file -m magic.old * > /dev/null 33.08s user 0.11s system 98% cpu 33.606 total Using my home directory on my amd64 desktop, I get: file -m magic.new * > /dev/null 2.18s user 0.41s system 28% cpu 9.111 total file -m magic.new * > /dev/null 2.11s user 0.49s system 24% cpu 10.707 total file -m magic.new * > /dev/null 2.05s user 0.56s system 23% cpu 10.989 total file -m magic.old * > /dev/null 28.54s user 0.51s system 78% cpu 37.088 total file -m magic.old * > /dev/null 28.54s user 0.52s system 89% cpu 32.575 total file -m magic.old * > /dev/null 28.71s user 0.47s system 99% cpu 29.371 total The poorer wallclock performance on my amd64 is because it's running ZFS without adequate RAM whereas my netbook is UFS on SSD and the actual directory contents are completely different. >Fix: The following just comments out the REXX test. Index: Magdir/msdos =================================================================== RCS file: /usr/ncvs/src/contrib/file/Magdir/msdos,v retrieving revision 1.3 diff -u -r1.3 msdos --- Magdir/msdos 4 May 2009 00:37:44 -0000 1.3 +++ Magdir/msdos 19 Jun 2010 03:23:23 -0000 @@ -18,8 +18,8 @@ # OS/2 batch files are REXX. the second regex is a bit generic, oh well # the matched commands seem to be common in REXX and uncommon elsewhere -100 regex/c =^[\ \t]{0,10}call[\ \t]{1,10}rxfunc OS/2 REXX batch file text -100 regex/c =^[\ \t]{0,10}say\ ['"] OS/2 REXX batch file text +#100 regex/c =^[\ \t]{0,10}call[\ \t]{1,10}rxfunc OS/2 REXX batch file text +#100 regex/c =^[\ \t]{0,10}say\ ['"] OS/2 REXX batch file text 0 leshort 0x14c MS Windows COFF Intel 80386 object file #>4 ledate x stamp %s >Release-Note: >Audit-Trail: >Unformatted: