From owner-freebsd-questions@FreeBSD.ORG Fri Jul 2 19:14:51 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD727106566B for ; Fri, 2 Jul 2010 19:14:50 +0000 (UTC) (envelope-from tundra@tundraware.com) Received: from ozzie.tundraware.com (ozzie.tundraware.com [75.145.138.73]) by mx1.freebsd.org (Postfix) with ESMTP id 9A72B8FC14 for ; Fri, 2 Jul 2010 19:14:50 +0000 (UTC) Received: from [0.0.0.0] (ozzie.tundraware.com [75.145.138.73]) (authenticated bits=0) by ozzie.tundraware.com (8.14.4/8.14.4) with ESMTP id o62JEhbB096479 (version=TLSv1/SSLv3 cipher=DHE-DSS-CAMELLIA256-SHA bits=256 verify=NO) for ; Fri, 2 Jul 2010 14:14:44 -0500 (CDT) (envelope-from tundra@tundraware.com) Message-ID: <4C2E3AA3.7080200@tundraware.com> Date: Fri, 02 Jul 2010 14:14:43 -0500 From: Tim Daneliuk Organization: TundraWare Inc. User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 CC: freebsd-questions@freebsd.org References: <4C2DF07F.1020509@tundraware.com> <44630xq527.fsf@be-well.ilk.org> <20100702173504.c53738b2.freebsd@edvax.de> <44r5jln3oj.fsf@be-well.ilk.org> <20100702204249.1a7423ac.freebsd@edvax.de> In-Reply-To: <20100702204249.1a7423ac.freebsd@edvax.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.5 (ozzie.tundraware.com [75.145.138.73]); Fri, 02 Jul 2010 14:14:44 -0500 (CDT) X-TundraWare-MailScanner-Information: Please contact the ISP for more information X-TundraWare-MailScanner-ID: o62JEhbB096479 X-TundraWare-MailScanner: Found to be clean X-TundraWare-MailScanner-From: tundra@tundraware.com X-Spam-Status: No Subject: Re: 'file' Command Giving False Positives X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2010 19:14:51 -0000 On 7/2/2010 1:42 PM, Polytropon wrote: > On Fri, 02 Jul 2010 14:23:24 -0400, Lowell Gilbert wrote: >> Apparently, your memory is better than mine, because that was indeed >> what I was thinking of. Which leads to the question of why magic(5) >> lists LZ as representing "MS-DOS executable (built-in)". I'd be >> hesitant to change that unless we knew for sure it was wrong. > > As it has been mentioned before, .EXE is *one* of the formats > executable in DOS. .COM executables do not have specific headers > (as they are loaded directly). Also, .BAT are executable, allthough > they are text files, and finally .BTM are also text file executables, > specific to NDOS. As far as I also remember, there's .EXE on OS/2, > too. One could argue if "Windows" .PIF are also executables. Of > course, VMS also has .COM... but I see I'm making a digression... :-) > > > >> Even if it _is_ wrong, the "problem" still remains for "MZ" at least: >> Any file starting with those letters is going to be identified as an >> MS-DOS executable, and there's no clear way to distinguish it from a >> text file that happens to start with those letters. > > Well, there's a solution that is not *that* complicated: If the > file contains characters that don't match isprint(), i. e. those > outside the ASCII set used in real text files, it's likely to be > an executable. > > A scriptable solution might be to diff vs. `strings > `. If they differ, it's not a text, so it might be an > executable. > > I'm not sure if the magic identification string starting with MZ > could be enlarged with other specific characters immediately > following MZ that are *only* present in executables... > > The problem is that "MZ itself is completely sufficient: > > % echo "MZ"> foo > % file foo > foo: MS-DOS executable > > Of course, that's not correct. > > All noted (and appreciated). In this case, the client has a situation where none of the above will work: They can take in encrypted files that happen to have an MZ/LZ at the beginning but have binary data thereafter but are NOT executables. They want to properly flag executables but not get false positives. At this point, I'm inclined to believe that 'file' alone is insufficient to do this and, at best - even with more tools - it's going to be a probabilities game - i.e. "What percentage of false positives is acceptable?" -- ------------------------------------------------------------------------ Tim Daneliuk tundra@tundraware.com