From owner-freebsd-questions@FreeBSD.ORG Mon Jul 18 08:56:00 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 533CB106564A for ; Mon, 18 Jul 2011 08:56:00 +0000 (UTC) (envelope-from f.bonnet@esiee.fr) Received: from hp9.esiee.fr (hp9.esiee.fr [147.215.1.4]) by mx1.freebsd.org (Postfix) with ESMTP id 091858FC12 for ; Mon, 18 Jul 2011 08:55:59 +0000 (UTC) Received: from mail.esiee.fr (mail.esiee.fr [147.215.1.3]) by hp9.esiee.fr (Postfix) with ESMTP id EC6C614E9B41 for ; Mon, 18 Jul 2011 10:55:58 +0200 (CEST) X-DKIM: OpenDKIM Filter v2.4.1 hp9.esiee.fr EC6C614E9B41 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=esiee.fr; s=MAILOUT; t=1310979359; bh=ccJg6KP3lZzy328lUkUMh4otshEMlDLwCoZx59DElnk=; h=Message-ID:Date:From:MIME-Version:To:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=EBJii2D+LcdQnk5lQ43ruGRrtQ897iQcvuTezQoteJ5bDkdpYBgvCc8X1bDsmn1Us MBIOTyHP9OA+igb20Mj7qT5IpOtLWruhI4x2wmJvys4jBXpMKsANnqYEkSIodbWONi qhTfSdWEtNDmsawambyvuXGsm4mVshTTj88g0c00= Received: from mail.esiee.fr (localhost [127.0.0.1]) by VAMS.dummy (Postfix) with SMTP id D23013C3CB4 for ; Mon, 18 Jul 2011 10:55:58 +0200 (CEST) Received: from secure.esiee.fr (secure.esiee.fr [147.215.1.19]) by mail.esiee.fr (Postfix) with ESMTP id A3A073C3CB3 for ; Mon, 18 Jul 2011 10:55:58 +0200 (CEST) Received: from [147.215.1.21] (lisa.esiee.fr [147.215.1.21]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: bonnetf) by secure.esiee.fr (Postfix) with ESMTPSA id A1868EAE46 for ; Mon, 18 Jul 2011 10:55:58 +0200 (CEST) Message-ID: <4E23F51E.7040907@esiee.fr> Date: Mon, 18 Jul 2011 10:55:58 +0200 From: Frank Bonnet User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.18) Gecko/20110617 Lightning/1.0b2 Thunderbird/3.1.11 MIME-Version: 1.0 To: freebsd-questions@freebsd.org References: <4E23E6DD.6050901@esiee.fr> <20110718101029.531397d9.freebsd@edvax.de> <4E23F0FE.3070305@esiee.fr> <20110718104545.2aacbd1b.freebsd@edvax.de> In-Reply-To: <20110718104545.2aacbd1b.freebsd@edvax.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Tools to find "unlegal" files ( videos , music etc ) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 08:56:00 -0000 On 07/18/2011 10:45 AM, Polytropon wrote: > On Mon, 18 Jul 2011 10:38:22 +0200, Frank Bonnet wrote: >> On 07/18/2011 10:10 AM, Polytropon wrote: >>> On Mon, 18 Jul 2011 09:55:09 +0200, Frank Bonnet wrote: >>>> Hello >>>> >>>> Anyone knows an utility that I could pipe to the "find" command >>>> in order to detect video, music, games ... etc files ? >>>> >>>> I need a tool that could "inspect" inside files because many users >>>> rename those filename to "inoffensive" ones :-) >>> One way could be to define a list of file extensions that >>> commonly matches the content you want to track. Of course, >>> the file name does not directly correspond to the content, >>> but it often gives a good hint to search for *.wmv, *.flv, >>> *.avi, *.mp(e)g, *.mp3, *.wma, *.exe - and of course all >>> the variations of the extensions with uppercase letters. >>> Also consider *.rar and maybe *.zip for compressed content. >>> >>> If file extensions have been manipulated (rare case), the >>> "file" command can still identify the correct file type. >>> >>> >>> >>> >> yes thanks , gonna try with the file command > You could make a simple script that lists "file" output for > all files (just to be sure because of possible suffix renaming) > for further inspection. Sometimes, you can also run "strings" > for a given file - maybe that can be used to identify typical > suspicious string patters for a "strings + grep" combination > so less manual identification has to be done. > > yes , my main problem is the huge number of files but anyway I'm gonna first check files greater than 500 Mb it could be a good start