From owner-freebsd-questions@freebsd.org Sun Nov 3 02:30:24 2019 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3FAE71A1D3F for ; Sun, 3 Nov 2019 02:30:24 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Received: from holgerdanske.com (holgerdanske.com [IPv6:2001:470:0:19b::b869:801b]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "holgerdanske.com", Issuer "holgerdanske.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 475Kf706kGz4WLC for ; Sun, 3 Nov 2019 02:30:22 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Received: from 99.100.19.101 ([99.100.19.101]) by holgerdanske.com with ESMTPSA (ECDHE-RSA-AES128-GCM-SHA256:TLSv1.2:Kx=ECDH:Au=RSA:Enc=AESGCM(128):Mac=AEAD) (SMTP-AUTH username dpchrist@holgerdanske.com, mechanism PLAIN) for ; Sat, 2 Nov 2019 19:30:06 -0700 Subject: Re: grep for ascii nul To: freebsd-questions@freebsd.org References: <20191101092716.GA67658@admin.sibptus.ru> <63808.1572638827@segfault.tristatelogic.com> <20191102064505.GA98558@admin.sibptus.ru> From: David Christensen Message-ID: Date: Sat, 2 Nov 2019 19:30:05 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191102064505.GA98558@admin.sibptus.ru> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 475Kf706kGz4WLC X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of dpchrist@holgerdanske.com has no SPF policy when checking 2001:470:0:19b::b869:801b) smtp.mailfrom=dpchrist@holgerdanske.com X-Spamd-Result: default: False [-2.73 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; IP_SCORE(-1.63)[ipnet: 2001:470::/32(-4.61), asn: 6939(-3.47), country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; TO_DN_NONE(0.00)[]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; DMARC_NA(0.00)[holgerdanske.com]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Nov 2019 02:30:24 -0000 On 11/1/19 11:45 PM, Victor Sudakov wrote: > Ronald F. Guilmette wrote: >> In message <20191101092716.GA67658@admin.sibptus.ru>, >> Victor Sudakov wrote: >> >>> I need to find files containing ascii null inside, and print their names to >>> stdout. >> >> Unfortunately, you're banging up against a long-standing a rather >> annoying non-feature of fgrep/grep/egrep, which is that unlike the >> tr command, the grep family of commands does not support the \DDD >> notation for specifying arbitrary byte values. Thus, you cannot use >> then to search for arbitrary byte values. >> >> I would thus suggest that you solve your problem using a Perl or C >> program. > > Perl is not in the base system, so that is not quite the answer. > I'm a big fan of awk, awk is in the base system and should be able to do > it, right? > > $ hd trees.txt > 00000000 66 69 72 0a 6f 61 6b 0a 63 65 64 00 61 72 0a 62 |fir.oak.ced.ar.b| > 00000010 69 72 63 68 0a 70 61 6c 6d 0a |irch.palm.| > 0000001a > $ > > Note the ascii null embedded in the word "cedar" > > $ awk '/\x66\x69/{print $0}' trees.txt > fir > > So far so good. But with the ascii nul it behaves in an unexpected way: > > $ awk '/\x00/{print $0}' trees.txt > fir > oak > ced > birch > palm > $ > > 2019-11-03 02:16:02 freebsd@fbsd112 ~/sandbox/sh $ freebsd-version ; uname -a 11.2-RELEASE FreeBSD fbsd112 11.2-RELEASE FreeBSD 11.2-RELEASE #0 r335510: Fri Jun 22 04:32:14 UTC 2018 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 Perl is one of the first things I install on FreeBSD systems: root@fbsd112:~ # pkg install perl5 Updating FreeBSD repository catalogue... FreeBSD repository is up to date. All repositories are up to date. The following 1 package(s) will be affected (of 0 checked): New packages to be INSTALLED: perl5: 5.30.0 Number of packages to be installed: 1 The process will require 58 MiB more space. 14 MiB to be downloaded. Solving your problem then becomes a Perl one-liner: 2019-11-03 02:16:11 freebsd@fbsd112 ~/sandbox/sh $ hd hello.txt 00000000 68 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 0a |hello, world!.| 0000000e 2019-11-03 02:16:31 freebsd@fbsd112 ~/sandbox/sh $ hd trees.txt 00000000 66 69 72 0a 6f 61 6b 0a 63 65 64 00 61 72 0a 62 |fir.oak.ced.ar.b| 00000010 69 72 63 68 0a 70 61 6c 6d 0a |irch.palm.| 0000001a 2019-11-03 02:16:35 freebsd@fbsd112 ~/sandbox/sh $ cat find-files-with-nul.sh #!/bin/sh perl -e 'while (<>) {$f{$ARGV}++ if /\x00/}; print keys %f' $@ 2019-11-03 02:16:39 freebsd@fbsd112 ~/sandbox/sh $ sh find-files-with-nul.sh hello.txt trees.txt trees.txt David