Date: Sun, 3 Nov 2019 10:37:11 -0800 From: David Christensen <dpchrist@holgerdanske.com> To: freebsd-questions@freebsd.org Subject: Re: grep for ascii nul Message-ID: <6fbdd961-fc17-0479-d3a8-1366f0630872@holgerdanske.com> In-Reply-To: <f566209e-1def-7993-ff7f-87e7ea67151b@holgerdanske.com> References: <20191101092716.GA67658@admin.sibptus.ru> <63808.1572638827@segfault.tristatelogic.com> <20191102064505.GA98558@admin.sibptus.ru> <f566209e-1def-7993-ff7f-87e7ea67151b@holgerdanske.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/2/19 7:30 PM, David Christensen wrote: > On 11/1/19 11:45 PM, Victor Sudakov wrote: >> Ronald F. Guilmette wrote: >>> In message <20191101092716.GA67658@admin.sibptus.ru>, >>> Victor Sudakov <vas@sibptus.ru> wrote: >>> >>>> I need to find files containing ascii null inside, and print their >>>> names to >>>> stdout. >>> >>> Unfortunately, you're banging up against a long-standing a rather >>> annoying non-feature of fgrep/grep/egrep, which is that unlike the >>> tr command, the grep family of commands does not support the \DDD >>> notation for specifying arbitrary byte values. Thus, you cannot use >>> then to search for arbitrary byte values. >>> >>> I would thus suggest that you solve your problem using a Perl or C >>> program. >> >> Perl is not in the base system, so that is not quite the answer. >> I'm a big fan of awk, awk is in the base system and should be able to do >> it, right? >> >> $ hd trees.txt >> 00000000 66 69 72 0a 6f 61 6b 0a 63 65 64 00 61 72 0a 62 >> |fir.oak.ced.ar.b| >> 00000010 69 72 63 68 0a 70 61 6c 6d 0a |irch.palm.| >> 0000001a >> $ >> >> Note the ascii null embedded in the word "cedar" >> >> $ awk '/\x66\x69/{print $0}' trees.txt >> fir >> >> So far so good. But with the ascii nul it behaves in an unexpected way: >> >> $ awk '/\x00/{print $0}' trees.txt >> fir >> oak >> ced >> birch >> palm >> $ >> >> > > 2019-11-03 02:16:02 freebsd@fbsd112 ~/sandbox/sh > $ freebsd-version ; uname -a > 11.2-RELEASE > FreeBSD fbsd112 11.2-RELEASE FreeBSD 11.2-RELEASE #0 r335510: Fri Jun 22 > 04:32:14 UTC 2018 > root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 > > > Perl is one of the first things I install on FreeBSD systems: > > root@fbsd112:~ # pkg install perl5 > Updating FreeBSD repository catalogue... > FreeBSD repository is up to date. > All repositories are up to date. > The following 1 package(s) will be affected (of 0 checked): > > New packages to be INSTALLED: > perl5: 5.30.0 > > Number of packages to be installed: 1 > > The process will require 58 MiB more space. > 14 MiB to be downloaded. > > <snip> > > > Solving your problem then becomes a Perl one-liner: > > 2019-11-03 02:16:11 freebsd@fbsd112 ~/sandbox/sh > $ hd hello.txt > 00000000 68 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 0a |hello, > world!.| > 0000000e > > 2019-11-03 02:16:31 freebsd@fbsd112 ~/sandbox/sh > $ hd trees.txt > 00000000 66 69 72 0a 6f 61 6b 0a 63 65 64 00 61 72 0a 62 > |fir.oak.ced.ar.b| > 00000010 69 72 63 68 0a 70 61 6c 6d 0a |irch.palm.| > 0000001a > > 2019-11-03 02:16:35 freebsd@fbsd112 ~/sandbox/sh > $ cat find-files-with-nul.sh > #!/bin/sh > perl -e 'while (<>) {$f{$ARGV}++ if /\x00/}; print keys %f' $@ > > 2019-11-03 02:16:39 freebsd@fbsd112 ~/sandbox/sh > $ sh find-files-with-nul.sh hello.txt trees.txt > trees.txt find-files-with-nul.sh has a defect -- it does not print newlines between filenames: 2019-11-03 18:35:01 freebsd@fbsd112 ~/sandbox/sh $ hd trees2.txt 00000000 70 69 6e 00 65 0a |pin.e.| 00000006 2019-11-03 18:35:14 freebsd@fbsd112 ~/sandbox/sh $ sh find-files-with-nul.sh *.txt trees.txttrees2.txt Here is the corrected version: 2019-11-03 18:36:07 freebsd@fbsd112 ~/sandbox/sh $ cat find-files-with-nul.sh #!/bin/sh perl -e 'while (<>) {$f{$ARGV}++ if /\x00/}; print map {"$_\n"} keys %f' $@ 2019-11-03 18:36:15 freebsd@fbsd112 ~/sandbox/sh $ sh find-files-with-nul.sh *.txt trees.txt trees2.txt David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6fbdd961-fc17-0479-d3a8-1366f0630872>