Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 3 Nov 2019 10:37:11 -0800
From:      David Christensen <dpchrist@holgerdanske.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: grep for ascii nul
Message-ID:  <6fbdd961-fc17-0479-d3a8-1366f0630872@holgerdanske.com>
In-Reply-To: <f566209e-1def-7993-ff7f-87e7ea67151b@holgerdanske.com>
References:  <20191101092716.GA67658@admin.sibptus.ru> <63808.1572638827@segfault.tristatelogic.com> <20191102064505.GA98558@admin.sibptus.ru> <f566209e-1def-7993-ff7f-87e7ea67151b@holgerdanske.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/2/19 7:30 PM, David Christensen wrote:
> On 11/1/19 11:45 PM, Victor Sudakov wrote:
>> Ronald F. Guilmette wrote:
>>> In message <20191101092716.GA67658@admin.sibptus.ru>,
>>> Victor Sudakov <vas@sibptus.ru> wrote:
>>>
>>>> I need to find files containing ascii null inside, and print their 
>>>> names to
>>>> stdout.
>>>
>>> Unfortunately, you're banging up against a long-standing a rather
>>> annoying non-feature of fgrep/grep/egrep, which is that unlike the
>>> tr command, the grep family of commands does not support the \DDD
>>> notation for specifying arbitrary byte values.  Thus, you cannot use
>>> then to search for arbitrary byte values.
>>>
>>> I would thus suggest that you solve your problem using a Perl or C
>>> program.
>>
>> Perl is not in the base system, so that is not quite the answer.
>> I'm a big fan of awk, awk is in the base system and should be able to do
>> it, right?
>>
>> $ hd trees.txt
>> 00000000  66 69 72 0a 6f 61 6b 0a  63 65 64 00 61 72 0a 62  
>> |fir.oak.ced.ar.b|
>> 00000010  69 72 63 68 0a 70 61 6c  6d 0a                    |irch.palm.|
>> 0000001a
>> $
>>
>> Note the ascii null embedded in the word "cedar"
>>
>> $ awk '/\x66\x69/{print $0}' trees.txt
>> fir
>>
>> So far so good. But with the ascii nul it behaves in an unexpected way:
>>
>> $ awk '/\x00/{print $0}' trees.txt
>> fir
>> oak
>> ced
>> birch
>> palm
>> $
>>
>>
> 
> 2019-11-03 02:16:02 freebsd@fbsd112 ~/sandbox/sh
> $ freebsd-version ; uname -a
> 11.2-RELEASE
> FreeBSD fbsd112 11.2-RELEASE FreeBSD 11.2-RELEASE #0 r335510: Fri Jun 22 
> 04:32:14 UTC 2018 
> root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> 
> Perl is one of the first things I install on FreeBSD systems:
> 
> root@fbsd112:~ # pkg install perl5
> Updating FreeBSD repository catalogue...
> FreeBSD repository is up to date.
> All repositories are up to date.
> The following 1 package(s) will be affected (of 0 checked):
> 
> New packages to be INSTALLED:
>      perl5: 5.30.0
> 
> Number of packages to be installed: 1
> 
> The process will require 58 MiB more space.
> 14 MiB to be downloaded.
> 
> <snip>
> 
> 
> Solving your problem then becomes a Perl one-liner:
> 
> 2019-11-03 02:16:11 freebsd@fbsd112 ~/sandbox/sh
> $ hd hello.txt
> 00000000  68 65 6c 6c 6f 2c 20 77  6f 72 6c 64 21 0a        |hello, 
> world!.|
> 0000000e
> 
> 2019-11-03 02:16:31 freebsd@fbsd112 ~/sandbox/sh
> $ hd trees.txt
> 00000000  66 69 72 0a 6f 61 6b 0a  63 65 64 00 61 72 0a 62 
> |fir.oak.ced.ar.b|
> 00000010  69 72 63 68 0a 70 61 6c  6d 0a                    |irch.palm.|
> 0000001a
> 
> 2019-11-03 02:16:35 freebsd@fbsd112 ~/sandbox/sh
> $ cat find-files-with-nul.sh
> #!/bin/sh
> perl -e 'while (<>) {$f{$ARGV}++ if /\x00/}; print keys %f' $@
> 
> 2019-11-03 02:16:39 freebsd@fbsd112 ~/sandbox/sh
> $ sh find-files-with-nul.sh hello.txt trees.txt
> trees.txt

find-files-with-nul.sh has a defect -- it does not print newlines 
between filenames:

2019-11-03 18:35:01 freebsd@fbsd112 ~/sandbox/sh
$ hd trees2.txt
00000000  70 69 6e 00 65 0a                                 |pin.e.|
00000006

2019-11-03 18:35:14 freebsd@fbsd112 ~/sandbox/sh
$ sh find-files-with-nul.sh *.txt
trees.txttrees2.txt


Here is the corrected version:

2019-11-03 18:36:07 freebsd@fbsd112 ~/sandbox/sh
$ cat find-files-with-nul.sh
#!/bin/sh
perl -e 'while (<>) {$f{$ARGV}++ if /\x00/}; print map {"$_\n"} keys %f' $@

2019-11-03 18:36:15 freebsd@fbsd112 ~/sandbox/sh
$ sh find-files-with-nul.sh *.txt
trees.txt
trees2.txt


David





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6fbdd961-fc17-0479-d3a8-1366f0630872>