Date: Tue, 7 Nov 2017 20:36:26 +0300 From: Yuri Pankov <yuripv@gmx.com> To: byrnejb@harte-lyne.ca, freebsd-questions@freebsd.org Subject: Re: sed - remove nul lines from file Message-ID: <88a59a82-2902-9f63-0a94-bd23b910e7ad@gmx.com> In-Reply-To: <b21bf201363c34a90ab55c4a05ff8fd7.squirrel@webmail.harte-lyne.ca> References: <b21bf201363c34a90ab55c4a05ff8fd7.squirrel@webmail.harte-lyne.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 7 Nov 2017 12:12:55 -0500, James B Byrne Via Freebsd-questions wrote: > > I have a data file created by an ancient proprietary scripting > language called QTP. There is a bug in this program which, on > occasion, manifests itself by inserting output records consisting > entirely of nul (^@) (\x00) bytes at regular intervals. In the > present case every 47th. record consists entirely of nuls. > > The purpose of this data file is to feed a psql COPY statement for > loading into a PostgreSQL database. The presence of the NUL > characters prevents this. I have previously used the tr utility to > remove the NUL characters but this requires me to manually remove the > residual empty lines. > > I have tried various permutations of the sed invocation reproduced > below to remove these lines directly but without success. The > examples that I have found on StackExchange and various other > self-help sites do not give the results claimed, at least not for me > on FreeBSD. So, I would appreciate if anyone here can point out what I > am doing wrong or how the sed on FreeBSD differs in behaviour for that > used in the examples I have found. > > Given a file INFILE with records containing the following: > > . . . > *93566000008166*,*CCTL*,*3072 49534494 * > *93566000008166*,*CCTL*,*3072 49534493 * > *93566000008166*,*CCTL*,*3072 49534497 * > *93566000015962*,*CCTL*,*8156 4171000541 * > ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ . . . > *93566000198850*,*CCTL*,*417 1003874 * > *93566000010320*,*CCTL*,*8084 2601553853102 * > . . . > > I wish to remove (all) the line(s) with the nul (^@) characters. I > have tried this: > > sed '/^\x00*$/d' INFILE > INFILE.sed > > and this: > > sed _E '/^\x00*$/d' INFILE > INFILE.sed > > but neither these nor the many other combinations that I have tried > remove the lines. What is the method of accomplishing this in sed or > is it not possible? > Apparently, our regex engine doesn't accept the '\x' syntax, try a bit more complicated, but standard way :-) sed '/[[.NUL.]]/d'
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?88a59a82-2902-9f63-0a94-bd23b910e7ad>