From owner-freebsd-questions@freebsd.org Tue Nov 7 17:36:42 2017 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 983CBE5F4FC for ; Tue, 7 Nov 2017 17:36:42 +0000 (UTC) (envelope-from yuripv@gmx.com) Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 184FF798E8 for ; Tue, 7 Nov 2017 17:36:41 +0000 (UTC) (envelope-from yuripv@gmx.com) Received: from thor.xvoid.org ([62.183.126.160]) by mail.gmx.com (mrgmx103 [212.227.17.174]) with ESMTPSA (Nemesis) id 0MLNeQ-1eCflO2fYo-000cdj; Tue, 07 Nov 2017 18:36:28 +0100 Subject: Re: sed - remove nul lines from file To: byrnejb@harte-lyne.ca, freebsd-questions@freebsd.org References: From: Yuri Pankov Message-ID: <88a59a82-2902-9f63-0a94-bd23b910e7ad@gmx.com> Date: Tue, 7 Nov 2017 20:36:26 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:8XdzGhQWzXtdbMMZywOLeuJIV5poRLyq3ledPcfoiqHZwDXYDj+ BoFwFAD69Ja+dSMginp/uD5s9pvpp5ZOJP0kZXhZhTe1XIlnlUDAcN3R5fYdyVl0JlXL3EJ w4gWkYqP6xzhcHB5XcvvJF2Gr8BETlsHKxkNu4yterGvztSFDE8/RruojWOJf2oA4wcukKL INZQ7puhl93zR8abuL/Qw== X-UI-Out-Filterresults: notjunk:1;V01:K0:ri1r6CTKpL0=:44SKV02hhuXtPOQAEaeXj5 kf+6xZsKBIx3DOSspFXgE5giEjVxn+098sdHDnnhLjwneJfoED+l2B0i1GRg2KyQdKAcZ9+XH 4f6OynejvgZi9MnxwJ6pRmHruQRdusiWkMHVKics/sD7kYTsZ9Hc3sDBGlmJvOGHAkw8YZZR+ 6/VqtlIomhQX1+ZpGoPJVB5vStm/bgp8Qwv3zVtchhxikLU2CDtFrby2TQYaFqPaaeii+ssEy hIAWwZK1V7pFva7D5vnYS9358M8wHBw1xbMUiXSzOhXzVgS4x/GVepD2f2jpm6QSOi1q+h44c sA3qdIr2e8Hw001/YOGH2sFAB0OW+lK8FgUbgyh71INuVsB+fvEBZIxSC7Hvm9d91kyqTuLfy cj8NTVfmfciJ551/zEQc2Az+wQgAVl+PPNKEw/2IVCDTpGmS9GFrdfEdgQqKdZ53WzRaa07pB HJiPHYpkMI9i7gT1Nyqsqu0YX10oLjzX5aUSPD3Ktvht9IwPaX8HTylzM7TIGpSoXbfkFSLgH AZ3U5wdZe8dKgAufSupAWwCsVygWeohNSgjOwAQZLrsPlaHD9fA1sJBu3uOF4lpY24Bx6T3qv 66ESgeo8WjVNwn4ebcJYpHACVqxBSprPr8YkMAEOrciJ+nyHoJpioCpx0aNJi5vdO9tBDVkaC /XD/IJlAocjHXGiAzYm/nNyL4StaY1f59UEmRI1vDHpM+68U0bbKPhiX5QjMoAZm2tMN0Qk1Y Kc+u40mm+J7JsK59L/r25UjWJ1+z6D2BU8KdpGoAjgAohVTXZq3uIoK5Bo/JVnZdrKMc1/aXA H0cLoQ7YA7vdiYbC2eXlh8ZLBtKeHBMliMaGj880tBP6GcjT1UM4atMDZUZ2ZzhUaLkMLnd X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Nov 2017 17:36:42 -0000 On Tue, 7 Nov 2017 12:12:55 -0500, James B Byrne Via Freebsd-questions wrote: > > I have a data file created by an ancient proprietary scripting > language called QTP. There is a bug in this program which, on > occasion, manifests itself by inserting output records consisting > entirely of nul (^@) (\x00) bytes at regular intervals. In the > present case every 47th. record consists entirely of nuls. > > The purpose of this data file is to feed a psql COPY statement for > loading into a PostgreSQL database. The presence of the NUL > characters prevents this. I have previously used the tr utility to > remove the NUL characters but this requires me to manually remove the > residual empty lines. > > I have tried various permutations of the sed invocation reproduced > below to remove these lines directly but without success. The > examples that I have found on StackExchange and various other > self-help sites do not give the results claimed, at least not for me > on FreeBSD. So, I would appreciate if anyone here can point out what I > am doing wrong or how the sed on FreeBSD differs in behaviour for that > used in the examples I have found. > > Given a file INFILE with records containing the following: > > . . . > *93566000008166*,*CCTL*,*3072 49534494 * > *93566000008166*,*CCTL*,*3072 49534493 * > *93566000008166*,*CCTL*,*3072 49534497 * > *93566000015962*,*CCTL*,*8156 4171000541 * > ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ . . . > *93566000198850*,*CCTL*,*417 1003874 * > *93566000010320*,*CCTL*,*8084 2601553853102 * > . . . > > I wish to remove (all) the line(s) with the nul (^@) characters. I > have tried this: > > sed '/^\x00*$/d' INFILE > INFILE.sed > > and this: > > sed _E '/^\x00*$/d' INFILE > INFILE.sed > > but neither these nor the many other combinations that I have tried > remove the lines. What is the method of accomplishing this in sed or > is it not possible? > Apparently, our regex engine doesn't accept the '\x' syntax, try a bit more complicated, but standard way :-) sed '/[[.NUL.]]/d'