From owner-freebsd-questions@freebsd.org Tue Nov 7 21:22:31 2017 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9A909E652FC for ; Tue, 7 Nov 2017 21:22:31 +0000 (UTC) (envelope-from noeldude@gmail.com) Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5CFDF6C7F6 for ; Tue, 7 Nov 2017 21:22:31 +0000 (UTC) (envelope-from noeldude@gmail.com) Received: by mail-io0-x232.google.com with SMTP id m81so3674180ioi.13 for ; Tue, 07 Nov 2017 13:22:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=KdapDb+qckTAk7qRnETsF+NE37QSBgVScdC791zRuyM=; b=LXS2U9IPzTgy4ro1DzvdWiCeAqCE3aRxKlrhV4hbV/08FLgN64+etw4icslLwtGnO7 nYC/88MVkTW9pu/TA5zLiNXlvStFq9TDzVXaYeWSIit26WHctIFEEx3qyhbzx91j1leA rnJrv7ZeoddfYdq80rum2Id4wt3iARR3gGlROylRBPB5ecQDD6IjZPaQsrCVWXzHPQlz DUCk1i46Trwn0dz/UfpoGwrcQPDQ9cdE6u5ezmTYJaYyiXL9AKA14X00ikxOd0n6znYS yqjokUX7gyvsRY09Bgjs3bKBxxUwj6M5wP3DzES129h6WGWvZx/zJzr0PQazwwOhvA46 9bdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=KdapDb+qckTAk7qRnETsF+NE37QSBgVScdC791zRuyM=; b=n8v0hhRgGQ4i7z8O3SD6AQvt1Xkk1iphTq2FIhcDXbw0qZro+ns3xpcy4zHFVwM0RM xkYX85iW19MpzjLKSWd2RqfNQtzfYtKndjceli7I06L3bFE/t7djzbqTGWUaIlTKBQVk 2dO+vUw63QWBAe2utHmh48GhJI0iFVMG1KRNxHx7J+NjkKZCsjuCiVm+faslht4d1FMi kb56dzVrJ7YVZT/twJXR1VDYz/qaiVUCHB+RsHmk9clPO6WEuxqM8t2fGGVkTu8I+tCC uOv9PDf2Ht8yYWBalaSknts3HHDg6A1o9zcSeD7lcyIfi0Dikppro5ueU9aWj8jPUHZ2 RjaQ== X-Gm-Message-State: AJaThX7n1IxS0dJjLX7hrCK+p3vPtDgduxhClxIBwEOVZuxAeEJ7FYdJ c2i2IXVMyNmTI0zJ4GnAqp+EtsqZ X-Google-Smtp-Source: ABhQp+Rz8Khe8oVBUnJRlEhfLDl8wh5q3MOU5cP/6RpBy2rqZaogyMKeJfm99FWmA4SW74ENuaRpcg== X-Received: by 10.107.165.140 with SMTP id o134mr165318ioe.21.1510089749922; Tue, 07 Nov 2017 13:22:29 -0800 (PST) Received: from [192.168.70.98] (50-200-12-74-static.hfc.comcastbusiness.net. [50.200.12.74]) by smtp.googlemail.com with ESMTPSA id i201sm1236528ita.32.2017.11.07.13.22.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 07 Nov 2017 13:22:28 -0800 (PST) Subject: Re: sed - remove nul lines from file To: freebsd-questions@freebsd.org References: <20171107193652.7b0aa08f.freebsd@edvax.de> <76aef2fd3792a0d9291b90cb74b6924f.squirrel@webmail.harte-lyne.ca> <20171107200908.f9358f33.freebsd@edvax.de> From: Noel Message-ID: <5ca5d4fd-122b-814d-5632-553c497b25ec@gmail.com> Date: Tue, 7 Nov 2017 15:22:27 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <20171107200908.f9358f33.freebsd@edvax.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Nov 2017 21:22:31 -0000 On 11/7/2017 1:09 PM, Polytropon wrote: > On Tue, 7 Nov 2017 13:54:41 -0500, James B. Byrne wrote: >> On Tue, November 7, 2017 13:36, Polytropon wrote: >>> On Tue, 7 Nov 2017 12:12:55 -0500, James B. Byrne via >>> freebsd-questions wrote: >>>> I have a data file created by an ancient proprietary scripting >>>> language called QTP. There is a bug in this program which, on >>>> occasion, manifests itself by inserting output records consisting >>>> entirely of nul (^@) (\x00) bytes at regular intervals. In the >>>> present case every 47th. record consists entirely of nuls. >> ... >>> In this case, awk can also help: >>> >>> $ awk '(length > 0)' < infile.txt > outfile.txt >>> >>> This will print all lines which are longer than 0 characters. >>> >> Thank you very much. This worked exactly as I required. >> >> I infer from this that awk does not consider nul a character and its >> presence does not count towards the length of a record. Which is >> counter intuitive to me. A nul takes up the same space as any other >> character so why is it not counted? I would not have tried this >> construction for that reason. > Even though this example was actually meant for empty lines, > i. e., those where the NULs have already been removed (for > example with the tr -d command), but it seems that awk does > actually ignores the NULs. > > Let's say this is the test input: > > foo > bar > ^@^@^@^@^@^@^@^@ > baz > meow > > When fed into the awk command mentioned above, the NULs are > magically removed: > > $ awk '(length > 0)' < nul.txt > foo > bar > baz > meow > > This is an interesting behaviour, but fits the current problem > quite well: It removes NULs and emoty lines. :-) > > > > > I'd probably just use grep. grep '^[[:print:]]'  INFILE  >   OUTFILE ie. any line that starts with a printable character is copied to OUTFILE.  This will skip all-null and empty lines.