From owner-freebsd-questions@FreeBSD.ORG Sun Feb 12 10:53:51 2006 Return-Path: X-Original-To: questions@freebsd.org Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D3EC16A420 for ; Sun, 12 Feb 2006 10:53:51 +0000 (GMT) (envelope-from vaaf@broadpark.no) Received: from osl1smout1.broadpark.no (osl1smout1.broadpark.no [80.202.4.58]) by mx1.FreeBSD.org (Postfix) with ESMTP id 19ABD43D45 for ; Sun, 12 Feb 2006 10:53:51 +0000 (GMT) (envelope-from vaaf@broadpark.no) Received: from osl1sminn1.broadpark.no ([80.202.4.59]) by osl1smout1.broadpark.no (Sun Java System Messaging Server 6.1 HotFix 0.05 (built Oct 21 2004)) with ESMTP id <0IUK003LBM9QWFB0@osl1smout1.broadpark.no> for questions@freebsd.org; Sun, 12 Feb 2006 11:53:50 +0100 (CET) Received: from urban.broadpark.no ([213.187.181.70]) by osl1sminn1.broadpark.no (Sun Java System Messaging Server 6.1 HotFix 0.05 (built Oct 21 2004)) with ESMTP id <0IUK00LPQM9P4EB0@osl1sminn1.broadpark.no> for questions@freebsd.org; Sun, 12 Feb 2006 11:53:50 +0100 (CET) Date: Sun, 12 Feb 2006 11:53:59 +0100 From: Kristian Vaaf In-reply-to: <20060211214549.GA1674@holestein.holy.cow> To: Parv Message-id: <7.0.1.0.2.20060212114457.0219ab78@broadpark.no> MIME-version: 1.0 X-Mailer: QUALCOMM Windows Eudora Version 7.0.1.0 Content-type: text/plain; charset=us-ascii; format=flowed Content-transfer-encoding: 7BIT References: <7.0.1.0.2.20060211172807.0214a4b8@broadpark.no> <20060211214549.GA1674@holestein.holy.cow> Cc: questions@freebsd.org Subject: Re: Script to clean text files X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Feb 2006 10:53:51 -0000 At 22:45 11.02.2006, Parv wrote: >in message <7.0.1.0.2.20060211172807.0214a4b8@broadpark.no>, >wrote Kristian Vaaf thusly... > > > > > > Among other things, this script is suppose to add an empty line at > > the bottom of a file. > > > > But somehow it always removes the first line in a text file, > > how do I stop this? > >Can you provide a small sample file complete w/ things that you >want to remove? > > > > #!/usr/local/bin/bash > > # > > # Remove CRLF, trailing whitespace and double lines. > >What are "double lines"? > > > > # $ARBA: clean.sh,v 1.0 2007/11/11 15:09:05 vaaf Exp $ > > # > > for file in `find -s . -type f -not -name ".*"`; do > > if file -b "$file" | grep -q 'text'; then > > echo >> "$file" > > perl -i -pe 's/\015$//' "$file" > > perl -i -pe 's/[^\S\n]+$//g' "$file" > >Why do you have two perl runs? More importantly, you will remove >anything which is not whitespace or not newline. That means, in the >end, you should have a file filled w/ whitespace only. > > > > > perl -pi -00 -e 1 "$file" > > echo "$file: Done" > > fi > > done > >To remove CRLF, trailing whitespace, and 2 consecutive blank lines >... > > { > tr -d '\r' < "$file" \ > | sed -E -e 's/[[:space:]]+$//' \ > | cat -s - > "${file}.tmp" > } && mv -f "${file}.tmp" "$file" > > > - Parv > >-- Hello Parv! Yes I meant blank lines :) I've used the script for a long time now. The only error is that it removes the top blank space, if any. Which is a bit annoying. It's fine for scripts with shebangs but not for custom laid out documents etc. I just wanted to know where that error was. I use the Perl runs because those were the only runs people gave me. You know how it is, you enter a FreeBSD help channel and ask how you do this or that, and the upper gentlemen always reply "Learn Perl," and then they go on giving you Perl runs :) Your suggestion looks very very good. So is this alright? #!/usr/local/bin/bash # # Remove CRLF, trailing whitespace and blank lines. # $ARBA: clean.sh,v 1.0 2007/11/11 15:09:05 vaaf Exp $ # for file in `find -s . -type f -not -name ".*"`; do if file -b "$file" | grep -q 'text'; then echo >> "$file" tr -d '\r' < "$file" sed -E -e 's/[[:space:]]+$//' cat -s - > "${file}.tmp" && mv -f "${file}.tmp" "$file" echo "$file: Done" fi done All the best man, Vaaf