Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Apr 2002 21:01:41 +0300
From:      Giorgos Keramidas <keramida@ceid.upatras.gr>
To:        ann kok <annkok2001@yahoo.com>
Cc:        freebsd-questions@FreeBSD.ORG
Subject:   Re: awk question
Message-ID:  <20020409180140.GL67632@hades.hell.gr>
In-Reply-To: <20020409163454.55611.qmail@web20107.mail.yahoo.com>
References:  <20020409163454.55611.qmail@web20107.mail.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2002-04-09 09:34, ann kok wrote:
> Hi all
>
> I have the following data in a file
>
> 2002-06-07      brian
> 2003-04-25      ann
>
> How do I compare the first colum 'date format'
>
> Didn't work the follow!
>
> awk '{if($1 > 2003-01-01) print $0}'

Because it's not a number.  For *this* example of date format, which
has the format YYYY-MM-DD where YYYY is the year, MM the month number,
and DD the day of the month, simply removing from the string the dash
characters will yield a number that you can use in comparisons.

For instance, if you have "2003-01-01" in a string variable called
`foo' you can do what is shown in the example below:

	% cat sample.awk
	{
		date = $1;
		gsub("-", "", date);
		print date,$2;
	}

	% cat datafile
	2002-06-07      brian
	2003-04-25      ann

	% awk -f sample.awk < datafile
	20020607 brian
	20030425 ann

Using gsub() you can remove the "-" characters from dates.  Then
"2003-01-01" becomes 20030101 which can be treated as a number, and is
certainly larger than 20021231 by virtue of the year number which is
listed first.  A simple awk script that prints all the records from
datafile that are after 2003-04-01 but before 2003-05-01 (should match
only the "ann" line in your sample datafile) is shown below:

	% cat sample2.awk
	BEGIN {
		mindate = "2003-04-01";
		maxdate = "2003-05-01";

		gsub("-", "", mindate);
		gsub("-", "", maxdate);
	}

	{
		date = $1;
		gsub("-", "", date);
		if (date >= mindate && date < maxdate) {
			print $0;
		}
	}

	% awk -f sample2.awk < datafile
	2003-04-25      ann

This combined with the -v option of awk that can be used to pass
variable values from the command line will help a bit, I guess :)

For instance, by removing the explicit assignments to mindate and
maxdate in the BEGIN{} block above, you can set mindate and maxdate
from the command line like below:

	% awk -v mindate="2002-06-01" \
	      -v maxdate="2002-07-01" \
	      -f sample.awk < datafile

This should print only the "brian" line!

Giorgos Keramidas                       FreeBSD Documentation Project
keramida@{freebsd.org,ceid.upatras.gr}  http://www.FreeBSD.org/docproj/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020409180140.GL67632>