Date: Tue, 9 Apr 2002 21:01:41 +0300 From: Giorgos Keramidas <keramida@ceid.upatras.gr> To: ann kok <annkok2001@yahoo.com> Cc: freebsd-questions@FreeBSD.ORG Subject: Re: awk question Message-ID: <20020409180140.GL67632@hades.hell.gr> In-Reply-To: <20020409163454.55611.qmail@web20107.mail.yahoo.com> References: <20020409163454.55611.qmail@web20107.mail.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2002-04-09 09:34, ann kok wrote:
> Hi all
>
> I have the following data in a file
>
> 2002-06-07 brian
> 2003-04-25 ann
>
> How do I compare the first colum 'date format'
>
> Didn't work the follow!
>
> awk '{if($1 > 2003-01-01) print $0}'
Because it's not a number. For *this* example of date format, which
has the format YYYY-MM-DD where YYYY is the year, MM the month number,
and DD the day of the month, simply removing from the string the dash
characters will yield a number that you can use in comparisons.
For instance, if you have "2003-01-01" in a string variable called
`foo' you can do what is shown in the example below:
% cat sample.awk
{
date = $1;
gsub("-", "", date);
print date,$2;
}
% cat datafile
2002-06-07 brian
2003-04-25 ann
% awk -f sample.awk < datafile
20020607 brian
20030425 ann
Using gsub() you can remove the "-" characters from dates. Then
"2003-01-01" becomes 20030101 which can be treated as a number, and is
certainly larger than 20021231 by virtue of the year number which is
listed first. A simple awk script that prints all the records from
datafile that are after 2003-04-01 but before 2003-05-01 (should match
only the "ann" line in your sample datafile) is shown below:
% cat sample2.awk
BEGIN {
mindate = "2003-04-01";
maxdate = "2003-05-01";
gsub("-", "", mindate);
gsub("-", "", maxdate);
}
{
date = $1;
gsub("-", "", date);
if (date >= mindate && date < maxdate) {
print $0;
}
}
% awk -f sample2.awk < datafile
2003-04-25 ann
This combined with the -v option of awk that can be used to pass
variable values from the command line will help a bit, I guess :)
For instance, by removing the explicit assignments to mindate and
maxdate in the BEGIN{} block above, you can set mindate and maxdate
from the command line like below:
% awk -v mindate="2002-06-01" \
-v maxdate="2002-07-01" \
-f sample.awk < datafile
This should print only the "brian" line!
Giorgos Keramidas FreeBSD Documentation Project
keramida@{freebsd.org,ceid.upatras.gr} http://www.FreeBSD.org/docproj/
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020409180140.GL67632>
