Date: Wed, 19 Apr 2017 11:02:22 -0400 From: Ernie Luzar <luzar722@gmail.com> To: Andreas Perstinger <andipersti@gmail.com> Cc: freebsd-questions@freebsd.org Subject: Re: awk help Message-ID: <58F77BFE.50108@gmail.com> In-Reply-To: <b7c0da1d-e659-1430-9530-37993f9182b3@gmail.com> References: <58F25A01.1060208@gmail.com> <7951DF71-5CD3-4B53-9CB4-13CAA8945983@huiekin.org> <58F4CD14.7090008@gmail.com> <c95e03d2-986d-3c3c-198a-a28ab862dc70@gmail.com> <58F53EEA.2030206@gmail.com> <7b381f8f-e2a5-26ea-075e-96ae35efb25d@rogers.com> <58F61027.3090100@gmail.com> <aed3ad4b-7013-471f-8b11-bc717230cff0@gmail.com> <b7c0da1d-e659-1430-9530-37993f9182b3@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Andreas Perstinger wrote: > (Sorry for messing up parts of the quoting in my former mail.) > > On 2017-04-18 19:40, Andreas Perstinger wrote: >> I think awk is the better tool for your task but you could still >> simplify your shell script a little bit: > > After hitting the send button I realized that there is a simpler > solution using a classical Unix pipe: > > #!/bin/sh > > added_date="`date +%Y%m%d`" > > hits_rpt="hits_rpt" > hits_new="hits.yes" > hits_no="hits.no" > truncate -s 0 $hits_rpt $hits_new $hits_no > > ippool -l -d -m probing_ips > $hits_rpt 2> /dev/null > > tail -n +4 $hits_rpt | # start at 4th line > paste - - | # join two consecutive lines > sed -e 's:^ *::' -e 's:/32::' | # remove spaces at the beginning > # and "/32" suffix from IP address > cut -w -f 2,4 | # extract IP and Hits from combined line > # (at 2nd and 4th field) > while read ip hits # read IP and Hits from each line > do # and do your work > if [ "$hits" -gt 0 ]; then > echo "$added_date ${ip};" >> $hits_new > fi > echo "$hits ${ip};" >> $hits_no > done > exit 0 > > If the "ippool" uses tabs in the output just add "tr '\t' ' '" between > the "paste" and "sed" step. > > Bye, Andreas I really like this coding method. Had to change the tail command to bypass 3 lines and the cut commend was selecting the incorrect fields. The following is the script I tested against live data. #! /bin/sh date added_date="`date +%Y%m%d`" # hits_rpt="/root/hits.rpt" hits_rpt="/etc/ipf_pool_hits" hits_yes="/etc/ipf_pool_hits3_yes" hits_no="/etc/ipf_pool_hits3_no" truncate -s 0 $hits_yes $hits_no $hits_rpt ippool -l -d -m probing_ips > $hits_rpt 2> /dev/null tail -n +3 $hits_rpt | # start at 4th line paste - - | # join two consecutive lines #sed -e 's:^ *::' -e 's:/32::' | # remove spaces at the beginning # # and "/32" suffix from IP address cut -w -f 3,5 | # extract IP and Hits from combined line # (at 3rd and 5th fields) while read ip hits # read IP and Hits from each line do # and do your work if [ "$hits" -gt 0 ]; then ip=${ip%%/*} # strip off /32 echo "$added_date ${ip};" >> $hits_yes else echo "$hits ${ip};" >> $hits_no fi done date exit 0 I ran this script using sed command and without it and in both cases it ran in one second. I think that the paste command is where all the run elapse time saving is occurring. In my original script that took 5 minutes to run I had manually coded logic to combine the 2 lines together. From this experience I learned a creative use of the truncate & tail commands. But the real eye opener was the usage of the paste command. I didn't even know that command existed. I used the date command to time the runs, problem is it only goes to a second. What is really needed here is a timer in hundredths of a second. Is there such a command that does that? Thank you Andreas for taking the time to teach me these creative usages of these standard commands.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?58F77BFE.50108>