From owner-freebsd-questions@freebsd.org Sat Apr 8 17:17:05 2017 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DA828D35E51 for ; Sat, 8 Apr 2017 17:17:05 +0000 (UTC) (envelope-from freebsd@edvax.de) Received: from mailrelay11.qsc.de (mailrelay11.qsc.de [212.99.187.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.antispameurope.com", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4C9DFBC2 for ; Sat, 8 Apr 2017 17:17:04 +0000 (UTC) (envelope-from freebsd@edvax.de) Received: from mx01.qsc.de ([213.148.129.14]) by mailrelay11.qsc.de; Sat, 08 Apr 2017 19:16:34 +0200 Received: from r56.edvax.de (port-92-195-127-117.dynamic.qsc.de [92.195.127.117]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx01.qsc.de (Postfix) with ESMTPS id 0789B3C77D; Sat, 8 Apr 2017 19:16:33 +0200 (CEST) Received: from r56.edvax.de (localhost [127.0.0.1]) by r56.edvax.de (8.14.5/8.14.5) with SMTP id v38HGXjf002056; Sat, 8 Apr 2017 19:16:33 +0200 (CEST) (envelope-from freebsd@edvax.de) Date: Sat, 8 Apr 2017 19:16:33 +0200 From: Polytropon To: Ernie Luzar Cc: RW , freebsd-questions@freebsd.org Subject: Re: Is there a database built into the base system Message-Id: <20170408191633.70d1f303.freebsd@edvax.de> In-Reply-To: <58E9171F.3060405@gmail.com> References: <58E696BD.6050503@gmail.com> <69607026-F68C-4D9D-A826-3EFE9ECE12AB@mac.com> <58E69E59.6020108@gmail.com> <20170406210516.c63644064eb99f7b60dbd8f4@sohara.org> <58E6AFC0.2080404@gmail.com> <20170407001101.GA5885@tau1.ceti.pl> <20170407210629.GR2787@mailboy.kipshouse.net> <58E83E19.8010709@gmail.com> <20170408145503.69ddf649@gumby.homeunix.com> <58E9171F.3060405@gmail.com> Reply-To: Polytropon Organization: EDVAX X-Mailer: Sylpheed 3.1.1 (GTK+ 2.24.5; i386-portbld-freebsd8.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-cloud-security-sender: freebsd@edvax.de X-cloud-security-recipient: freebsd-questions@freebsd.org X-cloud-security-Virusscan: CLEAN X-cloud-security-disclaimer: This E-Mail was scanned by E-Mailservice on mailrelay11.qsc.de with 475CF6A357C X-cloud-security-connect: mx01.qsc.de[213.148.129.14], TLS=1, IP=213.148.129.14 X-cloud-security: scantime:.2668 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Apr 2017 17:17:06 -0000 On Sat, 08 Apr 2017 13:00:15 -0400, Ernie Luzar wrote: > Here is my first try at using awk to Read every record in the input > file and drop duplicates records from output file. > > > This what the data looks like. > /etc >cat /ip.org.sorted > 1.121.136.228; > 1.186.172.200; > 1.186.172.210; > 1.186.172.218; > 1.186.172.218; > 1.186.172.218; > 1.34.169.204; > 101.109.155.81; > 101.109.155.81; > 101.109.155.81; > 101.109.155.81; > 104.121.89.129; Why not simply use "sort | uniq" to eliminate duplicates? > /etc >cat /root/bin/ipf.table.awk.dup > #! /bin/sh > > file_in="/ip.org.sorted" > file_out="/ip.no-dups" > > awk '{ in_ip = $1 }' > END { (if in_ip = prev_ip) > next > else > prev_ip > $file_out > prev_ip = in_ip > } $file_in > > When I run this script it just hangs there. I have to ctrl/c to break > out of it. What is wrong with my awk command? For each line, you store the 1st field (in this case, the entire line) in in_ip, and you overwrite (!) that variable with each new line. At the end of the file (!!!) you make a comparison and even request the next data line. Additionally, keep an eye on the quotes you use: '...' will keep the $ in $file_out, that's now a variable inside awk which is empty. The '...' close before END, so outside of awk. Remember that awk reads from standard input, so your redirection for the input file would need to be "< $file_in", or useless use of cat, "cat $file_in | awk > $file_out". In your specific case, I'd say not that awk is the wrong tool. If you simply want to eliminate duplicates, use the classic UNIX approach "sort | uniq". Both tools are part of the OS. -- Polytropon Magdeburg, Germany Happy FreeBSD user since 4.0 Andra moi ennepe, Mousa, ...