Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Feb 2023 01:24:41 +0100 (CET)
From:      Sysadmin Lists <sysadmin.lists@mailfence.com>
To:        Freebsd Questions <freebsd-questions@freebsd.org>
Subject:   BSD-awk print() Behavior
Message-ID:  <1600449078.170379.1676939080787@fidget.co-bxl>

next in thread | raw e-mail | index | archive | help
Trying to wrap my head around what BSD awk is doing here. Although the behavior
is unwanted for this exercise, it seems like a possibly useful feature or hack
for future projects. Either way I'd like to understand what's going on.

I extracted a list of URLs from my browser's history sql file, and when
iterating over the list with awk got some strange results.

file_1 has the sql-extracted URLs, and file_2 is a copy-paste of that file's
contents using vim's yank-and-paste.

$ cat file_{1,2}
https://github.com/
https://github.com/
https://github.com/
https://github.com/

$ diff file_{1,2}  
1,2c1,2
< https://github.com/
< https://github.com/
---
> https://github.com/
> https://github.com/

$ awk '{ print $0 " abc " }' file_{1,2}  
 abc ://github.com/
 abc ://github.com/
https://github.com/ abc 
https://github.com/ abc 

The sql-extracted URLs cause awk's print() to replace the front of the string
with text following $0. file_2 does not. I used vim's `:set list' option to
view hidden chars, but there's no apparent difference between the two --
although `diff' clearly thinks so. Both files show this when `list' is set:

https://github.com/$
https://github.com/$


Here's more background if needed:

I extracted the URLs using sqlite3 like so:
for f in History-16768665*
do
        sqlite3 --bail $f <<-HEREDOC
                .mode csv
                .output ${f}.csv
                select * from urls where url like '%github%';
HEREDOC
done

Then tried to create a list of unique URLs using `sort -u' but it broke because
of special chars in the extracted lines (so it claimed). I used awk to get a
unique list instead:

for f in *.csv; do [[ -s $f ]] && list="${list} $f"; done; echo $list
awk '{ u[$0] } END { for (e in u) print e > "file_1" }' $list

-- 
Sent with https://mailfence.com  
Secure and private email



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1600449078.170379.1676939080787>