Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Feb 2023 12:53:14 +0100 (CET)
From:      Sysadmin Lists <sysadmin.lists@mailfence.com>
To:        Freebsd Questions <freebsd-questions@freebsd.org>
Cc:        =?utf-8?Q?Andreas_Kusalananda_K=C3=A4h=C3=A4ri?= <andreas.kahari@abc.se>
Subject:   Re: BSD-awk print() Behavior
Message-ID:  <1653727721.225143.1676980394881@ichabod.co-bxl>
In-Reply-To: <Y/SZfSO1CdhIvVUD@harpo.local>
References:  <1600449078.170379.1676939080787@fidget.co-bxl> <Y/SZfSO1CdhIvVUD@harpo.local>

next in thread | previous in thread | raw e-mail | index | archive | help
> ----------------------------------------
> From: Andreas Kusalananda K=C3=A4h=C3=A4ri <andreas.kahari@abc.se>
> Date: Feb 21, 2023, 2:14:21 AM
> To: Sysadmin Lists <sysadmin.lists@mailfence.com>
> Cc: Freebsd Questions <freebsd-questions@freebsd.org>
> Subject: Re: BSD-awk print() Behavior
>=20
>=20
> On Tue, Feb 21, 2023 at 01:24:41AM +0100, Sysadmin Lists wrote:
> >
> > $ cat file_{1,2}
> > https://github.com/
> > https://github.com/
> > https://github.com/
> > https://github.com/
> >=20
> > $ diff file_{1,2} =20
> > 1,2c1,2
> > < https://github.com/
> > < https://github.com/
> > ---
> > > https://github.com/
> > > https://github.com/
> >=20
> > $ awk '{ print $0 " abc " }' file_{1,2} =20
> >  abc ://github.com/
> >  abc ://github.com/
> > https://github.com/ abc=20
> > https://github.com/ abc=20
>=20
> file_1 is a DOS text file, while file_2 is a Unix text file.  The DOS
> text file, when interpreted by tools expecting Unix text, has an extra
> carriage-return character at the end of each line.  This carriage-return
> character will be part of $0 in the awk code and causes the cursor to be
> moved back to the start of the line when printing it, giving the effect
> that you are seeing.
>=20
> This has nothing to do with awk's print keyword.  You would get similar
> strange result if you simply pasted the data side by side:
>=20
> =09$ paste file_{1,2}
> =09https://https://github.com/
> =09https://https://github.com/
>=20
> Here, "https://github.com/" is first printed from the DOS text file,
> after which the cursor is returned to the start of the line.  Then,
> paste inserts a tab character which "steps over" the eight first
> characters that had already been outputted ("https://") and then outputs
> "https://github.com/" from the Unix text file.
>=20
>=20
> >=20
> > The sql-extracted URLs cause awk's print() to replace the front of the =
string
> > with text following $0. file_2 does not. I used vim's `:set list' optio=
n to
> > view hidden chars, but there's no apparent difference between the two -=
-
> > although `diff' clearly thinks so. Both files show this when `list' is =
set:
> >=20
> > https://github.com/$
> > https://github.com/$
>=20
> Yes, because Vim automatically interprets DOS text files as ordinary
> text.  I'm asssuming that while editing file_1 in Vim, you see "[dos]"
> at the bottom of the screen?
>=20
>=20

Good explanation. I found the hidden character before reading your email us=
ing
`cat -e' which printed the ^M character, but didn't know awk could move the
cursor around like that. Sounds like a useful (and dangerous) hack.

$ cat -e file_{1,2}=20
https://github.com/^M$
https://github.com/^M$
https://github.com/$
https://github.com/$

vim does indeed say [dos] at the bottom of file_1. Now I know sqlite3 creat=
es
dos files even on unix-like systems.

Thank you both.

--=20
Sent with https://mailfence.com =20
Secure and private email



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1653727721.225143.1676980394881>