Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Oct 2019 19:20:58 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 241441] inconsistency between allowed empty regex for `awk -F` and split()
Message-ID:  <bug-241441-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D241441

            Bug ID: 241441
           Summary: inconsistency between allowed empty regex for `awk -F`
                    and split()
           Product: Base System
           Version: 12.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: bin
          Assignee: bugs@FreeBSD.org
          Reporter: freebsd@tim.thechases.com

I get an error when I try to use an empty regex for the field separator:

  $ echo hello | awk -F '' '{print $2}'
  awk: field separator FS is empty

but awk has no issues splitting things on an empty regex:

  $ awk 'BEGIN{s=3D"hello"; split(s, a, ""); print a[1]}'
  h

Over on gawk, I get the expected behavior

  $ echo hello | awk -F '' '{print $1}'
  h

This is somewhat similar to #226112

  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D226112

I get that awk uses EREs and `man re_format`  says that "A (modern [Extende=
d])
RE is one or more non-empty branches, separated by '|'", but

1) that's not what split() does

2) it's not what gawk's -F parameter does

3) permitting an empty regex for splitting already seems supported in awk c=
ode
(as the split example shows) and shouldn't break any existing usage

4) as a non-workaround, `man re_format` says that the atom "()" matches the
null string, but

  $ echo hello | awk -F '()' '{print $1}'

doesn't split the row on the null regular expression (FWIW, gawk gives the =
same
results when using "()" as the split pattern).

In an ideal world, the behavior would match the behavior of gawk & the spli=
t()
function, splitting the record into each individual character.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-241441-227>