From nobody Fri Aug 20 09:03:26 2021 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id E3CF11775E27 for ; Fri, 20 Aug 2021 09:03:37 +0000 (UTC) (envelope-from freebsd@oldach.net) Received: from nuc.oldach.net (hmo.in-vpn.de [IPv6:2001:67c:1407:60::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "nuc.oldach.net", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GrbK463tpz53sK for ; Fri, 20 Aug 2021 09:03:36 +0000 (UTC) (envelope-from freebsd@oldach.net) Received: from nuc.oldach.net (localhost [127.0.0.1]) by nuc.oldach.net (8.17.1/8.17.1/hmo17dec20) with ESMTPS id 17K93Qmi091127 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Fri, 20 Aug 2021 11:03:26 +0200 (CEST) (envelope-from freebsd@oldach.net) Received: (from hmo@localhost) by nuc.oldach.net (8.17.1/8.17.1/Submit) id 17K93QN3091126 for stable@freebsd.org; Fri, 20 Aug 2021 11:03:26 +0200 (CEST) (envelope-from freebsd@oldach.net) Message-Id: <202108200903.17K93QN3091126@nuc.oldach.net> Subject: Confusion with grep & locale? To: stable@freebsd.org Date: Fri, 20 Aug 2021 11:03:26 +0200 (CEST) From: freebsd@oldach.net (Helge Oldach) X-No-Archive: Yes List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: inspected by milter-greylist-4.6.4 (nuc.oldach.net [0.0.0.0]); Fri, 20 Aug 2021 11:03:26 +0200 (CEST) for IP:127.0.0.1 DOMAIN:localhost HELO:nuc.oldach.net FROM:freebsd@oldach.net RCPT: X-Rspamd-Queue-Id: 4GrbK463tpz53sK X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of freebsd@oldach.net designates 2001:67c:1407:60::1 as permitted sender) smtp.mailfrom=freebsd@oldach.net X-Spamd-Result: default: False [-2.30 / 15.00]; SUBJECT_ENDS_QUESTION(1.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[stable@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_RHS_MATCH_FROMTLD(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_NO_DN(0.00)[]; DMARC_NA(0.00)[oldach.net]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:29670, ipnet:2001:67c:1400::/45, country:DE]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N Hi all, I'm confused about the FreeBSD behaviour with respect to locale's and grep - specifically, it seems case sensitivity is not handled consistently when grepping character ranges. It looks to me like 11 and 13 are not behaving consistently however I'm unclear why. # uname -a FreeBSD 11STABLE 11.4-STABLE FreeBSD 11.4-STABLE #1059 r368289M: Thu Dec 3 01:48:30 UTC 2020 root@XXX amd64 # export LANG=en_US.ISO8859-1 # (echo bla; echo Bla) | grep '[A-Z]' Bla # export LANG=C # (echo bla; echo Bla) | grep '[A-Z]' Bla # export LANG=en_US.UTF-8 # (echo bla; echo Bla) | grep '[A-Z]' bla Bla # # uname -a FreeBSD 13STABLE 13.0-STABLE FreeBSD 13.0-STABLE #49 stable/13-n246779-64085efb677-dirty: Mon Aug 16 08:42:53 CEST 2021 root@XXX amd64 # export LANG=en_US.ISO8859-1 # (echo bla; echo Bla) | grep '[A-Z]' bla Bla # export LANG=C # (echo bla; echo Bla) | grep '[A-Z]' Bla # export LANG=en_US.UTF-8 # (echo bla; echo Bla) | grep '[A-Z]' Bla # For comparison, a Linux RHEL box delivers the expected results: # uname -a Linux rhel.local 3.10.0-1062.9.1.el7.x86_64 #1 SMP Mon Dec 2 08:31:54 EST 2019 x86_64 x86_64 x86_64 GNU/Linux # export LANG=en_US.ISO8859-1 # (echo bla; echo Bla) | grep '[A-Z]' Bla # export LANG=C # (echo bla; echo Bla) | grep '[A-Z]' Bla # export LANG=en_US.UTF-8 # (echo bla; echo Bla) | grep '[A-Z]' Bla # There is nothing special in the environment, specifically no LC_xxx nor MM_CHARSET in either case. Any guidance is appreciated... Thanks! Kind regards Helge