Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Apr 2023 12:38:05 +0200
From:      Dimitry Andric <dim@FreeBSD.org>
To:        Ronald Klop <ronald-lists@klop.ws>
Cc:        Poul-Henning Kamp <phk@phk.freebsd.dk>, current@freebsd.org
Subject:   Re: find(1): I18N gone wild ?
Message-ID:  <C54AFCA2-1064-432D-9573-A231A6E4163E@FreeBSD.org>
In-Reply-To: <564252502.12.1682071276296@mailrelay>
References:  <202304172106.33HL6RUX051407@critter.freebsd.dk> <564252502.12.1682071276296@mailrelay>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_58B1808E-7CDD-464A-BEDF-DB6D3AC97BF1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

On 21 Apr 2023, at 12:01, Ronald Klop <ronald-lists@klop.ws> wrote:
> Van: Poul-Henning Kamp <phk@phk.freebsd.dk>
> Datum: maandag, 17 april 2023 23:06
> Aan: current@freebsd.org
> Onderwerp: find(1): I18N gone wild ?
> This surprised me:
>=20
>     # mkdir /tmp/P
>     # cd /tmp/P
>     # touch FOO
>     # touch bar
>     # env LANG=3DC.UTF-8 find . -name '[A-Z]*' -print
>     ./FOO
>     # env LANG=3Den_US.UTF-8 find . -name '[A-Z]*' -print
>     ./FOO
>     ./bar
>=20
> Really ?!
...
> My Mac and a Linux server only give ./FOO in both cases. Just a 2 =
cents remark.

Same here. However, I have read that with unicode, you should *never*
use [A-Z] or [0-9], but character classes instead. That seems to give
both files on macOS and Linux with [[:alpha:]]:

$ LANG=3Den_US.UTF-8 find . -name '[[:alpha:]]*' -print
./BAR
./foo

and only the lowercase file with [[:lower:]]:

$ LANG=3Den_US.UTF-8 find . -name '[[:lower:]]*' -print
./foo

But on FreeBSD, these don't work at all:

$ LANG=3Den_US.UTF-8 find . -name '[[:alpha:]]*' -print
<nothing>

$ LANG=3Den_US.UTF-8 find . -name '[[:lower:]]*' -print
<nothing>

This is an interesting rabbit hole... :)

-Dimitry


--Apple-Mail=_58B1808E-7CDD-464A-BEDF-DB6D3AC97BF1
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.2

iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZEJnjQAKCRCwXqMKLiCW
o1q/AJ9GDBFlhlXhv7jPnhbEdImI8MKrjACfefJ7A7gkn2K2LVHkevKiXtA/7sk=
=5KGL
-----END PGP SIGNATURE-----

--Apple-Mail=_58B1808E-7CDD-464A-BEDF-DB6D3AC97BF1--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C54AFCA2-1064-432D-9573-A231A6E4163E>