Date: Wed, 27 Jul 2016 15:33:27 +0300 From: Kimmo Paasiala <kpaasial@gmail.com> To: Tomoaki AOKI <junchoon@dec.sakura.ne.jp> Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>, jjuanino@gmail.com Subject: Re: sed command does not behave equal from 10.3 to 11.0 Message-ID: <CA%2B7WWSebwr7e4wZLHXN9=c_OhQo4k1=VVwdSrgUDwRiBZVgwYA@mail.gmail.com> In-Reply-To: <20160727205539.73c22d166abf0aa474e8c8c8@dec.sakura.ne.jp> References: <CAAVO5%2BLjAsN%2Bj%2B9sa%2B6pGVjDBqqe=MR9spKrsEuHWApfm5kRNA@mail.gmail.com> <CALfReycb85fJ0jmLKj_JS1wQEdy6hBM-p_9P3WjEoWphE-adPA@mail.gmail.com> <20160727090158.GD31921@over-yonder.net> <CAAVO5%2B%2B=eyOWPM3COo3TPsFAgjL7TZgKH_S8C%2BfBjUYSNXB%2BKg@mail.gmail.com> <20160727205539.73c22d166abf0aa474e8c8c8@dec.sakura.ne.jp>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jul 27, 2016 at 2:55 PM, Tomoaki AOKI <junchoon@dec.sakura.ne.jp> w= rote: > Hi. > > There were some collation related changes (*1) between 10.3 and 11. > So the results can be changed even with the same locale. > > *1: For example, r302512. > https://lists.freebsd.org/pipermail/svn-src-head/2016-July/088919.html > > But I cannot understand why ASCII range of characters are affected with > UTF-8 encoding. > > > On Wed, 27 Jul 2016 11:19:06 +0200 > Jos=E3=80=93 Garc=E3=80=93a Juanino <jjuanino@gmail.com> wrote: > >> On 27 July 2016 at 11:01, Matthew D. Fuller <fullermd@over-yonder.net> w= rote: >> > On Wed, Jul 27, 2016 at 09:45:23AM +0100 I heard the voice of >> > krad, and lo! it spake thus: >> >> are you sure you aren't hitting a port or something? >> > >> > Locale dependant. >> > >> > % echo "abc_ABC.def" | env LANG=3DC sed -e 's/[^A-Z0-9]//g' >> > ABC >> > >> > % echo "abc_ABC.def" | env LANG=3Den_US.UTF-8 sed -e 's/[^A-Z0-9]//g' >> > bcABCdef >> > >> > (pre-branch -CURRENT) >> > >> >> The issue is that, under the same locale, the output is not the same >> in 10.3 as 11.0. It sounds to me a bug ... >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org= " >> > > > -- > Tomoaki AOKI junchoon@dec.sakura.ne.jp > _______________________________________________ If I change the invocation to this I get the correct output: % echo "abc_ABC.def" | env LANG=3Den_US.UTF-8 sed -e 's/[^ABC]//g' Is the real problem that the UTF-8 locale messes up character ranges (e.g. A-Z) in sed(1)? -Kimmo
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B7WWSebwr7e4wZLHXN9=c_OhQo4k1=VVwdSrgUDwRiBZVgwYA>