Date: Sat, 5 Nov 2016 20:23:25 -0500 (CDT) From: Greg Rivers <gcr+freebsd-stable@tharned.org> To: freebsd-stable@freebsd.org Subject: Uppercase RE matching problems in FreeBSD 11 Message-ID: <alpine.BSF.2.20.1611051912260.2462@flake.tharned.org>
next in thread | raw e-mail | index | archive | help
I happened to run an old script today that uses sed(1) to extract the system boot time from the kern.boottime sysctl MIB. On 11.0 this no longer works as expected: $ sysctl kern.boottime kern.boottime: { sec = 1478380714, usec = 145351 } Sat Nov 5 16:18:34 2016 $ sysctl kern.boottime | sed -e 's/.*\([A-Z].*\)$/\1/' v 5 16:18:34 2016 sed passes over 'S' and 'N' until it hits 'v', which it considers uppercase apparently. This is with LANG=en_US.UTF-8. If I set LANG=C, it works as expected: $ sysctl kern.boottime | LANG=C sed -e 's/.*\([A-Z].*\)$/\1/' Nov 5 16:18:34 2016 Testing every lowercase character separately gives even more inconsistent results: $ cat <<! | LANG=en_US.UTF-8 sed -n -e '/^[A-Z]$/'p > a > b > c > d > e > f > g > h > i > j > k > l > m > n > o > p > q > r > s > t > u > v > w > x > y > z > ! b c d e f g h i j k l m n o p q r s t u v w x y z Here sed thinks every lowercase character except for 'a' is uppercase! This differs from the first test where sed did not think 'o' is uppercase. Again, the above behaves as expected with LANG=C. Does anyone have any insight into this? This is likely to break a lot of existing code. -- Greg
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.20.1611051912260.2462>