From owner-freebsd-stable@freebsd.org Wed Jul 27 12:33:28 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A38A4BA61E7 for ; Wed, 27 Jul 2016 12:33:28 +0000 (UTC) (envelope-from kpaasial@gmail.com) Received: from mail-oi0-x230.google.com (mail-oi0-x230.google.com [IPv6:2607:f8b0:4003:c06::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 673B111A8 for ; Wed, 27 Jul 2016 12:33:28 +0000 (UTC) (envelope-from kpaasial@gmail.com) Received: by mail-oi0-x230.google.com with SMTP id j185so15059454oih.0 for ; Wed, 27 Jul 2016 05:33:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=bwf030EuV2Kjqq2U0LWF3+sW6zM3B2rntsDEsnhz0yY=; b=VJ+nnkAUNgWa17/+ilj0mfoMprjAPVMqFbTl2OJylioceZaTNG+ar8W/NaXXbsVjbs a0LBigWPN00I2QrXW1pyhnBqAH48yszd9dODnbXNyH9fhsQaLfjIaa83VRh88M8+IOlq mjyiRggp6f/2YN8rS4iKOMLPBJQHHaaXUJ9jp27oSz+xyQSx/ksUmzAK+6audE5y0LVr GC8AIBp5qu9q6Ko2Ek31kBSte/V0Ss9wqQvsah7Mry6lj9cQVzaoko9eE2hAHCe2Hfz5 IrwJaMCxtWWJ7aJr7ULvYBMfBDdNDGPCREnjsbKv78mXWRfSulR5Nuj97lhkop0ZWhD5 cU4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=bwf030EuV2Kjqq2U0LWF3+sW6zM3B2rntsDEsnhz0yY=; b=G0X4NxBd61Pouf5QcACvveRAzofrURrPgbvyZY4aiCPqKk8oJ45CHVV/pvWNnDpQF2 +4PndLt7MtIpiLZe9sQPF15iQ4ZbGzbbxhNxlyHuRXX3CZ9Lcky86pmWUAraZbqzXM/o VJjlXU36TABSn2hSe0z8yQAcc25aXSMta7Kv/sQStzGD4YMqQY6FiHZeVubnQnDcU9w3 /ddchm9ma+/FBAFTnoBnZIOzReyJy1AuBteBvJr4BDjhSZty2mpGnMk0lnZpwgySaVnp ftjkxf4KltR2MdDjGXmOIZCVNruT9AGMQAPefdJED9lG94xODYRDYyYYJG5NT7sEnh30 /yyg== X-Gm-Message-State: AEkoouvR19iRYYeqqYRiukN1KokG3OepNo2/TmY0VDJwbfzQTblorw+pAXWf3K4y03kbP63DqJDNObcdQJ+p8g== X-Received: by 10.202.105.133 with SMTP id e127mr5519584oic.194.1469622807649; Wed, 27 Jul 2016 05:33:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.22.234 with HTTP; Wed, 27 Jul 2016 05:33:27 -0700 (PDT) In-Reply-To: <20160727205539.73c22d166abf0aa474e8c8c8@dec.sakura.ne.jp> References: <20160727090158.GD31921@over-yonder.net> <20160727205539.73c22d166abf0aa474e8c8c8@dec.sakura.ne.jp> From: Kimmo Paasiala Date: Wed, 27 Jul 2016 15:33:27 +0300 Message-ID: Subject: Re: sed command does not behave equal from 10.3 to 11.0 To: Tomoaki AOKI Cc: "freebsd-stable@freebsd.org" , jjuanino@gmail.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jul 2016 12:33:28 -0000 On Wed, Jul 27, 2016 at 2:55 PM, Tomoaki AOKI w= rote: > Hi. > > There were some collation related changes (*1) between 10.3 and 11. > So the results can be changed even with the same locale. > > *1: For example, r302512. > https://lists.freebsd.org/pipermail/svn-src-head/2016-July/088919.html > > But I cannot understand why ASCII range of characters are affected with > UTF-8 encoding. > > > On Wed, 27 Jul 2016 11:19:06 +0200 > Jos=E3=80=93 Garc=E3=80=93a Juanino wrote: > >> On 27 July 2016 at 11:01, Matthew D. Fuller w= rote: >> > On Wed, Jul 27, 2016 at 09:45:23AM +0100 I heard the voice of >> > krad, and lo! it spake thus: >> >> are you sure you aren't hitting a port or something? >> > >> > Locale dependant. >> > >> > % echo "abc_ABC.def" | env LANG=3DC sed -e 's/[^A-Z0-9]//g' >> > ABC >> > >> > % echo "abc_ABC.def" | env LANG=3Den_US.UTF-8 sed -e 's/[^A-Z0-9]//g' >> > bcABCdef >> > >> > (pre-branch -CURRENT) >> > >> >> The issue is that, under the same locale, the output is not the same >> in 10.3 as 11.0. It sounds to me a bug ... >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org= " >> > > > -- > Tomoaki AOKI junchoon@dec.sakura.ne.jp > _______________________________________________ If I change the invocation to this I get the correct output: % echo "abc_ABC.def" | env LANG=3Den_US.UTF-8 sed -e 's/[^ABC]//g' Is the real problem that the UTF-8 locale messes up character ranges (e.g. A-Z) in sed(1)? -Kimmo