From owner-freebsd-stable@freebsd.org Thu Oct 20 15:04:51 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DEC60C19FA8 for ; Thu, 20 Oct 2016 15:04:51 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A0D8B1AF; Thu, 20 Oct 2016 15:04:51 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1bxEtk-000K9C-GH; Thu, 20 Oct 2016 18:04:48 +0300 Date: Thu, 20 Oct 2016 18:04:48 +0300 From: Slawa Olhovchenkov To: Alan Somers Cc: FreeBSD Subject: Re: tcsh is not handled correctly UTF-8 in arguments Message-ID: <20161020150448.GU57714@zxy.spb.ru> References: <20161019171028.GF57876@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Oct 2016 15:04:52 -0000 On Thu, Oct 20, 2016 at 08:54:05AM -0600, Alan Somers wrote: > On Wed, Oct 19, 2016 at 11:10 AM, Slawa Olhovchenkov wrote: > > tcsh called by sshd for invocation of scp: `tcsh -c scp -f Расписание.pdf` > > At this time no any LC_* is set. > > tcsh read .cshrc and set LC_CTYPE=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8. > > After this invocation of scp will be incorrect: > > > > 00007ab0 20 2d 66 20 c3 90 c2 a0 c3 90 c2 b0 c3 91 c2 81 | -f ............| > > 00007ac0 c3 90 c2 bf c3 90 c2 b8 c3 91 c2 81 c3 90 c2 b0 |................| > > 00007ad0 c3 90 c2 bd c3 90 c2 b8 c3 90 c2 b5 5f c3 90 c2 |............_...| > > 00007ae0 a2 c3 90 c2 97 c3 90 c2 98 2e 70 64 66 0a |..........pdf. | > > > > Correct invocation must be: > > > > 00000000 20 2d 66 20 | -f | > > 00000010 d0 a0 d0 b0 d1 81 d0 bf d0 b8 d1 81 d0 b0 d0 bd |................| > > 00000020 d0 b8 d0 b5 5f d0 a2 d0 97 d0 98 2e 70 64 66 0a |...._.......pdf.| > > > > `d0` => `c3 90` > > `a0` => `c2 a0` > > > > I.e. every byte re-encoded to utf-8: `d0` => `c3 90` > > > > As result imposible to access files w/ non-ascii names. > > This might be related to PR213013. Could you please try on head after r306782 ? I think not related. PR213013 is about character classification, my report is about unnecessary encoding shell arguments.