From owner-svn-src-head@freebsd.org Mon Jun 22 22:24:51 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A755A33E9B0 for ; Mon, 22 Jun 2020 22:24:51 +0000 (UTC) (envelope-from glebius@freebsd.org) Received: from cell.glebi.us (glebi.us [162.251.186.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "cell.glebi.us", Issuer "cell.glebi.us" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 49rP9G2n8qz4WMg; Mon, 22 Jun 2020 22:24:49 +0000 (UTC) (envelope-from glebius@freebsd.org) Received: from cell.glebi.us (localhost [127.0.0.1]) by cell.glebi.us (8.15.2/8.15.2) with ESMTPS id 05MMOmRw032180 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 22 Jun 2020 15:24:48 -0700 (PDT) (envelope-from glebius@freebsd.org) Received: (from glebius@localhost) by cell.glebi.us (8.15.2/8.15.2/Submit) id 05MMOmjY032179; Mon, 22 Jun 2020 15:24:48 -0700 (PDT) (envelope-from glebius@freebsd.org) X-Authentication-Warning: cell.glebi.us: glebius set sender to glebius@freebsd.org using -f Date: Mon, 22 Jun 2020 15:24:48 -0700 From: Gleb Smirnoff To: Yuri Pankov Cc: Yuri Pankov , Zhihao Yuan , svn-src-head@freebsd.org Subject: Re: svn commit: r362148 - head/contrib/nvi/common Message-ID: <20200622222448.GB31842@FreeBSD.org> References: <202006131411.05DEB2mP097868@repo.freebsd.org> <20200622221144.GA31842@FreeBSD.org> <3fe4705c-e036-6999-b6b0-6e05f7cf8321@yuripv.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3fe4705c-e036-6999-b6b0-6e05f7cf8321@yuripv.dev> X-Rspamd-Queue-Id: 49rP9G2n8qz4WMg X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; local_wl_from(0.00)[freebsd.org]; ASN(0.00)[asn:27348, ipnet:162.251.186.0/24, country:US] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Jun 2020 22:24:51 -0000 On Tue, Jun 23, 2020 at 01:20:01AM +0300, Yuri Pankov wrote: Y> Gleb Smirnoff wrote: Y> > Yuri, Zhihao, Y> > Y> > this commit totally broke Russian input for me in nvi. After Y> > exiting edit mode, nvi immediately converts all text to ???????. Y> > Y> > I don't have any special settings in my environment. All I have Y> > is "russian" class for my user which yields in these environment Y> > variables: Y> > Y> > declare -x LANG="ru_RU.UTF-8" Y> > declare -x MM_CHARSET="UTF-8" Y> > declare -x XTERM_LOCALE="ru_RU.UTF-8" Y> > Y> > I'm already digging into that problem, but may be you have Y> > a clue immediately. Y> Y> My bad, yes, I see the problem, looking into it. My first attempt was this fix: --- common/exf.c (revision 362200) +++ common/exf.c (working copy) @@ -1252,7 +1252,8 @@ file_encinit(SCR *sp) else if (O_ISSET(sp, O_FILEENCODING) && strcasecmp(O_STR(sp, O_FILEENCODING), "utf-8") != 0) /* Use fileencoding as is */ ; - else if (strcasecmp(codeset(), "utf-8") != 0) + else if (strncasecmp(codeset() + strlen(codeset()) - 5, "utf-8", 5) != + 0) o_set(sp, O_FILEENCODING, OS_STRDUP, codeset(), 0); else o_set(sp, O_FILEENCODING, OS_STRDUP, "iso8859-1", 0); But it appeared to be not the case. To my surprise, codeset() which is wrapper around nl_langinfo() in my case returns US-ASCII. Y> > On Sat, Jun 13, 2020 at 02:11:02PM +0000, Yuri Pankov wrote: Y> > Y> Author: yuripv Y> > Y> Date: Sat Jun 13 14:11:02 2020 Y> > Y> New Revision: 362148 Y> > Y> URL: https://svnweb.freebsd.org/changeset/base/362148 Y> > Y> Y> > Y> Log: Y> > Y> nvi: fallback to ISO8859-1 as last resort Y> > Y> Y> > Y> Current logic of using user's locale encoding that is UTF-8 doesn't make Y> > Y> much sense if we already failed the looks_utf8() check and skipped Y> > Y> encoding set using "fileencoding" as being UTF-8 as well; fallback to Y> > Y> ISO8859-1 in that case. Y> > Y> Y> > Y> Reviewed by: Zhihao Yuan Y> > Y> Differential Revision: https://reviews.freebsd.org/D24919 Y> > Y> Y> > Y> Modified: Y> > Y> head/contrib/nvi/common/exf.c Y> > Y> Y> > Y> Modified: head/contrib/nvi/common/exf.c Y> > Y> ============================================================================== Y> > Y> --- head/contrib/nvi/common/exf.c Sat Jun 13 09:16:07 2020 (r362147) Y> > Y> +++ head/contrib/nvi/common/exf.c Sat Jun 13 14:11:02 2020 (r362148) Y> > Y> @@ -1237,7 +1237,10 @@ file_encinit(SCR *sp) Y> > Y> } Y> > Y> Y> > Y> /* Y> > Y> - * Detect UTF-8 and fallback to the locale/preset encoding. Y> > Y> + * 1. Check for valid UTF-8. Y> > Y> + * 2. Check if fallback fileencoding is set and is NOT UTF-8. Y> > Y> + * 3. Check if user locale's encoding is NOT UTF-8. Y> > Y> + * 4. Use ISO8859-1 as last resort. Y> > Y> * Y> > Y> * XXX Y> > Y> * A manually set O_FILEENCODING indicates the "fallback Y> > Y> @@ -1246,9 +1249,13 @@ file_encinit(SCR *sp) Y> > Y> */ Y> > Y> if (looks_utf8(buf, blen) > 1) Y> > Y> o_set(sp, O_FILEENCODING, OS_STRDUP, "utf-8", 0); Y> > Y> - else if (!O_ISSET(sp, O_FILEENCODING) || Y> > Y> - !strcasecmp(O_STR(sp, O_FILEENCODING), "utf-8")) Y> > Y> + else if (O_ISSET(sp, O_FILEENCODING) && Y> > Y> + strcasecmp(O_STR(sp, O_FILEENCODING), "utf-8") != 0) Y> > Y> + /* Use fileencoding as is */ ; Y> > Y> + else if (strcasecmp(codeset(), "utf-8") != 0) Y> > Y> o_set(sp, O_FILEENCODING, OS_STRDUP, codeset(), 0); Y> > Y> + else Y> > Y> + o_set(sp, O_FILEENCODING, OS_STRDUP, "iso8859-1", 0); Y> > Y> Y> > Y> conv_enc(sp, O_FILEENCODING, 0); Y> > Y> #endif -- Gleb Smirnoff