From owner-svn-src-head@freebsd.org Mon Jun 22 22:39:18 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B73DA33ED9C for ; Mon, 22 Jun 2020 22:39:18 +0000 (UTC) (envelope-from lichray@gmail.com) Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49rPTy38qvz4Whc; Mon, 22 Jun 2020 22:39:18 +0000 (UTC) (envelope-from lichray@gmail.com) Received: by mail-wr1-x42b.google.com with SMTP id h5so18469255wrc.7; Mon, 22 Jun 2020 15:39:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Kn032rc4I5Z8+r3s98vqIP0XxZrEnVIS4y4Wj/wTTzk=; b=X835OdVu+cn9squW8A+n5/sUZyBT2nWfMEmc4X6FbU6PhABTgNazCS7riNndPH1HTc qlnJl1C0ZxMohOt/O5VB25pX2ViLToIU5c9XUd0vHZifP4FuHUrpZdpyqwXS/gAWRlm4 yuA7/lqZipHlkAf9RpCgVGdCxHkL5w5j7DworF807dciOYrQbkZENpl4MFVGNLhNpE9G 0GhAXAnO0C1bZKnjzUVusGcoHzF2JRejwK83WCX3dTckBMiYOzWDx5ZZ2/9xXMJcI1sj n7FeEKhe+fCiHhcVaJcoWhIqOPgj/2MDflHMPsZQkZxut2ES18Yluo4Qo7qSPcBtc6Kn 1H2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Kn032rc4I5Z8+r3s98vqIP0XxZrEnVIS4y4Wj/wTTzk=; b=S4JM1wwvPuQqKme4CLSqoO9oN9P5yDxi+4HBvFoecPbKlMphI2yknxixqTGx7A+zfy DmuP72iB+fhtTzTooeLdE02lS5LtLFz0peJFzOuaxEec1IdBDGrQsGEZtYUacn4OXcBk /lzB1BaVA8PPmrMgsgjkXlswQdYukgdkvU9da2NCXcLYPN+hJB1yIsNF8hLQ6gtrSGce bK5TMnQYTm0UGFmNH/0pYWlNfmlmYQ2ePRdtR1Oc4kPvPTc0+45C3BUdwIwOTNkp7XyK /8panIYq/qEUmctnvdViv1HLJ4vDPtk2NyBT55gpTbYXMQ+SWRGNtdMZhHXcuuKZC98s M1Mg== X-Gm-Message-State: AOAM5324RQmdDoz9SkFeo6d4JBPKZWXLN0+nNWHsa8erN5M6BYdpOn1Z PSBxEL7cX3ig+vipclQr6sgk7YDfmr2YvWr7aP3rbda9hJ8Veg== X-Google-Smtp-Source: ABdhPJxGL9cTH0Aw7/I2Mn3V32sq2dUj3vFywtgFW42Gwj++hjffdLaK7stMHtDQOCIZA4Vflx+zWMG1RZ7QA0ri57Q= X-Received: by 2002:a5d:56d0:: with SMTP id m16mr20801014wrw.194.1592865556534; Mon, 22 Jun 2020 15:39:16 -0700 (PDT) MIME-Version: 1.0 References: <202006131411.05DEB2mP097868@repo.freebsd.org> <20200622221144.GA31842@FreeBSD.org> <3fe4705c-e036-6999-b6b0-6e05f7cf8321@yuripv.dev> <20200622222448.GB31842@FreeBSD.org> In-Reply-To: <20200622222448.GB31842@FreeBSD.org> From: Zhihao Yuan Date: Mon, 22 Jun 2020 17:39:05 -0500 Message-ID: Subject: Re: svn commit: r362148 - head/contrib/nvi/common To: Gleb Smirnoff Cc: Yuri Pankov , Yuri Pankov , svn-src-head@freebsd.org X-Rspamd-Queue-Id: 49rPTy38qvz4Whc X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; REPLY(-4.00)[] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Jun 2020 22:39:18 -0000 On Mon, Jun 22, 2020 at 5:24 PM Gleb Smirnoff wrote: > > My first attempt was this fix: > > --- common/exf.c (revision 362200) > +++ common/exf.c (working copy) > @@ -1252,7 +1252,8 @@ file_encinit(SCR *sp) > else if (O_ISSET(sp, O_FILEENCODING) && > strcasecmp(O_STR(sp, O_FILEENCODING), "utf-8") != 0) > /* Use fileencoding as is */ ; > - else if (strcasecmp(codeset(), "utf-8") != 0) > + else if (strncasecmp(codeset() + strlen(codeset()) - 5, "utf-8", > 5) != > + 0) > o_set(sp, O_FILEENCODING, OS_STRDUP, codeset(), 0); > else > o_set(sp, O_FILEENCODING, OS_STRDUP, "iso8859-1", 0); > > But it appeared to be not the case. To my surprise, codeset() > which is wrapper around nl_langinfo() in my case returns US-ASCII. > > That sounds strange. 1. Can you set LC_CTYPE as well and see if anything changes? 2. Can you revert to the previous version and see what nl_langinfo gives? There is another issue... I'm sorry. I totally forgot what looks_utf8 actually does. Here is its behavior (encoding.c): Returns -1: invalid UTF-8 0: uses odd control characters, so doesn't look like text 1: 7-bit text 2: definitely UTF-8 text (valid high-bit set bytes) So if looks_utf8() > 1, it means the file itself is UTF-8 for sure. If you opened a file with 7-bit text or with control characters, :set fileencoding should set the encoding intended to write. But the HEAD behaviors is that you can't input Unicode. I'm reverting upstream. -- Zhihao Yuan, ID lichray The best way to predict the future is to invent it. _______________________________________________