From owner-freebsd-questions@freebsd.org Tue May 19 17:11:21 2020 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0AC3D2DF32F for ; Tue, 19 May 2020 17:11:21 +0000 (UTC) (envelope-from ypankov@fastmail.com) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49RMqD3rm6z3bXV for ; Tue, 19 May 2020 17:11:20 +0000 (UTC) (envelope-from ypankov@fastmail.com) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id D44BE5C00CD; Tue, 19 May 2020 13:01:38 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 19 May 2020 13:01:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= subject:to:cc:references:from:message-id:date:mime-version :in-reply-to:content-type:content-transfer-encoding; s=fm2; bh=H 4wsXaVSOST4bDCh6Z76PtOgsRok6cWPwc74GdAnNQ8=; b=vkxN5+JSzrnJwVdAl 2XGqFbV8c/WadDhfNW3cGlT4r1iZN/RzvS4fGJ+S1jTL5VmIV+JG7G0bIPos0Nnv 3Uhru2RWbaHqfKSmKsG1q3QDx4oUYdvHqnV2ame7VzryxYp1gOJb/VkWZlUjeH4c iDgfZNqu9ZOsr5mtAU11iI4NNCO6lG7FLG5LU8gaUBVuJ7Rmqh+gfyTYLvslLp1c sEB1lrj8Sh9fyB0/bbU4bmLd1gGBCHlA5ld14zbBay37ZHuPH1CowZBzdr57aOdd w1pxIfaSy3v/3pISFPryfgRDoHVlqJ8jtOCs3BiwuZZbSpsjLyxtGUg/pLihIizs qIW3Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=H4wsXaVSOST4bDCh6Z76PtOgsRok6cWPwc74GdAnN Q8=; b=YMl1PTrOeQFCRbtC5O6amAale5F1tsdygL1Te0UFrk58fv51dw3K6DGvB b1S+vVc7FCDCMqqj4T8x/72Icsis3k4mO7vS3IT59TNcE44y5z3SPDQnkUCHsP+I zlMHYXoZDb+KnyTmnrmjtsjbVjXs/8A1MEPhWULCbpLpCP98dWwHd0QrZqX1/Amm /zlQDW0pz7l0Bkz6bz0nY/pZkMKUEfuR/qL4O8ryMLY0TBeLegD4UuG3nTubY7VC jSMreX8Bc9Ipc6T+4PHjaFEGXvbotqTx26gOin6j717amV4zYm10OAobITqJXpr4 zRRHUCo/1uyukhoxYQ2CFgup0udgw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedruddtjedguddtjecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhepuffvfhfhkffffgggjggtgfesth ejredttdefjeenucfhrhhomhepjghurhhiucfrrghnkhhovhcuoeihphgrnhhkohhvsehf rghsthhmrghilhdrtghomheqnecuggftrfgrthhtvghrnhepfffgveehgfffveeutedvte etgfetteelvefgjeefhffgvdejuedttddtjeefffetnecuffhomhgrihhnpehfrhgvvggs shgurdhorhhgnecukfhppeelhedrudehfedrudefgedrvdeffeenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpeihphgrnhhkohhvsehfrghsthhm rghilhdrtghomh X-ME-Proxy: Received: from [172.20.10.4] (unknown [95.153.134.233]) by mail.messagingengine.com (Postfix) with ESMTPA id B87313066433; Tue, 19 May 2020 13:01:37 -0400 (EDT) Subject: Re: (character) Conversion error (in vi) ? To: "Ronald F. Guilmette" Cc: freebsd-questions@freebsd.org References: <72824.1589685787@segfault.tristatelogic.com> From: Yuri Pankov Message-ID: <5c384499-c87a-e121-2337-3598adf7fef0@fastmail.com> Date: Tue, 19 May 2020 20:01:35 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <72824.1589685787@segfault.tristatelogic.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49RMqD3rm6z3bXV X-Spamd-Bar: ++++++ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=fastmail.com header.s=fm2 header.b=vkxN5+JS; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=YMl1PTrO; dmarc=pass (policy=none) header.from=fastmail.com; spf=pass (mx1.freebsd.org: domain of ypankov@fastmail.com designates 66.111.4.28 as permitted sender) smtp.mailfrom=ypankov@fastmail.com X-Spamd-Result: default: False [6.33 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(0.00)[+ip4:66.111.4.28:c]; FREEMAIL_FROM(0.00)[fastmail.com]; RWL_MAILSPIKE_GOOD(0.00)[66.111.4.28:from]; RCVD_COUNT_THREE(0.00)[4]; DKIM_TRACE(0.00)[fastmail.com:+,messagingengine.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(0.00)[fastmail.com,none]; NEURAL_HAM_SHORT(-0.10)[-0.104]; RECEIVED_SPAMHAUS_PBL(0.00)[95.153.134.233:received]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[fastmail.com]; ASN(0.00)[asn:11403, ipnet:66.111.4.0/24, country:US]; ARC_NA(0.00)[]; RECEIVED_SPAMHAUS_XBL(5.00)[95.153.134.233:received]; R_DKIM_ALLOW(0.00)[fastmail.com:s=fm2,messagingengine.com:s=fm2]; RCVD_IN_DNSWL_LOW(-0.10)[66.111.4.28:from]; SUBJECT_ENDS_QUESTION(1.00)[]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_SPAM_MEDIUM(0.53)[0.534]; BAD_REP_POLICIES(0.10)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; GREYLIST(0.00)[pass,body] X-Spam: Yes X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 May 2020 17:11:21 -0000 Ronald F. Guilmette wrote: > In message , > Yuri Pankov wrote: > >> No, it's not that bug after all. The issue is that (n)vi now (for quite >> some time :-) defaults to UTF-8 when it can't reliably detect the file >> encoding, so you'll just have to help it a bit adding the following to >> ~/.nexrc: >> >> set fileencoding=iso8859-1 >> >> This way (n)vi will check if file encoding looks like UTF-8, and if not, >> it will use ISO8859-1 as fallback. > > Ahhhhhh... I did what you said and yes, that fixed it! > > Thanks ever so much! This has been bugging me fofr quite awile. > > And my apologies for being to lazy/preoccupied to dredge deeply > enough into the man pages to be able to find this solution on > my own. > > If you were my fairy godmother, then I'd ask you to grant me > one more wish, which would be to have (n)vi always be able to > automagically correctly detect the content encoding in any given > file it is asked to load. But you're not, so I won't. :-) > > Still, it seems like it out to be possible to do. It appears > that a hnuman (you) didn't have much trouble figuring out the > correct encoding type in this instance, so one would think > that this one piece of software might be able to do a better > job in this particular guessing game. (Should I bother to > submit a PR / enhancement request for that?) I do agree that falling back to user locale's encoding that is UTF-8 doesn't make much sense as we already know that it will fail. I'll put a change that makes us try ISO8859-1 (as it seems to be the most widely used single byte locale?) instead if we fail all of the checks below (as added to the code): 1. Check for valid UTF-8. 2. Check if fallback fileencoding is set and is NOT UTF-8. 3. Check if user locale's encoding is NOT UTF-8. 4. Use ISO8859-1 as last resort. As for the autodetecting the single byte encoding, I don't think it's doable in base without adding too much dependencies -- there are tools in ports for this, but if you really need it, can I just say the magic word, "vim"? :-) The review is at https://reviews.freebsd.org/D24919.