Date: Tue, 19 May 2020 20:01:35 +0300 From: Yuri Pankov <ypankov@fastmail.com> To: "Ronald F. Guilmette" <rfg@tristatelogic.com> Cc: freebsd-questions@freebsd.org Subject: Re: (character) Conversion error (in vi) ? Message-ID: <5c384499-c87a-e121-2337-3598adf7fef0@fastmail.com> In-Reply-To: <72824.1589685787@segfault.tristatelogic.com> References: <72824.1589685787@segfault.tristatelogic.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Ronald F. Guilmette wrote: > In message <ec8735ff-fcf1-7ee4-1aed-4aa9b87c655c@fastmail.com>, > Yuri Pankov <ypankov@fastmail.com> wrote: > >> No, it's not that bug after all. The issue is that (n)vi now (for quite >> some time :-) defaults to UTF-8 when it can't reliably detect the file >> encoding, so you'll just have to help it a bit adding the following to >> ~/.nexrc: >> >> set fileencoding=iso8859-1 >> >> This way (n)vi will check if file encoding looks like UTF-8, and if not, >> it will use ISO8859-1 as fallback. > > Ahhhhhh... I did what you said and yes, that fixed it! > > Thanks ever so much! This has been bugging me fofr quite awile. > > And my apologies for being to lazy/preoccupied to dredge deeply > enough into the man pages to be able to find this solution on > my own. > > If you were my fairy godmother, then I'd ask you to grant me > one more wish, which would be to have (n)vi always be able to > automagically correctly detect the content encoding in any given > file it is asked to load. But you're not, so I won't. :-) > > Still, it seems like it out to be possible to do. It appears > that a hnuman (you) didn't have much trouble figuring out the > correct encoding type in this instance, so one would think > that this one piece of software might be able to do a better > job in this particular guessing game. (Should I bother to > submit a PR / enhancement request for that?) I do agree that falling back to user locale's encoding that is UTF-8 doesn't make much sense as we already know that it will fail. I'll put a change that makes us try ISO8859-1 (as it seems to be the most widely used single byte locale?) instead if we fail all of the checks below (as added to the code): 1. Check for valid UTF-8. 2. Check if fallback fileencoding is set and is NOT UTF-8. 3. Check if user locale's encoding is NOT UTF-8. 4. Use ISO8859-1 as last resort. As for the autodetecting the single byte encoding, I don't think it's doable in base without adding too much dependencies -- there are tools in ports for this, but if you really need it, can I just say the magic word, "vim"? :-) The review is at https://reviews.freebsd.org/D24919.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5c384499-c87a-e121-2337-3598adf7fef0>