Date: Sun, 2 Apr 2006 14:32:04 +0200 From: "Tobias Svehagen" <tobias.svehagen@gmail.com> To: freebsd-current@freebsd.org, tjr@freebsd.org Subject: About gnu/93629 : GNU sort(1) tool dumps core within non-regular locale settings Message-ID: <ca834ac00604020532j534aa7e2l5251fdff96d26526@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
I saw that this issue was on the todo list for 6.1R so I decided to take a look at it. http://www.freebsd.org/cgi/query-pr.cgi?pr=3D93629 As it says in the report you can recreate the abort by doing the following setenv LANG uk_UA.KOI8-U setenv LC_CTYPE ja_JP.UTF-8 /usr/bin/sort This is quite a weird problem and the it lies in that sort tries to handle the LC_TIME values in inittables_mb() thinking that they are in UTF format. The LC_TIME values for uk_UA.KOI8-U does not use UTF encoding but it uses NONE as encoding. Normally this wouldn't be a problem since the multibyte routines handle normal ascii values <=3D 7f just fine and that's why sort works fine when setting LANG to C for example (since Jan-Dec has no ascii > 7f). The thing about uk_UA.KOI8-U (and some others) is that it uses ascii values > 7f to represent the ukrainian alphabet. For example Jan in uk_UA.KOI8-U's LC_TIME is d3 a6 de 00. When you parse that string as UTF, d3 says that it is a multibyte of length 2 and that one works fine (does not trigger the assertion) but then d6 also says that it is a multibyte of length 2 and that makes mbrtowc() return -2 (see man mbrtowc) and that's what makes the assertion go off and abort. I don't know what I think is the best way to solve this but I think that something should be done to make sort not abort and core dump. One solution is of course to make sort check that LC_CTYPE and LC_TIME is the same (or C) but maybe some people want's to have it that way (although I don't see why). Do you have any ideas on how this can be solved in a nice way or do you think that the fix "set LC_CTYPE and LC_TIME to same value" is enough? /Tobias Svehagen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ca834ac00604020532j534aa7e2l5251fdff96d26526>