Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 01 Feb 2012 02:56:58 -0800
From:      Edward Martinez <eam1edward@gmail.com>
To:        Robert Bonomi <bonomi@mail.r-bonomi.com>
Cc:        FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   =?windows-1252?q?Re=3A=5BSOLVED=5D_bash__LC=5FCOLLATE_or_LC=5FAL?= =?windows-1252?q?L_set_=93C=94_not__sort_in_dictionary_order=2E?=
Message-ID:  <4F291A7A.70609@gmail.com>
In-Reply-To: <201201312022.q0VKMabu097278@mail.r-bonomi.com>
References:  <201201312022.q0VKMabu097278@mail.r-bonomi.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/31/12 12:22, Robert Bonomi wrote:
> `
>
>
> Edward wrote:
>> On 01/31/12 06:31, Robert Bonomi wrote:
>>>>        Hi,
>>>>
>>>>        Been trying to get BASH to sort set characters in  dictionary order.
>>>>       I typed "locale" and it shows LC_COLLATE and LC_ALL are set to "C"
>>>> thought that was enough to work,
>>>>       however when i type metacharacters:  set character; any character,
>>>> something like this:
>>>>
>>>>           ls  [a-cx-y]*
>>>>
>>>>        bash does not sort in dictionary order; file   "Binarc" does not
>>>> list.
>>>>
>>> *OF*COURSE* it doesn't.  Unix is _case_sensitive_.  You specified a lower-
>>> case only (in the C locale) pattern.  Naturally, it doesn't match a file
>>> with an upper-case character in it.
>>>
>>> Note: in the 'C' locale, characters are sorted on the underlying byte value.
>>> Thus you will get all the upper-case matches before any lower-case match.
>>>
>>> To get upper-and-lower case files in the C locale, you will have to use:
>>>             ls [A-CX-Ya-cx-y]*
>>>
>>> IF you speciy a different charset for collating, you _may_ get upper/lower
>>> case characters sorted adjacently.  See the specifications for the charset
>>> in question.
>>>
>>>
>>       Thanks for reply!
>>
>>        I meant LC_COLLATE being  set to en_US.UTF-8 not C.
> AH.  you lied (not necessarily maliciously, or intentionally) about the
> nature of the problem.  disregard my rant.
>
> The short answer to the revised situation is 'it depends on how the charset
> collating sequence is deifined'.  AND _which_ release of FreeBSD you are
> using, and thus which version of bash.
>
    I have been digging around and discovered  linux's bash is  not 
working correctly on this matter and numerous  users  have file bug 
reports about it.  FreeBSD's bash is fine:

    https://bugs.archlinux.org/task/24553
    https://bugs.launchpad.net/ubuntu/+source/bash/+bug/120687
    http://teaching.idallen.com/net2003/06w/notes/character_sets.txt

     i will continue using either character classes and upper/lower case 
charsets when defining wildcards
    thanks for the help.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F291A7A.70609>