Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Apr 2023 18:51:20 +0100
From:      Jessica Clarke <jrtc27@freebsd.org>
To:        Hans Petter Selasky <hselasky@freebsd.org>
Cc:        Brooks Davis <brooks@freebsd.org>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 8dcf3a82c54c - main - libc: Implement bsort(3) a bitonic type of sorting algorithm.
Message-ID:  <3B5734C4-8630-4CD5-BA8D-DE33899161F1@freebsd.org>
In-Reply-To: <a073fd36-9aa9-e988-0cc5-86af305fb654@freebsd.org>
References:  <202304191206.33JC6Qcp062380@gitrepo.freebsd.org> <ZEAM3rO9r5e97BHE@spindle.one-eyed-alien.net> <a073fd36-9aa9-e988-0cc5-86af305fb654@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 19 Apr 2023, at 18:24, Hans Petter Selasky <hselasky@freebsd.org> =
wrote:
>=20
> On 4/19/23 17:46, Brooks Davis wrote:
>> On Wed, Apr 19, 2023 at 12:06:26PM +0000, Hans Petter Selasky wrote:
>>> The branch main has been updated by hselasky:
>>>=20
>>> URL: =
https://cgit.FreeBSD.org/src/commit/?id=3D8dcf3a82c54cb216df3213a013047907=
636a01da
>>>=20
>>> commit 8dcf3a82c54cb216df3213a013047907636a01da
>>> Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
>>> AuthorDate: 2022-09-08 10:16:43 +0000
>>> Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
>>> CommitDate: 2023-04-19 12:04:22 +0000
>>>=20
>>>     libc: Implement bsort(3) a bitonic type of sorting algorithm.
>>>          The bsort(3) algorithm works by swapping objects, similarly =
to qsort(3),
>>>     and does not require any significant amount of additional =
memory.
>>>          The bsort(3) algorithm doesn't suffer from the processing =
time issues
>>>     known the plague the qsort(3) family of algorithms, and is =
bounded by
>>>     a complexity of O(log2(N) * log2(N) * N), where N is the number =
of
>>>     elements in the sorting array. The additional complexity =
compared to
>>>     mergesort(3) is a fair tradeoff in situations where no memory =
may
>>>     be allocated.
>>>          The bsort(3) APIs are identical to those of qsort(3), =
allowing for
>>>     easy drop-in and testing.
>>>          The design of the bsort(3) algorithm allows for future =
parallell CPU
>>>     execution when sorting arrays. The current version of the =
bsort(3)
>>>     algorithm is single threaded. This is possible because fixed =
areas
>>>     of the sorting data is compared at a time, and can easily be =
divided
>>>     among different CPU's to sort large arrays faster.
>>>          Reviewed by:    gbe@, delphij@, pauamma_gundo.com =
(manpages)
>>>     Sponsored by:   NVIDIA Networking
>>>     Differential Revision:  https://reviews.freebsd.org/D36493
>> Why was this needed?  I'm not really a fan to adding new, =
non-standard
>> sorts without a clear use case.  I understand it has some specific
>> advantages vs qsort, but are we going to use it?  Doing so will make =
our
>> code less portable so we almost certainly can't s/qsort/bsort/.
>=20
> Hi Brooks,
>=20
> My long term plan is to get bsort() to replace qsort(), but because =
the two algorithms have different characteristics, then some people may =
complain it is good for me, but not for you. I want there to be an =
option besides from qsort(), which does not use malloc() as an integral =
part of sorting. And is faster than O(N*N) sorting (still the worst case =
for qsort in FreeBSD).

Why not do an adaptive qsort instead like other standard libraries?

Also, nothing actually says qsort doesn=E2=80=99t allocate memory; in =
fact,
glibc=E2=80=99s own documentation states that one should not assume it =
is
in-place and doesn=E2=80=99t allocate.

> qsort() is frequently used to do all kinds of sorting, and some =
pointed out that qsort() can technically be any kind of sorting =
algorithm, but looking around it is not.

Because there are variants that are guaranteed n log n? Which is better
than your n log^2 n.

> When I build applications of my own, I always use mergesort(), which =
depend on malloc(). Having a dependency on a certain memory allocator to =
get performance is not good.
>=20
> I want to distinguish from qsort() by calling it bsort(). If people =
use bsort() they know what they get cross-platform.

No they don=E2=80=99t, bsort can mean multiple things. If you want a =
specific
algorithm, put it in your program, but please don=E2=80=99t, just use a
standardised function like qsort.

> If people use qsort() the output is random basically. It helps very =
little my application works on FreeBSD, but not Linux, for example.

No, the output is sorted, which is far from random. And if you need a
stable sort you should ask for one.

> In FreeBSD qsort() is typically used for sorting files up to 16K =
entries per directory. Even if qsort() can explode in time, probably =
it's not that important. But when using qsort() for sorting millions of =
mathematical expressions, for example, doing number analysis, this is =
unacceptable.
>=20
> I think "C.A.R. Hoare's quicksort" technique is designed for single =
CPU systemsf only. Even if the best-case average is "N*log2(N)", that =
amount of processing cannot be split by multiple CPUs. The algorithm is =
serial as such.
>=20
> The bsort() algorithm is much more NCPU friendly, because it can split =
the work into fixed size smaller and independent work loads. Otherwise =
the work load doubles every time you merge two sorted lists. And when =
the number of lists to merge is fewer than the number of CPUs available, =
your system runs out of guts basically.

I highly doubt you want a libc sort routine to start spawning threads.
That also seems extremely contradictory to your claim about wanting one
that doesn=E2=80=99t allocate memory, but creating threads is going to =
do much
more than just that, and would scream POLA violation to me.

>> I also note that the swap code is pointlessly slow for size > 256 and
>> should almost certainly use aliasing with matching words like memcpy
>> implementations do.  Doing so would make it easier to port this code =
to
>> CHERI where that is required.
>=20
> I totally agree about the swap code being pointless. And I tried to =
look where is memswap(), but it was not there. This kind of swapping is =
done many places, and we could benefit from having a compiler supported =
memswap() function. What do you think?

We agree: https://github.com/CTSRD-CHERI/cheribsd/issues/207. But the
world doesn=E2=80=99t have that in C.

We don=E2=80=99t need new non-standard sorting routines in libc.

Jess




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B5734C4-8630-4CD5-BA8D-DE33899161F1>