From nobody Wed Apr 19 17:51:20 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Q1pHw6Hkwz46G6M for ; Wed, 19 Apr 2023 17:51:24 +0000 (UTC) (envelope-from jrtc27@jrtc27.com) Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Q1pHv5c4Sz457M for ; Wed, 19 Apr 2023 17:51:23 +0000 (UTC) (envelope-from jrtc27@jrtc27.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-wm1-f49.google.com with SMTP id v3so113675wml.0 for ; Wed, 19 Apr 2023 10:51:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681926682; x=1684518682; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SzR9EmTQGfbgH01iYfbqvQ7WCW5z524LRIG4/JBheV4=; b=iM0bI1p13OPLFH+9LS0o26LnDkk/V6aSFP1SQMby3oplPA8HYHFfBg4ih8OHCjyyCu 6Mh9wbXY679Qf4HoOIY/seonqkY3iLqi8tW+QWU/3MkD33sjHJAfi6G62KuN75hJJ9uE zTVWNlSWr+/9KQv30M8loAkKu5H7NzVHoG8PU51wGpuSnaTbk8EUPebM/fsi1qYbRFQ4 YwD5pWwZVZwbteW/k6/2h0KRTC+efpECmL5FJQhqVX/cDtOrfgrTEqY4Cf0Gme/HWI57 TIZTu+PJN3cdtwOvsosjFCye1TpLI3eSbC85n8G6JwvRXUyF5bGr5Zfa/HtREzq/b/08 6JLg== X-Gm-Message-State: AAQBX9ciSyPYBYT/UgWaGZyH8JWeUAvm7KrCYpTqxYd6pcv580UWClUg 1SIICaZ7VtX6TcZCu16kdJ+U2A== X-Google-Smtp-Source: AKy350ZqTDb6KZvQZeJ4BYv0vDGkfjkkhZDdhj938XwoniXXEdvMi3m1MbWi6LnTTxOt7VtFsVbQEQ== X-Received: by 2002:a7b:cb44:0:b0:3f1:6458:99ba with SMTP id v4-20020a7bcb44000000b003f1645899bamr13080249wmj.29.1681926681997; Wed, 19 Apr 2023 10:51:21 -0700 (PDT) Received: from smtpclient.apple ([131.111.5.246]) by smtp.gmail.com with ESMTPSA id h16-20020a05600c315000b003f173a2b2f6sm2919077wmo.12.2023.04.19.10.51.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Apr 2023 10:51:21 -0700 (PDT) Content-Type: text/plain; charset=utf-8 List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: git: 8dcf3a82c54c - main - libc: Implement bsort(3) a bitonic type of sorting algorithm. From: Jessica Clarke In-Reply-To: Date: Wed, 19 Apr 2023 18:51:20 +0100 Cc: Brooks Davis , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <3B5734C4-8630-4CD5-BA8D-DE33899161F1@freebsd.org> References: <202304191206.33JC6Qcp062380@gitrepo.freebsd.org> To: Hans Petter Selasky X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Rspamd-Queue-Id: 4Q1pHv5c4Sz457M X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On 19 Apr 2023, at 18:24, Hans Petter Selasky = wrote: >=20 > On 4/19/23 17:46, Brooks Davis wrote: >> On Wed, Apr 19, 2023 at 12:06:26PM +0000, Hans Petter Selasky wrote: >>> The branch main has been updated by hselasky: >>>=20 >>> URL: = https://cgit.FreeBSD.org/src/commit/?id=3D8dcf3a82c54cb216df3213a013047907= 636a01da >>>=20 >>> commit 8dcf3a82c54cb216df3213a013047907636a01da >>> Author: Hans Petter Selasky >>> AuthorDate: 2022-09-08 10:16:43 +0000 >>> Commit: Hans Petter Selasky >>> CommitDate: 2023-04-19 12:04:22 +0000 >>>=20 >>> libc: Implement bsort(3) a bitonic type of sorting algorithm. >>> The bsort(3) algorithm works by swapping objects, similarly = to qsort(3), >>> and does not require any significant amount of additional = memory. >>> The bsort(3) algorithm doesn't suffer from the processing = time issues >>> known the plague the qsort(3) family of algorithms, and is = bounded by >>> a complexity of O(log2(N) * log2(N) * N), where N is the number = of >>> elements in the sorting array. The additional complexity = compared to >>> mergesort(3) is a fair tradeoff in situations where no memory = may >>> be allocated. >>> The bsort(3) APIs are identical to those of qsort(3), = allowing for >>> easy drop-in and testing. >>> The design of the bsort(3) algorithm allows for future = parallell CPU >>> execution when sorting arrays. The current version of the = bsort(3) >>> algorithm is single threaded. This is possible because fixed = areas >>> of the sorting data is compared at a time, and can easily be = divided >>> among different CPU's to sort large arrays faster. >>> Reviewed by: gbe@, delphij@, pauamma_gundo.com = (manpages) >>> Sponsored by: NVIDIA Networking >>> Differential Revision: https://reviews.freebsd.org/D36493 >> Why was this needed? I'm not really a fan to adding new, = non-standard >> sorts without a clear use case. I understand it has some specific >> advantages vs qsort, but are we going to use it? Doing so will make = our >> code less portable so we almost certainly can't s/qsort/bsort/. >=20 > Hi Brooks, >=20 > My long term plan is to get bsort() to replace qsort(), but because = the two algorithms have different characteristics, then some people may = complain it is good for me, but not for you. I want there to be an = option besides from qsort(), which does not use malloc() as an integral = part of sorting. And is faster than O(N*N) sorting (still the worst case = for qsort in FreeBSD). Why not do an adaptive qsort instead like other standard libraries? Also, nothing actually says qsort doesn=E2=80=99t allocate memory; in = fact, glibc=E2=80=99s own documentation states that one should not assume it = is in-place and doesn=E2=80=99t allocate. > qsort() is frequently used to do all kinds of sorting, and some = pointed out that qsort() can technically be any kind of sorting = algorithm, but looking around it is not. Because there are variants that are guaranteed n log n? Which is better than your n log^2 n. > When I build applications of my own, I always use mergesort(), which = depend on malloc(). Having a dependency on a certain memory allocator to = get performance is not good. >=20 > I want to distinguish from qsort() by calling it bsort(). If people = use bsort() they know what they get cross-platform. No they don=E2=80=99t, bsort can mean multiple things. If you want a = specific algorithm, put it in your program, but please don=E2=80=99t, just use a standardised function like qsort. > If people use qsort() the output is random basically. It helps very = little my application works on FreeBSD, but not Linux, for example. No, the output is sorted, which is far from random. And if you need a stable sort you should ask for one. > In FreeBSD qsort() is typically used for sorting files up to 16K = entries per directory. Even if qsort() can explode in time, probably = it's not that important. But when using qsort() for sorting millions of = mathematical expressions, for example, doing number analysis, this is = unacceptable. >=20 > I think "C.A.R. Hoare's quicksort" technique is designed for single = CPU systemsf only. Even if the best-case average is "N*log2(N)", that = amount of processing cannot be split by multiple CPUs. The algorithm is = serial as such. >=20 > The bsort() algorithm is much more NCPU friendly, because it can split = the work into fixed size smaller and independent work loads. Otherwise = the work load doubles every time you merge two sorted lists. And when = the number of lists to merge is fewer than the number of CPUs available, = your system runs out of guts basically. I highly doubt you want a libc sort routine to start spawning threads. That also seems extremely contradictory to your claim about wanting one that doesn=E2=80=99t allocate memory, but creating threads is going to = do much more than just that, and would scream POLA violation to me. >> I also note that the swap code is pointlessly slow for size > 256 and >> should almost certainly use aliasing with matching words like memcpy >> implementations do. Doing so would make it easier to port this code = to >> CHERI where that is required. >=20 > I totally agree about the swap code being pointless. And I tried to = look where is memswap(), but it was not there. This kind of swapping is = done many places, and we could benefit from having a compiler supported = memswap() function. What do you think? We agree: https://github.com/CTSRD-CHERI/cheribsd/issues/207. But the world doesn=E2=80=99t have that in C. We don=E2=80=99t need new non-standard sorting routines in libc. Jess