From owner-freebsd-hackers@freebsd.org Tue May 18 18:28:08 2021 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 219EA653ABB for ; Tue, 18 May 2021 18:28:08 +0000 (UTC) (envelope-from cyril@freebsdfoundation.org) Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Fl4Hp6TQPz3wQK for ; Tue, 18 May 2021 18:28:06 +0000 (UTC) (envelope-from cyril@freebsdfoundation.org) Received: by mail-wm1-x32b.google.com with SMTP id h3-20020a05600c3503b0290176f13c7715so1977952wmq.5 for ; Tue, 18 May 2021 11:28:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=z9nYviB/EFv7FJUL8zqJ1yneOQ9h6Lvx9WGBo39ABHw=; b=j03tzEcWK1B14wQYK1RG0pJxZw5ow0kzvdOtUnp7LJgw2cwAlfw8DEgRet3V8XG3SY xJBA2QVWu1SJ1EzFpF+EzvdlcQe3J7NFYrQwV8tpwO4UHiEZrqlQSN/iipIoSkU3gOhW 04b74lMBgilcvTXcXNrXNnejv+7xXUYeh2OLmFUb4miQaP4/f1eeWK0UJfK17N0IxGfa sBWDlYfZSbzrBx/w5o9SyQ84QucdpvdDk9artCH+NGTFlkXtEUAHmL1Dy6mYInLynT/i XrajifXUAPkwvmD7vRRucP7u9hiiE1N5Y0AzCXtt77ah7T4oB2P/v1SCSByjzim9rNVj TIjA== X-Gm-Message-State: AOAM530nXsjXE7F1D+W/w9jBXr/twL0p2SVSpfQ1xLzTiZ3ApBUhV45y Lhpgsktd5SXLmLMgg/ctsN11v7cZU4oUOQ23sJgDj8NZT6ftcpAq X-Google-Smtp-Source: ABdhPJy6Ps0HR9J0Y8FG/bogwxwifceAZraSo78ZvGLP1LlAA7iVx4xd+lYqepnaYwkRiDQnI5Vp2yPduqAPeBOT41g= X-Received: by 2002:a7b:ce8d:: with SMTP id q13mr6307915wmj.109.1621362483828; Tue, 18 May 2021 11:28:03 -0700 (PDT) MIME-Version: 1.0 References: <63875688-33d3-d5a9-008f-4b8f53542434@selasky.org> In-Reply-To: <63875688-33d3-d5a9-008f-4b8f53542434@selasky.org> From: Cyril Zhang Date: Tue, 18 May 2021 11:27:55 -0700 Message-ID: Subject: Re: patch: Change default sorting algorithm in sort(1) To: Hans Petter Selasky Cc: freebsd-hackers@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Fl4Hp6TQPz3wQK X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.00 / 15.00]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[freebsdfoundation.org:s=gfnp-20170908]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; NEURAL_SPAM_MEDIUM(1.00)[0.996]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::32b:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[freebsdfoundation.org:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::32b:from]; NEURAL_HAM_SHORT(-1.00)[-0.998]; DMARC_POLICY_ALLOW(-0.50)[freebsdfoundation.org,quarantine]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::32b:from]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-hackers]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Technical discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 May 2021 18:28:08 -0000 On Tue, May 18, 2021 at 2:24 AM Hans Petter Selasky wrote: > Will this affect small-memory systems ability to sort data? It shouldn't. The sort program already allocates a large amount of memory during the preprocessing stage, far more than the amount of bytes in the input file (which is a wholly separate issue from the one at hand). The extra memory required for mergesort is much smaller than that, even in the worst case where each line of input contains only a single character. The sort program also has a mechanism to use temporary files in order to handle cases where the system runs out of memory. This is documented in the -S flag: -S size, --buffer-size=size Use size for the maximum size of the memory buffer. Size modi- fiers %,b,K,M,G,T,P,E,Z,Y can be used. If a memory limit is not explicitly specified, sort takes up to about 90% of available memory. If the file size is too big to fit into the memory buf- fer, the temporary disk files are used to perform the sorting. So a small-memory system could set the -S flag to allow for some extra memory for use with mergesort. Finally, the --heapsort and --quicksort flags will still be available, if memory conservation is truly needed.