From owner-svn-src-all@freebsd.org Thu Feb 14 16:12:00 2019 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9DCD414D978F; Thu, 14 Feb 2019 16:12:00 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-it1-x141.google.com (mail-it1-x141.google.com [IPv6:2607:f8b0:4864:20::141]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2F2E28045A; Thu, 14 Feb 2019 16:12:00 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-it1-x141.google.com with SMTP id z131so5886750itf.5; Thu, 14 Feb 2019 08:12:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=oDUBXTQq2hhb9smSO59E15kYl7cF+8gn4GeNW8TgrVU=; b=lBkISu6JiFoA0ZV4zMtu8eiRMjGGBz5KyaJpzbUokpkLODGyrwUDIguNhhx4/ihrhj +xsXSv+6B9DMvouVU5veYsFMRh2ybp5WMtoRaFaY2fCzpPAGQDDcydfqIVsHvZ0J65X6 fTyJPqwKt5SeBtkShEwdabkC7pZgQeCjJ2MiA0Gzm6MUKLWv8/lhJdN1pmxP/bm+mcue Nyoys+SSepVphFoAZZngDV6Azvb+Hh44th7SuV1ud8zS7PngL0WG5v3Y9vRlsdrgrXE1 M1K64F/23+2NIuI2yeT3AstmALS/QLpTd2CL0tr3NGl0eh1N0fbll+BOw1qkzpgFwJP1 eprQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=oDUBXTQq2hhb9smSO59E15kYl7cF+8gn4GeNW8TgrVU=; b=PFu3RGYE9zMUyNW+3YOnPr64YNO0mdQ+Ln2Fo56OAo8jGH6H51EATWpAGWwd/nzuO3 Tf9mEf3CZBYqkcVcJ+FkAy8e9IOij1293BNiWHXJ2MV2yeMNfLQnqL3q/RCyOAfahWP7 gE4y3KlJuZbmAtthtOtB8thR6XAhbtBJpSpGbHDMLMFeB59vvobPS10NMR/jPJsqi5UW F0mtmB3F1abPEKOuDbfxiJ1UmBGgK4Jvh2Ll8WgRe+JwBOKLWo37ziAyqnxg9gRpMh63 QNXECl00vH1ytSztF1GiEvsZrjf8FOv/CLB8bNq++jzbfEPkAzKHAPuSoGvJhuvEDjgO RhQA== X-Gm-Message-State: AHQUAuahXwoWDquD/gLxRFzd+F/g053i5UkFthWcM76/k3qMSA8etb4e D7EJvhM1sDxNfPCYoX5tliE= X-Google-Smtp-Source: AHgI3IZ+SpwkkbgHWnKQEQrLWtscb8Js+hXDM2tuvtswzmylCB22InWA9caz6soxvZZA/Jbfeb1AiQ== X-Received: by 2002:a02:8303:: with SMTP id v3mr2002659jag.51.1550160718950; Thu, 14 Feb 2019 08:11:58 -0800 (PST) Received: from raichu (toroon0560w-lp140-01-69-159-36-102.dsl.bell.ca. [69.159.36.102]) by smtp.gmail.com with ESMTPSA id b192sm1338926itb.12.2019.02.14.08.11.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 14 Feb 2019 08:11:58 -0800 (PST) Sender: Mark Johnston Date: Thu, 14 Feb 2019 11:11:53 -0500 From: Mark Johnston To: Bruce Evans Cc: Justin Hibbits , Gleb Smirnoff , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r343030 - in head/sys: cam conf dev/md dev/nvme fs/fuse fs/nfsclient fs/smbfs kern sys ufs/ffs vm Message-ID: <20190214161153.GA50900@raichu> References: <201901150102.x0F12Hlt025856@repo.freebsd.org> <20190213192450.32343d6a@ralga.knownspace> <20190214153345.C1404@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190214153345.C1404@besplex.bde.org> User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: 2F2E28045A X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.98 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.98)[-0.981,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Feb 2019 16:12:00 -0000 On Thu, Feb 14, 2019 at 06:56:42PM +1100, Bruce Evans wrote: > On Wed, 13 Feb 2019, Justin Hibbits wrote: > > > On Tue, 15 Jan 2019 01:02:17 +0000 (UTC) > > Gleb Smirnoff wrote: > > > >> Author: glebius > >> Date: Tue Jan 15 01:02:16 2019 > >> New Revision: 343030 > >> URL: https://svnweb.freebsd.org/changeset/base/343030 > >> > >> Log: > >> Allocate pager bufs from UMA instead of 80-ish mutex protected > >> linked list. > > ... > > > > This seems to break 32-bit platforms, or at least 32-bit book-e > > powerpc, which has a limited KVA space (~500MB). It preallocates I've > > seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA, > > leaving very little left for the rest of runtime. > > Hrmph. I complained other things in this commit this when it was > committed, but not this largest bug since preallocation was broken then > so I thought that it wasn't done, so that problems are smaller unless the > excessive limits are actually reached. > > Now i386 does it: > > XX ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > XX > XX swrbuf: 336, 128, 0, 0, 0, 0, 0 > XX swwbuf: 336, 64, 0, 0, 0, 0, 0 > XX nfspbuf: 336, 128, 0, 0, 0, 0, 0 > XX mdpbuf: 336, 25, 0, 0, 0, 0, 0 > XX clpbuf: 336, 128, 0, 5, 4, 0, 0 > XX vnpbuf: 336, 2048, 0, 0, 0, 0, 0 > XX pbuf: 336, 16, 0, 2535, 0, 0, 0 > > but i386 now has 4GB of KVA, with almost 3GB to waste, so the bug is not > noticed there. > > The preallocation wasn't there in my last mail to the author about nearby > bugs, on 24 Jan 2019: > > YY vnpbuf: 568, 2048, 0, 0, 0, 0, 0 > YY clpbuf: 568, 128, 0, 128, 8750, 0, 1 > YY pbuf: 568, 16, 0, 4, 0, 0, 0 > > This output is on amd64 where the SIZE is larger and everything else was > the same as on i386. Now amd64 shows the large preallocation too. > > There seems to be another bug for the especially small LIMIT of 16 to > turn into a preallocation of 2535 and not cause immediate reduction to > the limit. > > I happen to have kernels from 24 and 25 Jan handy. The first one is > amd64 r343346M built on Jan 23, and it doesn't do the large > preallocation. The second one is i386 r343388:343418M built on Jan > 25, and it does the large preallocation. Both call uma_prealloc() to > ask for nswbuf_max = 0x9e9 buffers, but the old version only allocates > 4 buffers while later version allocate 0x9e9 buffers. > > The only relevant commit between the good and bad versions seems to be > r343453. This fixes uma_prealloc() to actually work. But it is a feature > for it to not work when its caller asks for too much. I guess you meant r343353. In any case, the pbuf keg is _NOFREE, so even without preallocation the large pbuf zone limits may become problematic if there are bursts of allocation requests. > 0x9e9 is the sum of the LIMITs of all pbuf pools. The main bug in > r343030 is that it expands nswbuf, which is supposed to give the > combined limit, from its normal value of 256 to 0x9e9. (r343030 > actually used nswbuf before it was properly initialized, so used its > maximum value of 256 even on small systems with nswbuf = 16. Only > this has been fixed.) > > On i386, nbuf is excessively limited so as to give a maxbufspace of > about 100MB so as to fit in 1GB of kva even with infinite RAM and > -current's actual 4GB of kva. nbuf is correctly limited to give a > much smaller maxbufspace when RAM is small (kva scaling for this is > not done so well). nswbuf is restricted if nbuf is restricted, but > not enough (except in my version). It is normally 256, so the pbuf > allocation used to be 32MB, and this is already a bit large compared > with 100MB for maxbufspace. Expanding pbufs by a factor of 0x9e9/0x100 > gives the silly combination of 100MB for maxbufspace and 317MB for > pbufs. > > If kva is only 512MB instead of 1GB, then maxbufspace should be only > 50MB and nswbuf should be smaller too. Similarly for PAE on i386 back > when it was configured with 1GB kva by default. Only about 512MB are > left after allocating space for page table metadata. I have fixes > that scale most of this better. Large subsystems starting with kmem > get a hard-coded fraction of the usable kva. E.g., kmem gets about > 60% of usable kva instead of about 40% of nominal kva. Most other > large subsystems including the buffer cache get about 1/8 of the > remaining 40% of usable kva. Scaling for other subsystems is mostly > worse than for kmem. pbufs are part of the buffer cache allocation. > The expansion factor of 0x9e9/0x100 breaks this. > > I don't understand how pbuf_preallocate() allocates for the other > pbuf pools. When I debugged this for clpbufs, the preallocation was > not used. pbuf types other than clpbufs seem to be unused in my > configurations. I thought that pbufs were used during initialization, > since they end up with a nonzero FREE count, but their only use seems > to be to preallocate them. All of the pbuf zones share a common slab allocator. The zones have individual limits but can tap in to the shared preallocation.