From owner-freebsd-fs@freebsd.org Wed May 22 15:47:44 2019 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0B67A15AF199 for ; Wed, 22 May 2019 15:47:44 +0000 (UTC) (envelope-from karl@denninger.net) Received: from colo1.denninger.net (colo1.denninger.net [104.236.120.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2E3D78BAF8 for ; Wed, 22 May 2019 15:47:43 +0000 (UTC) (envelope-from karl@denninger.net) Received: from denninger.net (ip68-1-57-197.pn.at.cox.net [68.1.57.197]) by colo1.denninger.net (Postfix) with ESMTP id E44672110BA for ; Wed, 22 May 2019 11:47:06 -0400 (EDT) Received: from [192.168.43.86] (unknown [172.56.20.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by denninger.net (Postfix) with ESMTPSA id 9CB1D109BEE for ; Wed, 22 May 2019 10:47:05 -0500 (CDT) Subject: Re: Commit r345200 (new ARC reclamation threads) looks suspicious to me - second potential problem To: freebsd-fs@freebsd.org References: <369cb1e9-f36a-a558-6941-23b9b811825a@FreeBSD.org> <20190520164202.GA2130@spy> <28c7430b-fb7c-6472-5c1b-fa3ff63a9e73@FreeBSD.org> From: Karl Denninger Message-ID: <89064e9c-251a-a065-3a72-ac65c884d51d@denninger.net> Date: Wed, 22 May 2019 10:47:00 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <28c7430b-fb7c-6472-5c1b-fa3ff63a9e73@FreeBSD.org> Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-512; boundary="------------ms020004070907060108040100" X-Rspamd-Queue-Id: 2E3D78BAF8 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.16 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_ATTACHMENT(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; MX_GOOD(-0.01)[px.denninger.net]; NEURAL_HAM_SHORT(-0.35)[-0.348,0]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:14061, ipnet:104.236.64.0/18, country:US]; MIME_TRACE(0.00)[0:+,1:+,2:+]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[197.57.1.68.zen.spamhaus.org : 127.0.0.11]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; SIGNED_SMIME(-2.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.20)[multipart/signed,multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; IP_SCORE(-2.60)[ip: (-9.88), ipnet: 104.236.64.0/18(-4.26), asn: 14061(1.18), country: US(-0.06)]; DMARC_NA(0.00)[denninger.net]; R_SPF_NA(0.00)[] X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2019 15:47:44 -0000 This is a cryptographically signed message in MIME format. --------------ms020004070907060108040100 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable On 5/22/2019 10:19 AM, Alexander Motin wrote: > On 20.05.2019 12:42, Mark Johnston wrote: >> On Mon, May 20, 2019 at 07:05:07PM +0300, Lev Serebryakov wrote: >>> I'm looking at last commit to >>> 'sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c' (r345200) and >>> have another question. >>> >>> Here are such code: >>> >>> 4960 /* >>> 4961 * Kick off asynchronous kmem_reap()'s of all our cache= s. >>> 4962 */ >>> 4963 arc_kmem_reap_soon(); >>> 4964 =09 >>> 4965 /* >>> 4966 * Wait at least arc_kmem_cache_reap_retry_ms between >>> 4967 * arc_kmem_reap_soon() calls. Without this check it is= >>> possible to >>> 4968 * end up in a situation where we spend lots of time re= aping >>> 4969 * caches, while we're near arc_c_min. Waiting here al= so >>> gives the >>> 4970 * subsequent free memory check a chance of finding tha= t the >>> 4971 * asynchronous reap has already freed enough memory, a= nd >>> we don't >>> 4972 * need to call arc_reduce_target_size(). >>> 4973 */ >>> 4974 delay((hz * arc_kmem_cache_reap_retry_ms + 999) / 1000)= ; >>> 4975 =09 >>> >>> But looks like `arc_kmem_reap_soon()` is synchronous on FreeBSD! So= , >>> this `delay()` looks very wrong. Am I right? > Why is it wrong? > >>> Looks like it should be `#ifdef illumos`. >> See also r338142, which I believe was reverted by the update. > My r345200 indeed reverted that value, but I don't see a problem there.= > When OS need more RAM, pagedaemon will drain UMA caches by itself. I > don't see a point in re-draining UMA caches in a tight loop without > delay. If caches are not sufficient to sustain one second of workload,= > then usefulness of such caches is not very clear and shrinking ARC to > free some space may be a right move. Also making ZFS drain its caches > more active then anything else in a system looks unfair to me. There is a long-lasting pathology with the older implementation. The=20 short answer is that if you have cache in UMA but unallocated to current = working set it's completely wasted -- unless quickly re-used.=C2=A0 So a = small buffer between current and allocation is ok, but the UMA system=20 will leave large amounts out but unused. Reclaiming that after a=20 reasonable amount of time is a very good thing. The other problem is that disk cache should NEVER be preferred over=20 working set space.=C2=A0 It's always wrong to do so because a working set= =20 page-out is 1 *guaranteed* I/O (to page it out) and possibly 2 I/Os (if=20 required again and thus must be recalled) while a disk cache page is 1=20 *possible* I/O avoided (if the disk cache block is requested again) It is never the right move to intentionally take an I/O in order to=20 avoid a *possible* I/O. Under certain workloads making that choice leads = to severe pathological behavior (~30 second "pauses" where the system is = doing I/O like crazy but a desired process -- such as a database, or=20 shell, does nothing waiting on working set to be paged back in) when=20 there are gigabytes (or 10s of gigabytes) of ARC outstanding. --=20 -- Karl Denninger /The Market-Ticker/ S/MIME Email accepted and preferred --------------ms020004070907060108040100 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgMFADCABgkqhkiG9w0BBwEAAKCC DdgwggagMIIEiKADAgECAhMA5EiKghDOXrvfxYxjITXYDdhIMA0GCSqGSIb3DQEBCwUAMIGL MQswCQYDVQQGEwJVUzEQMA4GA1UECAwHRmxvcmlkYTESMBAGA1UEBwwJTmljZXZpbGxlMRkw FwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRhIFN5c3RlbXMgQ0ExITAf BgNVBAMMGEN1ZGEgU3lzdGVtcyBMTEMgMjAxNyBDQTAeFw0xNzA4MTcxNjQyMTdaFw0yNzA4 MTUxNjQyMTdaMHsxCzAJBgNVBAYTAlVTMRAwDgYDVQQIDAdGbG9yaWRhMRkwFwYDVQQKDBBD dWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRhIFN5c3RlbXMgQ0ExJTAjBgNVBAMMHEN1 ZGEgU3lzdGVtcyBMTEMgMjAxNyBJbnQgQ0EwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIK AoICAQC1aJotNUI+W4jP7xQDO8L/b4XiF4Rss9O0B+3vMH7Njk85fZ052QhZpMVlpaaO+sCI KqG3oNEbuOHzJB/NDJFnqh7ijBwhdWutdsq23Ux6TvxgakyMPpT6TRNEJzcBVQA0kpby1DVD 0EKSK/FrWWBiFmSxg7qUfmIq/mMzgE6epHktyRM3OGq3dbRdOUgfumWrqHXOrdJz06xE9NzY vc9toqZnd79FUtE/nSZVm1VS3Grq7RKV65onvX3QOW4W1ldEHwggaZxgWGNiR/D4eosAGFxn uYeWlKEC70c99Mp1giWux+7ur6hc2E+AaTGh+fGeijO5q40OGd+dNMgK8Es0nDRw81lRcl24 SWUEky9y8DArgIFlRd6d3ZYwgc1DMTWkTavx3ZpASp5TWih6yI8ACwboTvlUYeooMsPtNa9E 6UQ1nt7VEi5syjxnDltbEFoLYcXBcqhRhFETJe9CdenItAHAtOya3w5+fmC2j/xJz29og1KH YqWHlo3Kswi9G77an+zh6nWkMuHs+03DU8DaOEWzZEav3lVD4u76bKRDTbhh0bMAk4eXriGL h4MUoX3Imfcr6JoyheVrAdHDL/BixbMH1UUspeRuqQMQ5b2T6pabXP0oOB4FqldWiDgJBGRd zWLgCYG8wPGJGYgHibl5rFiI5Ix3FQncipc6SdUzOQIDAQABo4IBCjCCAQYwHQYDVR0OBBYE FF3AXsKnjdPND5+bxVECGKtc047PMIHABgNVHSMEgbgwgbWAFBu1oRhUMNEzjODolDka5k4Q EDBioYGRpIGOMIGLMQswCQYDVQQGEwJVUzEQMA4GA1UECAwHRmxvcmlkYTESMBAGA1UEBwwJ TmljZXZpbGxlMRkwFwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRhIFN5 c3RlbXMgQ0ExITAfBgNVBAMMGEN1ZGEgU3lzdGVtcyBMTEMgMjAxNyBDQYIJAKxAy1WBo2kY MBIGA1UdEwEB/wQIMAYBAf8CAQAwDgYDVR0PAQH/BAQDAgGGMA0GCSqGSIb3DQEBCwUAA4IC AQCB5686UCBVIT52jO3sz9pKuhxuC2npi8ZvoBwt/IH9piPA15/CGF1XeXUdu2qmhOjHkVLN gO7XB1G8CuluxofOIUce0aZGyB+vZ1ylHXlMeB0R82f5dz3/T7RQso55Y2Vog2Zb7PYTC5B9 oNy3ylsnNLzanYlcW3AAfzZcbxYuAdnuq0Im3EpGm8DoItUcf1pDezugKm/yKtNtY6sDyENj tExZ377cYA3IdIwqn1Mh4OAT/Rmh8au2rZAo0+bMYBy9C11Ex0hQ8zWcvPZBDn4v4RtO8g+K uQZQcJnO09LJNtw94W3d2mj4a7XrsKMnZKvm6W9BJIQ4Nmht4wXAtPQ1xA+QpxPTmsGAU0Cv HmqVC7XC3qxFhaOrD2dsvOAK6Sn3MEpH/YrfYCX7a7cz5zW3DsJQ6o3pYfnnQz+hnwLlz4MK 17NIA0WOdAF9IbtQqarf44+PEyUbKtz1r0KGeGLs+VGdd2FLA0e7yuzxJDYcaBTVwqaHhU2/ Fna/jGU7BhrKHtJbb/XlLeFJ24yvuiYKpYWQSSyZu1R/gvZjHeGb344jGBsZdCDrdxtQQcVA 6OxsMAPSUPMrlg9LWELEEYnVulQJerWxpUecGH92O06wwmPgykkz//UmmgjVSh7ErNvL0lUY UMfunYVO/O5hwhW+P4gviCXzBFeTtDZH259O7TCCBzAwggUYoAMCAQICEwCg0WvVwekjGFiO 62SckFwepz0wDQYJKoZIhvcNAQELBQAwezELMAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3Jp ZGExGTAXBgNVBAoMEEN1ZGEgU3lzdGVtcyBMTEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBD QTElMCMGA1UEAwwcQ3VkYSBTeXN0ZW1zIExMQyAyMDE3IEludCBDQTAeFw0xNzA4MTcyMTIx MjBaFw0yMjA4MTYyMTIxMjBaMFcxCzAJBgNVBAYTAlVTMRAwDgYDVQQIDAdGbG9yaWRhMRkw FwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRswGQYDVQQDDBJrYXJsQGRlbm5pbmdlci5uZXQw ggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQC+HVSyxVtJhy3Ohs+PAGRuO//Dha9A 16l5FPATr6wude9zjX5f2lrkRyU8vhCXTZW7WbvWZKpcZ8r0dtZmiK9uF58Ec6hhvfkxJzbg 96WHBw5Fumd5ahZzuCJDtCAWW8R7/KN+zwzQf1+B3MVLmbaXAFBuKzySKhKMcHbK3/wjUYTg y+3UK6v2SBrowvkUBC+jxNg3Wy12GsTXcUS/8FYIXgVVPgfZZrbJJb5HWOQpvvhILpPCD3xs YJFNKEPltXKWHT7Qtc2HNqikgNwj8oqOb+PeZGMiWapsatKm8mxuOOGOEBhAoTVTwUHlMNTg 6QUCJtuWFCK38qOCyk9Haj+86lUU8RG6FkRXWgMbNQm1mWREQhw3axgGLSntjjnznJr5vsvX SYR6c+XKLd5KQZcS6LL8FHYNjqVKHBYM+hDnrTZMqa20JLAF1YagutDiMRURU23iWS7bA9tM cXcqkclTSDtFtxahRifXRI7Epq2GSKuEXe/1Tfb5CE8QsbCpGsfSwv2tZ/SpqVG08MdRiXxN 5tmZiQWo15IyWoeKOXl/hKxA9KPuDHngXX022b1ly+5ZOZbxBAZZMod4y4b4FiRUhRI97r9l CxsP/EPHuuTIZ82BYhrhbtab8HuRo2ofne2TfAWY2BlA7ExM8XShMd9bRPZrNTokPQPUCWCg CdIATQIDAQABo4IBzzCCAcswPAYIKwYBBQUHAQEEMDAuMCwGCCsGAQUFBzABhiBodHRwOi8v b2NzcC5jdWRhc3lzdGVtcy5uZXQ6ODg4ODAJBgNVHRMEAjAAMBEGCWCGSAGG+EIBAQQEAwIF oDAOBgNVHQ8BAf8EBAMCBeAwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMEMDMGCWCG SAGG+EIBDQQmFiRPcGVuU1NMIEdlbmVyYXRlZCBDbGllbnQgQ2VydGlmaWNhdGUwHQYDVR0O BBYEFLElmNWeVgsBPe7O8NiBzjvjYnpRMIHKBgNVHSMEgcIwgb+AFF3AXsKnjdPND5+bxVEC GKtc047PoYGRpIGOMIGLMQswCQYDVQQGEwJVUzEQMA4GA1UECAwHRmxvcmlkYTESMBAGA1UE BwwJTmljZXZpbGxlMRkwFwYDVQQKDBBDdWRhIFN5c3RlbXMgTExDMRgwFgYDVQQLDA9DdWRh IFN5c3RlbXMgQ0ExITAfBgNVBAMMGEN1ZGEgU3lzdGVtcyBMTEMgMjAxNyBDQYITAORIioIQ zl6738WMYyE12A3YSDAdBgNVHREEFjAUgRJrYXJsQGRlbm5pbmdlci5uZXQwDQYJKoZIhvcN AQELBQADggIBAJXboPFBMLMtaiUt4KEtJCXlHO/3ZzIUIw/eobWFMdhe7M4+0u3te0sr77QR dcPKR0UeHffvpth2Mb3h28WfN0FmJmLwJk+pOx4u6uO3O0E1jNXoKh8fVcL4KU79oEQyYkbu 2HwbXBU9HbldPOOZDnPLi0whi/sbFHdyd4/w/NmnPgzAsQNZ2BYT9uBNr+jZw4SsluQzXG1X lFL/qCBoi1N2mqKPIepfGYF6drbr1RnXEJJsuD+NILLooTNf7PMgHPZ4VSWQXLNeFfygoOOK FiO0qfxPKpDMA+FHa8yNjAJZAgdJX5Mm1kbqipvb+r/H1UAmrzGMbhmf1gConsT5f8KU4n3Q IM2sOpTQe7BoVKlQM/fpQi6aBzu67M1iF1WtODpa5QUPvj1etaK+R3eYBzi4DIbCIWst8MdA 1+fEeKJFvMEZQONpkCwrJ+tJEuGQmjoQZgK1HeloepF0WDcviiho5FlgtAij+iBPtwMuuLiL shAXA5afMX1hYM4l11JXntle12EQFP1r6wOUkpOdxceCcMVDEJBBCHW2ZmdEaXgAm1VU+fnQ qS/wNw/S0X3RJT1qjr5uVlp2Y0auG/eG0jy6TT0KzTJeR9tLSDXprYkN2l/Qf7/nT6Q03qyE QnnKiBXWAZXveafyU/zYa7t3PTWFQGgWoC4w6XqgPo4KV44OMYIFBzCCBQMCAQEwgZIwezEL MAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3JpZGExGTAXBgNVBAoMEEN1ZGEgU3lzdGVtcyBM TEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBDQTElMCMGA1UEAwwcQ3VkYSBTeXN0ZW1zIExM QyAyMDE3IEludCBDQQITAKDRa9XB6SMYWI7rZJyQXB6nPTANBglghkgBZQMEAgMFAKCCAkUw GAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTkwNTIyMTU0NzAw WjBPBgkqhkiG9w0BCQQxQgRAXKTp7AjYPTf3iFKG4agUvWm1a1XfLC9Nexp0xMxVhydQcyc7 GAXX7JumqAn31M6fIolGEJy1Uug/Z1jvzZq/9jBsBgkqhkiG9w0BCQ8xXzBdMAsGCWCGSAFl AwQBKjALBglghkgBZQMEAQIwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMA0GCCqGSIb3 DQMCAgFAMAcGBSsOAwIHMA0GCCqGSIb3DQMCAgEoMIGjBgkrBgEEAYI3EAQxgZUwgZIwezEL MAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3JpZGExGTAXBgNVBAoMEEN1ZGEgU3lzdGVtcyBM TEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBDQTElMCMGA1UEAwwcQ3VkYSBTeXN0ZW1zIExM QyAyMDE3IEludCBDQQITAKDRa9XB6SMYWI7rZJyQXB6nPTCBpQYLKoZIhvcNAQkQAgsxgZWg gZIwezELMAkGA1UEBhMCVVMxEDAOBgNVBAgMB0Zsb3JpZGExGTAXBgNVBAoMEEN1ZGEgU3lz dGVtcyBMTEMxGDAWBgNVBAsMD0N1ZGEgU3lzdGVtcyBDQTElMCMGA1UEAwwcQ3VkYSBTeXN0 ZW1zIExMQyAyMDE3IEludCBDQQITAKDRa9XB6SMYWI7rZJyQXB6nPTANBgkqhkiG9w0BAQEF AASCAgA1QX7gqdAKvs3MHipT1n0xw5c4NXuglQz+kka0pw40/DWZh40lQ1ZlrzmzEKHMyrxb j6PIGQRofns/sVfliyA18Iz0RoWL2YZ75yPFgXSAlmCVRrYwH2njVunxiH4LE2vvenuO4SQD AKy3k5xKVjBhUa5DJUqe7jMhzEqmsY1RgNy9IoBoI6ceVARlXqatZQfptZNQhiBz9T4meuPt 8jbV+ZeDSODJkGuTE3M3iZN3ZQEljjDQWSSld8NXJT0et/LY1IhB9VXHJe80AaJiDPIRRrN5 gFgFeU+BhGDSJT7U1Fk7+aVmp4aXc7vO7VSG83q7QFr66WBFOzEbRi8KT5+5hoXU3T/nw8XQ 6D68WRqg4jwMjZsronufsMDoz4lvELpqZJGnRgXOyDUuGW7Eyb3THtT4E5rrueVHhAGGBSLY PhFMmuAoms+XNZKYmKZu2y8LJUkDMd3CWfuo+Q0qeDwZaS2AGj6W9qsrqFqNYTDg8fXLRmTA W7rLR6+3JnBC58eUDEaaJ7UOXHZmh1lUoyxJSfDCkNcOhWDZ7e2VBksP7e+ndoZrHKeGzLXV CkEsLFl+Nw9psEhTna+pk3c8ea8AjddQSgLr16BSh4e1NfqJppMgalsRfRwbExKF/r1ALme6 FaGp4VbFOWvUwZEsIEMg8U7MWPzWZJ/LNxOc+vWNzwAAAAAAAA== --------------ms020004070907060108040100--