From owner-freebsd-hackers@freebsd.org Tue Jul 5 14:31:12 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D26F8B7382F for ; Tue, 5 Jul 2016 14:31:12 +0000 (UTC) (envelope-from karl@denninger.net) Received: from mail.denninger.net (wsip-70-169-168-7.pn.at.cox.net [70.169.168.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 850161967 for ; Tue, 5 Jul 2016 14:31:11 +0000 (UTC) (envelope-from karl@denninger.net) Received: from [192.168.1.40] (Karl-Desktop.Denninger.net [192.168.1.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.denninger.net (Postfix) with ESMTPSA id A758122073F for ; Tue, 5 Jul 2016 09:31:08 -0500 (CDT) Subject: Re: ZFS ARC and mmap/page cache coherency question To: freebsd-hackers@freebsd.org References: <20160630140625.3b4aece3@splash.akips.com> <20160703123004.74a7385a@splash.akips.com> <155afb8148f.c6f5294d33485.2952538647262141073@nextbsd.org> <45865ae6-18c9-ce9a-4a1e-6b2a8e44a8b2@denninger.net> <155b84da0aa.ad3af0e6139335.8627172617037605875@nextbsd.org> <7e00af5a-86cd-25f8-a4c6-2d946b507409@denninger.net> <34cf2d30-8884-95b6-f852-457d55710daf@freebsd.org> <768b6169-70d9-5500-c455-563d8340972e@denninger.net> <272d657a-52ae-4f45-008c-3de6fb1b0c48@freebsd.org> From: Karl Denninger Message-ID: Date: Tue, 5 Jul 2016 09:30:51 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <272d657a-52ae-4f45-008c-3de6fb1b0c48@freebsd.org> Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-512; boundary="------------ms050604050603000003050505" X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2016 14:31:12 -0000 This is a cryptographically signed message in MIME format. --------------ms050604050603000003050505 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 7/4/2016 22:01, Allan Jude wrote: > On 2016-07-04 22:46, Karl Denninger wrote: >> >>> You keep saying per zvol. Do you mean per vdev? I am under the >>> impression that no zvol's are involved in the use case this thread is= >>> about. >> Sorry, per-vdev. The problem with dmu_tx is that it's system-wide. >> This is wildly inappropriate for several reasons -- first, it is >> computed on size-of-RAM with a hard cap (which is stupid on its face) >> and it entirely insensitive to the performance of the vdev's in >> question. Specifically, it is very common for a system to have very >> fast (e.g. SSD) disks, perhaps in a mirror configuration, and then >> spinning rust in a RaidZ2 config for bulk storage. Those are very, ve= ry >> different performance wise and they should have wildly different >> write-back cache sizes. At present there is exactly one such write-ba= ck >> cache and it's both system-wide and pays exactly zero attention to the= >> throughput of the underlying vdevs it is talking to. >> >> This is why you can provoke minute-long stalls on a system with modera= te >> (e.g. 32GB) amounts of RAM if there are spinning rust devices in the >> configuration. >> >>> >>> Improving the way ZFS frees memory, specifically UMA and the 'kmem >>> caches' will help a lot as well. >>> >> Well, yeah. But that means you have to police up the size of the UMA >> .vs. how much is actually in use in the UMA. What the PR does is get >> pretty aggressive with that whenever RAM is tight, and before the page= r >> can start playing hell with system performance. >> >>> In addition, another patch just went in to allow you to change the >>> arc_max and arc_min on a running system. >>> >> Yes, the PR I did a long time ago made that "active" on a running >> system.... so I've had that for quite some time. Not that you really >> ought to need to play with that (if you feel a need to then you're sti= ll >> at step 1 or 2 of what I went through with analyzing and working on th= is >> in the 10.x code.....) >> > > Have you looked into the the ZFS 'Write Throttle', it seems like it > was meant to solve the writeback problem you are describing. It starts > sending back pressure up to the application by introducing larger and > larger delays in the write() call until your disks can keep up with > your applications. > > http://dtrace.org/blogs/ahl/2014/02/10/the-openzfs-write-throttle/ > > http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/ > I believe this has been brought into FreeBSD's implementation; I recall going through it. --=20 Karl Denninger karl@denninger.net /The Market Ticker/ /[S/MIME encrypted email preferred]/ --------------ms050604050603000003050505 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgMFADCABgkqhkiG9w0BBwEAAKCC Bl8wggZbMIIEQ6ADAgECAgEpMA0GCSqGSIb3DQEBCwUAMIGQMQswCQYDVQQGEwJVUzEQMA4G A1UECBMHRmxvcmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3Rl bXMgTExDMRwwGgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhND dWRhIFN5c3RlbXMgTExDIENBMB4XDTE1MDQyMTAyMjE1OVoXDTIwMDQxOTAyMjE1OVowWjEL MAkGA1UEBhMCVVMxEDAOBgNVBAgTB0Zsb3JpZGExGTAXBgNVBAoTEEN1ZGEgU3lzdGVtcyBM TEMxHjAcBgNVBAMTFUthcmwgRGVubmluZ2VyIChPQ1NQKTCCAiIwDQYJKoZIhvcNAQEBBQAD ggIPADCCAgoCggIBALmEWPhAdphrWd4K5VTvE5pxL3blRQPyGF3ApjUjgtavqU1Y8pbI3Byg XDj2/Uz9Si8XVj/kNbKEjkRh5SsNvx3Fc0oQ1uVjyCq7zC/kctF7yLzQbvWnU4grAPZ3IuAp 3/fFxIVaXpxEdKmyZAVDhk9az+IgHH43rdJRIMzxJ5vqQMb+n2EjadVqiGPbtG9aZEImlq7f IYDTnKyToi23PAnkPwwT+q1IkI2DTvf2jzWrhLR5DTX0fUYC0nxlHWbjgpiapyJWtR7K2YQO aevQb/3vN9gSojT2h+cBem7QIj6U69rEYcEDvPyCMXEV9VcXdcmW42LSRsPvZcBHFkWAJqMZ Myiz4kumaP+s+cIDaXitR/szoqDKGSHM4CPAZV9Yh8asvxQL5uDxz5wvLPgS5yS8K/o7zDR5 vNkMCyfYQuR6PAJxVOk5Arqvj9lfP3JSVapwbr01CoWDBkpuJlKfpQIEeC/pcCBKknllbMYq yHBO2TipLyO5Ocd1nhN/nOsO+C+j31lQHfOMRZaPQykXVPWG5BbhWT7ttX4vy5hOW6yJgeT/ o3apynlp1cEavkQRS8uJHoQszF6KIrQMID/JfySWvVQ4ksnfzwB2lRomrdrwnQ4eG/HBS+0l eozwOJNDIBlAP+hLe8A5oWZgooIIK/SulUAsfI6Sgd8dTZTTYmlhAgMBAAGjgfQwgfEwNwYI KwYBBQUHAQEEKzApMCcGCCsGAQUFBzABhhtodHRwOi8vY3VkYXN5c3RlbXMubmV0Ojg4ODgw CQYDVR0TBAIwADARBglghkgBhvhCAQEEBAMCBaAwCwYDVR0PBAQDAgXgMCwGCWCGSAGG+EIB DQQfFh1PcGVuU1NMIEdlbmVyYXRlZCBDZXJ0aWZpY2F0ZTAdBgNVHQ4EFgQUxRyULenJaFwX RtT79aNmIB/u5VkwHwYDVR0jBBgwFoAUJHGbnYV9/N3dvbDKkpQDofrTbTUwHQYDVR0RBBYw FIESa2FybEBkZW5uaW5nZXIubmV0MA0GCSqGSIb3DQEBCwUAA4ICAQBPf3cYtmKowmGIYsm6 eBinJu7QVWvxi1vqnBz3KE+HapqoIZS8/PolB/hwiY0UAE1RsjBJ7yEjihVRwummSBvkoOyf G30uPn4yg4vbJkR9lTz8d21fPshWETa6DBh2jx2Qf13LZpr3Pj2fTtlu6xMYKzg7cSDgd2bO sJGH/rcvva9Spkx5Vfq0RyOrYph9boshRN3D4tbWgBAcX9POdXCVfJONDxhfBuPHsJ6vEmPb An+XL5Yl26XYFPiODQ+Qbk44Ot1kt9s7oS3dVUrh92Qv0G3J3DF+Vt6C15nED+f+bk4gScu+ JHT7RjEmfa18GT8DcT//D1zEke1Ymhb41JH+GyZchDRWtjxsS5OBFMzrju7d264zJUFtX7iJ 3xvpKN7VcZKNtB6dLShj3v/XDsQVQWXmR/1YKWZ93C3LpRs2Y5nYdn6gEOpL/WfQFThtfnat HNc7fNs5vjotaYpBl5H8+VCautKbGOs219uQbhGZLYTv6okuKcY8W+4EJEtK0xB08vqr9Jd0 FS9MGjQE++GWo+5eQxFt6nUENHbVYnsr6bYPQsZH0CRNycgTG9MwY/UIXOf4W034UpR82TBG 1LiMsYfb8ahQJhs3wdf1nzipIjRwoZKT1vGXh/cj3gwSr64GfenURBxaFZA5O1acOZUjPrRT n3ci4McYW/0WVVA3lDGCBRMwggUPAgEBMIGWMIGQMQswCQYDVQQGEwJVUzEQMA4GA1UECBMH RmxvcmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3RlbXMgTExD MRwwGgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhNDdWRhIFN5 c3RlbXMgTExDIENBAgEpMA0GCWCGSAFlAwQCAwUAoIICTTAYBgkqhkiG9w0BCQMxCwYJKoZI hvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xNjA3MDUxNDMwNTFaME8GCSqGSIb3DQEJBDFCBECD rQB1crTWkeBbJPZtcru08rZBv2y3HIBGXLi38ruOrCBCfXJffBJCfKv+LJJoL5pA1fPPkQEx sS4V/gDp1k0CMGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEqMAsGCWCGSAFlAwQBAjAK BggqhkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYI KoZIhvcNAwICASgwgacGCSsGAQQBgjcQBDGBmTCBljCBkDELMAkGA1UEBhMCVVMxEDAOBgNV BAgTB0Zsb3JpZGExEjAQBgNVBAcTCU5pY2V2aWxsZTEZMBcGA1UEChMQQ3VkYSBTeXN0ZW1z IExMQzEcMBoGA1UEAxMTQ3VkYSBTeXN0ZW1zIExMQyBDQTEiMCAGCSqGSIb3DQEJARYTQ3Vk YSBTeXN0ZW1zIExMQyBDQQIBKTCBqQYLKoZIhvcNAQkQAgsxgZmggZYwgZAxCzAJBgNVBAYT AlVTMRAwDgYDVQQIEwdGbG9yaWRhMRIwEAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoTEEN1 ZGEgU3lzdGVtcyBMTEMxHDAaBgNVBAMTE0N1ZGEgU3lzdGVtcyBMTEMgQ0ExIjAgBgkqhkiG 9w0BCQEWE0N1ZGEgU3lzdGVtcyBMTEMgQ0ECASkwDQYJKoZIhvcNAQEBBQAEggIAkWqI9kAf JqhCKltoYcw3vBqB11I+6iVlqd568CK7FKreL7vQG94rKOtSgR/gP7b2rpqqiteBS1sPsqpQ rKdhHo8HM5sVECzMKqbXq7SHSUSAt+UPWH44qcjUyqNNW7HP6EezceFMm6Ree6n+FgNyPa6O LY+yZVp2vSCg6h115plY6Jeq5fiMKyVNxbycr2M4f597OrwwbNCXGVktIrItgBmSU7jrt8yS n8OhRd99N37bHJh7wqZ8EnGElTa2ENFQJ0uw0xSGhrV6EtzJdHEaWhSjmVaneY/9MPTQMuFz P7H7X0P1QA2257RGp3ZZte18De2HwaG2d+uNkBHZrcD9VeOrCDjJiyQGsLiGq1vKiE2C4k+m qVygGO03+9+9tpQY78tMwl7rHtL7QQ4pVI7toX5UVN3Ny/OMapF6wBx/8OmY4gWg8QmAbMJE rPzVXad+JjN+11+xr+H53YQWd5fox78I3yO8PKdh3RGJ7Ffgtb4k829OOpM8HOonKti0OhxK aQIP/KTEx30mx4zIimK9kkW4ETitkyrQhFGjjeonlszqQH4NlVrkwzf1J1Ac5U3Za8wvdj5K rxrFRTUCinfRodbGHZqu9BsUyAXBjeUu71X2N4arrM3xNLjq8o+5qnYKUXILVT+G4iBeZRWX ofcCv0D8vbizqSD440lbs0ToWg4AAAAAAAA= --------------ms050604050603000003050505--