From owner-svn-src-all@freebsd.org Sun Dec 29 16:50:38 2019 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 71FFF1E3726; Sun, 29 Dec 2019 16:50:38 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47m64s1xQrz3HcZ; Sun, 29 Dec 2019 16:50:37 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk1-x72e.google.com with SMTP id z76so25229276qka.2; Sun, 29 Dec 2019 08:50:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=SzBTXu6QUJiCPlyP4X5viqBsZnArLb4gWktlLb2BJZI=; b=YIx2CiNxLLK9e5xiBGOmEhiZOyHONGbCl2lkEZBGX+itAEaloh4ILkfYW4JuRn50ls 04QmNbSgO0/zW+YX46xsHde9n8hUWdGHWBZKNlL9aDqwheC9TGWCIQraJvj09vuDUnI9 sMO21Gx/VI+iIsyT3PKmxbplgFHSskCUdttgZPgvdGFfh7/bEuFTBTILs+KOMOLXr2Ra Q7kErzQYCMZu/2e1BlqKz4pB59JKGJnvxctuBB5qcEAS0VQDAdZEeCsutfHhb3hbpZyU tfegdlN5IEwgwIXiij/Vxz69/nn3D0ROs8ESUt2TfH6pT9k7CzCStg50m+RNmI62h8yS fTPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=SzBTXu6QUJiCPlyP4X5viqBsZnArLb4gWktlLb2BJZI=; b=tA1pPFC4wIG6ohyQe0IBJaiHGS8+gSc6NrF7hLzYA2aT8Ga4h6FBmS6MhHgo/o19PP GbDPF+GOeEoadBkH0Yi9WIbRvX1GuJ0Qsw5UEprMQ7TH7kvzMzs+uHYCJAQkKfXCWyZh LCF7HvR/wymJ3EvJEIOwjWkEkb9n/ZTJRNKZQ9mfwyFU5Rb+78IJ9agGtoyP8ZisJBIi xqEKTc8rnhMlUHtV79cMu4byIGHInossRCGROgbAS0FBrCp/uIJOgF7fnFm7pR0Osuwk cDmi/2/TRQyC1kDMlbBipFaTkMw809z1bzFtRJ4CO2p+IRgBa9W4C6OU0ulTNIQGz8S0 OniQ== X-Gm-Message-State: APjAAAXyB+XCY6dB+dp7Uyl+QegmZrQq/2baWrzpcv14ASWApueDpqGS dyyuMfiq9lnCPDYeUdxDvl8= X-Google-Smtp-Source: APXvYqxCfDssAv2Bh/72KKDtaYsVIWyYcDtj6OcuaMb3oVC9LfRvV+qMI7GXBmC+otuLG6S9V02NcA== X-Received: by 2002:a37:e317:: with SMTP id y23mr51413557qki.431.1577638236104; Sun, 29 Dec 2019 08:50:36 -0800 (PST) Received: from raichu (toroon0560w-lp130-05-69-158-183-252.dsl.bell.ca. [69.158.183.252]) by smtp.gmail.com with ESMTPSA id j15sm4542106qki.47.2019.12.29.08.50.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Dec 2019 08:50:35 -0800 (PST) Sender: Mark Johnston Date: Sun, 29 Dec 2019 11:50:32 -0500 From: Mark Johnston To: Oliver Pinter Cc: "src-committers@freebsd.org" , "svn-src-all@freebsd.org" , "svn-src-head@freebsd.org" Subject: Re: svn commit: r356159 - head/sys/vm Message-ID: <20191229165032.GC30375@raichu> References: <201912281904.xBSJ4T19064948@repo.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 47m64s1xQrz3HcZ X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=YIx2CiNx; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::72e as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-2.86 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; FREEMAIL_TO(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; IP_SCORE(-2.66)[ip: (-9.23), ipnet: 2607:f8b0::/32(-2.15), asn: 15169(-1.88), country: US(-0.05)]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[e.2.7.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; MID_RHS_NOT_FQDN(0.50)[]; RCVD_TLS_ALL(0.00)[]; SUSPICIOUS_RECIPS(1.50)[] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2019 16:50:38 -0000 On Sun, Dec 29, 2019 at 03:39:55AM +0100, Oliver Pinter wrote: > Is there any performance measurement from before and after. It would be > nice to see them. I did not do extensive benchmarking. The aim of the patch set was simply to remove the use of the hashed page lock, since it shows up prominently in lock profiles of some workloads. The problem is that we acquire these locks any time a page's LRU state is updated, and the use of the hash lock means that we get false sharing. The solution is to implement these state updates using atomic operations on the page structure itself, making data contention much less likely. Another option was to embed a mutex into the vm_page structure, but this would bloat a structure which is already too large. A secondary goal was to reduce the number of locks held during page queue scans. Such scans frequently call pmap_ts_referenced() to collect info about recent references to the page. This operation can be expensive since it may require a TLB shootdown, and it can block for a long time on the pmap lock, for example if the lock holder is copying the page tables as part of a fork(). Now, the active queue scan body is executed without any locks held, so a page daemon thread blocked on a pmap lock no longer has the potential to block other threads by holding on to a shared page lock. Before, the page daemon could block faulting threads for a long time, hurting latency. I don't have any benchmarks that capture this, but it's something that I've observed in production workloads. I used some microbenchmarks to verify that the change did not penalize the single-threaded case. Here are some results on a 64-core arm64 system I have been playing with: https://people.freebsd.org/~markj/arm64_page_lock/ The benchmark from will-it-scale simply maps 128MB of anonymous memory, faults on each page, and unmaps it, in a loop. In the fault handler we allocate a page and insert it into the active queue, and the unmap operation removes all of those pages from the queue. I collected the throughput for 1, 2, 4, 8, 16 and 32 concurrent processes. With my patches we see some modest gains at low concurrency. At higher levels of concurrency we actually get lower throughput than before as contention moves from the page locks and the page queue lock to just the page queue lock. I don't believe this is a real regression: first, the benchmark is quite extreme relative to any useful workload, and second, arm64 suffers from using a much smaller batch size than amd64 for batched page queue operations. Changing that pushes the results out somewhat. Some earlier testing on a 2-socket Xeon system showed a similar pattern with smaller differences.