From owner-freebsd-amd64@freebsd.org Fri Sep 27 19:24:41 2019 Return-Path: Delivered-To: freebsd-amd64@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id ED6A212AD3E; Fri, 27 Sep 2019 19:24:41 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46g1vY1HQlz4D0H; Fri, 27 Sep 2019 19:24:40 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-io1-xd43.google.com with SMTP id r26so19172517ioh.8; Fri, 27 Sep 2019 12:24:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=uKkivfd8fY+rKkAJikbX2q5nIs/eNmq4y5ScAcHJx8Q=; b=HwlC7gY9HEOKSLqc6XWLsrAHY8lnXnAqzZKqreI6mW5nthqX9vB2oDgvESzJaLCTFi jklkxPFk10M5eDdqBDOZMA4QGJgcw5WwTsr3CnlHk+fzy1GoUHllP1sEZqxUsacytSub fYnfOCNe4iDo8oHtqrNxWF2nmg7qHMcAg4j0updKQshZvbsNtHxKWUctBgMCAde2QhOr Pi9L/eL/8v+Q66alri1jFh4QZCFqU0Wdnfc4WVYrIZQIEhsDdX0NfRBJ8Ewm/E4m/YMj KBjwZgeWp9JEYU/UkiOBV3yNbzzM9VG9xp7WYhefOiJd4JPWE2LqfJ0fE6ytGWAk6uBj ZWVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=uKkivfd8fY+rKkAJikbX2q5nIs/eNmq4y5ScAcHJx8Q=; b=EYGvjnAWutkR/p9pRJI1TteFdasikCL32bc7j2LuEievrBx00uAGYK9b6pT2axcFxW 36Ox2Lg1b8kujtSWC7iG/VWEA1CDeGn1/AK2pG9YuZpmMoZl4qMyW4WA3tp4pAjmE2Hc Wiqw46T2YnyWh5QcEICvaTMEGYqmE9kut0BxbM6kEsG0AVVj441oN0benPydq+Hb1WZv y64m5JuFgVJmn1UpI15J2sezbe6yWeYLI8OVD53mN9pTNKh3bcaHLvMacrCPoU1tiJaK IHK18ltv2JLA96mgwIQCFnOGnfA64nB4kYhRV5LOPW3+guToRVmAt0mfS2MYbId7Rhr8 RPIw== X-Gm-Message-State: APjAAAWG+KPwzXyWGXRMw7W2w02re7hFQSM5HGK8ZBd0Ry1QI2fzjVeJ H2qlykrA+frHtC15XuNtG8E= X-Google-Smtp-Source: APXvYqym91FNCoChL5YyFWRJrDd5ezg5Q15uFk9PKmziPadbqVQxwAeekHTexs96RPA+xiDoHjUvjQ== X-Received: by 2002:a92:5a10:: with SMTP id o16mr6854939ilb.296.1569612279750; Fri, 27 Sep 2019 12:24:39 -0700 (PDT) Received: from raichu (toroon0560w-lp140-01-69-159-39-167.dsl.bell.ca. [69.159.39.167]) by smtp.gmail.com with ESMTPSA id i18sm2048898ilc.34.2019.09.27.12.24.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Sep 2019 12:24:38 -0700 (PDT) Sender: Mark Johnston Date: Fri, 27 Sep 2019 15:24:34 -0400 From: Mark Johnston To: Mark Millard Cc: freebsd-amd64@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: head -r352341 example context on ThreadRipper 1950X: cpuset -n prefer:1 with -l 0-15 vs. -l 16-31 odd performance? Message-ID: <20190927192434.GA93180@raichu> References: <704D4CE4-865E-4C3C-A64E-9562F4D9FC4E@yahoo.com> <20190925170255.GA43643@raichu> <4F565B02-DC0D-4011-8266-D38E02788DD5@yahoo.com> <78A4D18C-89E6-48D8-8A99-5FAC4602AE19@yahoo.com> <26B47782-033B-40C8-B8F8-4C731B167243@yahoo.com> <20190926202936.GD5581@raichu> <2DE123BE-B0F8-43F6-B950-F41CF0DEC8AD@yahoo.com> <6BC5F6BE-5FC3-48FA-9873-B20141FEFDF5@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6BC5F6BE-5FC3-48FA-9873-B20141FEFDF5@yahoo.com> User-Agent: Mutt/1.12.1 (2019-06-15) X-Rspamd-Queue-Id: 46g1vY1HQlz4D0H X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=HwlC7gY9; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::d43 as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-1.26 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[freebsd.org]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; RCVD_IN_DNSWL_NONE(0.00)[3.4.d.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-0.99)[-0.994,0]; IP_SCORE(-0.56)[ip: (2.01), ipnet: 2607:f8b0::/32(-2.59), asn: 15169(-2.17), country: US(-0.05)]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; FREEMAIL_TO(0.00)[yahoo.com]; SUBJECT_ENDS_QUESTION(1.00)[]; MID_RHS_NOT_FQDN(0.50)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; FREEMAIL_ENVFROM(0.00)[gmail.com] X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Sep 2019 19:24:42 -0000 On Thu, Sep 26, 2019 at 08:37:39PM -0700, Mark Millard wrote: > > > On 2019-Sep-26, at 17:05, Mark Millard wrote: > > > On 2019-Sep-26, at 13:29, Mark Johnston wrote: > >> One possibility is that these are kernel memory allocations occurring in > >> the context of the benchmark threads. Such allocations may not respect > >> the configured policy since they are not private to the allocating > >> thread. For instance, upon opening a file, the kernel may allocate a > >> vnode structure for that file. That vnode may be accessed by threads > >> from many processes over its lifetime, and may be recycled many times > >> before its memory is released back to the allocator. > > > > For -l0-15 -n prefer:1 : > > > > Looks like this reports sys_thr_new activity, sys_cpuset > > activity, and 0xffffffff80bc09bd activity (whatever that > > is). Mostly sys_thr_new activity, over 1300 of them . . . > > > > dtrace: pid 13553 has exited > > > > > > kernel`uma_small_alloc+0x61 > > kernel`keg_alloc_slab+0x10b > > kernel`zone_import+0x1d2 > > kernel`uma_zalloc_arg+0x62b > > kernel`thread_init+0x22 > > kernel`keg_alloc_slab+0x259 > > kernel`zone_import+0x1d2 > > kernel`uma_zalloc_arg+0x62b > > kernel`thread_alloc+0x23 > > kernel`thread_create+0x13a > > kernel`sys_thr_new+0xd2 > > kernel`amd64_syscall+0x3ae > > kernel`0xffffffff811b7600 > > 2 > > > > kernel`uma_small_alloc+0x61 > > kernel`keg_alloc_slab+0x10b > > kernel`zone_import+0x1d2 > > kernel`uma_zalloc_arg+0x62b > > kernel`cpuset_setproc+0x65 > > kernel`sys_cpuset+0x123 > > kernel`amd64_syscall+0x3ae > > kernel`0xffffffff811b7600 > > 2 > > > > kernel`uma_small_alloc+0x61 > > kernel`keg_alloc_slab+0x10b > > kernel`zone_import+0x1d2 > > kernel`uma_zalloc_arg+0x62b > > kernel`uma_zfree_arg+0x36a > > kernel`thread_reap+0x106 > > kernel`thread_alloc+0xf > > kernel`thread_create+0x13a > > kernel`sys_thr_new+0xd2 > > kernel`amd64_syscall+0x3ae > > kernel`0xffffffff811b7600 > > 6 > > > > kernel`uma_small_alloc+0x61 > > kernel`keg_alloc_slab+0x10b > > kernel`zone_import+0x1d2 > > kernel`uma_zalloc_arg+0x62b > > kernel`uma_zfree_arg+0x36a > > kernel`vm_map_process_deferred+0x8c > > kernel`vm_map_remove+0x11d > > kernel`vmspace_exit+0xd3 > > kernel`exit1+0x5a9 > > kernel`0xffffffff80bc09bd > > kernel`amd64_syscall+0x3ae > > kernel`0xffffffff811b7600 > > 6 > > > > kernel`uma_small_alloc+0x61 > > kernel`keg_alloc_slab+0x10b > > kernel`zone_import+0x1d2 > > kernel`uma_zalloc_arg+0x62b > > kernel`thread_alloc+0x23 > > kernel`thread_create+0x13a > > kernel`sys_thr_new+0xd2 > > kernel`amd64_syscall+0x3ae > > kernel`0xffffffff811b7600 > > 22 > > > > kernel`vm_page_grab_pages+0x1b4 > > kernel`vm_thread_stack_create+0xc0 > > kernel`kstack_import+0x52 > > kernel`uma_zalloc_arg+0x62b > > kernel`vm_thread_new+0x4d > > kernel`thread_alloc+0x31 > > kernel`thread_create+0x13a > > kernel`sys_thr_new+0xd2 > > kernel`amd64_syscall+0x3ae > > kernel`0xffffffff811b7600 > > 1324 > > With sys_thr_new not respecting -n prefer:1 for > -l0-15 (especially for the thread stacks), I > looked some at the generated integration kernel > code and it makes significant use of %rsp based > memory accesses (read and write). > > That would get both memory controllers going in > parallel (kernel vectors accesses to the preferred > memory domain), so not slowing down as expected. > > If round-robin is not respected for thread stacks, > and if threads migrate cpus across memory domains > at times, there could be considerable variability > for that context as well. (This may not be the > only way to have different/extra variability for > this context.) > > Overall: I'd be surprised if this was not > contributing to what I thought was odd about > the benchmark results. Your tracing refers to kernel thread stacks though, not the stacks used by threads when executing in user mode. My understanding is that a HINT implementation would spend virtually all of its time in user mode, so it shouldn't matter much or at all if kernel thread stacks are backed by memory from the "wrong" domain. This also doesn't really explain some of the disparities in the plots you sent me. For instance, you get a much higher peak QUIS on FreeBSD than on Fedora with 16 threads and an interleave/round-robin domain selection policy.