From owner-freebsd-hackers@FreeBSD.ORG Fri Jun 28 08:57:19 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0386BADB for ; Fri, 28 Jun 2013 08:57:19 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ee0-x22b.google.com (mail-ee0-x22b.google.com [IPv6:2a00:1450:4013:c00::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 906F11C6B for ; Fri, 28 Jun 2013 08:57:18 +0000 (UTC) Received: by mail-ee0-f43.google.com with SMTP id l10so883056eei.16 for ; Fri, 28 Jun 2013 01:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=NOlaBcYu9p0Qfa5ad1C3VqsT7kQZ3ybtiMGjYYyBsG8=; b=ucvA3OEZxBpGlUh3rYA1+Msa3RtdGtTo6N2JjB+1pK2aRBynNl5XvFH0ZXQnxSvHL+ h1fvjyHlahwpLCVfc7fa0N4en774nU8FBqfYF0CAFsGa30iBM15xXsuyrVbef5H4sHgC 1BVLSiRcMoVd4daIjbYrvqc/PMiga+91pSZx5rEuuzDV7m+gwEwLEUALnjOsM2A6Ttce QpDsm/jSkQc3S4m0GdA+2ObhIPafJokrCBWa9il1RuECg2dQSNcxNUV/H9CjLO8K/s1s 18XVlUh3R8bpcMX7/p9O1CEA0InhXj3l7c+Cd50LHPI0e/N+RtWGlrqCsPzqDf1JzITA diMw== X-Received: by 10.14.209.197 with SMTP id s45mr12844172eeo.108.1372409837688; Fri, 28 Jun 2013 01:57:17 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (mavhome.mavhome.dp.ua. [213.227.240.37]) by mx.google.com with ESMTPSA id l42sm9114329eeo.14.2013.06.28.01.57.15 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 28 Jun 2013 01:57:16 -0700 (PDT) Sender: Alexander Motin Message-ID: <51CD4FEA.7030605@FreeBSD.org> Date: Fri, 28 Jun 2013 11:57:14 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130616 Thunderbird/17.0.6 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: b_freelist TAILQ/SLIST References: <51CCAE14.6040504@FreeBSD.org> <20130628065732.GL91021@kib.kiev.ua> In-Reply-To: <20130628065732.GL91021@kib.kiev.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jun 2013 08:57:19 -0000 On 28.06.2013 09:57, Konstantin Belousov wrote: > On Fri, Jun 28, 2013 at 12:26:44AM +0300, Alexander Motin wrote: >> While doing some profiles of GEOM/CAM IOPS scalability, on some test >> patterns I've noticed serious congestion with spinning on global >> pbuf_mtx mutex inside getpbuf() and relpbuf(). Since that code is >> already very simple, I've tried to optimize probably the only thing >> possible there: switch bswlist from TAILQ to SLIST. As I can see, >> b_freelist field of struct buf is really used as TAILQ in some other >> places, so I've just added another SLIST_ENTRY field. And result >> appeared to be surprising -- I can no longer reproduce the issue at all. >> May be it was just unlucky synchronization of specific test, but I've >> seen in on two different systems and rechecked results with/without >> patch three times. > This is too unbelievable. I understand that it looks like a magic. I was very surprised to see contention there at all, but `pmcstat -n 10000000 -TS unhalted-cycles` shows it too often and repeatable: PMC: [CPU_CLK_UNHALTED_CORE] Samples: 28052 (100.0%) , 12 unresolved %SAMP IMAGE FUNCTION CALLERS 46.4 kernel __mtx_lock_sleep relpbuf:22.3 getpbuf:22.0 xpt_run_devq:0.8 13.3 kernel _mtx_lock_spin_cooki turnstile_trywait 4.3 kernel cpu_search_lowest cpu_search_lowest 2.3 kernel getpbuf physio , and benchmark results confirm it. > Could it be, e.g. some cache line conflicts > which cause the trashing, in fact ? Does it help if you add void *b_pad > before b_freelist instead of adding b_freeslist ? No, this doesn't help. And previously I've tested it also with b_freeslist in place but without other changes -- it didn't help either. >> The present patch is here: >> http://people.freebsd.org/~mav/buf_slist.patch >> >> The question is how to do it better? What is the KPI/KBI policy for >> struct buf? I could replace b_freelist by a union and keep KBI, but >> partially break KPI. Or I could add another field, probably breaking >> KBI, but keeping KPI. Or I could do something handmade with no breakage. >> Or this change is just a bad idea? > The same question about using union for b_freelist/b_freeslist, does the > effect of magically fixing the contention still there if b_freeslist > is on the same offset as the b_freelist ? Yes, it is. > There are no K{B,P}I policy for struct buf in HEAD, just change it as > it fits. Which one would you prefer, the original or http://people.freebsd.org/~mav/buf_slist2.patch ? Thank you. -- Alexander Motin