From owner-freebsd-hackers@FreeBSD.ORG Fri Sep 6 08:11:25 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E3367533; Fri, 6 Sep 2013 08:11:24 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ea0-x233.google.com (mail-ea0-x233.google.com [IPv6:2a00:1450:4013:c01::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 09FA1254D; Fri, 6 Sep 2013 08:11:23 +0000 (UTC) Received: by mail-ea0-f179.google.com with SMTP id b10so1403847eae.38 for ; Fri, 06 Sep 2013 01:11:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=MENZEmMr0sUVxdAq0usDknkzf7eJArKcNpbmmkCSdyk=; b=TC67HJEukz882fT6ifYtGLcpNr6Z6pFEvYsPUfmbffJ6SUJQdUL6DkHWyBYQvfnK25 S0xgvF10pZCHR9x+YQh3TxJZleGby8gr4rQk94lvdrD+E293WgMv1h/h22y5GV0K7/lG SAcabzqpCbHrAnU4ZiyyqTITN5ADVgYaIYg/4fYT3SjpDc7089ghUFoOtRFuyfJz6pKI Wg34jloD75h4r3jij0kcx2kaBUXI5jORvUUMvHRgvL3Wq5B0mC33lLaVOxwqpBYu6C7c euSRDrnQBmYTkG03Gmn9smo3hSaxbVdAfQtZnJxp/XMShgC80mNjRoC/Yg9N5SxDzXcl TiGg== X-Received: by 10.15.102.71 with SMTP id bq47mr1268106eeb.66.1378455082304; Fri, 06 Sep 2013 01:11:22 -0700 (PDT) Received: from mavbook.mavhome.dp.ua ([37.229.21.195]) by mx.google.com with ESMTPSA id z12sm2105616eev.6.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 06 Sep 2013 01:11:21 -0700 (PDT) Sender: Alexander Motin Message-ID: <52298E27.60200@FreeBSD.org> Date: Fri, 06 Sep 2013 11:11:19 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130616 Thunderbird/17.0.6 MIME-Version: 1.0 To: hackers@freebsd.org Subject: Re: Again about pbuf_mtx References: <52287BCD.4090507@FreeBSD.org> In-Reply-To: <52287BCD.4090507@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Jeff Roberson , Andriy Gapon X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Sep 2013 08:11:25 -0000 On 05.09.2013 15:40, Alexander Motin wrote: > Some may remember that not so long ago I complained about high lock > congestion on pbuf_mtx. At that time switching the mutex to padalign > reduced the problem. But now after improving scalability in CAM and GEOM > and doing more then half million IOPS on 32-core system I again heavily > hit that problem -- hwpmc shows about 30% of CPU time spent on that > mutex spinning and another 30% of time spent on attempt of threads to go > to sleep on that mutex and getting more collisions there. > > Trying to mitigate that I've made a patch > (http://people.freebsd.org/~mav/pcpu_pbuf.patch) to split single queue > of pbufs into several. That definitely cost some amount of KVA and > memory, but on my tests it fixes problem redically, removing any > measurable congestion there. The patch is not complete and don't even > boot on i386 now, but I would like to hear opinions about the approach, > or may be some better propositions. On kib@ proposition I've tried to reimplement that patch using vmem(9). Code indeed looks much better (at least looked before workarounds): http://people.freebsd.org/~mav/pbuf_vmem.patch and it works fast, but I have found number of problems: - now we have only 256 (or even less) pbufs and UMA used by vmem for quick caches tend to allocate up to 256 items per CPU and never release them back. I've partially workarounded that by passing fake MAGIC_SIZE value to vmem and down to UMA as size to make initial bucket sizes smaller, but that is a hack and not always sufficient since size may grow on congestion and again never shrink back. - UMA panics with "uma_zalloc: Bucket pointer mangled." if I am giving vmem zero as valid pointer. I've workarounded that by adding an offset to the value, but I think that assertion in UMA should be removed if we are going to use it for abstract values now. > Another patch I've made > (http://people.freebsd.org/~mav/si_threadcount.patch) removes lock > acquisition from dev_relthread() by using atomics for reference > counting. That fixes another congestion I see. This patch looks fine to > me and the only congestion I see after that is on HBA driver locks, but > may be I am missing something? -- Alexander Motin