From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 03:25:23 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 79363F60; Sat, 4 Jan 2014 03:25:23 +0000 (UTC) Received: from mail-qa0-x231.google.com (mail-qa0-x231.google.com [IPv6:2607:f8b0:400d:c00::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 252E21AEF; Sat, 4 Jan 2014 03:25:23 +0000 (UTC) Received: by mail-qa0-f49.google.com with SMTP id ii20so1145992qab.8 for ; Fri, 03 Jan 2014 19:25:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=B5bSnR4zc4F5EI414D0CKvcNGoDZ8ch7jN3oHrNUXeE=; b=TO8H7rgOW3mUe9GT+Sz8Y1Yv65eYR9AX+mAgW/TDQulH2Nf9LiqPK7V5EyrT74kQ80 eRyEeqZk5KjfCmvepb0I6/Lk4RoLYamtiyseiCrISNHsV1DjJbQ3D8I/0AMPD01wWm0Y EJaadMJWcERP/BNZtaXqQfaWY2KW/RzA28WrNFfhWQSOZGMcrznGU9PFsckmDLEfZPRb GZMbdn0X1MnlfD0ivfFTEGGaJNrmAjduQimbLObwU1f8sSxkWgJu5LtHbpdL1kzZQfEh uc61DqHClHoojVCkyDFyOKbU0jQa1elij6Y6N4kiWTC9QW/DkE+93r6ColO8ZThNpgTY NO4A== MIME-Version: 1.0 X-Received: by 10.224.13.141 with SMTP id c13mr146664121qaa.76.1388805921968; Fri, 03 Jan 2014 19:25:21 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 3 Jan 2014 19:25:21 -0800 (PST) In-Reply-To: <52C77DB8.5020305@gmail.com> References: <52C77DB8.5020305@gmail.com> Date: Fri, 3 Jan 2014 19:25:21 -0800 X-Google-Sender-Auth: BpUuRvy4lYraVjrPG1IP3m6A_n8 Message-ID: Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? From: Adrian Chadd To: David Xu Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 03:25:23 -0000 Doesn't critical_enter / exit enable/disable interrupts? We don't necessarily want to do -that-, as that can be expensive. Just not scheduling certain tasks that would interfere would be good enough. -a On 3 January 2014 19:19, David Xu wrote: > On 2014/01/04 08:55, Adrian Chadd wrote: >> Hi, >> >> So here's a fun one. >> >> When doing TCP traffic + socket affinity + thread pinning experiments, >> I seem to hit this very annoying scenario that caps my performance and >> scalability. >> >> Assume I've lined up everything relating to a socket to run on the >> same CPU (ie, TX, RX, TCP timers, userland thread): >> >> * userland code calls something, let's say "kqueue" >> * the kqueue lock gets grabbed >> * an interrupt comes in for the NIC >> * the NIC code runs some RX code, and eventually hits something that >> wants to push a knote up >> * and the knote is for the same kqueue above >> * .. so it grabs the lock.. >> * .. contests.. >> * Then the scheduler flips us back to the original userland thread doing TX >> * The userland thread finishes its kqueue manipulation and releases >> the queue lock >> * .. the scheduler then immediately flips back to the NIC thread >> waiting for the lock, grabs the lock, does a bit of work, then >> releases the lock >> >> I see this on kqueue locks, sendfile locks (for sendfile notification) >> and vm locks (for the VM page referencing/dereferencing.) >> >> This happens very frequently. It's very noticable with large numbers >> of sockets as the chances of hitting a lock in the NIC RX path that >> overlaps with something in the userland TX path that you are currently >> fiddling with (eg kqueue manipulation) or sending data (eg vm_page >> locks or sendfile locks for things you're currently transmitting) is >> very high. As I increase traffic and the number of sockets, the amount >> of context switches goes way up (to 300,000+) and the lock contention >> / time spent doing locking is non-trivial. >> >> Linux doesn't "have this" problem - the lock primitives let you >> disable driver bottom halves. So, in this instance, I'd just grab the >> lock with spin_lock_bh() and all the driver bottom halves would not be >> run. I'd thus not have this scheduler ping-ponging and lock contention >> as it'd never get a chance to happen. >> >> So, does anyone have any ideas? Has anyone seen this? Shall we just >> implement a way of doing selective thread disabling, a la >> spin_lock_bh() mixed with spl${foo}() style stuff? >> >> Thanks, >> >> >> -adrian >> > > This is how turnstile based mutex works, AFAIK it is for realtime, > same as POSIX pthread priority inheritance mutex, realtime does not > mean high performance, in fact, it introduces more context switches > and hurts throughput. I think default mutex could be patched to > call critical_enter when mutex_lock is called, and spin forever, > and call critical_leave when the mutex is unlocked, bypass turnstile. > The turnstile design assumes the whole system must be scheduled > on global thread priority, but who did say a system must be based on this? > Recently, I had ported Linux CFS like scheduler to FreeBSD on our > perforce server, > it is based on start-time fair queue, and I found turnstile is such a > bad thing. > it makes me can not schedule thread based on class: rt > timeshare > idle, > but must face with a global thread priority change. > I have stopped porting it, although it is now fully work on UP, it supports > nested group scheduling, I can watch video smoothly while doing "make > -j10 buildwork" on > same UP machine. My scheduler does not work on SMP, too much priority > propagation > work makes me go away, non-preemption spinlock works well for such > a system, propagating thread weight on a scheduler tree is not practical. > > Regards, > David Xu > > > > > > > > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"