From owner-freebsd-hackers@FreeBSD.ORG Sat Mar 28 23:58:54 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 607ED91F for ; Sat, 28 Mar 2015 23:58:54 +0000 (UTC) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com [IPv6:2607:f8b0:4001:c05::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1F5F4D96 for ; Sat, 28 Mar 2015 23:58:54 +0000 (UTC) Received: by igcau2 with SMTP id au2so47946412igc.0 for ; Sat, 28 Mar 2015 16:58:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=IKhdY8s1RmyHO02aZDfv0SkJmYldKN1HDhGjpGGF3jo=; b=CvusWVXztV4NEf4VVqrCdXZ78cB7rs5wvjrasZ0fWX/8+2qiMqmLFPc8/K3S8D0oZx TGqQYJzizn4RyPmdWv+iUQuYu6pEfghTrAOzuiV/iITBG1YgfPrP0EJBQHd1rrPI8ucS ZwcF/qhtl5wx9b8K7B47cQYwlElejR4hmBNFT/pT9b+t5L0a0J3BROuOnzKL1Ij3Y+rK cr1COH7eIEBbnCrXE1mjkRO73TTcZq4QIlUglUP9MsJC4t8cfX/hUBQKiX2yqVCkgCQ7 sonN3rAgd+Noj0fs5p85/9fpwCk3rJN60Kn9H2CjKRmYsEAEos5tHtawR6S4VF4JZQTN lJQg== MIME-Version: 1.0 X-Received: by 10.42.41.200 with SMTP id q8mr48738692ice.61.1427587133585; Sat, 28 Mar 2015 16:58:53 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.36.17.194 with HTTP; Sat, 28 Mar 2015 16:58:53 -0700 (PDT) In-Reply-To: <20150328234116.GJ23643@zxy.spb.ru> References: <20150328194959.GE23643@zxy.spb.ru> <20150328201219.GF23643@zxy.spb.ru> <20150328221621.GG23643@zxy.spb.ru> <20150328224634.GH23643@zxy.spb.ru> <20150328230533.GI23643@zxy.spb.ru> <20150328234116.GJ23643@zxy.spb.ru> Date: Sat, 28 Mar 2015 16:58:53 -0700 X-Google-Sender-Auth: M1ggTk0dZnlsjfcQJRuA2gZRymE Message-ID: Subject: Re: irq cpu binding From: Adrian Chadd To: Slawa Olhovchenkov Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Mar 2015 23:58:54 -0000 Hi, * It turns out that fragments were being 100% handled out of order (compared to non-fragments in the same stream) when doing fragment reassembly, because the current system was assuming direct dispatch netisr and not checking any packet contents for whether they're on the wrong CPU. I checked. It's not noticable unless you go digging, but it's absolutely happening. That's why I spun a lot of cycles looking at the IP fragment reassembly path and which methods get called on the frames as they're reinjected. * We're going to have modify drivers, because the way drivers currently assign interrupts, pick CPUs for queues, auto-select how many queues to use, etc is all completely adhoc and not consistent. So yeah, we're going to change the drivers and they're going to be consistent and configurable. That way you can choose how you want to distribute work and pin or not pin things - and it's not done adhoc differently in each driver. Even igb, ixgbe and cxgbe differ in how they implement these three things. * For RSS, there'll be a consistent configuration for what the hardware is doing with hashing, rather than it being driver dependent. Again, otherwise you may end up with some NICs doing 2-tuple hashing where others are doing 4-tuple hashing, and behaviour changes dramatically based on what NIC you're using. * For applications - I'm not sure yet, but at the minimum the librss API I have vaguely sketched out and coded up in a git branch lets you pull out the list of buckets and which CPU it's on. I'm going to extend that a bit more, but it should be enough for things like nginx to say "ok, start up one nginx process per RSS bucket, and here's the CPU set for it to bind to." You said it has worker groups - that's great; I want that to be auto configured. -adrian