From owner-freebsd-current@FreeBSD.ORG Wed Aug 7 07:18:03 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B089A427; Wed, 7 Aug 2013 07:18:03 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: from mail-lb0-x229.google.com (mail-lb0-x229.google.com [IPv6:2a00:1450:4010:c04::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 520FC2C56; Wed, 7 Aug 2013 07:18:02 +0000 (UTC) Received: by mail-lb0-f169.google.com with SMTP id u10so1255543lbi.28 for ; Wed, 07 Aug 2013 00:18:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=0KxcqGC3WvWiRDlyeUqaPPUJNeQU620FtJBwCg/XN4c=; b=jOAWpdA8Cq+lwF5MdP33CeExc4xFk6p8i3RX3LRbwdgDZrnEeVhrAcO52oaRyFwnHr nZcOCyxtsjpifhxu8uHP6qOBWdnJyVdSrM8DiZFL0c8aNFuGqixiYeHd7ffakPCoSSI5 8P+WMo3JjgkYNt4ePl5tQkwfYbWvKgnjPGcwjruag4WWmXCt+B+wUrbufMG5jHGY4oOx iiQGkbZhzcsIjV8sq++GAS2cIXfYhKki4Z6OmMqtSfUpoGVwORkXatmMwSYIsD2xHB+j SBR7592Zybgygn3aoGOflCsyuWDDvaeZcMERGM/EGUGyWT8mezSSHob2ByQudeQtwdX1 uMTA== MIME-Version: 1.0 X-Received: by 10.152.19.97 with SMTP id d1mr795779lae.34.1375859880203; Wed, 07 Aug 2013 00:18:00 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.200.165 with HTTP; Wed, 7 Aug 2013 00:17:59 -0700 (PDT) In-Reply-To: <201308070326.r773QCLD035541@mail.karels.net> References: <20130805215319.GA43271@onelab2.iet.unipi.it> <201308070326.r773QCLD035541@mail.karels.net> Date: Wed, 7 Aug 2013 09:18:00 +0200 X-Google-Sender-Auth: a7yRlQjeUh-lwQ89EstnIQoZ08w Message-ID: Subject: Re: [net] protecting interfaces from races between control and data ? From: Luigi Rizzo To: mike@karels.net Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Adrian Chadd , Andre Oppermann , current@freebsd.org, Bryan Venteicher , Navdeep Parhar , net@freebsd.org, Giuseppe Lettieri X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Aug 2013 07:18:03 -0000 On Wed, Aug 7, 2013 at 5:26 AM, Mike Karels wrote: > I'm replying to one of the last messages of this thread, but in part going > back to the beginning; then I'm following up on Andre's proposal. > > Luigi wrote: > > i am slightly unclear of what mechanisms we use to prevent races > > between interface being reconfigured (up/down/multicast setting, etc, > > all causing reinitialization of the rx and tx rings) and > > > i) packets from the host stack being sent out; > > ii) interrupts from the network card being processed. > > > I think in the old times IFF_DRV_RUNNING was used for this purpose, > > but now it is not enough. > > Acquiring the "core lock" in the NIC does not seem enough, either, > > because newer drivers, especially multiqueue ones, have per-queue > > rx and tx locks. > > > Does anyone know if there is a generic mechanism, or each driver > > reimplements its own way ? > > I'm not sure I understand the question, or its motivation. What problem(s) > are we trying to solve here? It seems to me that this is mostly internal > to drivers, and I don't see the issue with races. In particular, the only > external guarantees that I see are that control operations will affect the > packet stream "soon" but at some undefined place. Not all of the cited > operations (e.g. multicast changes) need to cause the rings to be > reinitialized; if they do, that's a chip or driver flaw. Clearing the UP > flag should cause packets to stop arriving "soon", but presumably > processing > those already in memory; packets to stop being sent "soon", probably > including > some already accepted for transmission; and new attempts to transmit > receiving > an error "soon". And, of course, the driver should not crash or misbehave. > Other than that, I don't see what external guarantees need to be met. > i only want 'driver should not crash or misbehave', which is something that (I believe) many current drivers do not guarantee because of the races mentioned in the thread (control path reinitializes rings without waiting for the datapath to drain). My specific problem was achieving this safe behaviour when moving between netmap mode and regular mode; i hoped i could replicate whatever scheme was implemented by the drivers in 'normal' mode, and this is when i realized that there was no such protection in place. Jumping to (near) the end of the thread, I like most of Andre's proposal. > Running with minimal locks at this layer is an admirable goal, and I agree > with most of what was said. I have a few observations on the general > changes, > or related issues: > > There was mention of taskqueues. I think that with MSI-X, taskqueues > should not be needed or used. More specifically, having separate ithreads > and taskqueues, with ithreads deferring to taskqueues after some limit, > makes > sense only for MSI and legacy interrupts. With MSI-X, an interrupt thread > should be able to process packets indefinitely with sufficient CPU > resources, > and there is no reason to context switch to a different thread > periodically. > A periodic "yield" might be reasonable, but if it is necessary, small > packet > performance will suffer. However, most of this is internal to the driver. > i am not completely clear on what is the difference between ithreads and taskqueues. Also, Andre's proposal requires to force-kill the ithread, but i am unclear on how to do it safely (i.e. without leaving the data structures in some inconsistent state), unless ithread periodically yields the CPU when it is in a safe state. While this is internal to the driver, we should probably provide some template code to avoid that each driver implements its own way to shutdown the ithread. cheers luigi