From owner-freebsd-net@FreeBSD.ORG Sat Aug 24 18:21:43 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B1DA33C2 for ; Sat, 24 Aug 2013 18:21:43 +0000 (UTC) (envelope-from scottl@netflix.com) Received: from mail-ie0-x233.google.com (mail-ie0-x233.google.com [IPv6:2607:f8b0:4001:c03::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5D5242C53 for ; Sat, 24 Aug 2013 18:21:43 +0000 (UTC) Received: by mail-ie0-f179.google.com with SMTP id m16so540048ieq.38 for ; Sat, 24 Aug 2013 11:21:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netflix.com; s=google; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=45m2GP9pJ/B4IfT002gclv06sQJmi4TqHVs6ruPqfU8=; b=AS8x5XVkhllsAoPSenljHjRigfTphsHIZLpvjm8+ww9ZzJEOEwoPChPUcZjUDbzrzz A6QDAkMt97SzIrOhSQa/1rMjv5/jWSfjtXulaOOv+PdOsEco000x/HVS2MmNiCh9WJhF zPPDQjf9Y2j0XzsZHt/FgzwGcSHAsmIfmUzcY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=45m2GP9pJ/B4IfT002gclv06sQJmi4TqHVs6ruPqfU8=; b=LnCQ8lvgPewhbDj/kvK1eSFAigScbQ5aPz/TchOn78JoKP7Wj+UWNgRv3zPpk6Jq0F 4hfWsiorOdcoGz0hcvzP6VGnPkT0t6g/RH+EazmHUvIvTL31GmJ8W9/SKt7BzCCLedB0 oFCCyYrJRf+7KcsGDFmmxumNHPP+mSWX3VznPy7pTtbEa8uOnw9Zc63yEIIiy80rMsXo RUfcGWGBrOXucl3Yg6UgWtXR/Wj9Lh3nhaxUQnw9VlWLOXdgSGgFcBN1IXdDclnYv65A OurZ6LullNVX35ifDjnfb3bPD9UMWlI2JBcndiVMP1C27a8jMLw8l6wNh1gnxceKPFti /vUA== X-Gm-Message-State: ALoCoQm8bMsZKiWcdU4O2QsnQHQmJRMBHBpjnXXxpwXQrkR9Q6WRxFEkNAUoUOIlB39XFfAEIjVj X-Received: by 10.43.168.67 with SMTP id nh3mr3263725icc.33.1377368502814; Sat, 24 Aug 2013 11:21:42 -0700 (PDT) Received: from phobos.samsco.home (pooker.samsco.org. [168.103.85.57]) by mx.google.com with ESMTPSA id jg5sm5852822igb.0.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 24 Aug 2013 11:21:42 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: [rfc] migrate lagg to an rmlock From: Scott Long In-Reply-To: <5218F803.7000405@mu.org> Date: Sat, 24 Aug 2013 12:21:40 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5218AA36.1080807@ipfw.ru> <5218E108.6090901@mu.org> <5218F803.7000405@mu.org> To: Alfred Perlstein X-Mailer: Apple Mail (2.1508) Cc: FreeBSD Net , Adrian Chadd , freebsd-current , "Robert N. M. Watson" , "Alexander V. Chernikov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Aug 2013 18:21:43 -0000 On Aug 24, 2013, at 12:14 PM, Alfred Perlstein wrote: > On 8/24/13 10:47 AM, Robert N. M. Watson wrote: >> On 24 Aug 2013, at 17:36, Alfred Perlstein wrote: >>=20 >>>> We should distinguish "lock contention" from "line contention". = When acquiring a rwlock on multiple CPUs concurrently, the cache lines = used to implement the lock are contended, as they must bounce between = caches via the cache coherence protocol, also referred to as = "contention". In the if_lagg code, I assume that the read-only acquire = of the rwlock (and perhaps now rmlock) is for data stability rather than = mutual exclusion -- e.g., to allow processing to completion against a = stable version of the lagg configuration. As such, indeed, there should = be no lock contention unless a configuration update takes place, and any = line contention is a property of the locking primitive rather than data = model. >>>>=20 >>>> There are a number of other places in the kernel where migration to = an rmlock makes sense -- however, some care must be taken for four = reasons: (1) while read locks don't experience line contention, write = locking becomes observably e.g., rmlocks might not be suitable for = tcbinfo; (2) rmlocks, unlike rwlocks, more expensive so is not suitable = for all rwlock line contention spots -- implement reader priority = propagation, so you must reason about; and (3) historically, rmlocks = have not fully implemented WITNESS so you may get less good debugging = output. if_lagg is a nice place to use rmlocks, as reconfigurations are = very rare, and it's really all about long-term data stability. >>> Robert, what do you think about a quick swap of the ifnet structures = to counter before 10.x? >> Could you be more specific about the proposal you're making? >>=20 >> Robert >=20 > The lagg patch referred to in the thread seems to indicate that zero = locking is needed if we just switched to counter(9), that makes me = wonder if we could do better with locking in other places if we switched = to counter(9) while we have the chance. >=20 > This is the thread: >=20 > http://lists.freebsd.org/pipermail/svn-src-all/2013-April/067570.html >=20 >> / =20 > />/ Perfect solution would be to convert ifnet(9) to counters(9), = but this > />/ requires much more work, and unfortunately ABI change, so = temporarily > />/ patch lagg(4) manually. > />/ />/ We store counters in the softc, and once per second push = their values > />/ to legacy ifnet counters./ >=20 Some sort of gatekeeper semantic is needed to ensure that configuration = changes to the lagg state don't cause incorrect behavior to the data = path. It's not about protecting the integrity of counters. This can be = done several ways, but right now it's via a read/write semantic. Scott