From owner-freebsd-hackers@FreeBSD.ORG  Fri Feb 14 02:37:05 2014
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id CA09D9E1;
 Fri, 14 Feb 2014 02:37:05 +0000 (UTC)
Received: from mail-ob0-x22b.google.com (mail-ob0-x22b.google.com
 [IPv6:2607:f8b0:4003:c01::22b])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 740EC1FF2;
 Fri, 14 Feb 2014 02:37:05 +0000 (UTC)
Received: by mail-ob0-f171.google.com with SMTP id wp4so13291458obc.16
 for <multiple recipients>; Thu, 13 Feb 2014 18:37:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=MERphlNKfDE1BAVIvYiPemycFmItkkla/PRxwB3dpxM=;
 b=KiEsJ8C7bnC51BcfqhM8FStxMpvZIZjUYzzLbNXrcoGoQx1L2tuxg63IBCfl9fkbH2
 +G6wVB/38tnwx7NKlaFi0q2OMamqPvKLms2SkIg2vtxtGjNZqC0GnLsqniznupmRMI/l
 AQuQx667IRktJxrSBn+/HWfGTZs5L0HoLPPc/yy1GRAFHV2tYh4gWZMO8rftLZY7MylQ
 iFTKLcDIf4CwyaJ5fhEq/b5KS6kWZykuudpK1ckJt9tapXiymvxxBbiSB8ItVbqE2X/l
 m9BY6nGlkOO29PoG7GSgqy+cDk9vA5Nak4Z0jRpnVRJEavjVHj0Fd0x03CMRTotXi13P
 zc7Q==
MIME-Version: 1.0
X-Received: by 10.182.113.195 with SMTP id ja3mr4262074obb.46.1392345424631;
 Thu, 13 Feb 2014 18:37:04 -0800 (PST)
Received: by 10.76.130.196 with HTTP; Thu, 13 Feb 2014 18:37:04 -0800 (PST)
In-Reply-To: <CAJ-Vmo=7Nz1jqXy+rTQ7u9_ZP7jeFOKUJxU1O51tYJjvTUmWTg@mail.gmail.com>
References: <CAJ-Vmo=7Nz1jqXy+rTQ7u9_ZP7jeFOKUJxU1O51tYJjvTUmWTg@mail.gmail.com>
Date: Thu, 13 Feb 2014 21:37:04 -0500
Message-ID: <CAFMmRNxFbtegWMxfdD1=t7gHRVg_86FSxh0_R5_+h5JP+pw1Vw@mail.gmail.com>
Subject: Re: can the scheduler decide to schedule an interrupted but runnable
 thread on another CPU core? What are the implications for code?
From: Ryan Stone <rysto32@gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Feb 2014 02:37:05 -0000

On Thu, Feb 13, 2014 at 6:57 PM, Adrian Chadd <adrian@freebsd.org> wrote:
> sequentually:
>
> * lookup occurs on CPU A;
> * lookup succeeds on CPU A for some almost-expired entry;
> * preemption occurs, and it gets scheduled to CPU B;
>
> then simultaneously:
>
> * CPU A's flowtable purge code runs, and decides to purge entries
> including the current one;
> * the code now running on CPU B has an item from the CPU A flowtable,
> and dereferences it as it's being freed, leading to potential badness.

This kind of scenario is definitely possible.  All of the FreeBSD
kernel code that deals with lockless per-cpu data structures that I
have seen (e.g. uma) has used critical_enter()/critical_exit() to
prevent preemption, and have been careful to invalidate their
references to the per-cpu data if they have to drop the critical
section.

I don't believe that sched_pin() is good enough because I don't
believe that it handles the scenario when thread A gets a reference
and then is preempted, thread B frees the entry, and then A is
scheduled and uses the now-freed entry.  However I'm really not
familiar at all with flowtable so maybe there's something preventing
that.