From owner-freebsd-current@FreeBSD.ORG Mon Nov 10 01:45:44 2014 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3E5D1A2B; Mon, 10 Nov 2014 01:45:44 +0000 (UTC) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 02A2E8B; Mon, 10 Nov 2014 01:45:43 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 9629E7300A; Mon, 10 Nov 2014 02:49:39 +0100 (CET) Date: Mon, 10 Nov 2014 02:49:39 +0100 From: Luigi Rizzo To: current@freebsd.org Subject: dev_lock() contention for fdesc syscalls -- possible fix Message-ID: <20141110014939.GA21626@onelab2.iet.unipi.it> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Cc: jmg@freebsd.org, gnn@freebsd.org, adrian@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Nov 2014 01:45:44 -0000 It was noticed that there is huge dev_lock() contention when multiple processes do a poll() even on independent file descriptors. Turns out that not just poll but most syscalls on file descriptors (as opposed to sockets) in sys/fs/devfs/devfs_vnops.c including devfs_poll_f(), devfs_ioctl_f() and read/write share the problem as they use the following pattern devfs_poll_f() { ... devfs_fp_check(fp, ...) --> kern/kern_conf.c :: devvn_refthread(fp->f_vnode, ...) --> dev_lock(); dev = vp->v_rdev; // lock on vp ? ... check that dev != NULL atomic_add_long(&dev->si_threadcount, 1); dev_unlock(); dsw->d_poll() dev_relthread() --> atomic_subtract_rel_long(&dev->si_threadcount, 1); } I believe that dev_lock() in devvn_refthread() is protecting dev = vp->v_rdev (the cdev entry 'dev' cannot go away for the reasons stated below). However looking at places where vp->v_rdev is modified, turns out that it only happens when holding VI_LOCK(vp) -- the places are devfs_allocv() and devfs_reclaim(). There is one place in zfs where the vnode is created and v_rdev is set without holding a lock, so nobody can dereference it. As a consequence i believe that if in devfs_fp_check() we replace dev_lock() / dev_unlock() with VI_LOCK(vp) / VI_UNLOCK(vp), we make sure that we can safely dereference vp->v_rdev, and the cdev cannot go away because the vnode has a reference to it. The counter uses an atomic instruction (so the release is lockless) This should be enough to remove the contention. Anyone care to review/test the patch below ? (dev_refthread() still has the single dev_lock() contention, i don't know how to address that yet) cheers luigi Index: /home/luigi/FreeBSD/head/sys/kern/kern_conf.c =================================================================== --- /home/luigi/FreeBSD/head/sys/kern/kern_conf.c (revision 273452) +++ /home/luigi/FreeBSD/head/sys/kern/kern_conf.c (working copy) @@ -224,10 +224,11 @@ } csw = NULL; - dev_lock(); + ASSERT_VI_UNLOCKED(vp, __func__); + VI_LOCK(vp); // dev_lock(); dev = vp->v_rdev; if (dev == NULL) { - dev_unlock(); + VI_UNLOCK(vp); // dev_unlock(); return (NULL); } cdp = cdev2priv(dev); @@ -236,7 +237,7 @@ if (csw != NULL) atomic_add_long(&dev->si_threadcount, 1); } - dev_unlock(); + VI_UNLOCK(vp); // dev_unlock(); if (csw != NULL) { *devp = dev; *ref = 1;