From owner-freebsd-hackers@freebsd.org Tue Jan 10 20:50:42 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4E9CDCA92E7 for ; Tue, 10 Jan 2017 20:50:42 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qt0-x235.google.com (mail-qt0-x235.google.com [IPv6:2607:f8b0:400d:c0d::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 023181B55 for ; Tue, 10 Jan 2017 20:50:42 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qt0-x235.google.com with SMTP id l7so128416134qtd.1 for ; Tue, 10 Jan 2017 12:50:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:subject:message-id:mime-version :content-disposition:user-agent; bh=P3Q00KnaTlxvrclNvjfvZRooKHgiPmdlSFT2Au40J2U=; b=W39UaWqn0lUNezSOjQgAo9ssgbIB0SzU3crH/ic2Pd11B5qoufDmUB8TGU/ISXX4qx KsaHse2zx0E39tOduNNH4iyD3Zri+WzvJKlStoH14Uc2arSVSv6HLzIyi1ctgfQca0rv /y0IuVydBDJiWJxQ26ngkO/blt9O5qdgWqOvjJO2YS3eYIe9LL0rtiYkBVrPQJNTuJcQ ol3oP96LneSsyvuqClSRyHU+4cbdMbMAaoUsorHmpfE4TW2FTXZegPdwdzX8+8Nw1oi/ 0n2tRd5ITc1g9ETgAAfcg1F/RWYydiwtO44VeiJrSTIMnI68U4aGpa0oB+lRZSuSA+kn DSWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:subject:message-id :mime-version:content-disposition:user-agent; bh=P3Q00KnaTlxvrclNvjfvZRooKHgiPmdlSFT2Au40J2U=; b=jaV5PvXkDf3hq4LxkoC89fORC+Mr8NE+UUWrNiyZBpMk1mXCtVaP3+h2v8cPwVhIrR I9FHC8zcDzsjWOtkgBmFcOs5+vxnbUbN34cI7gbdrYJYS19cYcuGBnvQI+Arovl7KflT HNJX/rNRdjbHFcqF6TslYh65zK7XXxdu6ewirERP0rA0i6ZCa0b5MzYbSxQsF58/RPzs LcB/uLFq0lP9uxAuwKqJgGQSp9vs+SX69Oyi5S5FRqEhxX2XaXYog3DqC8y3JPcm3q6R f4b6hQnp+7x2QaiULxhNW/Z6jg6IOsngO4QBdDqft6HGPKJ8/wuUkkSAdfZhVhENJkHP xHBw== X-Gm-Message-State: AIkVDXIWPjcIIVaOx9vRoRTGw6DmlXYp4Q+VnD2X7yOeRFqmmkQMd+TeOr1dfZTN5u+QEQ== X-Received: by 10.200.35.14 with SMTP id a14mr4554753qta.159.1484081440911; Tue, 10 Jan 2017 12:50:40 -0800 (PST) Received: from wkstn-mjohnston.west.isilon.com (c-76-104-201-218.hsd1.wa.comcast.net. [76.104.201.218]) by smtp.gmail.com with ESMTPSA id 1sm2291627qtb.49.2017.01.10.12.50.39 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Jan 2017 12:50:40 -0800 (PST) Sender: Mark Johnston Date: Tue, 10 Jan 2017 12:57:12 -0800 From: Mark Johnston To: freebsd-hackers@FreeBSD.org Subject: draining high-frequency callouts Message-ID: <20170110205711.GA86449@wkstn-mjohnston.west.isilon.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.7.2 (2016-11-26) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Jan 2017 20:50:42 -0000 I'm occasionally seeing an assertion failure in softclock_call_cc() when running DTrace tests on a system with hz=10000. The assertion (c->c_flags & CALLOUT_ACTIVE) != 0 is failing while a thread is concurrently draining the callout, which runs at a high frequency. At the time of the panic, that thread is spinning on the per-CPU callout lock after having been awoken from "codrain", and CALLOUT_PENDING is set on the callout. The callout is direct, i.e., it is executed in hard interrupt context. I think this is what's happening: - callout_drain() is called while the callout is executing but after the callout has rescheduled itself, and goes to sleep after having cleared CALLOUT_ACTIVE. - softclock_call_cc() wakes up the callout_drain() caller, but the callout fires again before the caller is scheduled. - the second softclock_call_cc() call sees that CALLOUT_ACTIVE is cleared and panics. Is there anything that prevents this scenario? Is it really correct to leave CALLOUT_ACTIVE cleared when the per-CPU callout lock must be dropped in order to acquire a sleepqueue lock? Thanks, -Mark