From owner-freebsd-current@FreeBSD.ORG Mon Nov 15 17:05:52 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A54D106578D; Mon, 15 Nov 2010 17:05:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 383818FC0A; Mon, 15 Nov 2010 17:05:52 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id C94F346B51; Mon, 15 Nov 2010 12:05:51 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3EA498A027; Mon, 15 Nov 2010 12:05:46 -0500 (EST) From: John Baldwin To: freebsd-current@freebsd.org Date: Mon, 15 Nov 2010 11:43:04 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20101102; KDE/4.4.5; amd64; ; ) References: <06D5F9F6F655AD4C92E28B662F7F853E039E389A@seaxch09.desktop.isilon.com> <201011122125.47922.hselasky@c2i.net> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201011151143.04825.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 15 Nov 2010 12:05:46 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: mdf@freebsd.org, Hans Petter Selasky Subject: Re: sleep bug in taskqueue(9) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Nov 2010 17:05:52 -0000 On Friday, November 12, 2010 4:24:51 pm mdf@freebsd.org wrote: > On Fri, Nov 12, 2010 at 12:25 PM, Hans Petter Selasky wrote: > > On Friday 12 November 2010 17:38:38 mdf@freebsd.org wrote: > >> On Fri, Nov 12, 2010 at 6:23 AM, Hans Petter Selasky > > wrote: > >> > On Friday 12 November 2010 15:18:46 mdf@freebsd.org wrote: > >> >> On Fri, Nov 12, 2010 at 12:56 AM, Hans Petter Selasky > >> > > >> > wrote: > >> >> > On Thursday 29 April 2010 01:59:58 Matthew Fleming wrote: > >> >> >> It looks to me like taskqueue_drain(taskqueue_thread, foo) will not > >> >> >> correctly detect whether or not a task is currently running. The > >> >> >> check is against a field in the taskqueue struct, but for the > >> >> >> taskqueue_thread queue with more than one thread, multiple threads > >> >> >> can simultaneously be running a task, thus stomping over the > >> >> >> tq_running field. > >> >> >> > >> >> >> I have not seen any problem with the code as-is in actual use, so > >> >> >> this is purely an inspection bug. > >> >> >> > >> >> >> The following patch should fix the problem. Because it changes the > >> >> >> size of struct task I'm not sure if it would be suitable for MFC. > >> >> > > >> >> > 1) The u_char is going to leave a hole in that structure on ARM > >> >> > platforms for example. > >> >> > > >> >> > 2) The existing taskqueue implementation also has a missing check for > >> >> > the pending count wrapping to zero. I.E. it should stick at 0xFFFF > >> >> > and not wrap to 0. > >> >> > >> >> This commit mail is rather old, and this fix was incorrect, because > >> >> the task cannot be referenced after it has been run. Some task > >> >> handlers will free the task as part of the handler. > >> > > >> > Ok, maybe the e-mail got stuck somewhere. Have you fixed the above > >> > mentioned issues in a newer patch? > >> > >> If you look at the file history for subr_taskqueue.c: > >> > >> http://svn.freebsd.org/viewvc/base/head/sys/kern/subr_taskqueue.c > >> > >> You will see quite a few commits by me. The most recent relating to > >> detecting if a task is running is being MFC'd today: > > > > Yes, and I see that this code needs an overflow check, which is one of the > > issues still not fixed: > > You keep bringing this up. It is not a new issue. It is not a bug in > any of the patches. It is extremely unlikely that a task will be > queued 65536 times before execution. It is more worthy of an assert > rather than a check, because if a task is enqueued that many times > without being run then there's likely a stuck task in the queue. > > The patch you posted will lie as well, so I would not consider it > sufficient if someone wanted to address the issue. I agree it should be an assert. -- John Baldwin