From owner-freebsd-hackers@freebsd.org Wed Oct 5 23:59:17 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8B5DDB81B39; Wed, 5 Oct 2016 23:59:17 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5011BC7F; Wed, 5 Oct 2016 23:59:16 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from julian-mbp3.pixel8networks.com (50-196-156-133-static.hfc.comcastbusiness.net [50.196.156.133]) (authenticated bits=0) by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id u95NxAoa051800 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Wed, 5 Oct 2016 16:59:14 -0700 (PDT) (envelope-from julian@freebsd.org) Subject: Re: fix for use-after-free problem in 10.x [review please]. To: FreeBSD Stable , freebsd References: <7b732876-8cc3-a638-7ff1-e664060d4907@freebsd.org> From: Julian Elischer Message-ID: <6eaa1131-e268-bfb4-9203-a93eaf296f6c@freebsd.org> Date: Wed, 5 Oct 2016 16:59:05 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <7b732876-8cc3-a638-7ff1-e664060d4907@freebsd.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Oct 2016 23:59:17 -0000 Please review.. https://reviews.freebsd.org/D8160 Direct fix for stable/10 as bug is not present in 11+ in this form. Julian On 4/10/2016 8:06 PM, Julian Elischer wrote: > In 11 and 12 the taskqueue code has been rewritten in this area but > under 10 this bug still occurs. > > On our appliances this bug stops the system from mounting the ZFS > root, so it is quite severe. > Basically while the thread is sleeping during the ZFS mount of root > (in the while loop), another thread can free the 'task' item it is > checking in that while loop and it can be reused or filled with > 'deadcode' etc., with the waiting code unaware of the change.. The > fix is to refetch the item at the end of the queue each time around > the loop. > I don't really want to do the bigger change of MFCing the change in > 11, as it is more extensive, though if someone else does, that's ok > by me. (If it's ABI compatible) > > Any comments or suggestions? > > here's the fix in diff form: A slightly better fix is at https://reviews.freebsd.org/D8160 > > > [robot@porridge /usr/src]$ p4 diff -du ... > --- > //depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c > 2016-09-27 09:14:59.000000000 -0700 > +++ /usr/src/sys/kern/subr_taskqueue.c 2016-09-27 > 09:14:59.000000000 -0700 > @@ -441,9 +441,10 @@ > > TQ_LOCK(queue); > task = STAILQ_LAST(&queue->tq_queue, task, ta_link); > - if (task != NULL) > - while (task->ta_pending != 0) > - TQ_SLEEP(queue, task, &queue->tq_mutex, > PWAIT, "-", 0); > + while (task != NULL && task->ta_pending != 0) { > + TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT, "-", 0); > + task = STAILQ_LAST(&queue->tq_queue, task, ta_link); > + } > taskqueue_drain_running(queue); > KASSERT(STAILQ_EMPTY(&queue->tq_queue), > ("taskqueue queue is not empty after draining")); >