From owner-freebsd-stable@FreeBSD.ORG Sun Nov 24 17:25:28 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6B4E0878; Sun, 24 Nov 2013 17:25:28 +0000 (UTC) Received: from mail-la0-x233.google.com (mail-la0-x233.google.com [IPv6:2a00:1450:4010:c03::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id BC5322ABB; Sun, 24 Nov 2013 17:25:27 +0000 (UTC) Received: by mail-la0-f51.google.com with SMTP id ec20so2271276lab.38 for ; Sun, 24 Nov 2013 09:25:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=sN/J3cbkGOtUVkwOl3Gi9jePZWj3+I4kmQOoEAmHLjU=; b=tDwmryPmtndrL7rUoc8u83wqX7oFkPCEoELe1HT/GV1reKV8FQ//hcpmuobDL2/H9M 46y7yrUvZpyINoWKTTcbFHDoCnaVedcoKd+TZp26O/OLmrYtwMOyhLUXI2OwlC9OInPN umJoHXtNNqHT/HubctLQqwyO8+1czMbZIkho09u/TkC2sJUExC+i6cjftqQEw+IzbWGt INpz6UB9k44ZPr7kdcfAALc1hL6lOGxeKEH469X/lOn3//TBkUy0aprRySEQqaBj5Nw6 uJqzf5/miuIRNWTPXGypwlHCCWGyxxudIiOiwnGDCeEm/Ba/EvklWW6jtvBdzHB5B+53 POyg== X-Received: by 10.112.154.129 with SMTP id vo1mr1985534lbb.31.1385313925641; Sun, 24 Nov 2013 09:25:25 -0800 (PST) Received: from localhost ([178.150.115.244]) by mx.google.com with ESMTPSA id i8sm4208818lbh.2.2013.11.24.09.25.24 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Nov 2013 09:25:24 -0800 (PST) Sender: Mikolaj Golub Date: Sun, 24 Nov 2013 19:25:21 +0200 From: Mikolaj Golub To: Pete French Subject: Re: Hast locking up under 9.2 Message-ID: <20131124172520.GB17292@gmail.com> References: <20131121203711.GA3736@gmail.com> <20131123215950.GA17292@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131123215950.GA17292@gmail.com> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Nov 2013 17:25:28 -0000 On Sat, Nov 23, 2013 at 11:59:51PM +0200, Mikolaj Golub wrote: > So I propose: > > 1) Use hio_countdown only for counting components we waiting to > complete, i.e. initially it is always 2 for any replication mode. > > 2) To distinguish between "memsync ack" or "memsync fin" responses from > the secondary, add and use hio_memsyncacked field. > > 3) Call write_complete() in component threads _before_ releasing > hio_countdown (i.e. before the hio may be returned to the done > queue). > > 4) Add and use hio_writecount refcounter to detect when > write_complete() should be called in memsync case. > > 5) As hio_done is used only for async, rename it to hio_asyncdone and > check/modify outside of more generic write_complete(), only when it > is needed. > > Now, write_complete(): > - for fullsync is called by ggate_send_thread; > - for async case -- either by local component thread or by ggate_send_thread; > - for memsync case -- by one of the component threads. I just realized that in the case when the write failed locally I can't do write_complete() until "memsync fin" is recieved (to get the status from the secondary), i.e. in this case write_complete() should be called by ggate_send_thread. Here is an updated patch: http://people.freebsd.org/~trociny/patches/hast.primary.c.memsync_write_complete.2.patch I have reverted (5), so hio_done is used to detect if write_complete is needed in ggate_send_thread for memsync case too. -- Mikolaj Golub