From owner-freebsd-fs@FreeBSD.ORG Thu Apr 29 08:03:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D9321065670; Thu, 29 Apr 2010 08:03:46 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f216.google.com (mail-bw0-f216.google.com [209.85.218.216]) by mx1.freebsd.org (Postfix) with ESMTP id AF0268FC17; Thu, 29 Apr 2010 08:03:45 +0000 (UTC) Received: by bwz8 with SMTP id 8so14048003bwz.3 for ; Thu, 29 Apr 2010 01:03:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject :organization:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=BCgNIQJkOyov7QzXVGTaFHswAEqjMn8EZ4BicZTfMn8=; b=QXnVwMpZAIh89467qRu7rDv6f6T5xZvoxEcSa83zbuEaxWiunrk2Jv8eXN0Pjne58Y PsHdPjsukcy2kK25hROGT5Wbh4ERhnDlAS43jCIft2FpcqKNQoTr/+KgusUeuTPOAX6K 4Djkm3Fzq3jO9xZsAw2p8f1c0u8xGkEvSW/KI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=EXboj9NeZaGptmUpMm3In2pMjOB+QN3jXXwKGiuvazDVrmTzjbJY4jzLlADARe0coG 0afYg9v07/aIla1JnpJIWdCY9zMx7hTbUTtIkYGlKbVh2O/t+F+ABP2jc+xv3fq5Ecwg Ou057xA2mxKAZGy5W1SN7+Y7CpTdePa6sFb00= Received: by 10.204.141.133 with SMTP id m5mr5535795bku.91.1272528217254; Thu, 29 Apr 2010 01:03:37 -0700 (PDT) Received: from localhost (ua1.etadirect.net [91.198.140.16]) by mx.google.com with ESMTPS id 15sm217641bwz.4.2010.04.29.01.03.35 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 29 Apr 2010 01:03:35 -0700 (PDT) From: Mikolaj Golub To: Pawel Jakub Dawidek Organization: TOA Ukraine References: <86r5m9dvqf.fsf@zhuzha.ua1> <20100423062950.GD1670@garage.freebsd.pl> <86k4rye33e.fsf@zhuzha.ua1> <20100424073031.GD3067@garage.freebsd.pl> <868w8dgk4e.fsf@kopusha.onet> <86tyqzeq84.fsf@kopusha.onet> <20100428214636.GD1677@garage.freebsd.pl> Date: Thu, 29 Apr 2010 11:03:33 +0300 In-Reply-To: <20100428214636.GD1677@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Wed, 28 Apr 2010 23:46:36 +0200") Message-ID: <86mxwmk7my.fsf@zhuzha.ua1> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs Subject: Re: HAST: primary might get stuck when there are connectivity problems with secondary X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Apr 2010 08:03:46 -0000 On Wed, 28 Apr 2010 23:46:36 +0200 Pawel Jakub Dawidek wrote: PJD> Could you see if the following patch fixes the problem for you: PJD> http://people.freebsd.org/~pjd/patches/hastd_timeout.patch PJD> The patch sets timeout on both incoming and outgoing sockets on primary PJD> and on outgoing socket on secondary. Incoming socket on secondary is PJD> left with no timeout to avoid problem you described above. The patch works for me. After disabling the network connection between the primary and the secondary FS operations on the primary do not get stuck and the following messages are observed: Apr 29 10:37:41 hasta hastd: [storage] (primary) Unable to receive reply header: Resource temporarily unavailable. Apr 29 10:37:57 hasta hastd: [tank] (primary) Unable to receive reply header: Resource temporarily unavailable. Apr 29 10:37:57 hasta hastd: [tank] (primary) Unable to send request (Resource temporarily unavailable): WRITE(972292096, 14336). Apr 29 10:38:56 hasta hastd: [storage] (primary) Unable to connect to 172.20.66.202: Operation timed out. Apr 29 10:39:12 hasta hastd: [tank] (primary) Unable to connect to 172.20.66.202: Operation timed out. After restoring the network connection the primary reconnects to the secondary and the status changes back from "degraded" to "complete". Thank you. -- Mikolaj Golub