From owner-freebsd-fs@FreeBSD.ORG Fri Apr 23 06:29:56 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34E871065675 for ; Fri, 23 Apr 2010 06:29:56 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl [89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 8FCD28FC16 for ; Fri, 23 Apr 2010 06:29:55 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 0A6C545C99; Fri, 23 Apr 2010 08:29:54 +0200 (CEST) Received: from localhost (chello089077043238.chello.pl [89.77.43.238]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 3B4BE45683; Fri, 23 Apr 2010 08:29:48 +0200 (CEST) Date: Fri, 23 Apr 2010 08:29:50 +0200 From: Pawel Jakub Dawidek To: Mikolaj Golub Message-ID: <20100423062950.GD1670@garage.freebsd.pl> References: <86r5m9dvqf.fsf@zhuzha.ua1> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="YToU2i3Vx8H2dn7O" Content-Disposition: inline In-Reply-To: <86r5m9dvqf.fsf@zhuzha.ua1> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs Subject: Re: HAST: primary might get stuck when there are connectivity problems with secondary X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 06:29:56 -0000 --YToU2i3Vx8H2dn7O Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Apr 21, 2010 at 12:02:16PM +0300, Mikolaj Golub wrote: > Restoring network after this changes nothing. Primary gets stuck. No mess= ages > in the log and "dirty" in status output does not change: >=20 > [root@hasta ~]# hastctl status > storage: > role: primary > provname: storage > localpath: /dev/ad4 > extentsize: 2097152 > keepdirty: 64 > remoteaddr: 172.20.66.202 > replication: memsync > status: complete > dirty: 2097152 bytes >=20 > On the secondary we have all this time: >=20 > tcp4 0 0 172.20.66.202.8457 172.20.66.201.57596 ESTABLI= SHED > tcp4 0 0 172.20.66.202.8457 172.20.66.201.41841 ESTABLI= SHED >=20 > The last messages in log: >=20 > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411bc0)= Request received from the kernel: READ(13565952, 65536). > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411bc0)= Moving request to the send queue. > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: Taking free = request. > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)= Got free request. > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)= Waiting for request from the kernel. > Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: (0x28411bc0)= Got request. > Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: (0x28411bc0)= Moving request to the done queue. > Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: Taking reque= st. > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: (0x28411bc0)= Got request. > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: (0x28411bc0)= Moving request to the free queue. > Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: Taking reque= st. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)= Request received from the kernel: READ(1812529152, 65536). > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)= Moving request to the send queue. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: Taking free = request. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b00)= Got free request. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b00)= Waiting for request from the kernel. > Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: (0x28411b80)= Got request. > Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: (0x28411b80)= Moving request to the done queue. > Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: Taking reque= st. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: (0x28411b80)= Got request. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: (0x28411b80)= Moving request to the free queue. > Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: Taking reque= st. >=20 > The backtrace of gotten stuck hastd is in the attach. >=20 > I interpret this in the following way. Although the network is down > hast_proto_send() in remote_send_thread() returns success (sent data are > stored in the kernel buffer). Then kernel tries to send data and eventual= ly > fails after timeout and close the socket. hastd is not aware about this, > remote_send_thread() is blocked in "Taking request" at this time, sync th= read > is waiting for status from the secondary about sent data but secondary do= es > not send it because it did not receive any data. "Taking request" only means that HAST is waiting for a request. In case of ggate_send thread it means it is waiting for I/O request from the kernel. It stops here, because there is no new request. > Restarting hastd on the secondary usually helps. A workaround is to set > net.inet.tcp.keepidle to some small value (e.g. 300 sec) on the > secondary. Then the secondary will notice much earlier that the peer has > closed the connection and will restart the worker itself: >=20 > Apr 21 11:52:21 hastb hastd: [storage] (secondary) Unable to receive requ= est header: Connection reset by peer. > Apr 21 11:52:21 hastb hastd: [storage] (secondary) Worker process (pid=3D= 1398) exited ungracefully: status=3D19200. What you are observing here is because currently there is no keep-alive mechanism in hastd, so if there are no I/O requests, primary won't notice that secondary is down until next I/O request arrives. I don't consider it critical, but confusing. Note that when secondary goes down and there are no I/O requests to primary, there will be nothing to synchronize, so fast reconnect isn't crucial in this case. We need one I/O request coming to primary to discover that secondary is down and start reconnect loop. Could you confirm this is what you are observing? If not, what FreeBSD version do you use? If it is HEAD, do you have r205738? --=20 Pawel Jakub Dawidek http://www.wheelsystems.com pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --YToU2i3Vx8H2dn7O Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkvRPl0ACgkQForvXbEpPzQ2BwCgz94Ij8Xsz+SdrF+OvT3IOfev Q8IAoIIFvxvBuCU6/mPquHRcuaiQgChh =Ab6v -----END PGP SIGNATURE----- --YToU2i3Vx8H2dn7O--