Date: Sat, 24 Apr 2010 14:33:53 +0300 From: Mikolaj Golub <to.my.trociny@gmail.com> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: HAST: primary might get stuck when there are connectivity problems with secondary Message-ID: <868w8dgk4e.fsf@kopusha.onet> In-Reply-To: <20100424073031.GD3067@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Sat\, 24 Apr 2010 09\:30\:31 %2B0200") References: <86r5m9dvqf.fsf@zhuzha.ua1> <20100423062950.GD1670@garage.freebsd.pl> <86k4rye33e.fsf@zhuzha.ua1> <20100424073031.GD3067@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 24 Apr 2010 09:30:31 +0200 Pawel Jakub Dawidek wrote: > If secondary is not going to reply, hast_proto_recv_hdr() should > eventually timeout. On timeout, connection should be closed and this > requests (and all the others) should be moved to done queue. > > It doesn't timeout at all or maybe the timeout is too long? After "outage" we have: on the primary: tcp4 0 0 172.20.66.201.57596 172.20.66.202.8457 ESTABLISHED tcp4 0 0 172.20.66.201.41841 172.20.66.202.8457 CLOSED on the secondary: tcp4 0 0 172.20.66.202.8457 172.20.66.201.57596 ESTABLISHED tcp4 0 0 172.20.66.202.8457 172.20.66.201.41841 ESTABLISHED So one of the connections (used by primary/remote_send_thread()) is broken (although the secondary is not aware about this, it it in the recv() at that time) and the second connection (used by primary/remote_recv_thread()) is alive. It does timeout after net.inet.tcp.keepidle (which is 2 hours by default) when the secondary starts to send keep alive packets. The secondary receive RST on its keep alive packet, recv() returns with error and the worker is restarted. As I wrote in my first letter the workaround is to set net.inet.tcp.keepidle to some small value on the secondary so it would notice a broken connection much earlier. >From the code I don't see how hast_proto_recv_hdr() may timeout if the connection is alive, have I missed something? -- Mikolaj Golub
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?868w8dgk4e.fsf>