From owner-freebsd-stable@FreeBSD.ORG Fri Jun 3 16:18:41 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 148B2106566B for ; Fri, 3 Jun 2011 16:18:41 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id 79D478FC0A for ; Fri, 3 Jun 2011 16:18:40 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.4/8.14.4) with ESMTP id p53GITjk079235 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Fri, 3 Jun 2011 19:18:34 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <4DE90955.9020505@digsys.bg> Date: Fri, 03 Jun 2011 19:18:29 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110519 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <4DE21C64.8060107@digsys.bg> <4DE3ACF8.4070809@digsys.bg> <86d3j02fox.fsf@kopusha.home.net> <4DE4E43B.7030302@digsys.bg> <86zkm3t11g.fsf@in138.ua3> <4DE5048B.3080206@digsys.bg> <4DE5D535.20804@digsys.bg> <4DE8FE78.6070401@digsys.bg> In-Reply-To: <4DE8FE78.6070401@digsys.bg> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: HAST instability X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2011 16:18:41 -0000 Well, apparently my HAST joy was short. On a second run, I got stuck with Jun 3 19:08:16 b1a hastd[1900]: [data2] (primary) Unable to receive reply header: Operation timed out. on the primary. No messages on the secondary. On primary: # netstat -an | grep 8457 tcp4 0 0 10.2.101.11.42659 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.62058 10.2.101.12.8457 CLOSE_WAIT tcp4 0 0 10.2.101.11.34646 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.11419 10.2.101.12.8457 CLOSE_WAIT tcp4 0 0 10.2.101.11.37773 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.21911 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.40169 10.2.101.12.8457 CLOSE_WAIT tcp4 0 97749 10.2.101.11.44360 10.2.101.12.8457 CLOSE_WAIT tcp4 0 0 10.2.101.11.8457 *.* LISTEN on secondary # netstat -an | grep 8457 tcp4 0 0 10.2.101.12.8457 10.2.101.11.42659 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.62058 FIN_WAIT_2 tcp4 0 0 10.2.101.12.8457 10.2.101.11.34646 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.11419 FIN_WAIT_2 tcp4 0 0 10.2.101.12.8457 10.2.101.11.37773 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.21911 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.40169 FIN_WAIT_2 tcp4 66415 0 10.2.101.12.8457 10.2.101.11.44360 FIN_WAIT_2 tcp4 0 0 10.2.101.12.8457 *.* LISTEN on primary # hastctl status data0: role: primary provname: data0 localpath: /dev/gpt/data0 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 0 (0B) data1: role: primary provname: data1 localpath: /dev/gpt/data1 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 0 (0B) data2: role: primary provname: data2 localpath: /dev/gpt/data2 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 6291456 (6.0MB) data3: role: primary provname: data3 localpath: /dev/gpt/data3 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 0 (0B) Sits in this state for over 10 minutes. Unfortunately, no KDB in kernel. Any ideas what other to look for? Daniel