Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 03 Jun 2011 19:18:29 +0300
From:      Daniel Kalchev <daniel@digsys.bg>
To:        freebsd-stable@freebsd.org
Subject:   Re: HAST instability
Message-ID:  <4DE90955.9020505@digsys.bg>
In-Reply-To: <4DE8FE78.6070401@digsys.bg>
References:  <4DE21C64.8060107@digsys.bg>	<4DE3ACF8.4070809@digsys.bg>	<86d3j02fox.fsf@kopusha.home.net>	<4DE4E43B.7030302@digsys.bg>	<86zkm3t11g.fsf@in138.ua3>	<4DE5048B.3080206@digsys.bg> <4DE5D535.20804@digsys.bg> <4DE8FE78.6070401@digsys.bg>

next in thread | previous in thread | raw e-mail | index | archive | help
Well, apparently my HAST joy was short. On a second run, I got stuck with

Jun  3 19:08:16 b1a hastd[1900]: [data2] (primary) Unable to receive 
reply header: Operation timed out.

on the primary. No messages on the secondary.

On primary:

# netstat -an | grep 8457

tcp4       0      0 10.2.101.11.42659      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.62058      10.2.101.12.8457       CLOSE_WAIT
tcp4       0      0 10.2.101.11.34646      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.11419      10.2.101.12.8457       CLOSE_WAIT
tcp4       0      0 10.2.101.11.37773      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.21911      10.2.101.12.8457       FIN_WAIT_2
tcp4       0      0 10.2.101.11.40169      10.2.101.12.8457       CLOSE_WAIT
tcp4       0  97749 10.2.101.11.44360      10.2.101.12.8457       CLOSE_WAIT
tcp4       0      0 10.2.101.11.8457       *.*                    LISTEN

on secondary

# netstat -an | grep 8457

tcp4       0      0 10.2.101.12.8457       10.2.101.11.42659      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.62058      FIN_WAIT_2
tcp4       0      0 10.2.101.12.8457       10.2.101.11.34646      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.11419      FIN_WAIT_2
tcp4       0      0 10.2.101.12.8457       10.2.101.11.37773      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.21911      CLOSE_WAIT
tcp4       0      0 10.2.101.12.8457       10.2.101.11.40169      FIN_WAIT_2
tcp4   66415      0 10.2.101.12.8457       10.2.101.11.44360      FIN_WAIT_2
tcp4       0      0 10.2.101.12.8457       *.*                    LISTEN

on primary

# hastctl status
data0:
   role: primary
   provname: data0
   localpath: /dev/gpt/data0
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 0 (0B)
data1:
   role: primary
   provname: data1
   localpath: /dev/gpt/data1
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 0 (0B)
data2:
   role: primary
   provname: data2
   localpath: /dev/gpt/data2
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 6291456 (6.0MB)
data3:
   role: primary
   provname: data3
   localpath: /dev/gpt/data3
   extentsize: 2097152 (2.0MB)
   keepdirty: 64
   remoteaddr: 10.2.101.12
   sourceaddr: 10.2.101.11
   replication: fullsync
   status: complete
   dirty: 0 (0B)

Sits in this state for over 10 minutes.

Unfortunately, no KDB in kernel. Any ideas what other to look for?

Daniel



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DE90955.9020505>