From owner-freebsd-stable@FreeBSD.ORG Fri Jun 3 15:32:21 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A9295106566C for ; Fri, 3 Jun 2011 15:32:21 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id 347988FC13 for ; Fri, 3 Jun 2011 15:32:20 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.4/8.14.4) with ESMTP id p53FW8gW079102 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Fri, 3 Jun 2011 18:32:14 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <4DE8FE78.6070401@digsys.bg> Date: Fri, 03 Jun 2011 18:32:08 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110519 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <4DE21C64.8060107@digsys.bg> <4DE3ACF8.4070809@digsys.bg> <86d3j02fox.fsf@kopusha.home.net> <4DE4E43B.7030302@digsys.bg> <86zkm3t11g.fsf@in138.ua3> <4DE5048B.3080206@digsys.bg> <4DE5D535.20804@digsys.bg> In-Reply-To: <4DE5D535.20804@digsys.bg> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: HAST instability X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2011 15:32:21 -0000 Decided to apply the patch proposed in -current by Mikolaj Golub: http://people.freebsd.org/~trociny/uipc_socket.c.patch This apparently fixed my issue as well. Running without checksums for a full bonnie++ run (~100GB write/rewrite) produced no disconnects, no stalls and generated up to 280MB/sec (4 drives in stripped zpool). Interestingly, the hast devices write latency as observed by gstat was under 30ms. I believe this fix should be committed. Here are the accumulated netstat -s from both hosts, for comparison with previous runs. Retransmits etc are much less. http://news.digsys.bg/~admin/hast/test3jun-fix/b1a-netstat-s http://news.digsys.bg/~admin/hast/test3jun-fix/b1b-netstat-s http://news.digsys.bg/~admin/hast/test3jun-fix/b1b-systat-if-fix Before applying the patch I verified there are no network problems. Created 1TB file from /dev/random on the first host. Copied over to the second host with ftp. Transfer speed was low, at 80MB/sec -- ftp would utilize one CPU core 100% at the receiving node. Then calculated md5 checksums on both sides, matched. Daniel