From owner-freebsd-fs@FreeBSD.ORG Mon Oct 11 16:09:49 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6AE071065670 for ; Mon, 11 Oct 2010 16:09:49 +0000 (UTC) (envelope-from mwlucas@bewilderbeast.blackhelicopters.org) Received: from bewilderbeast.blackhelicopters.org (bewilderbeast.blackhelicopters.org [198.22.63.8]) by mx1.freebsd.org (Postfix) with ESMTP id 15C128FC13 for ; Mon, 11 Oct 2010 16:09:48 +0000 (UTC) Received: from bewilderbeast.blackhelicopters.org (localhost [127.0.0.1]) by bewilderbeast.blackhelicopters.org (8.14.4/8.14.4) with ESMTP id o9BFUpTr015721 for ; Mon, 11 Oct 2010 11:30:51 -0400 (EDT) (envelope-from mwlucas@bewilderbeast.blackhelicopters.org) Received: (from mwlucas@localhost) by bewilderbeast.blackhelicopters.org (8.14.4/8.14.4/Submit) id o9BFUp2O015720 for fs@freebsd.org; Mon, 11 Oct 2010 11:30:51 -0400 (EDT) (envelope-from mwlucas) Date: Mon, 11 Oct 2010 11:30:51 -0400 From: "Michael W. Lucas" To: fs@freebsd.org Message-ID: <20101011153051.GA15699@bewilderbeast.blackhelicopters.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.5 (bewilderbeast.blackhelicopters.org [127.0.0.1]); Mon, 11 Oct 2010 11:30:51 -0400 (EDT) Cc: Subject: hast crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Oct 2010 16:09:49 -0000 Hi, I upgraded my HAST cluster to 8.1-stable on 6 October 2010, and am now experiencing crashes in hastd. hastd debug output is showing: ... [DEBUG][2] [mirror] (secondary) recv: (0x8013ecc40) Got request header: WRITE(11752701952, 131072). [DEBUG][2] [mirror] (secondary) recv: (0x8013ecc40) Moving request to the disk queue. [DEBUG][2] [mirror] (secondary) disk: (0x8013ecc40) Got request: WRITE(11752701952, 131072). [DEBUG][2] [mirror] (secondary) recv: Taking free request. [DEBUG][2] [mirror] (secondary) recv: (0x8013ecbf0) Got request. [ERROR] [mirror] (secondary) Unable to receive request header: RPC version wrong. [DEBUG][1] Unable to receive event header: Socket is not connected. [DEBUG][1] Accepting connection to tcp4://0.0.0.0:8457. [INFO] Connection from tcp4://192.168.0.1:21493 to tcp4://192.168.0.2:8457. [DEBUG][2] tcp4://192.168.0.1:21493: resource=mirror [DEBUG][1] [mirror] (secondary) Initial connection from tcp4://192.168.0.1:21493. [DEBUG][1] [mirror] (secondary) Worker process exists (pid=8826), stopping it. [ERROR] [mirror] (secondary) Worker process exited ungracefully (pid=8826, exitcode=75). Assertion failed: (conn != NULL), function proto_close, file /usr/src/sbin/hastd/proto.c, line 287. Abort (core dumped) Both machines are running on VMWare ESXi. The second machine is a clone of the first. Any thoughts, folks? Thanks, ==ml -- Michael W. Lucas mwlucas@BlackHelicopters.org http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/ New book available: Network Flow Analysis http://www.networkflowanalysis.com/