From owner-freebsd-fs@FreeBSD.ORG Sun Apr 11 09:12:16 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C9FC106564A for ; Sun, 11 Apr 2010 09:12:16 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.159]) by mx1.freebsd.org (Postfix) with ESMTP id AB3588FC0C for ; Sun, 11 Apr 2010 09:12:15 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id l26so690611fgb.13 for ; Sun, 11 Apr 2010 02:12:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:subject:organization:from :date:message-id:user-agent:mime-version:content-type; bh=GcoZge6LibaVu3PEwJXURYT2dHvcZpJsLHhsOsC89DA=; b=gp95pTqj3TwfHOzGpj8QWc4BpADQS3yaaz/WvJV/HYAjy+YpbdIEDMBm742/8ZV/Ae R6MygfRK62cj/h/x1qmulz02CH/FHBej8dnMcAl4wrQUTjzsGEMm3TAM0UPBPCPphPXU iGG6v5rE7j/BDw30scjr8ZKbLjFKetkh3/IFg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:subject:organization:from:date:message-id:user-agent :mime-version:content-type; b=Sn7lMzsRZcadhdp3A4hHu4tmuHImDyPTxcy2l7+n7ySAVGWDYPEuus9KWJYvzhf/if RNsQZLl4p6hxE/G4ElZkAnfL+hm/SYOWZxqX2JWfVYiayTvMybyWKuVAmgNGWqOIpxMq QLvRHqayFTpa4gDH3jhufaKn/gUbkv3VieCRA= Received: by 10.223.161.204 with SMTP id s12mr1785152fax.103.1270977134506; Sun, 11 Apr 2010 02:12:14 -0700 (PDT) Received: from localhost ([95.69.173.44]) by mx.google.com with ESMTPS id 12sm6295181fks.50.2010.04.11.02.12.13 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 11 Apr 2010 02:12:13 -0700 (PDT) To: freebsd-fs@FreeBSD.org Organization: Home From: Mikolaj Golub Date: Sun, 11 Apr 2010 12:12:10 +0300 Message-ID: <86pr265p5h.fsf@kopusha.onet> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: Subject: hastd: socket leakage on worker exit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Apr 2010 09:12:16 -0000 --=-=-= Hi, Playing with HAST I have noticed the following issue with hasd: .... Apr 10 22:32:36 hastb hastd: [storage] (secondary) Split-brain detected, exiting. Apr 10 22:32:36 hastb hastd: [storage] (secondary) Worker process failed (pid=6474, status=78). Apr 10 22:32:56 hastb hastd: [storage] (secondary) Split-brain detected, exiting. Apr 10 22:32:56 hastb hastd: [storage] (secondary) Worker process failed (pid=6475, status=78). Apr 10 22:32:56 hastb hastd: [storage] (secondary) Unable to create control sockets be: Too many open files And sockstat: root hastd 711 4 stream /var/run/hastctl root hastd 711 5 tcp4 *:8457 *:* root hastd 711 7 dgram -> /var/run/logpriv root hastd 711 8 stream (not connected) root hastd 711 9 stream -> ?? root hastd 711 10 stream -> ?? root hastd 711 12 stream -> ?? root hastd 711 13 stream -> ?? root hastd 711 14 stream -> ?? root hastd 711 15 stream -> ?? [ ... and so on .. ] The patch below has fixed the issue. -- Mikolaj Golub --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=hastd.c.close_on_child_exit.patch --- sbin/hastd/hastd.c.orig 2010-04-11 11:52:10.000000000 +0300 +++ sbin/hastd/hastd.c 2010-04-11 11:51:23.000000000 +0300 @@ -138,6 +138,7 @@ child_exit(void) (unsigned int)pid, WEXITSTATUS(status)); } res->hr_workerpid = 0; + proto_close(res->hr_ctrl); if (res->hr_role == HAST_ROLE_PRIMARY) { sleep(1); pjdlog_info("Restarting worker process."); --=-=-=--