From owner-freebsd-fs@FreeBSD.ORG Sat Oct 2 16:26:06 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5E14106564A; Sat, 2 Oct 2010 16:26:06 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 3872D8FC17; Sat, 2 Oct 2010 16:26:05 +0000 (UTC) Received: by fxm9 with SMTP id 9so3354844fxm.13 for ; Sat, 02 Oct 2010 09:26:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject:references :x-comment-to:date:in-reply-to:message-id:user-agent:mime-version :content-type; bh=xcK9desYQpRHqcDeASxNdUusoVhCZM3HTGqO7r6gcLY=; b=q2MR3fEFZeIwGGqzSgMLaObtJkxNMGVrf4r8pv9vxQz17f7sLxS9NYwCq1mWHrxuEK 0TrzyBVJMpD1M6jR9pu7AeY7YNIarFElYCoLaNsCrc+/pGfiRuFa0mOEd+7Z+mShXo1t uKmC26r3OO/KaI0zd5Fqv7kVYoXo1iW4cX0Ew= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=fUwvsbBB0ZEzqYWOIgbeZdCavU6uQd/chjWIXabeFgvwl/LSN9mFfKEL/8HiKvJ6OP ePWYHTNreUCYBXpetIruK5j8yH7R61gUQIUDDus0eNci4vQ+OB+y9uiy6vU8+cvf8Zuw Z2z/4pmbz/ZAqxEtrfSbYZR0wm6L4MYNzQ+bA= Received: by 10.223.103.84 with SMTP id j20mr6907188fao.35.1286036765051; Sat, 02 Oct 2010 09:26:05 -0700 (PDT) Received: from localhost ([95.69.162.97]) by mx.google.com with ESMTPS id h12sm1250305faa.13.2010.10.02.09.26.03 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 02 Oct 2010 09:26:04 -0700 (PDT) From: Mikolaj Golub To: freebsd-fs@freebsd.org References: <86hbh44wgl.fsf@kopusha.home.net> X-Comment-To: Mikolaj Golub Date: Sat, 02 Oct 2010 19:26:05 +0300 In-Reply-To: <86hbh44wgl.fsf@kopusha.home.net> (Mikolaj Golub's message of "Sat, 02 Oct 2010 15:20:58 +0300") Message-ID: <86aamw4l42.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: pjd@freebsd.org Subject: Re: hastd: assertion (res->hr_event != NULL) fails in secondary on split-brain X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 16:26:07 -0000 --=-=-= On Sat, 02 Oct 2010 15:20:58 +0300 Mikolaj Golub wrote: MG> After recent changes in hastd (I think r213006: Fix descriptor leaks) if MG> split-brain occurs hastd will abort in child_cleanup() on assertion MG> (res->hr_event != NULL). ... MG> So we have double close of res->hr_event. The first time it is closed when MG> parent detects that worker exited in main_loop(), and the second time when a MG> new connection from primary comes and the parent does cleanup after previously MG> terminated child before starting new one. MG> The straightforward fix is to check res->hr_event before closing, like in the MG> patch below. MG> -- MG> Mikolaj Golub MG> Index: sbin/hastd/control.c MG> =================================================================== MG> --- sbin/hastd/control.c (revision 213357) MG> +++ sbin/hastd/control.c (working copy) MG> @@ -58,8 +58,10 @@ child_cleanup(struct hast_resource *res) MG> MG> proto_close(res->hr_ctrl); MG> res->hr_ctrl = NULL; MG> - proto_close(res->hr_event); MG> - res->hr_event = NULL; MG> + if (res->hr_event != NULL) { MG> + proto_close(res->hr_event); MG> + res->hr_event = NULL; MG> + } MG> res->hr_workerpid = 0; MG> } MG> Running with this fix another issue is observed. On split-brain `hastctl status' on secondary will return "[ERROR] Error 32 received from hastd" most of the times. And only for some runs an output will be returned. lolek# hastctl status storage [ERROR] Error 32 received from hastd. lolek# hastctl status storage [ERROR] Error 32 received from hastd. lolek# hastctl status storage storage: role: secondary provname: storage localpath: /dev/ad4 extentsize: 2097152 keepdirty: 0 remoteaddr: tcp4://bolek replication: memsync status: complete dirty: 0 bytes lolek# hastctl status storage [ERROR] Error 32 received from hastd. This is because hastd clears res->hr_workerpid only when a new connection from the primary comes. Whilst hastd checks res->hr_workerpid in control_status() and if it is not zero it tries to get info from the worker and returns error (broken pipe) if the worker is actually not running. So it looks like it is better not just to close res->hr_ctrl in main_loop() but to do full child cleanup here -- straight away its exit is detected. What do you think about the attached patch? -- Mikolaj Golub --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=hast.child_kill.patch Index: sbin/hastd/hastd.c =================================================================== --- sbin/hastd/hastd.c (revision 213357) +++ sbin/hastd/hastd.c (working copy) @@ -94,22 +94,6 @@ g_gate_load(void) } static void -child_exit_log(unsigned int pid, int status) -{ - - if (WIFEXITED(status) && WEXITSTATUS(status) == 0) { - pjdlog_debug(1, "Worker process exited gracefully (pid=%u).", - pid); - } else if (WIFSIGNALED(status)) { - pjdlog_error("Worker process killed (pid=%u, signal=%d).", - pid, WTERMSIG(status)); - } else { - pjdlog_error("Worker process exited ungracefully (pid=%u, exitcode=%d).", - pid, WIFEXITED(status) ? WEXITSTATUS(status) : -1); - } -} - -static void child_exit(void) { struct hast_resource *res; @@ -388,8 +372,6 @@ listen_accept(void) const unsigned char *token; char laddr[256], raddr[256]; size_t size; - pid_t pid; - int status; proto_local_address(cfg->hc_listenconn, laddr, sizeof(laddr)); pjdlog_debug(1, "Accepting connection to %s.", laddr); @@ -504,26 +486,7 @@ listen_accept(void) "Worker process exists (pid=%u), stopping it.", (unsigned int)res->hr_workerpid); /* Stop child process. */ - if (kill(res->hr_workerpid, SIGINT) < 0) { - pjdlog_errno(LOG_ERR, - "Unable to stop worker process (pid=%u)", - (unsigned int)res->hr_workerpid); - /* - * Other than logging the problem we - * ignore it - nothing smart to do. - */ - } - /* Wait for it to exit. */ - else if ((pid = waitpid(res->hr_workerpid, - &status, 0)) != res->hr_workerpid) { - /* We can only log the problem. */ - pjdlog_errno(LOG_ERR, - "Waiting for worker process (pid=%u) failed", - (unsigned int)res->hr_workerpid); - } else { - child_exit_log(res->hr_workerpid, status); - } - child_cleanup(res); + child_kill(res); } else if (res->hr_remotein != NULL) { char oaddr[256]; @@ -678,8 +641,8 @@ main_loop(void) if (event_recv(res) == 0) continue; /* The worker process exited? */ - proto_close(res->hr_event); - res->hr_event = NULL; + if (res->hr_workerpid != 0) + child_kill(res); } } } Index: sbin/hastd/control.c =================================================================== --- sbin/hastd/control.c (revision 213357) +++ sbin/hastd/control.c (working copy) @@ -63,6 +63,51 @@ child_cleanup(struct hast_resource *res) res->hr_workerpid = 0; } +void +child_exit_log(unsigned int pid, int status) +{ + + if (WIFEXITED(status) && WEXITSTATUS(status) == 0) { + pjdlog_debug(1, "Worker process exited gracefully (pid=%u).", + pid); + } else if (WIFSIGNALED(status)) { + pjdlog_error("Worker process killed (pid=%u, signal=%d).", + pid, WTERMSIG(status)); + } else { + pjdlog_error("Worker process exited ungracefully (pid=%u, exitcode=%d).", + pid, WIFEXITED(status) ? WEXITSTATUS(status) : -1); + } +} + +void +child_kill(struct hast_resource *res) +{ + pid_t pid; + int status; + + assert(res->hr_workerpid != 0); + + if (kill(res->hr_workerpid, SIGINT) < 0) { + pjdlog_errno(LOG_ERR, + "Unable to stop worker process (pid=%u)", + (unsigned int)res->hr_workerpid); + /* + * Other than logging the problem we + * ignore it - nothing smart to do. + */ + } + /* Wait for it to exit. */ + else if ((pid = waitpid(res->hr_workerpid, + &status, 0)) != res->hr_workerpid) { + /* We can only log the problem. */ + pjdlog_errno(LOG_ERR, + "Waiting for worker process (pid=%u) failed", + (unsigned int)res->hr_workerpid); + } + child_exit_log(res->hr_workerpid, status); + child_cleanup(res); +} + static void control_set_role_common(struct hastd_config *cfg, struct nv *nvout, uint8_t role, struct hast_resource *res, const char *name, unsigned int no) @@ -107,22 +152,8 @@ control_set_role_common(struct hastd_config *cfg, * If previous role was primary or secondary we have to kill process * doing that work. */ - if (res->hr_workerpid != 0) { - if (kill(res->hr_workerpid, SIGTERM) < 0) { - pjdlog_errno(LOG_WARNING, - "Unable to kill worker process %u", - (unsigned int)res->hr_workerpid); - } else if (waitpid(res->hr_workerpid, NULL, 0) != - res->hr_workerpid) { - pjdlog_errno(LOG_WARNING, - "Error while waiting for worker process %u", - (unsigned int)res->hr_workerpid); - } else { - pjdlog_debug(1, "Worker process %u stopped.", - (unsigned int)res->hr_workerpid); - } - child_cleanup(res); - } + if (res->hr_workerpid != 0) + child_kill(res); /* Start worker process if we are changing to primary. */ if (role == HAST_ROLE_PRIMARY) Index: sbin/hastd/control.h =================================================================== --- sbin/hastd/control.h (revision 213357) +++ sbin/hastd/control.h (working copy) @@ -39,6 +39,8 @@ struct hastd_config; struct hast_resource; void child_cleanup(struct hast_resource *res); +void child_exit_log(unsigned int pid, int status); +void child_kill(struct hast_resource *res); void control_set_role(struct hast_resource *res, uint8_t role); --=-=-=--