From owner-freebsd-current@FreeBSD.ORG Thu Feb 5 09:59:27 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E47216A4CE for ; Thu, 5 Feb 2004 09:59:27 -0800 (PST) Received: from smtp1.powertech.no (smtp1.powertech.no [195.159.0.145]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB05A43D3F for ; Thu, 5 Feb 2004 09:59:25 -0800 (PST) (envelope-from frode@nordahl.net) Received: from [195.159.148.100] (samwise.xu.nordahl.net [195.159.148.100]) by smtp1.powertech.no (Postfix) with ESMTP id 3737782F0 for ; Thu, 5 Feb 2004 18:59:24 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v612) Content-Transfer-Encoding: 7bit Message-Id: <0703C4CC-5805-11D8-951F-000A95A9A574@nordahl.net> Content-Type: text/plain; charset=US-ASCII; format=flowed To: current@freebsd.org From: Frode Nordahl Date: Thu, 5 Feb 2004 18:59:22 +0100 X-Mailer: Apple Mail (2.612) Subject: Re: rpc.lockd(8) seg faults on 5.2-RELEASE X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Feb 2004 17:59:27 -0000 Hello, Got an update on the rpc.lockd "hang" issue. Whenever I observe it does this, I try to kill it off using kill -SEGV before restarting it. In one of the dumps I observed this: (gdb) print *blockedlocklist_head->lh_first $1 = {nfslocklist = {le_next = 0x8099000, le_prev = 0x8099000}, filehandle = { fh_fsid = {val = {1074502253, -394432445}}, fh_fid = {fid_len = 12, fid_reserved = 0, fid_data = "?\\@\0r?\202[\0\0\0\0\0\0\0"}}, addr = 0x80751e0, client = {exclusive = 1, svid = 19869, oh = {n_len = 24, n_bytes = 0x8056520 "19869@mail7.powertech.no", '?' ...}, l_offset = 0, l_len = 0}, client_cookie = {n_len = 4, n_bytes = 0x8075290 "?\221K\001", '?' , "udp6"}, client_name = "mail7.powertech.no", '\0' , nsm_status = 0, status = 0, flags = 6, blocking = 0, locker = 0, fd = 0} (gdb) Looking at retry_blockingfilelocklist(), this kind of data in blockedlocklist_head would most likely make it loop forever. I simulated this behaviour in my own program as well. But how did le_next end up == le_prev? I also found this in send_granted(): lockd_lock.c:2161 debuglog("About to send granted on blocked lock\n"); sleep(1); debuglog("Blowing off return send\n"); Anyone know what sleep(1) is good for here? Mvh, Frode