From owner-freebsd-current Sat Sep 28 02:12:24 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id CAA02130 for current-outgoing; Sat, 28 Sep 1996 02:12:24 -0700 (PDT) Received: from alpo.whistle.com (s205m1.whistle.com [207.76.205.1]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id CAA02073 for ; Sat, 28 Sep 1996 02:12:17 -0700 (PDT) Received: from current1.whistle.com (current1.whistle.com [207.76.205.22]) by alpo.whistle.com (8.7.5/8.7.3) with SMTP id CAA15432 for ; Sat, 28 Sep 1996 02:11:00 -0700 (PDT) Message-ID: <324CEB39.41C67EA6@whistle.com> Date: Sat, 28 Sep 1996 02:09:13 -0700 From: Julian Elischer Organization: Whistle Communications X-Mailer: Mozilla 3.0b6 (X11; I; FreeBSD 2.2-CURRENT i386) MIME-Version: 1.0 To: current@FreeBSD.org Subject: Re: Locking snafu in -current References: <324CDD35.167EB0E7@whistle.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-current@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk Julian Elischer wrote: > > This seems to have been present for a long time.. looking at the hung process 447 sendmail 447 177 177 177 -1,-1 noflags -1,-1 0,369364 2,336634 lockf 0 0 we see it is waiting on a lock structure.. (lockf lock string) in the proc structure: p_wchan = 0xf0ae58c0, p_wmesg = 0xf01e17d4 "lockf", print *(struct lockf *) 0xf0ae58c0 yields $20 = { lf_flags = 48, lf_type = 3, lf_start = 0x0000000000000000, lf_end = 0xffffffffffffffff, lf_id = 0xf0ad0cc0 "\200^\n@\001", lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0ad8080 } following lf_head yields: $21 = (struct lockf *) 0xf0a9f900 which is the same as "lf_next" (?) but backtracking through the "lf_block yields: (in sequence) lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0aebf00 lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0af4940 lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0af4100 lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0af9c00 lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0af9640 lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0afcc00 lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0xf0afcd40 $29 = { lf_flags = 48, lf_type = 3, lf_start = 0x0000000000000000, lf_end = 0xffffffffffffffff, lf_id = 0xf0afca00 "", lf_head = 0xf0a9d530, lf_next = 0xf0a9f900, lf_block = 0x0 } Now we have reached the head of the locking chain.. now we notice that ALL of these lock structures are pointing to lf_next = 0xf0a9f900, which is where the original 'head' was pointing. If we take the hint and go there.. we see: $30 = { lf_flags = 48, lf_type = 3, lf_start = 0x0000000000000000, lf_end = 0xffffffffffffffff, lf_id = 0xf0a9fb80 "\200", lf_head = 0xf0a9d530, lf_next = 0x0, lf_block = 0xf0a9f200 } so if we start at the head, there is no forward pointer to all those lock structures we back-tracked through.. This is all very confusing as the lf_id is different for all these as well. it doesn't make sense to me.. if anyone has a suggestion as to what to look at next, I'll leave the machine inthe debugger and can look at snything over the weekend.... there's deffinitly a bug, but I don't know what. possibly it's in sendmail, but.. who knows. julian