Date: Mon, 15 Feb 1999 11:43:57 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: FreeBSD-gnats-submit@FreeBSD.ORG Subject: kern/10107: inode vs exec_map interlock Message-ID: <199902151943.LAA18818@apollo.backplane.com>
next in thread | raw e-mail | index | archive | help
>Number: 10107
>Category: kern
>Synopsis: interlock situation with exec_map and a program binary inode
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Feb 15 11:50:01 PST 1999
>Closed-Date:
>Last-Modified:
>Originator: Matthew Dillon
>Release: FreeBSD 4.0-CURRENT i386
>Organization:
none
>Environment:
Heavily loaded test machine artificially limited to 16MB of main
memory, NFS swap, running a buildworld -j10.
>Description:
I found an interesting interlock situation between what I believe to
be a program binaries inode and the exec_map. The machine locked up
trying to exec new programs.
This was running a make -j10 buildworld on a machine with 16MB of ram
configured, while testing my new VM system. I don't think the lockup is
due to my VM system, though. It took it 7 hours of extremely heavy
paging before it locked up.
When I broke the machine out into DDB and did a ps, all of the cc's
were stuck in 'inode' wait, while a single ld program was stuck in
'thrd_sleep'.
I tracked 'thrd_sleep' down to a vm_map lock and the map down to
the exec_map. I tracked down the inode lock to the 'cc' program binary.
The inode had one shared lock and 7 waiters. The exec_map appears to
own one shared lock with 6 waiters ( but most of the waiters are due to
me trying to run other programs before breaking into the DDB ).
I am guessing that there is an interlock situation with exec_map and
a program inode where one process locks exec_map followed by the program
inode, and another locks the program inode followed by exec_map. But
I'm not familiar with that section of the code so I would appreciate
any help.
>How-To-Repeat:
The problem was found by running a make -j10 buildworld on a machine
artificially limited to 16MB of main memory, with NFS swap. The problem
occured approximately 7 hours into the buildworld so it is presumably
difficult to recreate and represents a small window somewhere.
>Fix:
Unknown as yet.
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902151943.LAA18818>
