From owner-freebsd-current Thu Sep 24 06:33:09 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id GAA29226 for freebsd-current-outgoing; Thu, 24 Sep 1998 06:33:09 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from lor.watermarkgroup.com (lor.watermarkgroup.com [207.202.73.33]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id GAA29219 for ; Thu, 24 Sep 1998 06:33:01 -0700 (PDT) (envelope-from luoqi@watermarkgroup.com) Received: (from luoqi@localhost) by lor.watermarkgroup.com (8.8.8/8.8.8) id JAA23662; Thu, 24 Sep 1998 09:32:27 -0400 (EDT) (envelope-from luoqi) Date: Thu, 24 Sep 1998 09:32:27 -0400 (EDT) From: Luoqi Chen Message-Id: <199809241332.JAA23662@lor.watermarkgroup.com> To: archer@lucky.net, luoqi@watermarkgroup.com Subject: Re: deadlock in vm_fault() Cc: current@FreeBSD.ORG Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > In article <199809232224.SAA17260@lor.watermarkgroup.com> you wrote: > LC> I ran into a deadlock in vm_fault code today while making -j12 world. > LC> It's caused by a reversed lock acquisition order. The normal order of > LC> acquisition is vm map lock first and vnode lock next (if the fault is in > LC> a vnode backed object). During the course of the fault handling, lock on > LC> the vm map is released prior to paging io and has to be reacquired if it's > LC> modified by another process during the io. Before reacquiring the lock of > LC> vm map, we have to release the vnode lock we still hold, otherwise another > LC> page fault in the same map/vnode would send us into a deadlock. > > LC> Attached is a fix for this problem. Would any of the vm/lock experts out > LC> there review this? Thanks. > > I've seen something strange during the same -j12 buildworld. In fact, > it was just that ld hang, apparently not doing anything. The rest > of the system seemed to be alive. Though in some 3 or 4 hours machine > rebooted (it had kernel with broken crash dump generation, so I do > not know what actually happened). > > May it be related? > > LC> -lq > > --- > Reality is an obstacle to hallucination. > It could very well be. What I saw on my machine was a deadlock between the exec_map and sh inode, which means all existing processes were running fine, but to fork() and then exec() a new image hang waiting for the exec_map. Children of cron piled up as time went by, and eventually that could kill the system (it's not clear how. I didn't wait for it to happen, I went into the debugger, took a dump and rebooted). -lq To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message