From owner-freebsd-current@FreeBSD.ORG Wed Jun 20 16:44:03 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5166B1065673; Wed, 20 Jun 2012 16:44:03 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh7.mail.rice.edu (mh7.mail.rice.edu [128.42.199.46]) by mx1.freebsd.org (Postfix) with ESMTP id 1963B8FC17; Wed, 20 Jun 2012 16:44:03 +0000 (UTC) Received: from mh7.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh7.mail.rice.edu (Postfix) with ESMTP id D6D9D291F9B; Wed, 20 Jun 2012 11:44:02 -0500 (CDT) Received: from mh7.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh7.mail.rice.edu (Postfix) with ESMTP id C6BED29211F; Wed, 20 Jun 2012 11:44:02 -0500 (CDT) X-Virus-Scanned: by amavis-2.6.4 at mh7.mail.rice.edu, auth channel Received: from mh7.mail.rice.edu ([127.0.0.1]) by mh7.mail.rice.edu (mh7.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id NkfaVkwLTIQi; Wed, 20 Jun 2012 11:44:02 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh7.mail.rice.edu (Postfix) with ESMTPSA id 2383A291F9B; Wed, 20 Jun 2012 11:44:02 -0500 (CDT) Message-ID: <4FE1FDD1.3030208@rice.edu> Date: Wed, 20 Jun 2012 11:44:01 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Konstantin Belousov References: <4FE127D3.8070502@FreeBSD.org> <201206200819.39256.jhb@freebsd.org> <20120620132542.GW2337@deviant.kiev.zoral.com.ua> In-Reply-To: <20120620132542.GW2337@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Alan Cox , Steve Wills , Konstantin Belousov , current@freebsd.org Subject: Re: panic with out of memory X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jun 2012 16:44:03 -0000 On 06/20/2012 08:25, Konstantin Belousov wrote: > On Wed, Jun 20, 2012 at 08:19:39AM -0400, John Baldwin wrote: >> On Tuesday, June 19, 2012 9:30:59 pm Steve Wills wrote: >>> Hi, >>> >>> I just got a panic out of my r237195 system. The panic looks like: >>> >>> Sleeping thread (tid 173153, pid 42034) owns a non-sleepable lock >>> KDB: stack backtrace of thread 173153: >>> sched_switch() at sched_switch+0x28a >>> mi_switch() at mi_switch+0xdf >>> sleepq_timedwait() at sleepq_timedwait+0x3a >>> _sleep() at _sleep+0x266 >>> swp_pager_meta_build() at swp_pager_meta_build+0x259 >>> swap_pager_copy() at swap_pager_copy+0x17b >>> vm_object_collapse() at vm_object_collapse+0x123 >>> vm_object_deallocate() at vm_object_deallocate+0x457 >>> vm_map_process_deferred() at vm_map_process_deferred+0x72 >>> vm_pageout_oom() at vm_pageout_oom+0x180 >>> swp_pager_meta_build() at swp_pager_meta_build+0x248 >>> swap_pager_copy() at swap_pager_copy+0x17b >>> vm_object_collapse() at vm_object_collapse+0x123 >>> vm_object_deallocate() at vm_object_deallocate+0x457 >>> vm_map_process_deferred() at vm_map_process_deferred+0x72 >>> vm_map_remove() at vm_map_remove+0x116 >>> exec_new_vmspace() at exec_new_vmspace+0x1bc >>> exec_elf64_imgact() at exec_elf64_imgact+0x5f4 >>> kern_execve() at kern_execve+0x6f0 >>> sys_execve() at sys_execve+0x37 >>> amd64_syscall() at amd64_syscall+0x351 >>> Xfast_syscall() at Xfast_syscall+0xfb >>> --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800d2eddc, rsp = >>> 0x7fffffffd328, rbp = 0x7fffffffd470 --- >>> panic: sleeping thread >>> cpuid = 4 >>> >>> The system was very busy and using lots of swap, but I didn't expect a >>> panic. If any more detail is needed or I should just get more RAM, let >>> me know. :) >> Hmm, this is due to a bug I noticed recently as well. I had been talking >> with Alan and Konstantin about the proper fix. Hmm, thinking abou this some >> more, perhaps a simpler fix would be to have a 'I'm already in >> vm_map_process_deferred()' flag. Or even better, just move the entire list >> off into a static variable so that we don't get caught in recursion. >> Something like this: >> >> Index: vm_map.c >> =================================================================== >> --- vm_map.c (revision 237227) >> +++ vm_map.c (working copy) >> @@ -475,12 +475,14 @@ static void >> vm_map_process_deferred(void) >> { >> struct thread *td; >> - vm_map_entry_t entry; >> + vm_map_entry_t entry, next; >> vm_object_t object; >> >> td = curthread; >> - while ((entry = td->td_map_def_user) != NULL) { >> - td->td_map_def_user = entry->next; >> + entry = td->td_map_def_user; >> + td->td_map_def_user = NULL; >> + while (entry != NULL) { >> + next = entry->next; >> if ((entry->eflags& MAP_ENTRY_VN_WRITECNT) != 0) { >> /* >> * Decrement the object's writemappings and >> @@ -494,6 +496,7 @@ vm_map_process_deferred(void) >> entry->end); >> } >> vm_map_entry_deallocate(entry, FALSE); >> + entry = next; >> } >> } > Yes, looks like it should work. I'll add, "Me too." I'm much happier with this than the previous patch. Alan