From owner-freebsd-bugs@FreeBSD.ORG Fri Mar 2 18:20:05 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 273C516A405 for ; Fri, 2 Mar 2007 18:20:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 0926413C4BC for ; Fri, 2 Mar 2007 18:20:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l22IK4R4053427 for ; Fri, 2 Mar 2007 18:20:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l22IK4BA053426; Fri, 2 Mar 2007 18:20:04 GMT (envelope-from gnats) Resent-Date: Fri, 2 Mar 2007 18:20:04 GMT Resent-Message-Id: <200703021820.l22IK4BA053426@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Andrew Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9CE1416A409 for ; Fri, 2 Mar 2007 18:11:45 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [69.147.83.33]) by mx1.freebsd.org (Postfix) with ESMTP id 7647513C4B7 for ; Fri, 2 Mar 2007 18:11:45 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l22IBjZF012464 for ; Fri, 2 Mar 2007 18:11:45 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id l22IBjDp012463; Fri, 2 Mar 2007 18:11:45 GMT (envelope-from nobody) Message-Id: <200703021811.l22IBjDp012463@www.freebsd.org> Date: Fri, 2 Mar 2007 18:11:45 GMT From: Andrew To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.0 Cc: Subject: kern/109762: deadlock in g_down -> ahd_action -> contigmalloc X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Mar 2007 18:20:05 -0000 >Number: 109762 >Category: kern >Synopsis: deadlock in g_down -> ahd_action -> contigmalloc >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Mar 02 18:20:04 GMT 2007 >Closed-Date: >Last-Modified: >Originator: Andrew >Release: FreeBSD 6.2-20070202 >Organization: Critical Path, Inc >Environment: FreeBSD volcano.supernews.net 6.2-20070202 FreeBSD 6.2-20070202 #0: Fri Feb 2 16:29:10 UTC 2007 root@supernews.net:/usr/obj/usr/src/sys/SUPERNEWS i386 >Description: System hung during heavy file write activity (copying a large file between filesystems). The cause of the hang was g_down being stuck as follows: Tracing pid 4 tid 100016 td 0xa8303a80 sched_switch(a8303a80,0,1) at sched_switch+0x14b mi_switch(1,0,a8303a80,ca4249c4,a0515704,...) at mi_switch+0x1ba sleepq_switch(bc3927f0) at sleepq_switch+0x87 sleepq_wait(bc3927f0,0,a8303a80,44,bc3927f0,...) at sleepq_wait+0x5c msleep(bc3927f0,a073c7a0,44,a06f43d5,0) at msleep+0x269 bwait(bc3927f0,44,a06f43d5,bc3927f0,0,...) at bwait+0x5f swap_pager_putpages(ac2d318c,ca424ac4,1,1,ca424a90,...) at swap_pager_putpages+0x48c default_pager_putpages(ac2d318c,ca424ac4,1,1,ca424a90) at default_pager_putpages+0x18 vm_pageout_flush(ca424ac4,1,1) at vm_pageout_flush+0xcb vm_contig_launder_page(a49de288) at vm_contig_launder_page+0x2a6 vm_page_alloc_contig(3,0,0,ffffffff,8,0) at vm_page_alloc_contig+0x25c contigmalloc(3000,a0710ea0,1,0,ffffffff,...) at contigmalloc+0x97 bus_dmamem_alloc(a83d3e80,ab998618,1,ab998610) at bus_dmamem_alloc+0xb4 ahd_alloc_scbs(a8415000) at ahd_alloc_scbs+0x17a ahd_get_scb(a8415000,8) at ahd_get_scb+0x57 ahd_action(a83f82c0,ab8a4800) at ahd_action+0x103 xpt_run_dev_sendq(a83f8280) at xpt_run_dev_sendq+0x175 xpt_action(ab8a4800) at xpt_action+0x269 dastart(a862c600,ab8a4800,ab8a4800,a86254c0,1) at dastart+0x149 xpt_run_dev_allocq(a83f8280) at xpt_run_dev_allocq+0x82 xpt_schedule(a862c600,1,a9da5bdc,ca424ce8,a04bd420,...) at xpt_schedule+0xef dastrategy(a9da5bdc) at dastrategy+0x4a g_disk_start(a9da5528) at g_disk_start+0x18c g_io_schedule_down(a8303a80) at g_io_schedule_down+0x13b g_down_procbody(0,ca424d38) at g_down_procbody+0x92 fork_exit(a04bf00c,0,ca424d38) at fork_exit+0x71 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xca424d6c, ebp = 0 --- Clearly, having g_down waiting for a swap pageout to complete is a deadlock. The circumstances under which this happens are not particularly clear - at the point of the hang, most of the system memory was in the 'inactive' queue, but the amount of free and/or cached memory was substantial. Machine has 4GB of RAM of which about 3.3GB is usable (i386, no PAE). Inactive memory was about 2.2GB, cache 900M, free 5M. This has been observed twice so far, though attempts to reproduce it in a consistent fashion have failed and it seems to be relatively rare. >How-To-Repeat: Initiate a burst of heavy i/o via the ahd driver, such as copying multi-gigabyte files or using dd to create same. The other conditions needed for it to happen are not known. >Fix: >Release-Note: >Audit-Trail: >Unformatted: