From owner-freebsd-current@FreeBSD.ORG Fri Jan 29 22:58:32 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F4082106566C; Fri, 29 Jan 2010 22:58:31 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-fx0-f227.google.com (mail-fx0-f227.google.com [209.85.220.227]) by mx1.freebsd.org (Postfix) with ESMTP id 52DD88FC19; Fri, 29 Jan 2010 22:58:30 +0000 (UTC) Received: by fxm27 with SMTP id 27so819352fxm.3 for ; Fri, 29 Jan 2010 14:58:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:subject:x-enigmail-version:content-type :content-transfer-encoding; bh=2k8+4l3Yg/0zvW1bOdz0HOrERSM3U5XWnbUKlamQvPI=; b=huahRTR2a/+2G/Kx9uG1eohPGwICYPai0Ay3BDSGG4RYNdInL4xuwmSKpAOY81I7sE X3Zp8qGkzsPD0pT5AoPz2v0XMI2V5dLGDx1x/4XqWRgWaXKpOpgg45MJcFdKuWkZzt4O ToXbqOT/qq7Yc+C1fZf5LBXDqYPykS1u419Wo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:subject :x-enigmail-version:content-type:content-transfer-encoding; b=EZPbPuxpe6DKKyEH+DhdVj0ReXSu38dGBl+sC0jyJNp9QW1kgPMFrqzFh+sMoAEe11 okimhtvH7vtto9EHwl5cB+wzXrXQudCWMgMw5OqP5XmfESUHejGngceK9iUzE5Vte7/M 2dwpv2i26raPQ2tilvYD9FLcymXlGYZq4M3gg= Received: by 10.102.200.17 with SMTP id x17mr657326muf.125.1264805909944; Fri, 29 Jan 2010 14:58:29 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id y37sm1759033mug.8.2010.01.29.14.58.28 (version=SSLv3 cipher=RC4-MD5); Fri, 29 Jan 2010 14:58:29 -0800 (PST) Sender: Alexander Motin Message-ID: <4B636812.8060403@FreeBSD.org> Date: Sat, 30 Jan 2010 00:58:26 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: freebsd-geom@freebsd.org, freebsd-hackers@freebsd.org, FreeBSD-Current X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: Subject: Deadlock between GEOM and devfs device destroy and process exit. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jan 2010 22:58:32 -0000 Hi. Experimenting with SATA hot-plug I've found quite repeatable deadlock case. Problem observed when several SATA devices, opened via devfs, disappear at exactly same time. In my case, at time of unplugging SATA Port Multiplier with several disks beyond it. All I have to do is to run several `dd if=/dev/adaX of=/dev/null bs=1m &` commands and unplug multiplier. That causes predictable I/O errors and devices destruction. But with high probability several dd processes getting stuck in kernel. I've discovered such pieces of problem: - CAM receives disconnect event and starts device destruction. But as device is still opened, it can't do it immediately. - dd receives I/O error and exits. - exit1() call closes all descriptors, including adaX device. It triggers final device destruction, by sending event to geom_dev. adaclose(4571fa00,4,40c16576,76,0,...) at 0x4049c521 g_disk_access(457e2200,ffffffff,0,0,0,...) at 0x4080b9a4 g_access(45643d80,ffffffff,0,0,2000,...) at 0x40810ccb g_dev_close(45766500,1,2000,4569fd80,4569fd80,...) at 0x4080a425 devfs_close(7b604aa8,80000,457f8000,80000,7b604acc,...) at 0x407f2762 VOP_CLOSE_APV(40d03180,7b604aa8,40c2e681,128,0,...) at 0x40b6da55 vn_close(457f8000,1,45624300,4569fd80,451271e0,...) at 0x40912750 vn_closefile(4566da48,4569fd80,4566da48,0,7b604b58,...) at 0x40912854 devfs_close_f(4566da48,4569fd80,3,0,4566da48,...) at 0x407f235b _fdrop(4566da48,4569fd80,7b604b8c,408b5cec,0,4569fe24,40eb23a8,40d10460,40c1a8bb,4560672c,721,40c1a8b2,7b604bb4,40878220,4560672c,8,40c1a8b2,721) at 0x40836da3 closef(4566da48,4569fd80,721,71e,4569fe24,...) at 0x40838ad0 fdfree(4569fd80,0,40c1b1a9,107,7b604c80,...) at 0x408394da exit1(4569fd80,100,7b604d2c,40b565c0,4569fd80,...) at 0x40844423 sys_exit(4569fd80,7b604cf8,40c59d34,40c26be4,4569d2a8,...) at 0x408450fd syscall(7b604d38) at 0x40b565c0 - GEOM event thread tries to destroy /dev/adaX device (which should be already free at this moment), but for some reason freezes, waiting for device to be freed: 0 2 0 0 -8 0 0 8 devdrn DL ?? 0:02.89 [g_event] - as GEOM event is still not handled, exit1() waits for it: kdb_backtrace(40c16bc4,0,40c16ab1,56,4540e640,...) at 0x408a2909 g_waitidle(4569fd80,0,40c1b1a9,107,7b604c80,...) at 0x4080cd1f exit1(4569fd80,100,7b604d2c,40b565c0,4569fd80,...) at 0x40844431 sys_exit(4569fd80,7b604cf8,40c59d34,40c26be4,4569d2a8,...) at 0x408450fd syscall(7b604d38) at 0x40b565c0 - system stationary. GEOM frozen. No way to get out of this, except pushing reset. 0 1065 1055 0 44 0 5344 3040 g_wait DE 0 0:00.43 dd if=/dev/ada1 of=/dev/null bs=1m 0 1066 1055 0 44 0 5344 3040 GEOM t DE 0 0:00.07 dd if=/dev/ada2 of=/dev/null bs=1m So, does anybody have good idea why destroy_dev() can't complete? -- Alexander Motin