From owner-freebsd-bugs@FreeBSD.ORG Thu Nov 4 16:00:20 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1D565106566B for ; Thu, 4 Nov 2010 16:00:20 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id CD3F58FC19 for ; Thu, 4 Nov 2010 16:00:19 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA4G0JTF009304 for ; Thu, 4 Nov 2010 16:00:19 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA4G0JWN009286; Thu, 4 Nov 2010 16:00:19 GMT (envelope-from gnats) Resent-Date: Thu, 4 Nov 2010 16:00:19 GMT Resent-Message-Id: <201011041600.oA4G0JWN009286@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Andreas Longwitz Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7537C1065673 for ; Thu, 4 Nov 2010 15:56:49 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 495008FC12 for ; Thu, 4 Nov 2010 15:56:49 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id oA4Fumrv029171 for ; Thu, 4 Nov 2010 15:56:48 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id oA4FumJL029170; Thu, 4 Nov 2010 15:56:48 GMT (envelope-from nobody) Message-Id: <201011041556.oA4FumJL029170@www.freebsd.org> Date: Thu, 4 Nov 2010 15:56:48 GMT From: Andreas Longwitz To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/151941: FreeBSD RELENG_6 server freezes during create of a snapshot on a disk with mpt X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Nov 2010 16:00:20 -0000 >Number: 151941 >Category: kern >Synopsis: FreeBSD RELENG_6 server freezes during create of a snapshot on a disk with mpt >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Nov 04 16:00:19 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Andreas Longwitz >Release: RELENG_6 >Organization: Data Service Stockelsdorf >Environment: FreeBSD dssbkp1.incore 6.4-STABLE FreeBSD 6.4-STABLE #2: Wed Nov 3 18:18:31 CET 2010 root@dssbkp1.incore:/usr/src/sys/i386/compile/SERVER i386 >Description: An actual FreeBSD RELENG_6 system stops working on the command mount -u -o snapshot /prod/.snap/fscktest prod where prod is a 1 TB partition on a scsi disk /dev/da0p1 connected to mpt. The machine is semi-dead: All user processes are sleeping, all cpus idle, only ping and ddb is possible. Giant is the only lock shown by ps in ddb. The trace of the mount process causing the problem looks like this: Tracing command mount pid 7871 tid 100190 td 0xd1070480 sched_switch(d1070480,0,1) at sched_switch+0x14b mi_switch(1,0,d1070480,f3408348,c03e6d10,...) at mi_switch+0x1ba sleepq_switch(c0631cc4) at sleepq_switch+0x87 sleepq_wait(c0631cc4,0,d1070480,4,0,...) at sleepq_wait+0x5c msleep(c0631cc4,c0631ce0,50,c05b02c2,0) at msleep+0x269 getnewbuf(0,0,4000,4000) at getnewbuf+0x6ce getblk(d08de440,4cb7f440,0,4000,0,...) at getblk+0x360 breadn(d08de440,4cb7f440,0,4000,0,...) at breadn+0x31 bread(d08de440,4cb7f440,0,4000,0,f34084a8) at bread+0x20 ffs_alloccg(d134cad4,d5c,132dfce8,0,4000) at ffs_alloccg+0x13d ffs_hashalloc(d134cad4,d5c,132dfce8,0,4000,...) at ffs_hashalloc+0x28 ffs_alloc(d134cad4,281ec6f,0,132dfce8,0,4000,d051d400,f34085e8,d134cad4,281ec6f,0,463,e8cae000) at ffs_alloc+0x20d ffs_balloc_ufs2(d12dd880,7b1bc000,a0,4000,d051d400,0,f34087d8) at ffs_balloc_ufs2+0x16fc ffs_snapshot(d099a2bc,d130e8a0,d130e8a0,d0982600,d08de440,...) at ffs_snapshot+0x89b ffs_mount(d099a2bc,d1070480,10201000,0,d0520a80,...) at ffs_mount+0x991 vfs_domount(d1070480,d1125750,d0f8a250,11010000,d1125330) at vfs_domount+0x728 vfs_donmount(d1070480,11010000,f3408c04) at vfs_donmount+0x415 kernel_mount(d06599c0,11010000,804e040,0,fffffffe,...) at kernel_mount+0x38 ffs_cmount(d06599c0,bf7fdec0,11010000,d1070480,c05f84e0,...) at ffs_cmount+0x5d mount(d1070480,f3408d04) at mount+0x18e syscall(3b,3b,3b,804af21,bf7fe974,...) at syscall+0x2bf Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (21, FreeBSD ELF32, mount), eip = 0x880bfbcb, esp = 0xbf7fde9c, ebp = 0xbf7fdf38 --- The problem arises both on i386 and amd64 server. Creating snapshots an 300 GB disks connected to amr controller work without any problems. >How-To-Repeat: see above >Fix: The reason for the problem is the update from 1.50.2.2 to 1.50.2.3 of the source ffs_balloc.c (SVN rev 196973 on 2009-09-08 14:19:14; MFC r180758). If I revert this change from the kernel the problem disappears. >Release-Note: >Audit-Trail: >Unformatted: