From owner-freebsd-stable@FreeBSD.ORG Fri May 31 12:41:40 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 845FBB6F for ; Fri, 31 May 2013 12:41:40 +0000 (UTC) (envelope-from Andre.Albsmeier@siemens.com) Received: from david.siemens.de (david.siemens.de [192.35.17.14]) by mx1.freebsd.org (Postfix) with ESMTP id 1D7AEDB8 for ; Fri, 31 May 2013 12:41:39 +0000 (UTC) Received: from mail2.siemens.de (localhost [127.0.0.1]) by david.siemens.de (8.13.6/8.13.6) with ESMTP id r4VCQB34000975 for ; Fri, 31 May 2013 14:26:11 +0200 Received: from curry.mchp.siemens.de (curry.mchp.siemens.de [139.25.40.130]) by mail2.siemens.de (8.13.6/8.13.6) with ESMTP id r4VCQBlI031684 for ; Fri, 31 May 2013 14:26:11 +0200 Received: (from localhost) by curry.mchp.siemens.de (8.14.7/8.14.7) id r4VCQBgI029892; Date: Fri, 31 May 2013 14:26:11 +0200 From: Andre Albsmeier To: freebsd-stable@freebsd.org Subject: FreeBSD-9.1: machine reboots during snapshot creation, LORs found Message-ID: <20130531122611.GA6607@bali> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Echelon: X-Advice: Drop that crappy M$-Outlook, I'm tired of your viruses! User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 12:41:40 -0000 Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r----- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r-------- 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 palveli kernel: lock order reversal: May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 palveli kernel: lock order reversal: May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running "dump -L" on a root fs. Any hints of how to proceed? -Andre