From owner-freebsd-fs@FreeBSD.ORG Mon Jul 17 09:02:54 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E24E416A4DD; Mon, 17 Jul 2006 09:02:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id EBA2143D45; Mon, 17 Jul 2006 09:02:53 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k6G9ptFk008974 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 16 Jul 2006 12:51:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6) with ESMTP id k6G9ptoP074984; Sun, 16 Jul 2006 12:51:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6/Submit) id k6G9ptZp074974; Sun, 16 Jul 2006 12:51:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 16 Jul 2006 12:51:55 +0300 From: Kostik Belousov To: Mark Knight Message-ID: <20060716095155.GN32624@deviant.kiev.zoral.com.ua> References: <20060716084210.GL32624@deviant.kiev.zoral.com.ua> <20060716091925.GM32624@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ai3I8gwHc37+ASRI" Content-Disposition: inline In-Reply-To: <20060716091925.GM32624@deviant.kiev.zoral.com.ua> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV version 0.88.2, clamav-milter version 0.88.2 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=0.4 required=5.0 tests=ALL_TRUSTED, DNS_FROM_RFC_ABUSE,SPF_NEUTRAL autolearn=no version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on fw.zoral.com.ua Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 6.1 panic after approx. 49 days uptime X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jul 2006 09:02:55 -0000 --ai3I8gwHc37+ASRI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 16, 2006 at 12:19:25PM +0300, Kostik Belousov wrote: > On Sun, Jul 16, 2006 at 09:46:49AM +0100, Mark Knight wrote: >=20 > Index: mount.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /usr/local/arch/ncvs/src/sys/sys/mount.h,v > retrieving revision 1.210 > diff -u -r1.210 mount.h > --- mount.h 5 May 2006 19:32:35 -0000 1.210 > +++ mount.h 16 Jul 2006 09:15:32 -0000 > @@ -578,7 +578,7 @@ > int _locked; \ > struct mount *_MP; \ > _MP =3D (MP); \ > - if (VFS_NEEDSGIANT(_MP)) { \ > + if (_MP !=3D NULL && VFS_NEEDSGIANT(_MP)) { \ > mtx_lock(&Giant); \ > _locked =3D 1; \ > } else \ This is not needed, since the problem was fixed on HEAD and RELENG_6 by tegge, see rev. 1.210 of sys/sys/mount.h. Fix was committed after release of 6.1. --ai3I8gwHc37+ASRI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (FreeBSD) iD8DBQFEugw6C3+MBN1Mb4gRAkDHAJ0RpVAee7yMKXDo8KbN+he6JrALSgCguSMo eSDqDb4/q0lK7qiX5zvRsd4= =yT0z -----END PGP SIGNATURE----- --ai3I8gwHc37+ASRI-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 17 09:02:56 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4614B16A4E0; Mon, 17 Jul 2006 09:02:56 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8717B43D45; Mon, 17 Jul 2006 09:02:55 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k6G9JQFP008185 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 16 Jul 2006 12:19:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6) with ESMTP id k6G9JQeH027191; Sun, 16 Jul 2006 12:19:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6/Submit) id k6G9JPlR027182; Sun, 16 Jul 2006 12:19:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 16 Jul 2006 12:19:25 +0300 From: Kostik Belousov To: Mark Knight Message-ID: <20060716091925.GM32624@deviant.kiev.zoral.com.ua> References: <20060716084210.GL32624@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tBhgiDt8dP1efIIJ" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV version 0.88.2, clamav-milter version 0.88.2 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=0.4 required=5.0 tests=ALL_TRUSTED, DNS_FROM_RFC_ABUSE,SPF_NEUTRAL autolearn=no version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on fw.zoral.com.ua Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 6.1 panic after approx. 49 days uptime X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jul 2006 09:02:56 -0000 --tBhgiDt8dP1efIIJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 16, 2006 at 09:46:49AM +0100, Mark Knight wrote: > In message <20060716084210.GL32624@deviant.kiev.zoral.com.ua>, Kostik=20 > Belousov writes > >On Sun, Jul 16, 2006 at 09:32:47AM +0100, Mark Knight wrote: > >>Just awoke to fine my home server (6.1-RELEASE) had panicked during its > >>daily update of /usr/ports with an uptime of 49 days. > >> > >>Stack trace is here: > >> > >> > >> > >>Looks file system related to me. Any advice appreciated. > > > >If you still have the core dump at hands, go to frame #7 and post the > >output of the commands "p *vp" and "p *(vp->v_mount)". >=20 > Appended to log file (in case of mail formatting) and reproduced here: >=20 > (kgdb) p *(vp) > $3 =3D {v_type =3D VBAD, v_tag =3D 0xc0791704 "none", v_op =3D 0xc07d89e0= , v_data =3D=20 > 0x0, v_mount =3D 0x0, > v_nmntvnodes =3D {tqe_next =3D 0x0, tqe_prev =3D 0xc3250014}, v_un =3D = {vu_mount=20 > =3D 0x0, vu_socket =3D 0x0, > vu_cdev =3D 0x0, vu_fifoinfo =3D 0x0}, v_hashlist =3D {le_next =3D 0x= 0, le_prev=20 > =3D 0xc295f570}, > v_hash =3D 3269747, v_cache_src =3D {lh_first =3D 0x0}, v_cache_dst =3D= =20 > {tqh_first =3D 0x0, tqh_last =3D 0xc335cbe0}, > v_dd =3D 0x0, v_cstart =3D 0, v_lasta =3D 0, v_lastw =3D 0, v_clen =3D = 0, v_lock =3D=20 > {lk_interlock =3D 0xc08073dc, > lk_flags =3D 64, lk_sharecount =3D 0, lk_waitcount =3D 0, lk_exclusiv= ecount =3D=20 > 0, lk_prio =3D 80, > lk_wmesg =3D 0xc07a24ed "ufs", lk_timo =3D 51, lk_lockholder =3D 0xff= ffffff,=20 > lk_newlock =3D 0x0}, > v_interlock =3D {mtx_object =3D {lo_class =3D 0xc07e0644, lo_name =3D 0= xc07a3a55=20 > "vnode interlock", > lo_type =3D 0xc07a3a55 "vnode interlock", lo_flags =3D 196608, lo_l= ist =3D=20 > {tqe_next =3D 0x0, > tqe_prev =3D 0x0}, lo_witness =3D 0x0}, mtx_lock =3D 4, mtx_recur= se =3D 0},=20 > v_vnlock =3D 0xc335cc08, > v_holdcnt =3D 1, v_usecount =3D 0, v_iflag =3D 128, v_vflag =3D 0, v_wr= itecount =3D=20 > 0, v_freelist =3D { > tqe_next =3D 0xc3248990, tqe_prev =3D 0xc080d22c}, v_bufobj =3D {bo_m= tx =3D=20 > 0xc335cc2c, bo_clean =3D {bv_hd =3D { > tqh_first =3D 0x0, tqh_last =3D 0xc335cc74}, bv_root =3D 0x0, bv_= cnt =3D=20 > 0}, bo_dirty =3D {bv_hd =3D { > tqh_first =3D 0x0, tqh_last =3D 0xc335cc84}, bv_root =3D 0x0, bv_= cnt =3D=20 > 0}, bo_numoutput =3D 0, bo_flag =3D 0, > bo_ops =3D 0xc07e6564, bo_bsize =3D 8192, bo_object =3D 0x0, bo_syncl= ist =3D=20 > {le_next =3D 0x0, le_prev =3D 0x0}, > bo_private =3D 0xc335cbb0, __bo_vnode =3D 0xc335cbb0}, v_pollinfo =3D= 0x0,=20 > v_label =3D 0x0} > (kgdb) p *(vp->v_mount) > Cannot access memory at address 0x0 > (kgdb) >=20 > Thanks, Thank you for the data. As I see, the problem could be worked around by the following patch: Index: mount.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/local/arch/ncvs/src/sys/sys/mount.h,v retrieving revision 1.210 diff -u -r1.210 mount.h --- mount.h 5 May 2006 19:32:35 -0000 1.210 +++ mount.h 16 Jul 2006 09:15:32 -0000 @@ -578,7 +578,7 @@ int _locked; \ struct mount *_MP; \ _MP =3D (MP); \ - if (VFS_NEEDSGIANT(_MP)) { \ + if (_MP !=3D NULL && VFS_NEEDSGIANT(_MP)) { \ mtx_lock(&Giant); \ _locked =3D 1; \ } else \ What seems to be quite untrivial is testing. Did you had unmount some filesystem before that panic happen ? To reproduce the situation, the following cojunction of the events is neede= d: 1. you have free vnode pressure 2. some very active fs in unmounted 3. some further file activity is going on --tBhgiDt8dP1efIIJ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (FreeBSD) iD8DBQFEugSdC3+MBN1Mb4gRAnilAJ92qyB84eq3du4WDmulhov11ZwAnQCg7JX0 nrlrB9Ie0YmEedmgZAzmplM= =+sfK -----END PGP SIGNATURE----- --tBhgiDt8dP1efIIJ-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 20 02:08:08 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B023416A4DD for ; Thu, 20 Jul 2006 02:08:08 +0000 (UTC) (envelope-from anderson@centtech.com) Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4421A43D45 for ; Thu, 20 Jul 2006 02:08:07 +0000 (GMT) (envelope-from anderson@centtech.com) Received: from [192.168.42.24] (andersonbox4.centtech.com [192.168.42.24]) by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id k6K287BP090456 for ; Wed, 19 Jul 2006 21:08:07 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <44BEE590.2030905@centtech.com> Date: Wed, 19 Jul 2006 21:08:16 -0500 From: Eric Anderson User-Agent: Thunderbird 1.5.0.4 (X11/20060612) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.87.1/1609/Wed Jul 19 07:13:27 2006 on mh1.centtech.com X-Virus-Status: Clean Subject: Failed to flush worklist X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jul 2006 02:08:08 -0000 On one of my servers, I'm unable to change a mount point from rw to ro (something I do all the time). When I try it, I receive messages like: Jul 19 08:57:18 snapshot1 kernel: softdep_waitidle: Failed to flush worklist for 0xc9f37400softdep_waitidle: Failed to flush worklist for 0xc9f37400 Jul 19 08:57:18 snapshot1 kernel: softdep_waitidle: Failed to flush worklist for 0xc9f37400 Jul 19 08:57:53 snapshot1 kernel: softdep_waitidle: Failed to flush worklist for 0xc9f37400 Jul 19 08:58:28 snapshot1 kernel: softdep_waitidle: Failed to flush worklist for 0xc9f37400 # df /vol4 Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/label/vol4 1891668564 1769762782 -29427702 102% /vol4 # uname -a FreeBSD snapshot1.centtech.com 6.1-STABLE FreeBSD 6.1-STABLE #2: Tue May 16 08:34:43 CDT 2006 root@snapshot1.centtech.com:/usr/obj/usr/src/sys/SNAPSHOT i386 # mount /dev/label/vol4 on /vol4 (ufs, local, noatime, soft-updates) Now, here's how I think I broke it: Filesystem was mounted rw, and all was well. Perl program was running in the background, accessing data in a directory deep down inside the filesystem, maybe even its cwd was in there when it started. I did massive rm's, hard links, and writing of new files (see rsync and the link-dest option), probably including the directory the perl program was in. Later on, I tried to do: mount -u -oro /vol4 which failed, with either permission denied or operation not permitted. I cannot umount the filesystem, nor does the -f option help. Here's a snippet from fstat - nothing is currently running on that filesystem that I can find, and this is the only thing that looks odd: root perl 99912 root / 2 drwxr-xr-x 1024 r root perl 99912 wd / 2 drwxr-xr-x 1024 r root perl 99912 text / 238402 -rwxr-xr-x 10088 r root perl 99912 0 - - bad - root perl 99912 1 - - bad - root perl 99912 2 - - bad - root perl 99740 root / 2 drwxr-xr-x 1024 r root perl 99740 wd / 2 drwxr-xr-x 1024 r root perl 99740 text / 238402 -rwxr-xr-x 10088 r root perl 99740 0 - - bad - root perl 99740 1 - - bad - root perl 99740 2 - - bad - Any ideas?? Eric -- ------------------------------------------------------------------------ Eric Anderson Sr. Systems Administrator Centaur Technology Anything that works is better than anything that doesn't. ------------------------------------------------------------------------