From owner-freebsd-current@FreeBSD.ORG Tue May 18 20:59:13 2010 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ABB781065672 for ; Tue, 18 May 2010 20:59:13 +0000 (UTC) (envelope-from fk@fabiankeil.de) Received: from smtprelay04.ispgateway.de (smtprelay04.ispgateway.de [80.67.31.38]) by mx1.freebsd.org (Postfix) with ESMTP id 18BD48FC12 for ; Tue, 18 May 2010 20:59:12 +0000 (UTC) Received: from [78.34.140.135] (helo=r500.local) by smtprelay04.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1OETi0-00049E-8G; Tue, 18 May 2010 22:48:12 +0200 Date: Tue, 18 May 2010 22:48:19 +0200 From: Fabian Keil To: Roman Bogorodskiy Message-ID: <20100518224819.28d9624b@r500.local> In-Reply-To: <20100518185201.GA2745@fsol> References: <20100518185201.GA2745@fsol> X-Mailer: Claws Mail 3.7.5 (GTK+ 2.20.1; amd64-portbld-freebsd9.0) X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/fk-2008-08-18.asc Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/ULCw+2R6QiNFyA4SQLvZX9X"; protocol="application/pgp-signature" X-Df-Sender: 180909 X-Mailman-Approved-At: Tue, 18 May 2010 21:16:04 +0000 Cc: current@freebsd.org Subject: Re: ffs_copyonwrite panics X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 May 2010 20:59:13 -0000 --Sig_/ULCw+2R6QiNFyA4SQLvZX9X Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Roman Bogorodskiy wrote: > I've been using -CURRENT last update in February for quite a long time > and few weeks ago decided to finally update it. The update was quite > unfortunate as system became very unstable: it just hangs few times a > day and panics sometimes. >=20 > Some things can be reproduced, some cannot. Reproducible ones: >=20 > 1. background fsck always makes system hang > 2. system crashes on operations with nullfs mounts (disabled that for > now) >=20 > The most annoying one is ffs_copyonwrite panic which I cannot reproduce. > The thing is that if I will run 'startx' on it with some X apps it will > panic just in few minutes. When I leave the box with nearly no stress > (just use it as internet gateway for my laptop) it behaves a little > better but will eventually crash in few hours anyway. >=20 > The even more annoying thing is that when I cannot save the dump, > because when the system boots and runs 'savecore' it leads to > fss_copyonwrite panic as well. The panic happens when about 90% complete > (as seem via ctrl-t). >=20 > Any ideas how to debug and get rid of this issue? >=20 > System arch is amd64. I don't know what other details could be useful. I'm not familiar with the background fsck issue, but if the nullfs panic looks like this one, there's a fair chance it's already fixed: Fatal trap 12: page fault while in kernel mode cpuid =3D 0; apic id =3D 00 fault virtual address =3D 0x10 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff82412f14 stack pointer =3D 0x28:0xffffff803e564620 frame pointer =3D 0x28:0xffffff803e564770 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 1825 (jail) panic: from debugger cpuid =3D 0 Uptime: 38s Dumping 1992 MB (5 chunks) chunk 0: 1MB (155 pages) ... ok chunk 1: 1990MB (509345 pages) 1974 [...] 6 ... ok chunk 2: 2MB (273 pages) ... ok chunk 3: 1MB (184 pages) #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:223 #1 0xffffffff803c506f in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0xffffffff803c546c in panic (fmt=3DVariable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:590 #3 0xffffffff801f6e77 in db_panic (addr=3DVariable "addr" is not available. ) at /usr/src/sys/ddb/db_command.c:478 #4 0xffffffff801f7281 in db_command (last_cmdp=3D0xffffffff808bfd80, cmd_t= able=3DVariable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #5 0xffffffff801f74d0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #6 0xffffffff801f9429 in db_trap (type=3DVariable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #7 0xffffffff803f3c25 in kdb_trap (type=3D12, code=3D0, tf=3D0xffffff803e5= 64570) at /usr/src/sys/kern/subr_kdb.c:535 #8 0xffffffff8062ad9d in trap_fatal (frame=3D0xffffff803e564570, eva=3DVar= iable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:773 #9 0xffffffff8062b0fc in trap_pfault (frame=3D0xffffff803e564570, usermode= =3D0) at /usr/src/sys/amd64/amd64/trap.c:694 #10 0xffffffff8062b8ff in trap (frame=3D0xffffff803e564570) at /usr/src/sys/amd64/amd64/trap.c:451 #11 0xffffffff80611f33 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:223 #12 0xffffffff82412f14 in null_bypass (ap=3D0xffffff803e564780) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:269 #13 0xffffffff80448104 in vgonel (vp=3D0xffffff0005e05780) at vnode_if.h:10= 99 #14 0xffffffff8044835e in vrecycle (vp=3D0xffffff0005e05780, td=3DVariable = "td" is not available. ) at /usr/src/sys/kern/vfs_subr.c:2505 #15 0xffffffff82412e6f in null_inactive (ap=3DVariable "ap" is not availabl= e. ) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:665 #16 0xffffffff80444ff8 in vinactive (vp=3D0xffffff0005e05780,=20 td=3D0xffffff00054743e0) at vnode_if.h:807 #17 0xffffffff804495dd in vputx (vp=3D0xffffff0005e05780, func=3D2) at /usr/src/sys/kern/vfs_subr.c:2226 #18 0xffffffff8043e1ae in lookup (ndp=3D0xffffff803e564a50) at /usr/src/sys/kern/vfs_lookup.c:905 #19 0xffffffff8043eef7 in namei (ndp=3D0xffffff803e564a50) at /usr/src/sys/kern/vfs_lookup.c:269 #20 0xffffffff8044ec86 in kern_accessat (td=3D0xffffff00054743e0, fd=3D-100= ,=20 path=3D0x800537000
, pathseg=3DVaria= ble "pathseg" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2140 #21 0xffffffff8062b21d in syscall (frame=3D0xffffff803e564c80) at /usr/src/sys/amd64/amd64/trap.c:946 #22 0xffffffff80612211 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:374 #23 0x000000080050e5ec in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb)=20 I got it reproducible with: FreeBSD 9.0-CURRENT #66 r+3fe665b: Fri May 14 17:45:10 CEST 2010 fk@r500.local:/usr/obj/usr/src/sys/ZOEY amd64 but it had already been fixed in Subversion/CVS on Saturday so I didn't investigate which commit caused it and which one fixed it. My previous kernel without the issue was: FreeBSD 9.0-CURRENT #65 r+6f48909: Sat May 8 19:28:58 CEST 2010 I'm currently using: FreeBSD 9.0-CURRENT #69 r+3a7afc7: Sun May 16 20:04:53 CEST 2010 without any issues either. I don't use background fsck, though. Fabian --Sig_/ULCw+2R6QiNFyA4SQLvZX9X Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkvy/RcACgkQSMVSH78upWORUgCePNh7EXdAVeybbfwOG0IOv+pJ 7HkAnRgunNyTSh5tJS7uJb5fDBOr4R8c =Qz9Q -----END PGP SIGNATURE----- --Sig_/ULCw+2R6QiNFyA4SQLvZX9X--