Date: Mon, 16 Nov 2009 11:04:10 -0800 From: David Wolfskill <david@catwhisker.org> To: Peter Jeremy <peterjeremy@acm.org> Cc: hardware@freebsd.org Subject: Re: 7.2-STABLE i386 box crashing -- clues? Message-ID: <20091116190410.GA1589@albert.catwhisker.org> In-Reply-To: <20091116182924.GA30969@server.vk2pj.dyndns.org> References: <20091111173747.GA1150@albert.catwhisker.org> <20091112062708.GA16648@server.vk2pj.dyndns.org> <20091112125903.GA1631@albert.catwhisker.org> <20091116182924.GA30969@server.vk2pj.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--d6Gm4EdcadzBjdND Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 17, 2009 at 05:29:24AM +1100, Peter Jeremy wrote: > ... > >Yes; the machine is configured to start xdm on transition to > >multi--user, as my spouse used to use it as a desktop. >=20 > Does the problem still appear if you don't start X? I haven't tried that yet.... > Is it running anything unusual when it crashes? Not that I can tell, no. Though I did just notice that the whines about icmp unreach responses is actually coming from the machine that's crashing ("albert") vs. the firewall box (which is configured to log everything to albert). > >> At this stage, my suggestion would be to try swapping the PSU. > > > >Thanks. I'll discuss it with the "family CFO." >=20 > You can't swap it with another of your systems? Even if it doesn't > fit neatly into the case, a temporary swap would give you some > confidence as to whether it was really faulty or not (especially if > the random reboots move to the other system). I think it's more a matter of "at all" rather than "neatly." :-} I tend to have a variety of hardware, but each machine tends to be from a different era or have other differences that cause each to be a one-off. But I'll see what I can find. In the mean time. tyhe machine crashed this morning after I got in to work -- but wonder of wonders, it came back up again this time. And the typescript file that's capturing the serial console activity showed: fxp0: link state changed to UP Limiting icmp unreach response from 234 to 200 packets/sec FreeBSD/i386 (albert.catwhisker.org) (ttyd0) login: drm0: <Intel i865G GMCH> on vgapci0 vgapci0: child drm0 requested pci_enable_busmaster info: [drm] AGP at 0xf0000000 128MB info: [drm] Initialized i915 1.6.0 20080730 drm0: [ITHREAD] Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec Limiting icmp unreach response from 201 to 200 packets/sec panic: vm_fault: fault on nofault entry, addr: c3983000 cpuid =3D 0 KDB: stack backtrace: db_trace_self_wrapper(c0bf0330,e7d168f8,c082cae9,c0c1237c,0,...) at 0xc049e= 9a6 =3D db_trace_self_wrapper+0x26 kdb_backtrace(c0c1237c,0,c0c07dfe,e7d16904,0,...) at 0xc085a239 =3D kdb_bac= ktrace+0x29 panic(c0c07dfe,c3983000,2,e7d169fc,e7d169ec,...) at 0xc082cae9 =3D panic+0x= 119 vm_fault(c1471000,c3983000,2,0,e7d16a7c,...) at 0xc0a6ec88 =3D vm_fault+0x1= 78 trap_pfault(c0d4de20,e7d16ac4,c0b2b675,0,c660cb00,...) at 0xc0b3f60e =3D tr= ap_pfault+0x20e trap(e7d16b3c) at 0xc0b400b5 =3D trap+0x445 calltrap() at 0xc0b22dbb =3D calltrap+0x6 --- trap 0xc, eip =3D 0xc0b37648, esp =3D 0xe7d16b7c, ebp =3D 0xe7d16b88 --- pmap_try_insert_pv_entry(c08b14c4,c63c0cf0,c63c0cf0,e7d16bbc,c08b4b17,...) = at 0xc0b37648 =3D pmap_try_insert_pv_entry+0x48 pmap_copy(c6cd2d74,c69dd350,33f7d000,f4000,33f7d000,...) at 0xc0b3c1e8 =3D = pmap_copy+0x2e8 vmspace_fork(c69dd2c4,0,2,e7d16c5c,bfbfc824,...) at 0xc0a7698b =3D vmspace_= fork+0x42b fork1(c63c3b40,14,0,e7d16c78,0,...) at 0xc08051ee =3D fork1+0x30e fork(c63c3b40,e7d16cfc,c,8001550d,369e99,...) at 0xc0806b79 =3D fork+0x29 syscall(e7d16d38) at 0xc0b3f9c5 =3D syscall+0x335 Xint0x80_syscall() at 0xc0b22e20 =3D Xint0x80_syscall+0x20 --- syscall (2, FreeBSD ELF32, fork), eip =3D 0x340cde4b, esp =3D 0xbfbfc7c= c, ebp =3D 0xbfbfc858 --- Uptime: 3d4h1m43s Physical memory: 2033 MB Dumping 179 MB: 164 148 132 116 100 84 68 52 36 20 4 Dump complete Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... Taking a quick look at vmcore.1, I see: albert(7.2-S)[5] kgdb /boot/kernel/kernel vmcore.1 GNU gdb 6.1.1 [FreeBSD] =2E.. [above stuff...] =2E.. #0 doadump () at pcpu.h:196 196 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc082c817 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:4= 18 #2 0xc082cb22 in panic (fmt=3DVariable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0a6ec88 in vm_fault (map=3D0xc1471000, vaddr=3D3281530880,=20 fault_type=3D2 '\002', fault_flags=3D0) at /usr/src/sys/vm/vm_fault.c:2= 77 #4 0xc0b3f60e in trap_pfault (frame=3D0xe7d16b3c, usermode=3D0, eva=3D3281= 531764) at /usr/src/sys/i386/i386/trap.c:852 #5 0xc0b400b5 in trap (frame=3D0xe7d16b3c) at /usr/src/sys/i386/i386/trap.= c:541 #6 0xc0b22dbb in calltrap () at /usr/src/sys/i386/i386/exception.s:166 #7 0xc0b37648 in pmap_try_insert_pv_entry (pmap=3D0xc6cd2d74, va=3D8721408= 00,=20 m=3D0xc2be5110) at /usr/src/sys/i386/i386/pmap.c:2229 #8 0xc0b3c1e8 in pmap_copy (dst_pmap=3D0xc6cd2d74, src_pmap=3D0xc69dd350,= =20 dst_addr=3D871878656, len=3D999424, src_addr=3D871878656) at /usr/src/sys/i386/i386/pmap.c:3677 #9 0xc0a7698b in vmspace_fork (vm1=3D0xc69dd2c4) at /usr/src/sys/vm/vm_map.c:2552 #10 0xc08051ee in fork1 (td=3D0xc63c3b40, flags=3DVariable "flags" is not a= vailable. ) at /usr/src/sys/kern/kern_fork.c:288 #11 0xc0806b79 in fork (td=3D0xc63c3b40, uap=3D0xe7d16cfc) at /usr/src/sys/kern/kern_fork.c:107 #12 0xc0b3f9c5 in syscall (frame=3D0xe7d16d38) at /usr/src/sys/i386/i386/trap.c:1101 #13 0xc0b22e20 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s= :262 #14 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) If there is an issue with the PSU, I'm not sure there's much to be gained by spending much time on that dump -- I understand that there's not much information to trust if the PSU is flaky. Thanks for your help! Peace, david --=20 David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --d6Gm4EdcadzBjdND Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.13 (FreeBSD) iEYEARECAAYFAksBoikACgkQmprOCmdXAD3h+wCfRMXcxH7UUqleSYnMMiHoRI0A tg0An2+iZqnyBzgEYu89l96nJUY4sS1T =TLek -----END PGP SIGNATURE----- --d6Gm4EdcadzBjdND--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091116190410.GA1589>