From owner-freebsd-amd64@FreeBSD.ORG Tue Feb 13 19:23:55 2007 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9EE8916A402 for ; Tue, 13 Feb 2007 19:23:55 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay02.kiev.sovam.com (relay02.kiev.sovam.com [62.64.120.197]) by mx1.freebsd.org (Postfix) with ESMTP id 3BFF813C467 for ; Tue, 13 Feb 2007 19:23:55 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [212.82.216.227] (helo=fw.zoral.com.ua) by relay02.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.60) (envelope-from ) id 1HH2vC-000DXe-Mj; Tue, 13 Feb 2007 21:02:43 +0200 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id l1DJ2NUG032330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 13 Feb 2007 21:02:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id l1DJ2NCl051100; Tue, 13 Feb 2007 21:02:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id l1DJ2N8T051099; Tue, 13 Feb 2007 21:02:23 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 13 Feb 2007 21:02:23 +0200 From: Kostik Belousov To: Kris Kennaway Message-ID: <20070213190222.GE25802@deviant.kiev.zoral.com.ua> References: <20070213185312.GF67616@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="9l24NVCWtSuIVIod" Content-Disposition: inline In-Reply-To: <20070213185312.GF67616@xor.obsecurity.org> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.7, clamav-milter version 0.88.7 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-0.1 required=5.0 tests=ALL_TRUSTED,SPF_NEUTRAL autolearn=failed version=3.1.7 X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on fw.zoral.com.ua X-Scanner-Signature: 26e083d2ce24d3f2b762b2848bba661f X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 770 [Feb 13 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Info: {received from trusted relay: not dialup} X-SpamTest-Method: none X-SpamTest-Method: Local Lists X-SpamTest-Rate: 0 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release Cc: amd64@freebsd.org, current@freebsd.org Subject: Re: Page fault in amd64 pmap_qremove from vm_thread_new() X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Feb 2007 19:23:55 -0000 --9l24NVCWtSuIVIod Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 13, 2007 at 01:53:12PM -0500, Kris Kennaway wrote: > I get this frequently when running stress2 on an 8-core amd64 system: >=20 > Fatal trap 12: page fault while in kernel mode > Fatal trap 12: page fault while in kernel mode >=20 >=20 > cpuid =3D 2; >=20 >=20 > apic id =3D 02 >=20 > Fatal trap 12: page fault while in kernel mode >=20 > cpuid =3D 5; fault virtual address =3D 0xffff807ffffff040 > Fatal trap 12: page fault while in kernel mode > Fatal trap 12: page fault while in kernel mode >=20 > cpuid =3D 4; apic id =3D 05 > apic id =3D 04 > fault virtual address =3D 0xffff807ffffff0e0 > fault virtual address =3D 0xffff807ffffff0b8 > cpuid =3D 0; fault code =3D supervisor write data, page not pre= sent >=20 > instruction pointer =3D 0x8:0xffffffff803deedd > cpuid =3D 3; stack pointer =3D 0x10:0xffffffffc7647720 > fault code =3D supervisor write data, page not present >=20 > instruction pointer =3D 0x8:0xffffffff803deedd > apic id =3D 00 > stack pointer =3D 0x10:0xffffffffcfd7e720 > fault code =3D supervisor write data, page not present > frame pointer =3D 0x10:0xffffffffc7647730 > frame pointer =3D 0x10:0xffffffffcfd7e730 > Fatal trap 12: page fault while in kernel mode >=20 > cpuid =3D 6; > instruction pointer =3D 0x8:0xffffffff803deedd >=20 > stack pointer =3D 0x10:0xffffffffb2b93720 >=20 > frame pointer =3D 0x10:0xffffffffb2b93730 >=20 > code segment =3D base 0x0, limit 0xfffff, type 0x1b >=20 > =3D DPL 0, pres 1, long 1, def32 0, gran 1 >=20 > processor eflags =3D > interrupt enabled, > resume, Fatal trap 12: page fault while in kernel mode > apic id =3D 06 > cpuid =3D 7; fault virtual address =3D 0xffff807ffffff108 > apic id =3D 07 > fault code =3D supervisor write data, page not present > code segment =3D base 0x0, limit 0xfffff, type 0x1b > apic id =3D 03 > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > fault virtual address =3D 0xffff807ffffff068 > IOPL =3D 0 > fault code =3D supervisor write data, page not present > fault virtual address =3D 0xffff807ffffff018 > instruction pointer =3D 0x8:0xffffffff803deedd > instruction pointer =3D 0x8:0xffffffff803deedd > Fatal trap 12: page fault while in kernel mode > stack pointer =3D 0x10:0xffffffffbf901720 > cpuid =3D 4; stack pointer =3D 0x10:0xffffffffb1c11720 > processor eflags =3D frame pointer =3D 0x10:0xffffffffb1c1= 1730 > interrupt enabled, resume, fault code =3D supervisor write data= , page not present > IOPL =3D 0 > instruction pointer =3D 0x8:0xffffffff803deedd > current process =3D stack pointer =3D 0x10:0xffffffffd5b2= 5720 > frame pointer =3D 0x10:0xffffffffbf901730 > frame pointer =3D 0x10:0xffffffffd5b25730 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > current process =3D =3D DPL 0, pres 1, long= 1, def32 0, gran 1 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > code segment =3D base 0x0, limit 0xfffff, type 0x1b > 18747 (thr2) > [thread pid 18747 tid 142909 ] > Stopped at pmap_qremove+0x2d: movq $0,(%rcx,%rax,8) > db> wh > Tracing pid 18747 tid 142909 td 0xffffff0095710cd0 > pmap_qremove() at pmap_qremove+0x2d > vm_thread_new() at vm_thread_new+0x8d > thread_init() at thread_init+0x16 > slab_zalloc() at slab_zalloc+0x282 > uma_zone_slab() at uma_zone_slab+0x1ae > uma_zalloc_bucket() at uma_zalloc_bucket+0x19d > uma_zalloc_arg() at uma_zalloc_arg+0x3a3 > thread_alloc() at thread_alloc+0x1f > create_thread() at create_thread+0xc5 > kern_thr_new() at kern_thr_new+0x75 > thr_new() at thr_new+0x62 > syscall() at syscall+0x310 > Xfast_syscall() at Xfast_syscall+0xab > --- syscall (455, FreeBSD ELF64, thr_new), rip =3D 0x8007a1cac, rsp =3D 0= x7fffffffdef8, rbp =3D 0 --- > db> show allpcpu > Current CPU: 2 >=20 > cpuid =3D 0 > curthread =3D 0xffffff00717e8290: pid 18944 "thr2" > curpcb =3D 0xffffffffe2e33d50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9aa6520: pid 17 "idle: cpu0" > spin locks held: >=20 > cpuid =3D 1 > curthread =3D 0xffffff0015e9d7b0: pid 18736 "thr2" > curpcb =3D 0xffffffffbceefd50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9aa6290: pid 16 "idle: cpu1" > spin locks held: > exclusive spin mutex sio r =3D 0 (0xffffffff806bf3c0) locked @ dev/sio/si= o.c:1390 >=20 > cpuid =3D 2 > curthread =3D 0xffffff0095710cd0: pid 18747 "thr2" > curpcb =3D 0xffffffffcfd7ed50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9aa6000: pid 15 "idle: cpu2" > spin locks held: >=20 > cpuid =3D 3 > curthread =3D 0xffffff00ad485290: pid 18743 "thr2" > curpcb =3D 0xffffffffd5b25d50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9a63cd0: pid 14 "idle: cpu3" > spin locks held: >=20 > cpuid =3D 4 > curthread =3D 0xffffff0098fc7000: pid 18942 "thr2" > curpcb =3D 0xffffffffc77fad50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9a63000: pid 13 "idle: cpu4" > spin locks held: > exclusive spin mutex turnstile chain r =3D 0 (0xffffffff80613ed8) locked = @ kern/subr_turnstile.c:489 >=20 > cpuid =3D 5 > curthread =3D 0xffffff00215b8cd0: pid 18708 "thr2" > curpcb =3D 0xffffffffb2b93d50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9a8fcd0: pid 12 "idle: cpu5" > spin locks held: >=20 > cpuid =3D 6 > curthread =3D 0xffffff005b72d520: pid 18718 "thr2" > curpcb =3D 0xffffffffb1c11d50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9a8fa40: pid 11 "idle: cpu6" > spin locks held: >=20 > cpuid =3D 7 > curthread =3D 0xffffff0078aae7b0: pid 18782 "thr2" > curpcb =3D 0xffffffffbf901d50 > fpcurthread =3D none > idlethread =3D 0xffffff00b9a8f7b0: pid 10 "idle: cpu7" > spin locks held: >=20 > For some reason ddb doesn't give sensible backtraces for the running thre= ads: >=20 > db> wh 18944 > Tracing pid 18944 tid 130433 td 0xffffff009daa7290 > fork_trampoline() at fork_trampoline > db> wh 18736 > Tracing pid 18736 tid 165977 td 0xffffff00632b2cd0 > fork_trampoline() at fork_trampoline > db> wh 18747 > Tracing pid 18747 tid 165890 td 0xffffff0037403000 > fork_trampoline() at fork_trampoline > db> wh 18743 > Tracing pid 18743 tid 165929 td 0xffffff004f59e000 > fork_trampoline() at fork_trampoline > db> wh 18942 > Tracing pid 18942 tid 130531 td 0xffffff000a166520 > fork_trampoline() at fork_trampoline > db> wh 18708 > Tracing pid 18708 tid 166269 td 0xffffff005c28a290 > fork_trampoline() at fork_trampoline > db> wh 18718 > Tracing pid 18718 tid 111088 td 0xffffff0081f51a40 > fork_trampoline() at fork_trampoline > db> wh 18782 > Tracing pid 18782 tid 166078 td 0xffffff0052b4c000 > fork_trampoline() at fork_trampoline Is the backtrace for faulted thread always the same ? And this is CURRENT ? I'm starring at similar (looks random) corruption on amd64 6.2-RELEASE. Machine already produced >2 core dumps. --9l24NVCWtSuIVIod Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFF0gs+C3+MBN1Mb4gRAgWXAKDvh0JupoLv9cK9sfHgcf57cpNaQgCeIX5x P0K/un9sWC1Qh/3e8WVsTc4= =5WaY -----END PGP SIGNATURE----- --9l24NVCWtSuIVIod--