From owner-freebsd-current@FreeBSD.ORG Wed Oct 22 00:59:00 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0EAED1065674 for ; Wed, 22 Oct 2008 00:59:00 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.27]) by mx1.freebsd.org (Postfix) with ESMTP id 877AC8FC18 for ; Wed, 22 Oct 2008 00:58:59 +0000 (UTC) (envelope-from artemb@gmail.com) Received: by ey-out-2122.google.com with SMTP id 6so837222eyi.7 for ; Tue, 21 Oct 2008 17:58:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=Ub6I1vvUbk+4x8cbWUsHe8g6IEi1h4oh6YBzXkjY5Io=; b=mQbGA4nWLCw9WoT5BGdec1oqvrVR4ojBukLhDuhkp022cMutXu+pgt/ov1/T2fII6t Uw2hKstNtO/3Y/q+HbKubxr2rT2MT3NsTHZW+ozGu2agHAWeVFELYQFiupqqkhue2iDH zKoLrVChA+wQqvwj9qQfByYm2Iq55zxXqsH4U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=w+qCnsS5/OSfy80HOVXyFk7YzrisUHTVjmUtW/U6eI4Lcet+z+ZbDaEex8YvPRyBNj AQ/VnMbG7OfmxA6Pn5lA28Prkq/RqfjGB9D3rKEG2uPvQqvu5MWNXWOMk39dcx8EUz6K qJLygIdeea0F6oeUUv/6b7G2tt5W5wqiFHsMU= Received: by 10.210.105.20 with SMTP id d20mr11131679ebc.78.1224637137899; Tue, 21 Oct 2008 17:58:57 -0700 (PDT) Received: by 10.210.13.13 with HTTP; Tue, 21 Oct 2008 17:58:57 -0700 (PDT) Message-ID: Date: Tue, 21 Oct 2008 17:58:57 -0700 From: "Artem Belevich" Sender: artemb@gmail.com To: freebsd-current@freebsd.org, jb@what-creek.com In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: X-Google-Sender-Auth: 81c68170a460d1ef Cc: Subject: Dtrace: hotkernel+buildworld -> crash X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Oct 2008 00:59:00 -0000 enabling WITNESS and INVARIANTS didn't produce anything new. I've got serial console hooked up, so here's detailed crash info. It looks like CPU needs to be as busy as possible. The crash seems to happen only when all four cores are busy. During lighter load it often succeeds. --Artem kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x20 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff80ad5173 stack pointer = 0x10:0xffffffff22b7dc40 frame pointer = 0x10:0xffffffff22b7dc50 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 37749 (cc1) [thread pid 37749 tid 100194 ] Stopped at cyclic_disable_xcall+0x7: movq 0x20(%rax),%rax db> show regi cs 0x8 ss 0 rax 0 rcx 0 rdx 0x1 rbx 0xffffffff8035dd06 smp_no_rendevous_barrier rsp 0xffffffff22b7dc40 rbp 0xffffffff22b7dc50 rsi 0x23 rdi 0xffffffff229167d0 r8 0x6 r9 0 r10 0x1 r11 0xc8181c r12 0xffffffff80ad516c cyclic_disable_xcall r13 0xffffffff229167d0 r14 0x801305da0 r15 0 rip 0xffffffff80ad5173 cyclic_disable_xcall+0x7 rflags 0x10086 cyclic_disable_xcall+0x7: movq 0x20(%rax),%rax db> trace Tracing pid 37749 tid 100194 td 0xffffff00ce0d4370 cyclic_disable_xcall() at cyclic_disable_xcall+0x7 smp_rendezvous_action() at smp_rendezvous_action+0xb3 Xrendezvous() at Xrendezvous+0x64 --- interrupt, rip = 0x543faa, rsp = 0x7fffffffe090, rbp = 0x801305ae0 --- Tracing command dtrace pid 25527 tid 100071 td 0xffffff00053a26e0 cpustop_handler() at cpustop_handler+0x47 ipi_nmi_handler() at ipi_nmi_handler+0x32 trap() at trap+0x26d nmi_calltrap() at nmi_calltrap+0x8 --- trap 0x13, rip = 0xffffffff80538cb9, rsp = 0xfffffffe40016ff0, rbp = 0xffffffff229166a0 --- smp_tlb_shootdown() at smp_tlb_shootdown+0x8a pmap_invalidate_page() at pmap_invalidate_page+0x79 pmap_remove_pte() at pmap_remove_pte+0xd7 pmap_remove() at pmap_remove+0x2e7 vm_map_delete() at vm_map_delete+0xdc vm_map_remove() at vm_map_remove+0x4a uma_large_free() at uma_large_free+0x54 free() at free+0x6b dtrace_buffer_free() at dtrace_buffer_free+0x1c dtrace_state_destroy() at dtrace_state_destroy+0x3bb dtrace_close() at dtrace_close+0x96 devfs_close() at devfs_close+0x16b vn_close() at vn_close+0x74 vn_closefile() at vn_closefile+0xf1 devfs_close_f() at devfs_close_f+0x1e _fdrop() at _fdrop+0x20 closef() at closef+0x4a kern_close() at kern_close+0x13f syscall() at syscall+0x255 Xfast_syscall() at Xfast_syscall+0xab --- syscall (6, FreeBSD ELF64, close), rip = 0x800df7a3c, rsp = 0x7fffffffe7b8, rbp = 0x622000 --- On Tue, Oct 21, 2008 at 2:41 PM, Artem Belevich wrote: > Hi, > > I'm not sure if it's a known issue or not, but running hotkernel > script from DTraceToolkit-0.99 > during "make buildworld -j8" easily crashes -current (cvsup'ed on Oct > 20th) on amd64 (Quad core Q9450) > when I press ^C to stop the script. > > Kernel is GENERIC with WITNESS/INVARIANTS disabled and some > SCSI/wireless/NIC drivers removed. > > I was unable to dump kernel core - debugger always gets another trap > and returns > to the prompt. The box does not have serial ports, so I've typed in > portions of stack > traces below from the screen. > > One common thing across all crashes I've seen so far is that crashed > process always > dies in the same place with the following backtrace. Apparently it attempts to > dereference $rip which is 0. > > cyclic_disable_xcall+0x7 > smp_rendezvous_action > Xrendezvous > ----interrupt > > Dtrace process itself always has the same stack trace: > > smp_tlb_shootdown > pmap_invalidate_page > pmap_remove_pte > pmap_remove > vm_map_delete > vm_map_remove > uma_large_free > free > dtrace_buffer_free > dtrace_state_destroy > dtrace_close > ... > > I'll try to reproduce the issue with WITNESS/INVARIANTS turned ON. Perhaps that > would provide more hints on what's wrong. Meanwhile, if someone can suggest > anything I can do to help troubleshoot this, that would be great as > I'm a bit out of > my depth here. > > -- > --Artem > -- --Artem