Date: Sat, 3 Oct 2015 00:29:30 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: John Baldwin <jhb@FreeBSD.org> Cc: Ryan Stone <rysto32@gmail.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org> Subject: Re: How to get anything useful out of kgdb? Message-ID: <560EF73A.8050505@FreeBSD.org> In-Reply-To: <1595419.L0rkNTMkPe@ralph.baldwin.cx> References: <554E41EE.2010202@ignoranthack.me> <CAFMmRNyM6Tc7P8rLJmMSVXOFkK4Tc0OCOtc=E9dLEtzKrEtjLg@mail.gmail.com> <560E238F.9050609@FreeBSD.org> <1595419.L0rkNTMkPe@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On 02/10/2015 19:12, John Baldwin wrote: > On Friday, October 02, 2015 09:26:23 AM Andriy Gapon wrote: >> On 15/05/2015 20:57, Ryan Stone wrote: >>> *Sigh*, kgdb isn't unwinding the trap frame properly. You can try this to >>> figure out where it was running: >> >> I wonder, what is a reason for this? >> Can that be fixed in kgdb itself? >> It seems that usually kgdb handles trap frames just fine, but not always? > > It should be fixable. If this doesn't work under newer kgdb let me know and I'll > try to fix it. Okay, letting you know :-) The backtraces from the in-tree kgdb and the newer kgdb both abort at the same frame (output from the newer kgdb is in my message in another kgdb related thread). > I did fix a few edge cases with special frame handling in the > newer kgdb though those mostly had to do with fork_trampoline and possibly > Xtimerint (and aside from fork_trampoline I think the fixes were mostly for i386 > where different handlers setup trapframes differently) > >>> That gives you the top of the callstack at the time that the core was >>> taken. To get the rest of it, try: >>> >>> define trace_stack >>> set $frame_ptr=$arg0 >>> set $iters=0 >>> while $frame_ptr != 0 && $iters < $arg1 >>> set $ret_addr=((char*)$frame_ptr) + sizeof(void*) >>> printf "frameptr=%p, ret_addr=%p\n", (void*)$frame_ptr, *(void**)$ret_addr >>> printf " " >>> info line **(void***)$ret_addr >>> set $frame_ptr=*(void**)$frame_ptr >>> set $iters=$iters+1 >>> end >>> end >>> >>> trace_stack frame->tf_rbp 20 >> >> Thank you for this script. >> Here is an example from my practice. >> >> (kgdb) bt >> #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:291 >> #1 0xffffffff8063453f in kern_reboot (howto=260) at >> /usr/src/sys/kern/kern_shutdown.c:359 >> #2 0xffffffff80634ba4 in vpanic (fmt=<value optimized out>, ap=<value optimized >> out>) at /usr/src/sys/kern/kern_shutdown.c:635 >> #3 0xffffffff806348a3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:568 >> #4 0xffffffff8041bba7 in db_panic (addr=<value optimized out>, have_addr=false, >> count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 >> #5 0xffffffff8041b67b in db_command (cmd_table=0x0) at >> /usr/src/sys/ddb/db_command.c:440 >> #6 0xffffffff8041b524 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 >> #7 0xffffffff8041de0b in db_trap (type=<value optimized out>, code=0) at >> /usr/src/sys/ddb/db_main.c:251 >> #8 0xffffffff80669de8 in kdb_trap (type=19, code=0, tf=0xffffffff80f976d0) at >> /usr/src/sys/kern/subr_kdb.c:653 >> #9 0xffffffff80820d26 in trap (frame=0xffffffff80f976d0) at >> /usr/src/sys/amd64/amd64/trap.c:381 >> #10 0xffffffff80809623 in nmi_calltrap () at >> /usr/src/sys/libkern/explicit_bzero.c:28 > > This may be part of the problem. The trapframe unwinder depends on function names > to know when it is crossing a trapframe. nmi_calltrap() is not the function at > explicit_bzero.c:28. Usually debugging this sort of thing starts by going to frame 11 > and comparing its registers with the values in the trapframe. They should match, but > sometimes you will find them shifted by one or two, etc. And it seems that nmi_calltrap being a label within an assembler-defined procedure confuses the in-tree kgdb quite a lot: (kgdb) list *0xffffffff80809623 0xffffffff80809623 is at /usr/src/sys/libkern/explicit_bzero.c:28. 23 void 24 explicit_bzero(void *buf, size_t len) 25 { 26 memset(buf, 0, len); 27 __explicit_bzero_hook(buf, len); 28 } (kgdb) list nmi_calltrap 23 void 24 explicit_bzero(void *buf, size_t len) 25 { 26 memset(buf, 0, len); 27 __explicit_bzero_hook(buf, len); 28 } (kgdb) disassemble nmi_calltrap Dump of assembler code for function nmi_calltrap: 0xffffffff8080961b <nmi_calltrap+0>: mov %rsp,%rdi 0xffffffff8080961e <nmi_calltrap+3>: callq 0xffffffff80820670 <trap> 0xffffffff80809623 <nmi_calltrap+8>: test %ebx,%ebx 0xffffffff80809625 <nmi_calltrap+10>: je 0xffffffff80809695 <nocallchain> 0xffffffff80809627 <nmi_calltrap+12>: mov %gs:0x0,%rax 0xffffffff80809630 <nmi_calltrap+21>: or %rax,%rax 0xffffffff80809633 <nmi_calltrap+24>: je 0xffffffff80809695 <nocallchain> 0xffffffff80809635 <nmi_calltrap+26>: testl $0x400000,0xec(%rax) 0xffffffff8080963f <nmi_calltrap+36>: je 0xffffffff80809695 <nocallchain> 0xffffffff80809641 <nmi_calltrap+38>: mov %rsp,%rsi 0xffffffff80809644 <nmi_calltrap+41>: mov $0xc0,%rcx 0xffffffff8080964b <nmi_calltrap+48>: mov %gs:0x220,%rdx 0xffffffff80809654 <nmi_calltrap+57>: sub %rcx,%rdx 0xffffffff80809657 <nmi_calltrap+60>: mov %rdx,%rdi 0xffffffff8080965a <nmi_calltrap+63>: shr $0x3,%rcx 0xffffffff8080965e <nmi_calltrap+67>: cld 0xffffffff8080965f <nmi_calltrap+68>: rep movsq %ds:(%rsi),%es:(%rdi) 0xffffffff80809662 <nmi_calltrap+71>: mov %ss,%eax 0xffffffff80809664 <nmi_calltrap+73>: push %rax 0xffffffff80809665 <nmi_calltrap+74>: push %rdx 0xffffffff80809666 <nmi_calltrap+75>: pushfq 0xffffffff80809667 <nmi_calltrap+76>: mov %cs,%eax 0xffffffff80809669 <nmi_calltrap+78>: push %rax 0xffffffff8080966a <nmi_calltrap+79>: pushq $0xffffffff80809671 0xffffffff8080966f <nmi_calltrap+84>: iretq End of assembler dump. (kgdb) disassemble explicit_bzero Dump of assembler code for function explicit_bzero: 0xffffffff806e74c0 <explicit_bzero+0>: push %rbp 0xffffffff806e74c1 <explicit_bzero+1>: mov %rsp,%rbp 0xffffffff806e74c4 <explicit_bzero+4>: push %r14 0xffffffff806e74c6 <explicit_bzero+6>: push %rbx 0xffffffff806e74c7 <explicit_bzero+7>: mov %rsi,%r14 0xffffffff806e74ca <explicit_bzero+10>: mov %rdi,%rbx 0xffffffff806e74cd <explicit_bzero+13>: callq 0xffffffff806e74f0 <memset> 0xffffffff806e74d2 <explicit_bzero+18>: mov %rbx,%rdi 0xffffffff806e74d5 <explicit_bzero+21>: mov %r14,%rsi 0xffffffff806e74d8 <explicit_bzero+24>: callq 0xffffffff8088a2d0 <__explicit_bzero_hook> 0xffffffff806e74dd <explicit_bzero+29>: pop %rbx 0xffffffff806e74de <explicit_bzero+30>: pop %r14 0xffffffff806e74e0 <explicit_bzero+32>: pop %rbp 0xffffffff806e74e1 <explicit_bzero+33>: retq End of assembler dump. The newer kgdb is smarter about this situation: (kgdb) list *0xffffffff80809623 0xffffffff80809623 is at /usr/src/sys/amd64/amd64/exception.S:527. 522 * - Check if the thread requires a user call chain to be 523 * captured. 524 * 525 * We are still in NMI mode at this point. 526 */ 527 testl %ebx,%ebx 528 jz nocallchain /* not from userspace */ 529 movq PCPU(CURTHREAD),%rax 530 orq %rax,%rax /* curthread present? */ 531 jz nocallchain However, that does not seem to help with stack unwinding. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?560EF73A.8050505>