Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 09 Mar 2018 15:33:37 -0800
From:      John Baldwin <jhb@freebsd.org>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Andriy Gapon <avg@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r330338 - head/sys/amd64/amd64
Message-ID:  <1997852.rATkjh2guz@ralph.baldwin.cx>
In-Reply-To: <20180310070452.P5919@besplex.bde.org>
References:  <201803031510.w23FAbeC065867@repo.freebsd.org> <2557369.6nFzd3kAUm@ralph.baldwin.cx> <20180310070452.P5919@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, March 10, 2018 07:41:30 AM Bruce Evans wrote:
> On Fri, 9 Mar 2018, John Baldwin wrote:
> 
> > On Saturday, March 03, 2018 03:10:37 PM Andriy Gapon wrote:
> >> Author: avg
> >> Date: Sat Mar  3 15:10:37 2018
> >> New Revision: 330338
> >> URL: https://svnweb.freebsd.org/changeset/base/330338
> >>
> >> Log:
> >>   db_nextframe/amd64: catch up with r328083 to recognize fast_syscall_common
> >>
> >>   Since that change the system call stack traces look like this:
> >>     ...
> >>     sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe0028e13ac0
> >>     amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe0028e13bf0
> >>     fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0028e13bf0
> >>   So, db_nextframe() stopped recognizing the system call frame.
> >>   This commit should fix that.
> >>
> >>   Reviewed by:	kib
> >>   MFC after:	4 days
> >>
> >> Modified:
> >>   head/sys/amd64/amd64/db_trace.c
> >>
> >> Modified: head/sys/amd64/amd64/db_trace.c
> >> ==============================================================================
> >> --- head/sys/amd64/amd64/db_trace.c	Sat Mar  3 13:20:44 2018	(r330337)
> >> +++ head/sys/amd64/amd64/db_trace.c	Sat Mar  3 15:10:37 2018	(r330338)
> >> @@ -212,7 +212,9 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, s
> >>  		    strcmp(name, "Xcpususpend") == 0 ||
> >>  		    strcmp(name, "Xrendezvous") == 0)
> >>  			frame_type = INTERRUPT;
> >> -		else if (strcmp(name, "Xfast_syscall") == 0)
> >> +		else if (strcmp(name, "Xfast_syscall") == 0 ||
> >> +		    strcmp(name, "Xfast_syscall_pti") == 0 ||
> >> +		    strcmp(name, "fast_syscall_common") == 0)
> >>  			frame_type = SYSCALL;
> >
> > I think you actually just want to replace Xfast_syscall with
> > fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
> > functions before jumping to the common label, so when unwinding from a system
> > call you should always get the common label.  (That is, I think we should
> > remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
> > happens to find those symbols during unwinding won't have a valid SYSCALL
> > frame to unwind.)
> 
> No, it needs these symbols to decode the frame after reaching a point where
> the frame is actually set up.

My point is that during the instructions from Xfast_syscall to fast_syscall_common
there isn't a valid frame.  Xfast_syscall has two instructions and hasn't yet
decremented %rsp to create room for the trapframe for example.  If you wanted to
handle the special case of stepping through those functions you'd have to create
a new type of frame that used register values from the saved frame for the
debug trap.  You can't assume that there's a 'struct trapframe' at 'rbp + 16'.

> Also, in uncommitted fixes I add some decoding of the non-frame between
> the entry point and when the frame is set up.  Then the frame register
> is still the user's for the syscall case, so should not even be accessed.
> The i386 output looks like this:
> 
> current:
> XX12: Breakpoint at   Xint0x80_syscall:       pushl   $0x2
> XX12: db> t
> XX12: Tracing pid 1 tid 100001 td 0xc6fad000
> XX12: Xint0x80_syscall(33,282,bfbfee0c,3b,0,...) at Xint0x80_syscall/frame 0xd1d05bd8
> XX12: kernload(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 0x805188f/frame 0xbfbfee7c
> 
> fixed:
> XX12F: Breakpoint at   Xint0x80_syscall:       pushl   $0x2
> XX12F: db> t
> XX12F: Tracing pid 1 tid 100001 td 0xd4c23000
> XX12F: Xint0x80_syscall(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at Xint0x80_syscall/frame 0xbfbfee7c
> XX12F: --- unknown trap, ebp = 0xbfbfee7c, sp = 0xd3399bdc, ks = 0xd3398000, kse = 0xd339a000 ---
> 
> where most of the values on the last line are for debugging (ebp is the user
> stack pointer which will become inaccessible, sp is the local variable sp
> and [ks, kse - 1] is the thread's kernel stack range (sp must be in this
> else the backtrace stops).

Yes, both of these symbols would only be found instructions for this type of
special frame.  Using the 'SYSCALL' frame type for the Xfast_syscall* symbols
would always be wrong.  Until such time as we have the new frame type we should
just ignore them.

> Jumps and labels with names inside functions complicate things.  I think
> fast_syscall_common needs to be in the list too, and the many alltraps
> labels should have been there.  This will be more useful with my fix.
> The label calltrap has always been in the list.  This works right since
> the frame has been set up then -- IIRC it is the first place where the
> frame has been set up, and label it more for gdb than for ddb, and decode
> the frame for ddb (presumably gdb decodes the frame too).

gdb does depend on the names, and I was looking at this commit again to see if
I needed to update gdb.  I thought I didn't, but now I see that gdb was depending
on the 'X' prefix for the old Xfast_syscall name and it now needs to check for
fast_syscall_common directly.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1997852.rATkjh2guz>