Date: Mon, 01 Dec 2003 18:01:59 -0800 From: Peter Wemm <peter@wemm.org> To: James Van Artsdalen <james-freebsd-amd64@jrv.org> Cc: freebsd-amd64@freebsd.org Subject: Re: Varargs issues Message-ID: <20031202020159.2897C2A8DA@canning.wemm.org> In-Reply-To: <200312020011.hB20B4A4068504@bigtex.jrv.org>
next in thread | previous in thread | raw e-mail | index | archive | help
James Van Artsdalen wrote: > I looked at the example I was suspicious of and realized that I had > overlooked that the caller implicitly allocates 8 bytes in the stack > due to the CALL opcode. It's perfectly reasonable for a function to > start with something like: > > doo: > push (64 bit) val > call z > > since the return address + the push makes an even 16 bytes. > > There are two different pthread_create () functions. The one in > /usr/src/lib/libc_r/uthread/uthread_create.c does this to ensure that > the stack starts off right: > > SET_STACK_JB(new_thread->ctx.jb, > (long)new_thread->stack + pattr->stacksize_attr > - sizeof(double)); > > sizeof (double) isn't really portable - sizeof (&_pthread_create) would > be better - but it clearly leaves the stack with a value that gcc expects, > i.e., an "odd" value in 8-byte units (0xfff8 instead of 0xfff0). This stuff is all pretty broken for amd64. I'm amazed that it even remotely works. The situation right now is that if a pthread_create()ed thread tries to do floating point, it'll blow up sooner or later. If its not in varargs, it'll be in the function prologue/epilogue when it saves and restores the xmm registers. This is usually an absolute cow to track down. Plus there is another hairball to deal with.. the red zone.. The 128 bytes below the stack pointer are reserved for private use of leaf functions or scratch area. This means that the signal handlers need to skip over it, and anything that copies the stack has to include the extra 128 bytes. ia64 does something similar. On ia64, 16 bytes *above* the stack pointer are available for scratch purposes. So, if you want to use 32 bytes of stack space, you actually have to reserve 48 and not use the bottom 16 bytes, since anything you call will likely use it. But ia64 is a whole different animal. > > But, in /usr/src/lib/libthr/thread/thr_create.c I see this: > > new_thread->ctx.uc_stack.ss_sp = new_thread->stack; > > and no evidence anywhere that the return address gcc expects to already > be pushed on the stack is accounted for. libthr isn't built on amd64.... > Perhaps this should be something > like: > > new_thread->ctx.uc_stack.ss_sp = new_thread->stack; > #if !defined(__ia64__) > new_thread->ctx.uc_stack.ss_sp -= sizeof (&_pthread_create); > #endif Wrong cpu platform. :-) Although again, this depends on the assumption that you are beginning execution in C code. If its assembler, it might just do a plain abi-breaking 'call' with odd padding to fix it up by accident. The kernel does a lot of this. I lost lots of sleep over this. Oh, and whats even better is that in some cases the cpu hardware rounds the stack for you and in other cases it doesn't... that caused some excitement, I can assure you. Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031202020159.2897C2A8DA>