Date: Tue, 15 Apr 2003 22:58:38 -0400 (EDT) From: Daniel Eischen <eischen@pcnet1.pcnet.com> To: David Xu <davidxu@freebsd.org> Cc: freebsd-threads@freebsd.org Subject: Re: libpthread patch Message-ID: <Pine.GSO.4.10.10304152253300.25176-100000@pcnet1.pcnet.com> In-Reply-To: <005501c303b8$cda19390$f001a8c0@davidw2k>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 16 Apr 2003, David Xu wrote: > ----- Original Message ----- > From: "Daniel Eischen" <eischen@pcnet1.pcnet.com> > To: "David Xu" <davidxu@freebsd.org> > Cc: <freebsd-threads@freebsd.org>; "Craig Rodrigues" <rodrigc@attbi.com> > Sent: Wednesday, April 16, 2003 5:26 AM > Subject: Re: libpthread patch > > > > There's an updated patch file available at (a slightly different place): > > > > http://people.freebsd.org/~deischen/kse/libpthread.diffs > > > > Will test it. I found another problem with one of my other tests. It doesn't seem to affect any of the ACE tests, though. I'll continue debugging it. > > There's also an html'ized log of the ACE tests: > > > > http://people.freebsd.org/~deischen/kse/ace_build_logs/index.html > > > > The only real problems seem to be with the ACE tests: > > > > Cached_Conn_Test > > Process_Manager_Test > > > > And I think these have something to do with wait() or waitpid() > > not working correctly. David, do you know of any problems in > > this area? It seems that sometimes waitpid() is returning 0 > > and the next time it is called it returns the process id. > > I wonder if it is being interrupted by a signal (either the > > kernel doing it or the UTS by use of kse_thr_interrupt)? > > > > Remember current signal handling for threaded program is > broken in kernel, any signal can be lost in kernel because > of thread exiting, for our M:N based threaded process, the > case is worse than 1:1 because we exit thread more often than > 1:1 threading, so any signal related tests will frequently > be failed. some code in ACE I find : > for (;;) > { > int result = ACE_OS::waitpid (this->getpid (), > status, > WNOHANG); > if (result != 0) > return result; > > ACE_Sig_Set alarm_or_child; > > alarm_or_child.sig_add (SIGALRM); > alarm_or_child.sig_add (SIGCHLD); > ACE_Time_Value time_left = wait_until - ACE_OS::gettimeofday (); > > // If ACE_OS::ualarm doesn't have sub-second resolution: > time_left += ACE_Time_Value (0, 500000); > time_left.usec (0); > > if (time_left <= ACE_Time_Value::zero) > return 0; // timeout > > ACE_OS::ualarm (time_left); > if (ACE_OS::sigwait (alarm_or_child) == -1) > return ACE_INVALID_PID; > } > ... > so you see, the code expects SIGCHLD and SIGALRM, if > SIGCHLD lost, it would timeout and return 0; > I did not find waitpid has bug. I thought it might also be the UTS trying to interrupt the thread (kse_thr_interruot) while it was in the kernel (assuming the UTS did get the signal). > BTW, I have a patch for kse_release to let it direct > return to userland and not schedule an upcall. > the bit 0 of km_flags in kse_mailbox is used as a hint > to tell kernel not to schedule an upcall for the kse. > http://people.freebsd.org/~davidxu/kse_release.diff I haven't tested that yet; that's on my list of things to do :-) > If nobody objects it, I will commit it. -- Dan Eischen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.10.10304152253300.25176-100000>