Date: Thu, 14 Feb 2013 22:05:56 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: Marc Fournier <scrappy@hub.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org, John Baldwin <jhb@freebsd.org> Subject: Re: 9-STABLE -> NFS -> NetAPP: Message-ID: <1964289267.3041689.1360897556427.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <9A149E78-BB4F-414D-AAE5-331C5934FF82@hub.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Marc Fournier wrote:
> On 2013-02-13, at 3:54 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:
>
> >>
> > The pid that is in "T" state for the "ps auxlH".
>
> Different server, last kernel update on Jan 22nd, https process this
> time instead of du last time.
>
> I've attached:
>
> ps auxlH
> ps auxlH of just the processes that are in TJ state (6 httpd servers)
> procstat output for each of the 6 process
>
>
>
>
> They are included as attachments … if these don't make it through, let
> me know, just figured I'd try and keep it compact ...
Well, I've looked at this call path a little closer:
16693 104135 httpd - mi_switch+0x186 thread_suspend_check+0x19f sleepq_catch_signals+0x1c5
sleepq_timedwait_sig+0x19 _sleep+0x2ca clnt_vc_call+0x763 clnt_reconnect_call+0xfb newnfs_request+0xadb
nfscl_request+0x72 nfsrpc_accessrpc+0x1df nfs34_access_otw+0x56 nfs_access+0x306 vn_open_cred+0x5a8
kern_openat+0x20a amd64_syscall+0x540 Xfast_syscall+0xf7
I am probably way off, since I am not familiar with this stuff, but it
seems to me that thread_suspend_check() should just return 0 for the
case where stop_allowed == SIG_STOP_NOT_ALLOWED (TDF_SBDRY flag set)
instead of sitting in the loop and doing a mi_switch(). I'm not even
sure if it should call thread_suspend_check() for this case, but there
are cases in thread_suspend_check() that I don't understand.
Although I don't really understand thread_suspend_check(), I've attached
a simple patch that might be a starting point for fixing this?
I wouldn't recommend trying the patch until kib and/or jhb weigh in
on whether it makes any sense.
rick
[-- Attachment #2 --]
--- kern/subr_sleepqueue.c.sav 2013-02-14 20:39:47.000000000 -0500
+++ kern/subr_sleepqueue.c 2013-02-14 21:03:03.000000000 -0500
@@ -443,7 +443,7 @@ sleepq_catch_signals(void *wchan, int pr
sig = cursig(td, stop_allowed);
if (sig == 0) {
mtx_unlock(&ps->ps_mtx);
- ret = thread_suspend_check(1);
+ ret = thread_suspend_check(1, stop_allowed);
MPASS(ret == 0 || ret == EINTR || ret == ERESTART);
} else {
if (SIGISMEMBER(ps->ps_sigintr, sig))
--- kern/kern_exit.c.sav 2013-02-14 21:04:21.000000000 -0500
+++ kern/kern_exit.c 2013-02-14 21:04:50.000000000 -0500
@@ -159,7 +159,7 @@ exit1(struct thread *td, int rv)
* First check if some other thread got here before us.
* If so, act appropriately: exit or suspend.
*/
- thread_suspend_check(0);
+ thread_suspend_check(0, SIG_STOP_ALLOWED);
/*
* Kill off the other threads. This requires
--- kern/kern_sig.c.sav 2013-02-14 21:05:06.000000000 -0500
+++ kern/kern_sig.c 2013-02-14 21:05:40.000000000 -0500
@@ -1463,7 +1463,7 @@ kern_sigsuspend(struct thread *td, sigse
while (msleep(&p->p_sigacts, &p->p_mtx, PPAUSE|PCATCH, "pause",
0) == 0)
/* void */;
- thread_suspend_check(0);
+ thread_suspend_check(0, SIG_STOP_ALLOWED);
mtx_lock(&p->p_sigacts->ps_mtx);
while ((sig = cursig(td, SIG_STOP_ALLOWED)) != 0)
has_sig += postsig(sig);
--- kern/kern_thread.c.sav 2013-02-14 21:07:06.000000000 -0500
+++ kern/kern_thread.c 2013-02-14 21:44:10.000000000 -0500
@@ -762,7 +762,7 @@ stopme:
* return_instead is set.
*/
int
-thread_suspend_check(int return_instead)
+thread_suspend_check(int return_instead, int stop_allowed)
{
struct thread *td;
struct proc *p;
@@ -794,6 +794,9 @@ thread_suspend_check(int return_instead)
(p->p_flag & P_SINGLE_BOUNDARY) && return_instead)
return (ERESTART);
+ if (stop_allowed == SIG_STOP_NOT_ALLOWED && return_instead)
+ return (0);
+
/*
* If the process is waiting for us to exit,
* this thread should just suicide.
--- kern/subr_trap.c.sav 2013-02-14 21:09:43.000000000 -0500
+++ kern/subr_trap.c 2013-02-14 21:10:02.000000000 -0500
@@ -283,7 +283,7 @@ ast(struct trapframe *framep)
*/
if (flags & TDF_NEEDSUSPCHK) {
PROC_LOCK(p);
- thread_suspend_check(0);
+ thread_suspend_check(0, SIG_STOP_ALLOWED);
PROC_UNLOCK(p);
}
--- sys/proc.h.sav 2013-02-14 21:10:58.000000000 -0500
+++ sys/proc.h 2013-02-14 21:12:01.000000000 -0500
@@ -943,7 +943,7 @@ void thread_stopped(struct proc *p);
void childproc_stopped(struct proc *child, int reason);
void childproc_continued(struct proc *child);
void childproc_exited(struct proc *child);
-int thread_suspend_check(int how);
+int thread_suspend_check(int how, int stop_allowed);
void thread_suspend_switch(struct thread *);
void thread_suspend_one(struct thread *td);
void thread_unlink(struct thread *td);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1964289267.3041689.1360897556427.JavaMail.root>
