Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 06 Jun 2022 17:44:28 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 264441] Hang with Valgrind on single CPU systems
Message-ID:  <bug-264441-227-UgxGnk1TJQ@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-264441-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-264441-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D264441

Mark Johnston <markj@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|bugs@FreeBSD.org            |markj@FreeBSD.org
                 CC|                            |markj@FreeBSD.org

--- Comment #1 from Mark Johnston <markj@FreeBSD.org> ---
Thanks for the repro steps, I was able to trigger the problem locally.  It's
enough to pin all of the valgrind threads to the same CPU:

/tmp/valgrind # cpuset -l 1 perl tests/vg_regtest none/tests/tls

Really there are two problems here.  First, it seems that one of the threads
(the one switched out in ast()) is simply getting starved.  There are sever=
al
other always-runnable threads in the process that have a slightly higher
scheduling priority, and they end up monopolizing the CPU.  ULE has decided
that the threads are "interactive", and in this case higher priority threads
are always scheduled first.

With multiple CPUs available I suppose the problem may still exist, it just
becomes harder to trigger since with more CPUs there's less chance that the
starved thread will get stuck.  I'm guessing that the stuck thread is the o=
ne
which is supposed to be writing to the pipe.

The second problem is a livelock in the pipe code.  It seems that
pipelock()/pipeunlock() can cause reader threads to wake each other up in a
loop even when there's nothing to do.  This is because the wait channel use=
d by
readers and writers to signal each other is the same as the one used to
serialize I/O operations among multiple concurrent readers.  It's not too h=
ard
to write a standalone test program which triggers this, albeit unreliably. =
 I
suspect the livelock causes the scheduler problem, since the reader threads
keep yielding the CPU to sleep, and this boosts their interactivity score.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-264441-227-UgxGnk1TJQ>