Date: Mon, 20 Aug 2018 11:53:03 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: Ian Lepore <ian@freebsd.org>, freebsd-arm@FreeBSD.org Cc: manu@FreeBSD.org Subject: Re: sx_sleep not waking up when timo expires Message-ID: <20180820185303.GH97145@funkthat.com> In-Reply-To: <20180811020136.GD97145@funkthat.com> References: <20180729010157.GC2884@funkthat.com> <1532874944.61594.110.camel@freebsd.org> <20180811020136.GD97145@funkthat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
John-Mark Gurney wrote this message on Fri, Aug 10, 2018 at 19:01 -0700: > Also, I've had the shell command sleep hang as well.. I figure that's > expected, but made me realized that a good test program could be to > fire up a bunch of threads and sleep in them, to make finding the > problem more quickly.... > > Anything I can do to help debug/fix it? > > I have a couple spare LTS boards specifically to do stuff like this. I wrote a program to trigger the issue. It triggered the issue in only an hour or two on both the A64-LTS boards that I've tried it on. Hopefully this can help others debug it. On my firewall board, that does a lot of interrupts, it happens a lot more frequently. In the last 4 hours or so of running the program, I've had 6 threads hang in sleep. # vmstat -i interrupt total rate gic0,p11: + 664239597 676 gic0,s0: uart0 10659 0 gic0,s60: aw_mmc0 70294 0 gic0,s82: awg0 452027133 460 cpu0:ast 511 0 cpu1:ast 48 0 cpu2:ast 34 0 cpu3:ast 35 0 cpu0:preempt 15682717 16 cpu1:preempt 14384242 15 cpu2:preempt 16722306 17 cpu3:preempt 16798837 17 cpu0:rendezvous 300161 0 cpu1:rendezvous 8545 0 cpu2:rendezvous 300183 0 cpu3:rendezvous 300115 0 cpu0:hardclock 35093 0 Total 1180880510 1201 # uptime 11:50AM up 11 days, 9:03, 8 users, load averages: 0.68, 0.83, 0.93 The other box that has only two threads freeze has a total rate of 77.. ---- sleeptest.py ---- import Queue import threading import time import random def sleepfun(q, lngth, extlst, idobj): while not extlst: #factor = (random.random() + 1) * 4 factor = 1 factor = (random.random() * .5 + 1) time.sleep(lngth * factor) q.put((idobj, time.time())) def run(): sleeplength = .5 exitlist = [] nthreads = 20 q = Queue.Queue() thds = {} lastcheck = {} for i in xrange(nthreads): obj = object() thr = threading.Thread(target=sleepfun, args=(q, sleeplength, exitlist, obj)) thds[obj] = thr lastcheck[obj] = time.time() thr.start() try: while True: for i in xrange(nthreads*3): obj, tm = q.get() lastcheck[obj] = tm cur = time.time() for i in lastcheck.keys(): if not thds[i].isAlive(): print 'thread died.' del thds[i] del lastcheck[i] continue print 'last checkin:', cur - lastcheck[i] if cur - lastcheck[i] > 2 * sleeplength: print 'thread is stuck:', `obj`, 'since:', time.ctime(lastcheck[i]) except KeyboardInterrupt: print 'trying to exit...' print time.ctime(time.time()) exitlist.append(True) for i in thds: thds[i].join() if __name__ == '__main__': run() ---- sleeptest.py ---- -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180820185303.GH97145>