From owner-freebsd-arm@freebsd.org Sat Aug 11 02:01:38 2018 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6DD1C105650E for ; Sat, 11 Aug 2018 02:01:38 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gate2.funkthat.com", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DA42286012; Sat, 11 Aug 2018 02:01:37 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.15.2/8.15.2) with ESMTPS id w7B21aNa024119 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 10 Aug 2018 19:01:36 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.15.2/8.15.2/Submit) id w7B21aJO024118; Fri, 10 Aug 2018 19:01:36 -0700 (PDT) (envelope-from jmg) Date: Fri, 10 Aug 2018 19:01:36 -0700 From: John-Mark Gurney To: Ian Lepore Cc: freebsd-arm@FreeBSD.org Subject: Re: sx_sleep not waking up when timo expires Message-ID: <20180811020136.GD97145@funkthat.com> Mail-Followup-To: Ian Lepore , freebsd-arm@FreeBSD.org References: <20180729010157.GC2884@funkthat.com> <1532874944.61594.110.camel@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1532874944.61594.110.camel@freebsd.org> X-Operating-System: FreeBSD 11.0-RELEASE-p7 amd64 X-PGP-Fingerprint: D87A 235F FB71 1F3F 55B7 ED9B D5FF 5A51 C0AC 3D65 X-Files: The truth is out there X-URL: https://www.funkthat.com/ X-Resume: https://www.funkthat.com/~jmg/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.6.1 (2016-04-27) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (gold.funkthat.com [127.0.0.1]); Fri, 10 Aug 2018 19:01:36 -0700 (PDT) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Aug 2018 02:01:38 -0000 Ian Lepore wrote this message on Sun, Jul 29, 2018 at 08:35 -0600: > On Sat, 2018-07-28 at 18:01 -0700, John-Mark Gurney wrote: > > I recently upgraded my router to an Pine A64-LTS board, and have hit > > the same issue as PR 222126[1].  The solution at the end does not work > > for me, as I do not have that line in my loader.conf: > > kern.timecounter.smp_tsc_adjust=1 > > > > I have verified that the wake up does not happen, as I used a dtrace > > script to verify that pf_purge_expired_states is called or not called.. > > When I change the timeout, pf will kick the thread and get things > > running again, but it has stopped a couple times later... > > > > I'm running a recent SNAPSHOT: > > FreeBSD gate2.funkthat.com 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r336134: Mon Jul  9 19:20:11 UTC 2018     root@releng3.nyi.freebsd.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC  arm64 > > > > This is likely reproducable by just starting pf, even in a pass all > > mode, and watching for when the function stops getting called...  I'll > > see if I can't get an extermely minimal config to reproduce it. > > > > Any suggestions? > > > > [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222126 > > > > Sounds like > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229644  > > which has some patches attached which reduce but don't quite eliminate > the occurrances, so nothing has been committed yet. I just ordered a > SOPINE board so I can do some hands-on debugging. That patch does not fix the problem.. didn't even take 2 days w/ the patch applied before I got the failure... I had the pf thread stop w/ that patch applied.. # ps laxwww | grep 'pf purge' | grep -v grep; sleep 5; ps laxwww | grep 'pf purge' | grep -v grep 0 614 0 0 -16 0 0 16 pftm DL - 15:15.23 [pf purge] 0 614 0 0 -16 0 0 16 pftm DL - 15:15.23 [pf purge] There'll be cpu usage even w/o traffic running... simply loading pf, and then waiting till the cpu usage time stops incrementing is another easy way to test for it... Also, I've had the shell command sleep hang as well.. I figure that's expected, but made me realized that a good test program could be to fire up a bunch of threads and sleep in them, to make finding the problem more quickly.... Anything I can do to help debug/fix it? I have a couple spare LTS boards specifically to do stuff like this. Thanks. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."