From owner-freebsd-stable@freebsd.org Tue Jan 30 22:24:05 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D6C15EC2FBB for ; Tue, 30 Jan 2018 22:24:04 +0000 (UTC) (envelope-from nimrod@nimrod.is-a-geek.net) Received: from mail-yw0-x22f.google.com (mail-yw0-x22f.google.com [IPv6:2607:f8b0:4002:c05::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6F07285162 for ; Tue, 30 Jan 2018 22:24:04 +0000 (UTC) (envelope-from nimrod@nimrod.is-a-geek.net) Received: by mail-yw0-x22f.google.com with SMTP id u17so6007571ywg.9 for ; Tue, 30 Jan 2018 14:24:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Lg9t26lKv23kPUykXIW0ZPGpDdjm2j0QhJ4XWpVG2LA=; b=A52fJxi1cbKiUr/L0y0Ctv5L57hIWaBM6b6reDPJMIJPZ8/dBe5PIWYhJXH6+Qz68t 7AgktldY79WHs0idBj+F8+VAFikKPwReuz/8ONz1pjhMpEnCUP7YlH1z0XhOcbu5RJR7 SNYBnrvTYBgX8UZXh4cBCi8OQydqB6FHK+jA+WFZT4dKJJ+ZbmiQGbkiFPxYVfUPKkS0 wnb/I0Y6PLILMe1XFuCKWegoHQRzN69PpnJ/4jNrw03pvkWIEv52WafPrBA9uBEMuLCe zefNw+AiZK/KmB+FFiqEK66lRO0SrXexkZLAcODjJPhUDcPqbgEEZxUX3qaOOkyT52vM P7ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Lg9t26lKv23kPUykXIW0ZPGpDdjm2j0QhJ4XWpVG2LA=; b=iATFoscc6H7kEypSIGTQba2uYDQW1zRA6ixb+ANd5mlA1KZTZ6oTBNwC5nu5t+xUvq IhF99zRBCpH5OLV9LU67AL5S1C3Y9yV9w+mLBsg5elvps65wpJtZmWUyQcobb2qoh6qn QyZ4cV+3DLNnGuW7xPCTkSSy4xJIPjI76tA6JA5nQJQns+g6yy5Xw1FHqZZ/oxWGrND0 f+HckRVfJ22ZALnw2FRfQir7t0zoWvvfqoRwCspjm5ua3x851DYefNk2lju9hHMhsRTZ PEWs8sOviptaUiXlPZo2jigYyD5GYfrUDBEJnC0411vwV4/YaHXt6pTHKVATYwb8bHsK eaRQ== X-Gm-Message-State: AKwxytfjvbeFOHhU+RWwjcwIMtiB9oFTkORJcNofPJR76lTepbW582Ik cdkap+pyxg7M9DXCFV48DguvtgkUbXfgyONPll7TEg== X-Google-Smtp-Source: AH8x226EnW3IdrNum+Pnub8qVfhjd0DNIX01qjPW8KKgQ8J541KtOt7CFMOcCeqWwZTc62tuwHQD+uYBLy8XYUyXdCw= X-Received: by 10.37.9.5 with SMTP id 5mr21836710ybj.101.1517351043290; Tue, 30 Jan 2018 14:24:03 -0800 (PST) MIME-Version: 1.0 References: <8e842dec-ade7-37d1-6bd8-856ea1a827ca@sentex.net> <9b769e4e-b098-b294-0bce-8bb1c42e8a59@rootautomation.com> <730eb882-1c6a-afb7-0ada-396db44fb34b@ingresso.co.uk> <8b882970-4d5d-2a96-4dac-779cab07b9ae@sentex.net> <343acf99-3e9e-093a-7390-c142396c2985@sentex.net> <3dd9a61b-511d-db2e-80ca-cbc9a4b65f92@sentex.net> <55913e41-3a8a-9a4d-6862-e09a3d0f4d55@sentex.net> <5e48bbc2-e872-46bd-eece-25acbb180f77@sentex.net> In-Reply-To: From: Nimrod Levy Date: Tue, 30 Jan 2018 22:23:52 +0000 Message-ID: Subject: Re: Ryzen issues on FreeBSD ? (with sort of workaround) To: Mike Tancsa Cc: Don Lewis , freebsd-stable@freebsd.org, Peter Moody , Andriy Gapon , Pete French Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Jan 2018 22:24:05 -0000 That's really strange. I never saw those kinds of deadlocks, but I did notice that if I kept the cpu busy using distributed.net I could keep the full system lockups away for at least a week if not longer. Not to keep harping on it, but what worked for me was lowering the memory speed. I'm at 11 days of uptime so far without anything running the cpu. Before the change it would lock up anywhere from an hour to a day. On Tue, Jan 30, 2018 at 4:39 PM Mike Tancsa wrote: > On 1/30/2018 2:51 PM, Mike Tancsa wrote: > > > > And sadly, I am still able to hang the compile in about the same place. > > However, if I set > > > OK, here is a sort of work around. If I have the box a little more busy, > I can avoid whatever deadlock is going on. In another console I have > cat /dev/urandom | sha256 > running while the build runs > > ... and I can compile net/samba47 from scratch without the compile > hanging. This problem also happens on HEAD from today. Should I start > a new thread on freebsd-current ? Or just file a bug report ? > The compile worked 4/4 > > ---Mike > > > > > > > > > > > > > > hw.lower_amd64_sharedpage=0 > > > > it seems to hang in a different way. CTRL+t shows > > > > load: 0.43 cmd: python2.7 15736 [umtxn] 165.00r 14.46u 6.65s 0% 233600k > > make[1]: Working in: /usr/ports/net/samba47 > > make: Working in: /usr/ports/net/samba47 > > > > > > # procstat -t 15736 > > PID TID COMM TDNAME CPU PRI STATE > > WCHAN > > 15736 100855 python2.7 - -1 152 sleep > > usem > > 15736 100956 python2.7 - -1 124 sleep > > umtxn > > 15736 100957 python2.7 - -1 126 sleep > > umtxn > > 15736 100958 python2.7 - -1 124 sleep > > umtxn > > 15736 100959 python2.7 - -1 127 sleep > > umtxn > > 15736 100960 python2.7 - -1 126 sleep > > umtxn > > 15736 100961 python2.7 - -1 126 sleep > > umtxn > > 15736 100962 python2.7 - -1 126 sleep > > umtxn > > 15736 100963 python2.7 - -1 126 sleep > > umtxn > > 15736 100964 python2.7 - -1 127 sleep > > umtxn > > 15736 100965 python2.7 - -1 126 sleep > > umtxn > > 15736 100966 python2.7 - -1 126 sleep > > umtxn > > 15736 100967 python2.7 - -1 126 sleep > > umtxn > > > > # procstat -kk 15736 > > PID TID COMM TDNAME KSTACK > > > > 15736 100855 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100956 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100957 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100958 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100959 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100960 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100961 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100962 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100963 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100964 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100965 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100966 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15736 100967 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > > > If I kill the make, reboot and just type make, it completes after the > > reboot. If after the reboot, I do an rm -R work, it will hang again. > > With the default of > > hw.lower_amd64_sharedpage: 1 > > post reboot, > > > > CTRL+T shows > > load: 2.73 cmd: python2.7 15703 [usem] 40.92r 12.34u 3.45s 0% 233640k > > make[1]: Working in: /usr/ports/net/samba47 > > make: Working in: /usr/ports/net/samba47 > > > > > > > > root@amdtestr12:/home/mdtancsa # procstat -kk 15703 > > PID TID COMM TDNAME KSTACK > > > > 15703 100824 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100956 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100957 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100958 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100959 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100960 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100961 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100962 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100963 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100964 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100965 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48 > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100966 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > 15703 100967 python2.7 - mi_switch+0xf5 > > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231 > > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b > > amd64_syscall+0xa48 fast_syscall_common+0xfc > > root@amdtestr12:/home/mdtancsa # procstat -t 15703 > > PID TID COMM TDNAME CPU PRI STATE > > WCHAN > > 15703 100824 python2.7 - -1 152 sleep > > usem > > 15703 100956 python2.7 - -1 125 sleep > > usem > > 15703 100957 python2.7 - -1 127 sleep > > usem > > 15703 100958 python2.7 - -1 125 sleep > > usem > > 15703 100959 python2.7 - -1 125 sleep > > usem > > 15703 100960 python2.7 - -1 126 sleep > > usem > > 15703 100961 python2.7 - -1 126 sleep > > usem > > 15703 100962 python2.7 - -1 126 sleep > > usem > > 15703 100963 python2.7 - -1 126 sleep > > usem > > 15703 100964 python2.7 - -1 126 sleep > > usem > > 15703 100965 python2.7 - -1 126 sleep > > umtxn > > 15703 100966 python2.7 - -1 126 sleep > > usem > > 15703 100967 python2.7 - -1 125 sleep > > usem > > root@amdtestr12:/home/mdtancsa # > > > > > > ---Mike > > > > > >> > >> ------------------------------------------------------------------------ > >> r321608 | kib | 2017-07-27 01:37:07 -0700 (Thu, 27 Jul 2017) | 9 lines > >> > >> Use MFENCE to serialize RDTSC on non-Intel CPUs. > >> > >> Kernel already used the stronger barrier instruction for AMDs, correct > >> the userspace fast gettimeofday() implementation as well. > >> > >> > >> > >> I did go back and look at the build runaways that I've occasionally seen > >> on my AMD FX-8320E package builder. I haven't seen the python issue > >> there, but have seen gmake get stuck in a sleeping state with a bunch of > >> zombie offspring. > >> > >> > > > > > > > -- > ------------------- > Mike Tancsa, tel +1 519 651 3400 <(519)%20651-3400> > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > -- -- Nimrod