From owner-freebsd-ppc@freebsd.org Sun Feb 24 00:22:00 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BF2651509378 for ; Sun, 24 Feb 2019 00:22:00 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic302-4.consmr.mail.bf2.yahoo.com (sonic302-4.consmr.mail.bf2.yahoo.com [74.6.135.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9C73974EA6 for ; Sun, 24 Feb 2019 00:21:59 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: yiCL9QwVM1lOVpg.8zZImuo5mLi7z9efbbDkXblpOm3cvzyj9soVIyDQQA3DlLg WJJr6o7ut4Rav.ca0ZZN1DaYyAtskW.Ll9wLyRxqhw9DvUY5D69ZnPQxYxeQ94Qml0TDzZEyQRjX oDvNE8ZISMI_aDxekQuL7WQ2kAnrkDcukCDk_7XCCNwwwtyhi81PIoaKVI5ZvHBLRG2xUbAT1tCB WndMav5PSEw_8ShtleB_pfGDxJoH17R9Zg7OejdIFlTnFL21ywRGldtniIPQigpcC5CAI92d8NdZ nQDLwYxkmHR_F_Chy1.SIHo0RgrDlB2opbq1rkY7ITjnSQ.CX1KSllhbXhRf_uMv_JJLGrnccmvX LlXE__cou1HJWRkRoCtBN1dAYUVOZGWP8kx36NKtFSJNuFomtfE5bVomiYXyBQ6VS5VphJhwO1cn odOKrikX5JS6YRIzCnq8M8zRtGxzIyCMP2c9WgV21EAIrLk94Ip9u2CPhQPQrV9ReQ1OpteIWMBy 2YcvA2eVxu_T5KrmiFwOh02R8CjZpb48zW3A4t_bROeQBK0wI8AoBFRn6x2wlOzEE7mNlo7faypy I75fc61v3eOgkBq37ZSDyuLqHIHIfEBXbkt2N_1RXkrrSKDuxy.fNoyQRTvbCUAFRbztll7xuddY D0_NR6Qd3Fi_MhBXv6yNfoR7kt2zYGEGw_x7u90ox5cy6H.mYckYR7_a8E5byrUrDsh4Wjfivt1j SZ3crbRg5P7AQOsYOr1afTUKu6KGl45SVEMTgb8gp9d0c4RXh.lCygXPWb_DADPpQhZ8d1mmx02l tsK0HNyYll7VRWgHAlZx2gj3_AAhKCCqx61WP14ada5hAHlMVfzc3nqJi3R6rW7mxBoULf1L3QUI 9yEsGYqkHcPbRcMLg6yMuyTAro0ghnJ0ZfzpjnikBa8Sx69F5Whq3.4IMTIBIaphvZri4K5g1XOa .uuq3La1RiCzAuo9obStmmnsFZj1q9CkXU.I1org6wJNwtkIZoLuToKxnaTWIUn4LT9JIU35MPg. xfJSAUVJIRfWjvycij8lQ7dH5zb14BR9j6Miqd3__qD04f1GkQpq9104dx4zi Received: from sonic.gate.mail.ne1.yahoo.com by sonic302.consmr.mail.bf2.yahoo.com with HTTP; Sun, 24 Feb 2019 00:21:58 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp413.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 31746b3652a418570f432e91df31d55c; Sun, 24 Feb 2019 00:21:56 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: powerpc64 (and more): procstat -kk does not report sched_switch, just mi_switch's call to it. Message-Id: Date: Sat, 23 Feb 2019 16:21:54 -0800 To: FreeBSD PowerPC ML , freebsd-hackers Hackers X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 9C73974EA6 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.17 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.91)[0.910,0]; NEURAL_HAM_LONG(-0.43)[-0.435,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.42)[ip: (4.43), ipnet: 74.6.128.0/21(1.52), asn: 26101(1.22), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.79)[0.788,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[43.135.6.74.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2019 00:22:01 -0000 Take, for example (from a powerpc64 context), # procstat -kk 23 PID TID COMM TDNAME KSTACK = =20 23 100074 bufdaemon - mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c buf_daemon+0x2f8 = fork_exit+0xb0 fork_trampoline+0x18 .TOC.+0x1fffffff642c6efc=20 23 100080 bufdaemon bufspacedaemon-0 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100081 bufdaemon bufspacedaemon-1 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100082 bufdaemon bufspacedaemon-2 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100083 bufdaemon bufspacedaemon-3 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100084 bufdaemon bufspacedaemon-4 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100085 bufdaemon bufspacedaemon-5 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100086 bufdaemon bufspacedaemon-6 mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = bufspace_daemon+0x438 fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 23 100106 bufdaemon / worker mi_switch+0x134 = sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c = softdep_flush+0x38c fork_exit+0xb0 fork_trampoline+0x18 = .TOC.+0x1fffffff642c6efc=20 then using objdump on /boot/kernel/kernel : 0000000000751868 mr r3,r30 000000000075186c bl 0000000000789f14 = 0000000000751870 nop But I see the same sort of thing on amd6: # procstat -kk 46 PID TID COMM TDNAME KSTACK = =20 46 100274 bufdaemon - mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 buf_daemon+0x158 fork_exit+0xbd = fork_trampoline+0xe=20 46 100275 bufdaemon bufspacedaemon-0 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100277 bufdaemon bufspacedaemon-1 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100279 bufdaemon bufspacedaemon-2 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100280 bufdaemon bufspacedaemon-3 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100282 bufdaemon bufspacedaemon-4 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100283 bufdaemon bufspacedaemon-5 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100284 bufdaemon bufspacedaemon-6 mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 bufspace_daemon+0x4d6 fork_exit+0xbd = fork_trampoline+0xe=20 46 100297 bufdaemon / worker mi_switch+0x131 = sleepq_timedwait+0x36 _sleep+0x289 softdep_flush+0x2c9 fork_exit+0xbd = fork_trampoline+0xe=20 ffffffff81139d0c callq ffffffff81177760 = ffffffff81139d11 mov %gs:0x18,%rbx Is this lack of listing sched_switch information intended behavior? Note for powerpc64 relative to the "+8": 0000000000789f0c addis r2,r12,190 0000000000789f10 addi r2,r2,-12044 0000000000789f14 mflr r0 0000000000789f18 std r0,16(r1) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Sun Feb 24 21:57:28 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F7F7150E8AF for ; Sun, 24 Feb 2019 21:57:28 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from mail.ignoranthack.me (ignoranthack.me [199.102.79.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BCA058A9B8 for ; Sun, 24 Feb 2019 21:57:27 +0000 (UTC) (envelope-from sbruno@freebsd.org) Received: from [192.168.0.101] (unknown [75.161.244.113]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id CD2D11AF6B2; Sun, 24 Feb 2019 14:02:15 +0000 (UTC) Subject: Re: Installation fails on Tyan POWER8 box with ahci command timeouts. To: Karel Gardas , freebsd-ppc@freebsd.org References: <6278e350-1cbd-2882-0a2e-7fd7f67b09e6@gmail.com> From: Sean Bruno Openpgp: preference=signencrypt Autocrypt: addr=sbruno@freebsd.org; prefer-encrypt=mutual; keydata= mQENBFk+0UEBCADaf4bgxxKvMOhRV5NPoGWRCCGm49d6+1VFNlQ77WsY/+Zvf95TPULdRlnG w648KfxWt7+O3kdKhdRwnqlXWC7zA2Qt0dRE1yIqOGJ4jp4INvp/bcxWzgr0aoKOjrlnfxRV bh+s0rzdZt6TsNL3cVYxkC8oezjaUkHdW4mFJU249U1QJogkF8g0FeKNfEcjEkwJNX6lQJH+ EzCWT0NCk6J+Xyo+zOOljxPp1OUfdvZi3ulkU/qTZstGVWxFVsP8xQklV/y3AFcbIYx6iGJ4 5L7WuB0IWhO7Z4yHENr8wFaNYwpod9i4egX2BugbrM8pOfhN2/qqdeG1L5LMtXw3yyAhABEB AAG0N1NlYW4gQnJ1bm8gKEZyZWVCU0QgRGV2ZWxvcGVyIEtleSkgPHNicnVub0BmcmVlYnNk Lm9yZz6JAVQEEwEKAD4WIQToxOn4gDUE4eP0ujS95PX+ibX8tgUCWT7RQQIbAwUJBaOagAUL CQgHAwUVCgkICwUWAwIBAAIeAQIXgAAKCRC95PX+ibX8ttKTCACFKzRc56EBAlVotq02EjZP SfX+unlk6AuPBzShxqRxeK+bGYVCigrYd1M8nnskv0dEiZ5iYeND9HIxbpEyopqgpVTibA7w gBXaZ7SOEhNX1wXwg14JrralfSmPFMYni+sWegPMX/zwfAsn1z4mG1Nn44Xqo3o7CfpkMPy6 M5Bow2IDzIhEYISLR+urxs74/aHU35PLtBSDtu18914SEMDdva27MARN8mbeCDbuJVfGCPWy YHuy2t+9u2Zn5Dd+t3sBXLM9gpeaMm+4x6TNPpESygbVdh4tDdjVZ9DK/bWFg0kMgfZoaq6J l0jNsQXrZV3bzYNFbVw04pFcvA2GIJ7xuQENBFk+0UEBCADIXBmQOaKMHGbc9vwjhV4Oj5aZ DdhNedn12FVeTdOXJvuTOusgxS29lla0RenHGDsgD08UiFpasBXWq/E+BhQ19d+iRbLLR17O KKc1ZGefoVbLARLXD68J5j4XAyK+6k2KqBLlqzAEpHTzsksM9naARkVXiEVcrt6ciw0FSm8n kuK3gDKKe93XfzfP+TQdbvvzJc7Fa+appLbXz61TM1aikaQlda8bWubDegwXbuoJdB34xU1m yjr/N4o+raL0x7QrzdH+wwgrTTo+H4S2c1972Skt5K5tbxLowfHicRl23V8itVQr3sBtlX4+ 66q+Apm7+R36bUS/k+G45Sp6iPpxABEBAAGJATwEGAEKACYWIQToxOn4gDUE4eP0ujS95PX+ ibX8tgUCWT7RQQIbDAUJBaOagAAKCRC95PX+ibX8trrIB/9Pljqt/JGamD9tx4dOVmxSyFg9 z2xzgklTLuDgS73MM120mM7ao9AQUeWiSle/H0UCK7xPOzC/aeUC4oygDQKAfkkNbCNTo3+A qDjBRA8qx0e9a/QjDL+RFgD4L5kLT4tToY8T8HaBp8h03LBfk510IaI8oL/Jg7vpM3PDtJMW tUi2H+yNFmL3NfM2oBToWKLFsoP54f/eeeImrNnrlLjLHPzqS+/9apgYqX2Jwiv3tHBc4FTO GuY8VvF7BpixJs8Pc2RUuCfSyodrp1YG1kRGlXAH0cqwwr0Zmk4+7dZvtVQMCl6kS6q1+84q JwtItxS2eXSEA4NO0sQ3BXUywANh Message-ID: <6b41696d-548c-4674-d30f-3172b1c12b82@freebsd.org> Date: Sun, 24 Feb 2019 14:57:16 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <6278e350-1cbd-2882-0a2e-7fd7f67b09e6@gmail.com> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="m6ffXCRA7FjQGqK3CFl6HNaPiCgspboAV" X-Rspamd-Queue-Id: BCA058A9B8 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.99 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-0.99)[-0.994,0]; NEURAL_HAM_SHORT(-0.99)[-0.991,0]; ASN(0.00)[asn:36236, ipnet:199.102.76.0/22, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2019 21:57:28 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --m6ffXCRA7FjQGqK3CFl6HNaPiCgspboAV Content-Type: multipart/mixed; boundary="nkOLhgJ3LynzNuv5wwzi5OokI4txWmA7F"; protected-headers="v1" From: Sean Bruno To: Karel Gardas , freebsd-ppc@freebsd.org Message-ID: <6b41696d-548c-4674-d30f-3172b1c12b82@freebsd.org> Subject: Re: Installation fails on Tyan POWER8 box with ahci command timeouts. References: <6278e350-1cbd-2882-0a2e-7fd7f67b09e6@gmail.com> In-Reply-To: <6278e350-1cbd-2882-0a2e-7fd7f67b09e6@gmail.com> --nkOLhgJ3LynzNuv5wwzi5OokI4txWmA7F Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 2/23/19 11:23 AM, Karel Gardas wrote: > Hello, >=20 > my attempt to install 12.0 on Tyan TN71-BP012 box with POWER8 CPU fails= > while extracting distribution files. The message -- probably kernel > complain as it "destroys" nice curses dialog of Archive Extraction -- > contains this: >=20 > ahcich1: Timeout on slot 0 port 0 > ahcich1: is 00000000 cs fdffffff ss ffffffff rs ffffffff tfd 40 serr > 00000000 cmd 00719917 > (ada0:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB 61 00 28 e6 d0 40 06 00 0= 0 > 01 00 00 > (ada0:ahcich1:0:0:0): CAM status: Command timeout > (ada0:ahcich1:0:0:0): Retrying command, 3 more tries remain > ahcich1: Timeout on slot 25 port 0 > ahcich1: is 00000000 cs 02000000 ss 00000000 rs 02000000 tfd 150 serr > 000000000 cmd 0071c017 > (apropbe0:ahcich1:0:0:0): ATA_IDENTITY. ACB: ec 00 00 00 00 40 00 00 00= > 00 00 00 > (apropbe0:ahcich1:0:0:0): CAM status: Command timeout > (apropbe0:ahcich1:0:0:0): Retrying command, 0 more tries remain >=20 >=20 > I'm not sure if I've shoot all before panic. I guess what happened afte= r > those messages is kernel panic and then straight reboot. > The drive connected to Marvel SATA mezanine board in Tyan is WD GOLD > 500GB SATA. > Is there any workaround for this issue? As I'd like to have FreeBSD > running on bare-metal/powernv instead of inside the PowerKVN provided i= n > also installed ubuntu, where freebsd works fine. >=20 >=20 > Thanks! > Karel >=20 >=20 >=20 I *think* this was fixed with a combination of: r336760 | luporl | 2018-07-27 07:11:05 -0600 (Fri, 27 Jul 2018) | 10 line= s And: r339589 | luporl | 2018-10-22 07:40:50 -0600 (Mon, 22 Oct 2018) | 12 line= s The current FreeBSD pkg builders are Tyan PPC8 based hosts as well. I would strongly suggest using -current for now as we have not been MFC'ing to stable/12 reliably (I'm not testing on stable/12). sean --nkOLhgJ3LynzNuv5wwzi5OokI4txWmA7F-- --m6ffXCRA7FjQGqK3CFl6HNaPiCgspboAV Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQGTBAEBCgB9FiEE6MTp+IA1BOHj9Lo0veT1/om1/LYFAlxzEz5fFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEU4 QzRFOUY4ODAzNTA0RTFFM0Y0QkEzNEJERTRGNUZFODlCNUZDQjYACgkQveT1/om1 /LY9twf+K2N4BpYzU/grX1HOOcBQh1o9G3D5p9c04K+Y0pCLbC/V5y5T5DbnA5qp emgQsti0WNm3GT4qsTt8K9Ciizq1nPu3DTbj5oqvVMD/unhNHsDXwh6CN5VlQH+f 9ddpSeDHMf3yZ6hkP1Cx6LODLpFMENP5xa1BhXqTnAlM3kFTEUdIz/7DBdHCzaM6 UZdPe+9sM/kKD1AGG1wyZU1qRBFf9YLRbSMNx85tfsS4//KUUzK8Do8CwWUFveru 9tUnCyGFQB2I2dZ8En8GiER6XRQt21/AaUROeOzFgbJD8c8uEUwpi9RJdWH68wEb VWzkXuP2kzN2FJFgdLJsfffhnUjdTw== =/fhG -----END PGP SIGNATURE----- --m6ffXCRA7FjQGqK3CFl6HNaPiCgspboAV-- From owner-freebsd-ppc@freebsd.org Sun Feb 24 17:20:13 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49C0D15047B1 for ; Sun, 24 Feb 2019 17:20:12 +0000 (UTC) (envelope-from bacon4000@gmail.com) Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D71477548F for ; Sun, 24 Feb 2019 17:20:10 +0000 (UTC) (envelope-from bacon4000@gmail.com) Received: by mail-ot1-x333.google.com with SMTP id 32so5888888ota.12 for ; Sun, 24 Feb 2019 09:20:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:cc:references:message-id:date:user-agent:mime-version :in-reply-to:content-language; bh=ZvX2FIOCz/Y3mZ7dIngCkPZB4t329Yv+PDSPBH27no8=; b=NH+11/pt7NCAH5wVL12Gn2Vk0+qlAk7ouKcoL88hfJXlBG+EauSjbVYlsmn8xMuozg oaMKkplWUqd9tKpjy3Cvy3drKFl8XpDMKklkYDHPD9KNwc4MCv/vyqO5qQY1056gX7kv coABQQWFDdz2g4aREMkBLcWKzOcWKjsHvAFQhs/N+adey1TzNMPyI1pBMj2vkhE2dd2q qnqZCjqSCE/xnLFFbuSu3h/P+3EWnodD5MpoaFVlrXqUhsRlxn3xoAXmKH8c6qogg8Zj pAYyjPhfS9/mYNFlDohgfS8j5W+dFZ2goaTdnwVUPoBjzYs4+TFt8FeQqee94UT/HtbD 7b7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=ZvX2FIOCz/Y3mZ7dIngCkPZB4t329Yv+PDSPBH27no8=; b=VpM0MmDorm9LFPnKK7ylvAXN98uMcWXIj6b7z4YeC4Qi3h77hokR33XIorrZPxSqsY 3R4Rkf5a5vBfiaQbMbHZcg+r7MpU6dqMfr2jEAN3hgh/49pvbe1BBzh+fMT1y9QIsM23 fyk8kis5E1NG2st4SJavXgF3DYnFbdEiVEoySujyQhT3ZHxtQQ4SL+TSGx+yzXlwCSLJ zslieOFLm0pbYHmVoTn7sXyU8dO3pWZ2T9kIWQQDrDP9YB001JiMttOFQaf9ZAcXpnv9 HRykjDCspAcfkQiMQRnHOjM6LwVpRJlO0QeOWhYn70JsTVbIpJRPFxuARnPdklZzy+X1 TMBA== X-Gm-Message-State: AHQUAuYxF34/8xsuI+ZEqtSHufc+682R8NL8GeIT9hp6q0kVmRiycEV6 wfzq1MqAwj/34d3vuK5arpJ9boNf X-Google-Smtp-Source: AHgI3IZhRPQW3rnj7zwYnmlz2oJCBwsYivWt+Ubn5jj4FgaKP5/0yQdEPAsrSQXQVFkRjOL8ZoI9OQ== X-Received: by 2002:a9d:67cb:: with SMTP id c11mr9700471otn.103.1551028809325; Sun, 24 Feb 2019 09:20:09 -0800 (PST) Received: from cray.acadix.biz (cpe-174-102-163-140.wi.res.rr.com. [174.102.163.140]) by smtp.gmail.com with ESMTPSA id h30sm3368688oth.19.2019.02.24.09.20.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Feb 2019 09:20:07 -0800 (PST) Subject: Re: QEMU From: Jason Bacon Cc: freebsd-ppc@freebsd.org References: <5f291124-612f-6d10-5012-a8701b1cf49e@gmail.com> <5302f073-b51b-c92f-ada2-f7123d27fa3d@gmail.com> <8213518b-14a5-aac2-bcbb-529e49c4f044@gmail.com> <9f96d3ac-ada3-8f41-6c2c-e6fab80e49e9@gmail.com> Message-ID: <6ebb36d4-8f6e-e4a1-3d96-094c1f883a6f@gmail.com> Date: Sun, 24 Feb 2019 11:20:06 -0600 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: <9f96d3ac-ada3-8f41-6c2c-e6fab80e49e9@gmail.com> Content-Language: en-US X-Rspamd-Queue-Id: D71477548F X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=NH+11/pt; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of bacon4000@gmail.com designates 2607:f8b0:4864:20::333 as permitted sender) smtp.mailfrom=bacon4000@gmail.com X-Spamd-Result: default: False [-4.36 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; HAS_ATTACHMENT(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; SUBJ_ALL_CAPS(0.30)[4]; MISSING_TO(2.00)[]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+,1:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.997,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-ppc@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_SHORT(-0.96)[-0.964,0]; IP_SCORE(-2.68)[ip: (-8.71), ipnet: 2607:f8b0::/32(-2.65), asn: 15169(-2.00), country: US(-0.07)]; RCVD_IN_DNSWL_NONE(0.00)[3.3.3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0] Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2019 17:20:13 -0000 On 2/23/19 7:49 AM, Jason Bacon wrote: > On 7/21/18 1:06 PM, Jason Bacon wrote: >> On 7/19/18 4:32 PM, Jason Bacon wrote: >>> On 07/19/18 14:09, Chuck Tuffli wrote: >>>> On Thu, Jul 12, 2018 at 8:09 AM, Jason Bacon >>>> wrote: >>>>> FYI, I get the exact same behavior under qemu 2.8.1 on Debian. >>>>> >>>>> So now we have similar symptoms in qemu 2.8.1, 2.9, and 2.12.50 on >>>>> FreeBSD >>>>> and Linux hosts. >>>> FWIW, on an Ubuntu 14.04 system with qemu-system-ppc64 version 2.0.0, >>>> the ppc64 snapshot ISO of 12.0, the OS appears to install correctly >>>> and subsequently boots correctly. >>>> >>>> --chuck >>> That's worth a lot, actually. >>> >>> The 12.0 snapshot also works on my FreeBSD 11.1 host with the stock >>> qemu package.  Both keyboard and mouse input are processed. >>> >>> Interestingly, though, while 12.0 works, it seems to be a lot slower >>> than 11.1 under qemu.  Below are times to get to the install >>> screen.  ( I just close the qemu window as soon as it reaches that >>> point, where 11.1 won't accept keyboard input. ) >>> >>> FreeBSD cray.acadix  bacon ~ 999: time qemu-ppc install >>> freebsd-ppc.img >>> FreeBSD-12.0-CURRENT-powerpc-powerpc64-20180709-r336134-disc1.iso >>> + [ ! -e freebsd-ppc.img ] >>> + qemu-system-ppc64 -cdrom >>> FreeBSD-12.0-CURRENT-powerpc-powerpc64-20180709-r336134-disc1.iso >>> -drive 'file=freebsd-ppc.img,format=raw' -boot d >>> 217.327u 3.455s 4:21.41 84.4%    9628+6292k 94+2io 476pf+0w >>> >>> >>> FreeBSD cray.acadix  bacon ~ 1000: time qemu-ppc install >>> freebsd-ppc.img Save/FreeBSD-11.1-RELEASE-powerpc-powerpc64-disc1.iso >>> + [ ! -e freebsd-ppc.img ] >>> + qemu-system-ppc64 -cdrom >>> Save/FreeBSD-11.1-RELEASE-powerpc-powerpc64-disc1.iso -drive >>> 'file=freebsd-ppc.img,format=raw' -boot d >>> 123.001u 1.748s 2:47.05 74.6%    9643+6302k 556+3io 0pf+0w >>> >>> Maybe these data will provide some clues to the ppc base developers... >>> >> I'm getting "lock order reversal" errors followed by stack traces >> when running portsnap.  Bleeding-edge 12.0 issue? >> > > Poked around at this a bit more and found a workaround.  It seems > FreeBSD doesn't support the latest default PPC machine in qemu. > Available options are listed below.  After switching from the default > pseries-2.6 to pseries-2.5, FreeBSD 12.0 works flawlessly. > > I attached a script I'm using to install and then boot the VM. > > So now there's an easy way to test/fix ports for PPC64.  Runs about as > fast as a 486, but that's fine since we can install dependencies via > "pkg install" to reduce build time. > > > FreeBSD cray.acadix  bacon ~ 1010: qemu-system-ppc -machine help > Supported machines are: > bamboo               bamboo > g3beige              Heathrow based PowerMAC (default) > mac99                Mac99 based PowerMAC > mpc8544ds            mpc8544ds > none                 empty machine > ppce500              generic paravirt e500 platform > prep                 PowerPC PREP platform > ref405ep             ref405ep > taihu                taihu > virtex-ml507         Xilinx Virtex ML507 reference design > FreeBSD cray.acadix  bacon ~ 1011: qemu-system-ppc64 -machine help > Supported machines are: > bamboo               bamboo > g3beige              Heathrow based PowerMAC > mac99                Mac99 based PowerMAC > mpc8544ds            mpc8544ds > none                 empty machine > ppce500              generic paravirt e500 platform > prep                 PowerPC PREP platform > pseries-2.1          pSeries Logical Partition (PAPR compliant) > pseries-2.2          pSeries Logical Partition (PAPR compliant) > pseries-2.3          pSeries Logical Partition (PAPR compliant) > pseries-2.4          pSeries Logical Partition (PAPR compliant) > pseries-2.5          pSeries Logical Partition (PAPR compliant) > pseries              pSeries Logical Partition (PAPR compliant) (alias > of pseries-2.6) > pseries-2.6          pSeries Logical Partition (PAPR compliant) (default) > ref405ep             ref405ep > taihu                taihu > virtex-ml507         Xilinx Virtex ML507 reference design > > Just to see how far I could push qemu, I tried running sysutils/desktop-installer.  It actually worked well after a few minor fixes and answering "no" to using the LATEST packages instead of quarterly.  Took a few hours whereas it generally finishes a lightweight desktop in about 20 minutes on bare metal with a fast Internet connection. Lumina is apparently broken on powerpc64, but LXDE worked fine.  It fails to start from XDM, but this may not be powerpc-specific.  I've seen issues like this before due to changes in the desktop env's startup scripts.  It starts up fine using startx. The working version is in my work-in-progress collection:     https://github.com/outpaddling/freebsd-ports-wip If I had a Mac G5 I'd do more testing, but it will probably work with a supported video card. Cheers,     JB -- Earth is a beta site. From owner-freebsd-ppc@freebsd.org Sun Feb 24 21:07:36 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B87F150C73F for ; Sun, 24 Feb 2019 21:07:36 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0331888780 for ; Sun, 24 Feb 2019 21:07:35 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: by mail-lj1-x229.google.com with SMTP id z20so5652739ljj.10 for ; Sun, 24 Feb 2019 13:07:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7TFxM+ZyEXAFs2+RRuyplbSfS6Pz61ri9UvCVcQReRQ=; b=L9mPIyAqhNkXd7IDJklyk2jjWhFCwDEM2ok0LX8YXHGV0HWE0laykJndu1Pc4sdZkr ZmyAuTOvMPWMchUkEI/mDrK4UThVSFvHZ456Pina7ZCSk9zml6sLavxaqtYcpQcEJdbW Z6KjPios464QmUSXJnfOay7fCeGhQvWrrExH2bzJMBcvDRFMcYryppDLGLc2QeL8x01Z VjiXFvUUeGflCiRty4hmSoe+HnS1xTAIljz5ZTR8mOwkaekDP2v3bb5DC3nwLJg9XV8V nhiPB73XUx4yISREVoPlPRig2H/OGjbHbFLm0a4bQ9u6NNWU2jFpnAsKoPnz3Cen9IHF nA5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7TFxM+ZyEXAFs2+RRuyplbSfS6Pz61ri9UvCVcQReRQ=; b=pEVyNhZiNujLGyAjoT5SILy2W85y/OJn9Lyf1Wl5JzKJNmlyAAZjgbMlv0HRa0w5mV qui7KTUOrTmvL9rWF2T2U/w5ZBytbacAd68rauRjc6Tyuzb3aWQ+BvT/NyuIW4TgY8h6 saizVyErkijFwmzlD2BGmnGSktDgz5otAgdpVowsC6sVQBZaJ749QvURIqHhEPbd+eFo scNjgPj3VIbKqi1vKJmLcabGaB+/63khT2aoiy8vx0AzfohHkuwtbKe3+a/bldKIa8Wi 37SH9/+qZb/9ybVbm/U3dl222RzWGpv4/CBc6P1HWlgvQn2P9P0ScKRYPtTtuJFUEYWM 1khQ== X-Gm-Message-State: AHQUAuZYmjz5fAzpZg2rih5fDeUHAUnr8A9qg4gJvbdb0UOZN1C98HzR +5zRrO4yVLNgeLlEyS6WM5ijIZW9iymzR+b8PpM= X-Google-Smtp-Source: AHgI3IbSgyBPOgdvTG8RB9G9NpgMqYPE0Dn0mUJ9MgbWRteQb9KzNiT4brkdkm1fWsq5kbABcY1IfFfo9P80WyeD6Cw= X-Received: by 2002:a2e:89d9:: with SMTP id c25mr8639821ljk.105.1551042452169; Sun, 24 Feb 2019 13:07:32 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Justin Hibbits Date: Sun, 24 Feb 2019 15:07:20 -0600 Message-ID: Subject: Re: An experimental hack that appears to allow old PowerMacG5 4-core (system total) system to boot reliably (head -r343884 based context) To: Mark Millard Cc: FreeBSD PowerPC ML , Dennis Clarke Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 0331888780 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=L9mPIyAq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of chmeeedalf@gmail.com designates 2a00:1450:4864:20::229 as permitted sender) smtp.mailfrom=chmeeedalf@gmail.com X-Spamd-Result: default: False [-6.69 / 15.00]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_SHORT(-0.90)[-0.896,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-ppc@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[9.2.2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org : 127.0.5.0]; IP_SCORE(-2.78)[ip: (-9.59), ipnet: 2a00:1450::/32(-2.25), asn: 15169(-2.00), country: US(-0.07)]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2019 21:07:36 -0000 On Sat, Feb 23, 2019 at 1:36 PM Mark Millard wrote: > > For sys/powerpc/aim/mp_cpudep.c 's cpudep_ap_bootstrap I added as shown below: > > +extern void hack_into_slb_if_needed(void* vap); // HACK!!! > + > uintptr_t > cpudep_ap_bootstrap(void) > { > . . . > + hack_into_slb_if_needed(pcpup->pc_curpcb); // HACK!!! > + > sp = pcpup->pc_curpcb->pcb_sp; > > and in src/sys/powerpc/aim/slb.c I added an implementation: > > +void hack_into_slb_if_needed(void* vap); // HACK!!! > +void hack_into_slb_if_needed(void* vap) // HACK!!! > +{ // HACK!!! > + struct slb *cache= PCPU_GET(aim.slb); > + vm_offset_t va= (vm_offset_t)vap; > + uint64_t slbv= kernel_va_to_slbv(va); > + uint64_t esid= va>>ADDR_SR_SHFT; > + uint64_t slbe= (esid< + int i; > + > + for (i = 0; i < n_slbs; i++) { > + if (i == USER_SLB_SLOT) > + continue; > + if (cache[i].slbe == (slbe | i)) > + break; > + } > + > + if (i==n_slbs) > + slb_insert_kernel(slbe,slbv); > +} // HACK!!! > + > > So far I've not had any boot hang-ups after this. > > Given the random nature of the hang-ups it will be a > while before I conclude for sure how reliable this > change makes booting, but so far so good. > > (I recognize that the "break" could be "return" > and then then the "if (i==n_slbs)" would not be > needed.) > > > Other issues not fixed by this: > > This does not change the buf*daemon* randomly getting > hung up (and so timing out on shutdown). This appears > to be the same issue that leads to the fans sometimes > starting to run full-rate because of pmac_thermal > being hun -up. > > For buf*daemon* "top -SHIopid" before shutdown shows > just the ones that will not hang-up. The same goes for > seeing before hand for pmac_thermal vs. the fans. > > === > Mark Millard Hi Mark, Fantastic work tracking this down! So the problem is we now can fault when accessing KVA space. I think we should allow this, otherwise we can hamper performance with reduced KVA size. I'll have to think about how best to do this. Would you be willing to test patches I come up with? - Justin From owner-freebsd-ppc@freebsd.org Sun Feb 24 21:50:32 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 075B8150E4FE for ; Sun, 24 Feb 2019 21:50:32 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic305-22.consmr.mail.ne1.yahoo.com (sonic305-22.consmr.mail.ne1.yahoo.com [66.163.185.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F29898A365 for ; Sun, 24 Feb 2019 21:50:29 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 3UfgHdEVM1lMEREpk9SCDMvWf.VrVBe9PSSTCcstSwzvnHAyLML3EC5ftFOLuSQ Q_2SZsKEvUW96VAQv4tECpAwsd1x_pNjnpJjKlil.653kHGMdrGkUKeYDIfVkAHJRtK5qYJpiEZ1 FvH.yamJl8pCXR.eJCI0CZoo2WPh_OSmn3Rpj9SsKI7HzLEVuUUZvdcO6rwp1CUtiAM_YVkMQnwU .TN.uQCtT6IME1KTOJ3VXt8S0isk_ba0s.T_7eV1ihsNXGGm8XIX7bxenqtonbyzdsJVpHK7FLT0 ZEJzLxjbKDwYdesVqVmg2pLJt3S_RgPIzYrNc8D5eWiyd3LavsUyIjusl8SJnDKsoxQpBGqyV2iX sOj2kv65tMPiiahg_xrn5X.Thxl1UOzeZm4YfjdQK3yU6poSuNLdMRTb0VuPePFMM9ZN8Zd0kDVm TqEYX3aA9UXCo.ApAHXyxEor.wzKwHiQdId7R7uo4p7q4QD2kT5WjQxpdKClYDVWu7eI8n053I08 95lGbPFgRhYuHEKfJ1fSWqdQi7dyqeYF1Mb_96N3ujSSDJ3mkxUjh2y9eC5Hw3EkyxLW9qZyHsCo WJJ7yuI2JFS9tnQQrrv6FRKJ8q8kVOzizARJWiy8KCpJxBczcgarC6CBn6hyOb1.nIUrDMWOamsd DriEGTdR65MLRo49gReW6DekdOks_Wko3BjVXpHH2vjWxMToL5SgzUC5gsyd3MVnmyRh60q22phi voldyw9bHZZXCaw7g_SmEKLqgHEqnOMlLVmVcnn1sn1exJpiTLwm5GemuGF7WZzECFZJgdlXtAGX G_.XrsNKSnovP_0iRIdnffm76ENEK9AGQ4nZhfDNf_6gewr0fpGQ6aoJB9NlA0XCTiBwVF6m5qVv pcdNLEhHAHN28tIHJanTMN7_zWWakQuCoRwRdxADo2Yp4efdtgkHtXZtBssKe29uaULxvSmne_Ly te6rbhNdH__vuu8aDDDj9wwwcPGvE2Vbj6hVaA6rp3ap9AL7C9tKdyQ2g9NJ4GFOswseCY.WSk14 FHp7IfffeUZ0TwPaKjWF1PNTWuMdUEdzPnfH4CzB3B9RGsh96qCz1yc4GnM4ML.s- Received: from sonic.gate.mail.ne1.yahoo.com by sonic305.consmr.mail.ne1.yahoo.com with HTTP; Sun, 24 Feb 2019 21:50:22 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp413.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 6e72ba45aad1c5b1c3a4c1c2479e17e0; Sun, 24 Feb 2019 21:50:19 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: An experimental hack that appears to allow old PowerMacG5 4-core (system total) system to boot reliably (head -r343884 based context) From: Mark Millard In-Reply-To: Date: Sun, 24 Feb 2019 13:50:17 -0800 Cc: FreeBSD PowerPC ML , Dennis Clarke Content-Transfer-Encoding: quoted-printable Message-Id: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> References: To: Justin Hibbits X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: F29898A365 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.13 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.99)[0.993,0]; NEURAL_HAM_LONG(-0.61)[-0.610,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.43)[ip: (4.89), ipnet: 66.163.184.0/21(1.28), asn: 36646(1.03), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.83)[0.829,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[148.185.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2019 21:50:32 -0000 On 2019-Feb-24, at 13:07, Justin Hibbits = wrote: > On Sat, Feb 23, 2019 at 1:36 PM Mark Millard = wrote: >>=20 >> For sys/powerpc/aim/mp_cpudep.c 's cpudep_ap_bootstrap I added as = shown below: >>=20 >> +extern void hack_into_slb_if_needed(void* vap); // HACK!!! >> + >> uintptr_t >> cpudep_ap_bootstrap(void) >> { >> . . . >> + hack_into_slb_if_needed(pcpup->pc_curpcb); // HACK!!! >> + >> sp =3D pcpup->pc_curpcb->pcb_sp; >>=20 >> and in src/sys/powerpc/aim/slb.c I added an implementation: >>=20 >> +void hack_into_slb_if_needed(void* vap); // HACK!!! >> +void hack_into_slb_if_needed(void* vap) // HACK!!! >> +{ // HACK!!! >> + struct slb *cache=3D PCPU_GET(aim.slb); >> + vm_offset_t va=3D (vm_offset_t)vap; >> + uint64_t slbv=3D kernel_va_to_slbv(va); >> + uint64_t esid=3D va>>ADDR_SR_SHFT; >> + uint64_t slbe=3D (esid<> + int i; >> + >> + for (i =3D 0; i < n_slbs; i++) { >> + if (i =3D=3D USER_SLB_SLOT) >> + continue; >> + if (cache[i].slbe =3D=3D (slbe | i)) >> + break; >> + } >> + >> + if (i=3D=3Dn_slbs) >> + slb_insert_kernel(slbe,slbv); >> +} // HACK!!! >> + >>=20 >> So far I've not had any boot hang-ups after this. >>=20 >> Given the random nature of the hang-ups it will be a >> while before I conclude for sure how reliable this >> change makes booting, but so far so good. >>=20 >> (I recognize that the "break" could be "return" >> and then then the "if (i=3D=3Dn_slbs)" would not be >> needed.) >>=20 >>=20 >> Other issues not fixed by this: >>=20 >> This does not change the buf*daemon* randomly getting >> hung up (and so timing out on shutdown). This appears >> to be the same issue that leads to the fans sometimes >> starting to run full-rate because of pmac_thermal >> being hun -up. >>=20 >> For buf*daemon* "top -SHIopid" before shutdown shows >> just the ones that will not hang-up. The same goes for >> seeing before hand for pmac_thermal vs. the fans. >>=20 >> =3D=3D=3D >> Mark Millard >=20 > Hi Mark, >=20 > Fantastic work tracking this down! So the problem is we now can fault > when accessing KVA space. I think we should allow this, otherwise we > can hamper performance with reduced KVA size. I'll have to think > about how best to do this. Would you be willing to test patches I > come up with? I'll try to test whatever updates you want but there may be some issues with timeliness. The reason for the "sometimes" boot-failure is that the entry in the slb for the PCB/stack for the CPU being added has sometimes been replaced already before the CPU the pcb is for has sufficiently configured to allow automatic handling --and other times has not yet been replaced: the random slb replacement mechanism. There already is code to handle slb entry replacements but it does not work for a CPU still being set up (at the stage of the sometimes failure). At least that is what I expect for: # grep -r "handle_kernel_slb_spill" /usr/src/sys/powerpc/ /usr/src/sys/powerpc/aim/trap_subr64.S: bl handle_kernel_slb_spill /usr/src/sys/powerpc/powerpc/trap.c: void = handle_kernel_slb_spill(int, register_t, register_t); /usr/src/sys/powerpc/powerpc/trap.c:handle_kernel_slb_spill(int type, = register_t dar, register_t srr0) So my hack was to separately do the potential replacement in that early time frame to allow the configuration for the CPU to get far enough along for the existing mechanism to work. (At least that is what I expect that I did.) So far I've had no boot failures of any kind with the hack. I've removed the hacks for reporting information and things still work. But I've not tried anything extensive after booting because things like buf*daemon* threads and pmac_thermal are randomly hanging up in/at: mi_switch+0x134 sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c (mi_swtich seems to have called sched_switch based on the "+0x134" and the code in that area --but ched_switch is not listed) I've no clue what is safe when one or more buf*daeomon* threads make no progress. For shutdown that frequently leads to timeouts for stopping some buf*deamon* threads (when all 8 time out it takes about 8 minutes). The buf*deamon* that fail are the ones that "top -SHIopid" no longer shows. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Sun Feb 24 23:58:58 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 83404151264F for ; Sun, 24 Feb 2019 23:58:58 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic306-4.consmr.mail.bf2.yahoo.com (sonic306-4.consmr.mail.bf2.yahoo.com [74.6.132.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0B8E58EDB2 for ; Sun, 24 Feb 2019 23:58:56 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: _5J3RTEVM1kMiBqneJN4VPkMZ.q1ja6ZVhH_8XSYpak67yn_Uvg2xJfV7uZn7_x G5jOCxh_U22YDMduVHluuPpDu.G7zXVkAON5guNqbM.0G2fzchoklmAC9flnJop7AU7wymE9nh8g a7EtWw3cBZXfYlI7diRX.iWhHHd2iABOTtPNf3ZNx1puNV5mEyVBoOWhVW2UJ8UPhwbJXE8aWBSk T_ymo9lz0FncJhM6036JTC2eVudys50.YJNxaUGNnkFC3SWZVdsrj5GK9.NvZ0Y8tDuHTbwjq8au jsYjG5mUkDrNhYB6656t4fnugE_qx4AKs5dQ4HkudFaoDHqYkLiZ1NTMUmfUQ7HL8Sf4MuE1Dkot uHt_10vay90YPohYYa.cE9TUzxkyS4HhFvjgrmGPhRXHQQ_hFoFQUeDluyHgdhe_DsTQi23UIpxi k8opna6Y5JE3fkSWqY1utICJUTN.YMNiqt7sc25yK37aJqqgQlQNt6C3UzvUlm4eStE6pWNnAE3s FL63yIyk6qRrCpVR0.NIWKsjDQcNhXBwb5bfVbdmNGC0Ke9TFRgMLNduxJwjQmwg45Rlv35SZeTt mY8EoCbTzPurNv8GwgJ_ToGdh9LaM1MPsc7RpFUTT5qIzMkcNW8w7WBmZehrpSTPMk648440xmdJ e7bIAm5tnh5BV2l7lFGSi8BXU.R0JslNeOXyL1gyvQ8M3kR_1a0.K2DccYHjgJP.3oC5ZH_3knFS RkuKKktIIwmZDD6W8Z27Yfa6ymacCLKzPGw3BJtx.rfgVQenA3EH_FdS7wSzgxRtCOUbTtVCKF8U 30uI4KAzDbZyF.EbSV6PxzEweUWVEo2XU9LiAqHkl2YfazI3lOKWagAOdzsIY20Vgq7i8Zw.EMqm k4JlCcXvfqg8kyVVcMtG1SslXglIR4LUyeqF6ogsBXReAJ0GqK0jhB3yhCY_hyLXYu53sLrHaD5e HcUT04Lqt1Q5qC9urQBlPMRIVqkWVNVaGDNucNP5nvnKKCttD4ytJXDP_I_.w7E_fDj7A.tR5yGT UK0KpWMynxGHCSMn3FqIfdjRrsjmbe04DRYuqlSEZ4jPXKaJLIAUt_ymokr7l Received: from sonic.gate.mail.ne1.yahoo.com by sonic306.consmr.mail.bf2.yahoo.com with HTTP; Sun, 24 Feb 2019 23:58:49 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp410.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 099d3620c355fdd2a73cfbcbc99e4dd3; Sun, 24 Feb 2019 23:58:47 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: powerpc64 head -r344018 based: "Entering uma_startup1 with 0 boot pages left": such okay? Message-Id: Date: Sun, 24 Feb 2019 15:58:45 -0800 Cc: freebsd-hackers Hackers To: FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 0B8E58EDB2 X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.58 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; IP_SCORE(1.26)[ip: (3.62), ipnet: 74.6.128.0/21(1.52), asn: 26101(1.21), country: US(-0.07)]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.90)[0.904,0]; NEURAL_HAM_LONG(-0.00)[-0.001,0]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; NEURAL_SPAM_MEDIUM(0.93)[0.928,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[43.132.6.74.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2019 23:58:58 -0000 I switched to a debug kernel build for powerpc64 head and it reports: FreeBSD 13.0-CURRENT #2 r344018M: Sun Feb 24 15:21:37 PST 2019 = markmi@FBSDFSSD:/usr/obj/powerpc64vtscdbg_xtoolchain-gcc/powerpc.powerpc64= /usr/src/powerpc.powerpc64/sys/GENERIC64vtsc-DBG powerpc gcc version 6.4.0 (FreeBSD Ports Collection for powerpc64)=20 WARNING: WITNESS option enabled, expect reduced performance. WARNING: DIAGNOSTIC option enabled, expect reduced performance. Entering uma_startup with 6 boot pages configured startup_alloc from "UMA Kegs", 5 boot pages left startup_alloc from "UMA Zones", 4 boot pages left startup_alloc from "UMA Zones", 3 boot pages left startup_alloc from "UMA Zones", 2 boot pages left startup_alloc from "UMA Hash", 1 boot pages left Entering uma_startup1 with 0 boot pages left Entering uma_startup2 with 0 boot pages left cpu0: IBM PowerPC 970MP revision 1.1, 2500.46 MHz cpu0: Features dc000000 cpu0: HID0 1511081 real memory =3D 17133703168 (16339 MB) avail memory =3D 16382566400 (15623 MB) FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs Is this a problem? If yes: Any clue what should be done for it? Note: This is from a devel/powerpc64-xtoolchain-gcc based build. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Mon Feb 25 22:47:01 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ACC881518196 for ; Mon, 25 Feb 2019 22:47:01 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic317-35.consmr.mail.ne1.yahoo.com (sonic317-35.consmr.mail.ne1.yahoo.com [66.163.184.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 03A5685834 for ; Mon, 25 Feb 2019 22:46:59 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 6.engXcVM1kf7n9XG_zhA8TkVeIFhJ2iuZIC3AP.LZ03dcjlen2zkTNIutyYwBe y.r2l35N.ongc7pcV2mOKex6EqFe6ti3tKL7zNdX9fQwAYgvQVxzhoUfBVD4JuQt2uKKUW3qvExm XGhvsDUvfMDyr8.iLhZx1UGDQMc0Fi..3b5ulHtTu645XX_.aEUjfY11wypp_sicsHCFNQEvqeF1 G342L_8uEC6mu7Y.ZeFtUgJ0d3rey.Mq6RsHfWMPbsQPSKMcvGGA4IMvBaiz0ffbfMX_xZRGscyQ gLSMNvL23.TwC8wrqZWfJAuQt2Jd7dDGc287tk9pmG1HrtK1ZsH5JAbc2BsrI4tldybxCoaKG1Xi ZQfV0lFAy.mDPHJGsELr4grqm5OLy3Bpo8wwqWNwueGtTOONnzr3pp6ErsCQ7f9MEVs2WDzdZHJz xF6nBzhommZcSJyqutJKHuBf5_VfpccLmZTST_cMbPDZOLAquun5bkzK.7V4AqNvPq1.L3jk1XgS 22oow8OHftvL.oNq3IE0Xw.7mF0T5UMaCd48IqqJuaY0sQKBsCqZsdC9rkzPpmbGCaK0NG.dMt2V om7SW47J3aUiU0T4S0IN33.QIrv2o4X9V7cyWSIn21Jh70d0aFhnU4OPR8HQ7dO1fTSG6_3JDV1q T80oGNg5wjUjaiob7T_MBz5esYKWUeluclEEy9220k85njTBE5LnW4UrIS0g7q96_6TDs3H5Rn.p KMq7Ic53sPyjFHmJ9t4B0aWkvUhoNuYDQG7dra77f9pBIDZaSQUH8EQueH6cAL.cUJTuOYXrt18B 3xNCt7FT53KQRxJrUND5Fqw4jMh0XqI8vejNnTDeYiLBWN0e9V3Zti7Q_FZl81KvuvqNgZH1.2RM e_2d3IIfoNV784o7HSaYj4kVUL2j7kQ6MAs1llI2VFevAELafbTCMB2B4owgy.UmZygrZ.7r3MIm INvBLH0QmN0o.tYIhIYJcKUDfbRMQkrT.s9w6n9CanOvx.0.B9N0WHtWAe1qY99u_cIyp2cuJMLV aUgailDgkpxQR2qlrMJEbofmRzm9r7rgt2Dz7n9JtzqEwDbi.mbffhEEae0w5qis- Received: from sonic.gate.mail.ne1.yahoo.com by sonic317.consmr.mail.ne1.yahoo.com with HTTP; Mon, 25 Feb 2019 22:46:58 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp427.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 764dc0a887305ebf83c8ac32aaf510eb; Mon, 25 Feb 2019 22:46:56 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: head -r344018 based powerpc64 pmac_thermal hangup (stuck sleeping): some preliminary evidence Message-Id: <40D1DDA1-10FB-4F2C-B38B-C7FED5795542@yahoo.com> Date: Mon, 25 Feb 2019 14:46:55 -0800 Cc: freebsd-hackers Hackers To: FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 03A5685834 X-Spamd-Bar: ++++ X-Spamd-Result: default: False [4.15 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; SUBJECT_ENDS_SPACES(0.50)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.93)[0.934,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.74)[ip: (6.47), ipnet: 66.163.184.0/21(1.28), asn: 36646(1.02), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.76)[0.762,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.72)[0.721,0]; RCVD_IN_DNSWL_NONE(0.00)[46.184.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2019 22:47:02 -0000 I adjusted what KTR_PROC does to just show just some of its pid 26 (pmac_thermal) messages and adding some extra output as well. I'll list some of that output later. I'll note that beyond pmac_thermal the buf*daemon* threads also seem to be subject to being stuck sleeping in (offsets are for a specific build of mine): mi_switch+0x134 sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c So far for pmac_thermal I've seen that until the failing case: sleepq_set_timeout_sbt was being given: sbt=3D=3D0xcccccbe0 pr=3D=3D0x0 = flags=3D=3D0x100 and in turn was using: prec=3D=3D0xcccccbe flags=3D=3D0x501 (of course = the used td_sleeptimo varies). [I note that 16*0xcccccbe =3D=3D 0xcccccbe0, the original sbt value, not that I know yet if this matters.] But the sequence leading to failures is different: sleepq_set_timeout_sbt was given: sbt=3D=3D0xfffffed8 pr=3D=3D0x0 = flags=3D=3D0x100 and in turn was using: prec=3D=3D0xfffffed flags=3D=3D0x501 [I note that 16*0xfffffed =3D=3D 0xfffffed0, so less than the original sbt value, not that I know this matters at this point.] For sbt=3D=3D0xfffffed8, the callout to sleepq_timeout ends up with = values like (a particular example): td_sleeptimo=3D0x470d360fe5 sbinuptime=3D0x46c869f6aa where the reporting code looks like: static void sleepq_timeout(void *arg) { struct sleepqueue_chain *sc __unused; struct sleepqueue *sq; struct thread *td; void *wchan; int wakeup_swapper; sbintime_t sbut; // HACK!!! td =3D arg; wakeup_swapper =3D 0; if (26 =3D=3D td->td_proc->p_pid) // HACK!!! CTR3(KTR_PROC, "sleepq_timeout: thread %p (pid %ld, %s)", (void *)td, (long)td->td_proc->p_pid, (void *)td->td_name); =20 thread_lock(td); =20 sbut=3D sbinuptime(); // HACK!!! if (td->td_sleeptimo > sbut || td->td_sleeptimo =3D=3D 0) { /* * The thread does not want a timeout (yet). */ if (26 =3D=3D td->td_proc->p_pid) // HACK!!! CTR5(KTR_PROC, "sleepq_timeout thread not want = timeout yet: thread %p (pid %ld, %s) td_sleeptimo=3D%jx sbinuptime=3D%jx",= (void *)td, (long)td->td_proc->p_pid, (void = *)td->td_name, (uintmax_t)td->td_sleeptimo, (uintmax_t)sbut); . . . So far sleepq_set_timeout_sbt being given sbt=3D=3D0xfffffed8 instead of sbt=3D=3D0xcccccbe0 seems to be an accurate indicator of if the problem = will happen in sleepq_timeout. (But I've only a few examples so far.) I'll note that the sleepq_timeout code for this case does not set up another callout to itself for later and the sleep then continues indefinitely. I've not yet gotten into finding evidence for why the callout to sleepq_timeout itself happens. Hopefully I can find some. An example of some modified KTR_PROC output is: 493 (0xc000000001eb45a0:cpu3) 3278339934 = /usr/src/sys/kern/subr_sleepqueue.c.1026: sleepq_timeout thread not want = timeout yet: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) = td_sleeptimo=3D470d360fe5 sbinuptime=3D46c869f6aa 492 (0xc000000001eb45a0:cpu3) 3278339919 = /usr/src/sys/kern/subr_sleepqueue.c.1015: sleepq_timeout: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 491 (0xc00000000a8315a0:cpu3) 3253979006 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 490 (0xc00000000a8315a0:cpu3) 3253978972 = /usr/src/sys/kern/subr_sleepqueue.c.408: sleepq_set_timeout_sbt: = sbt=3Dfffffed8 pr=3D0 flags=3D100 td_sleeptimo=3D470d360fe5 prec=3Dfffffed= flags=3D501 489 (0xc00000000a8315a0:cpu3) 3253978965 = /usr/src/sys/kern/subr_sleepqueue.c.406: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 488 (0xc00000000a8315a0:cpu3) 3253978940 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on pmac_therm (0xc0000000015082c3) 487 (0xc00000000a8315a0:cpu3) 3253978890 = /usr/src/sys/kern/subr_sleepqueue.c.630: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 486 (0xc00000000a8315a0:cpu3) 3253978883 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 485 (0xc000000001f6e000:cpu2) 3253978598 = /usr/src/sys/kern/subr_sleepqueue.c.836: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 484 (0xc000000001f6e000:cpu2) 3253978580 = /usr/src/sys/kern/subr_sleepqueue.c.988: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) 483 (0xc00000000a8315a0:cpu3) 3252269011 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 482 (0xc00000000a8315a0:cpu3) 3252268968 = /usr/src/sys/kern/subr_sleepqueue.c.408: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D46ccf486f8 prec=3Dcccccbe= flags=3D501 481 (0xc00000000a8315a0:cpu3) 3252268962 = /usr/src/sys/kern/subr_sleepqueue.c.406: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 480 (0xc00000000a8315a0:cpu3) 3252268935 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) 479 (0xc00000000a8315a0:cpu3) 3252268793 = /usr/src/sys/kern/subr_sleepqueue.c.630: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 478 (0xc00000000a8315a0:cpu3) 3252268778 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 477 (0xc000000001f6e000:cpu2) 3252268391 = /usr/src/sys/kern/subr_sleepqueue.c.836: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 476 (0xc000000001f6e000:cpu2) 3252268385 = /usr/src/sys/kern/subr_sleepqueue.c.988: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) 475 (0xc00000000a8315a0:cpu2) 3250546514 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 474 (0xc00000000a8315a0:cpu2) 3250546468 = /usr/src/sys/kern/subr_sleepqueue.c.408: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D46bfa4a7cc prec=3Dcccccbe= flags=3D501 473 (0xc00000000a8315a0:cpu2) 3250546462 = /usr/src/sys/kern/subr_sleepqueue.c.406: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 472 (0xc00000000a8315a0:cpu2) 3250546436 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) 471 (0xc00000000a8315a0:cpu2) 3250546295 = /usr/src/sys/kern/subr_sleepqueue.c.630: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 470 (0xc00000000a8315a0:cpu2) 3250546286 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 469 (0xc000000001f6e000:cpu0) 3250545941 = /usr/src/sys/kern/subr_sleepqueue.c.836: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 468 (0xc000000001f6e000:cpu0) 3250545934 = /usr/src/sys/kern/subr_sleepqueue.c.988: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Mon Feb 25 23:50:02 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C720A1519791 for ; Mon, 25 Feb 2019 23:50:02 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic314-21.consmr.mail.ne1.yahoo.com (sonic314-21.consmr.mail.ne1.yahoo.com [66.163.189.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8B49C87F53 for ; Mon, 25 Feb 2019 23:50:01 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: RDhR2xMVM1m_1kIA6YwjyxT9mLcW5_nV6QlagfYGoGJ.LmP11FsmtaELcHMZXMy dUwZctqrTrIlegcPdMC4e0cHMWGGf6uxNks1.C_xjBOsEkcbQk4gYLcCld8nHCFXtEk2xSFYvxPq gtI3FEwnrlG5aDV8MCjT2kobY7grjwRbNhrWzA6ib4yHiOPtupjQKbpibjZVIDcfMz2ilo5GQ2YJ LGZTeH.QVSOBXlwQeKIvsqwGP_6nOVdOsywcM3O26uplw6ZfQkvT8L89dHqrpig4RQAuNaP8OBTh LfVwoJLEAADcpFf6AKgN9nNo5YlepbacvvxQ1yHMU8uyYFUhyPTP9owqA7.eJnfYzJUtJEriW.EH lRgRnbxgNr1tM4fqInE.cVcNWRYlN5NaJvKBOURzIDW4_a21q2cVuKesg_YZ5sH1F.jWQLXu0xMD uDAWWRRJHL2h1QN5PQDpgdUnMy2G7P104N6ocpjR9eZYhgywpiWE6Vw3P0lcBssrJ5hckwXFHa7N P.sYqqjprRDDy1us0kIWGx509gTENteCmuq9yT4Dxkyi0hCM.NsjWVsSzwteUlrCGH4BiLHlPJBQ FR1J_RzvvE69tE86jKU2oKVmsgk44ESltRYlV57d7eoZkE5jLshRAqkyt44tQDsHd1DgY1epJWRy t8Cl7O0ksf7.Xf.rupHhdM_zTmaeCIxbdCVNIRzO..SYXfA4HrT1hVzxYtdApdSsywxqID5pcQkx sp5woe68vjM8du6PG1tvbCAygq1IOY3jxNMGZAQLzh.XTE3_QfXLVMbxnkjR5rLP6LchCAKgI._w YujikXZ5YJm6aoeZC1cs0_YWDSFD05dSWuKNYWGVdE9c31_jwwjrKEKIWc5za_Qq2tbDJj2mlV__ 48qkxWUEYSpebyMBKUMx20yaDI3fBJYAkutt8fmCZx2fb09XkqcfUVxde86t7HkgjSWe1LXnT14. DsW6_uC6wqxachY1TkGg03BnpOsQRMxveP5Lo_95Ij67RVzrZ2HZauDAJ2dRMeVQiIMiPJdsiyQm 9jEuWmSdyLmsv7yV7tkft5Hd12duHd1AlUR9LlekJ5UcKFUjs8W4.BaTScrXQqx8- Received: from sonic.gate.mail.ne1.yahoo.com by sonic314.consmr.mail.ne1.yahoo.com with HTTP; Mon, 25 Feb 2019 23:50:00 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp405.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID fe7a88af288ff3baaf0e2c3a26d6e422; Mon, 25 Feb 2019 23:49:57 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: head -r344018 based powerpc64 pmac_thermal hangup (stuck sleeping): some preliminary evidence [not as uniform as I initially saw] From: Mark Millard In-Reply-To: <40D1DDA1-10FB-4F2C-B38B-C7FED5795542@yahoo.com> Date: Mon, 25 Feb 2019 15:49:56 -0800 Cc: freebsd-hackers Hackers Content-Transfer-Encoding: quoted-printable Message-Id: References: <40D1DDA1-10FB-4F2C-B38B-C7FED5795542@yahoo.com> To: FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 8B49C87F53 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.59 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.93)[0.927,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.34)[ip: (4.48), ipnet: 66.163.184.0/21(1.27), asn: 36646(1.02), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.71)[0.711,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.12)[0.118,0]; RCVD_IN_DNSWL_NONE(0.00)[147.189.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2019 23:50:03 -0000 [I've now seen examples of sbt=3D=3D0xfffffed8 that did not lead to a hangup.] On 2019-Feb-25, at 14:46, Mark Millard wrote: > I adjusted what KTR_PROC does to just show just some of its pid 26 > (pmac_thermal) messages and adding some extra output as well. I'll > list some of that output later. I'll note that beyond pmac_thermal > the buf*daemon* threads also seem to be subject to being stuck > sleeping in (offsets are for a specific build of mine): >=20 > mi_switch+0x134 sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c >=20 >=20 > So far for pmac_thermal I've seen that until the failing case: >=20 > sleepq_set_timeout_sbt was being given: sbt=3D=3D0xcccccbe0 pr=3D=3D0x0 = flags=3D=3D0x100 > and in turn was using: prec=3D=3D0xcccccbe flags=3D=3D0x501 (of course = the used > td_sleeptimo varies). >=20 > [I note that 16*0xcccccbe =3D=3D 0xcccccbe0, the original sbt value, > not that I know yet if this matters.] >=20 > But the sequence leading to failures is different: I've now seen examples of sbt=3D=3D0xfffffed8 that did not lead to a hangup. So it is not a reliable predictor of the hang-up in sleep. I'm trying to see if I can observe a failure with different value. > sleepq_set_timeout_sbt was given: sbt=3D=3D0xfffffed8 pr=3D=3D0x0 = flags=3D=3D0x100 > and in turn was using: prec=3D=3D0xfffffed flags=3D=3D0x501 >=20 > [I note that 16*0xfffffed =3D=3D 0xfffffed0, so less than the original > sbt value, not that I know this matters at this point.] >=20 > For sbt=3D=3D0xfffffed8, the callout to sleepq_timeout ends up with = values > like (a particular example): >=20 > td_sleeptimo=3D0x470d360fe5 sbinuptime=3D0x46c869f6aa >=20 > where the reporting code looks like: >=20 > static void > sleepq_timeout(void *arg) > { > struct sleepqueue_chain *sc __unused; > struct sleepqueue *sq; > struct thread *td; > void *wchan; > int wakeup_swapper; > sbintime_t sbut; // HACK!!! >=20 > td =3D arg; > wakeup_swapper =3D 0; > if (26 =3D=3D td->td_proc->p_pid) // HACK!!! > CTR3(KTR_PROC, "sleepq_timeout: thread %p (pid %ld, %s)", > (void *)td, (long)td->td_proc->p_pid, (void *)td->td_name); >=20 > thread_lock(td); >=20 > sbut=3D sbinuptime(); // HACK!!! > if (td->td_sleeptimo > sbut || td->td_sleeptimo =3D=3D 0) { > /* > * The thread does not want a timeout (yet). > */ > if (26 =3D=3D td->td_proc->p_pid) // HACK!!! > CTR5(KTR_PROC, "sleepq_timeout thread not want = timeout yet: thread %p (pid %ld, %s) td_sleeptimo=3D%jx sbinuptime=3D%jx",= > (void *)td, (long)td->td_proc->p_pid, (void = *)td->td_name, (uintmax_t)td->td_sleeptimo, (uintmax_t)sbut); > . . . >=20 > So far sleepq_set_timeout_sbt being given sbt=3D=3D0xfffffed8 instead = of > sbt=3D=3D0xcccccbe0 seems to be an accurate indicator of if the = problem will > happen in sleepq_timeout. (But I've only a few examples so far.) >=20 I've now seen examples of sbt=3D=3D0xfffffed8 that did not lead to a hangup. So it is not a reliable predictor of the hang-up in sleep. It looks like the values are sometimes more varied than I'd seen before as well. > I'll note that the sleepq_timeout code for this case does not set up > another callout to itself for later and the sleep then continues > indefinitely. >=20 > I've not yet gotten into finding evidence for why the callout to > sleepq_timeout itself happens. Hopefully I can find some. >=20 >=20 > An example of some modified KTR_PROC output is: >=20 > . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Tue Feb 26 12:29:57 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32D5D1518CD9 for ; Tue, 26 Feb 2019 12:29:57 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic304-11.consmr.mail.bf2.yahoo.com (sonic304-11.consmr.mail.bf2.yahoo.com [74.6.128.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 93ADA83D92 for ; Tue, 26 Feb 2019 12:29:55 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: YLflxQ4VM1nBP2F6ZCWaysFaRIrubVu0TeJDEP8YiDWuV9sYHx9RqlqMtE33IZe PCKGm8UQjRm2C..vUXVIFOyE8KhDUa_2WCJGx8HkMtl16pQIOI6mp0OaSbRQtO0shQV5iS9wNtx7 933m3BFWckwZvCfC4mGj7dVEmHbRZnbvYj6TRkQat9Dc1CTK_H6nvbotz9LHwtQwn6k42gZkjB5. 8O3zeTYfs6Vnq26RiPDmVYc8H5J9cbfHRtfaEufcZvgqUWF2RF3PbbTqto2.hhkjm7LH2qnxlwZV 1XD3thMcXImPDj4emFieK49mlHdApJEYpB7Tan5pqI0qjO5_8kXnY1RPc0_SBmIU5t7d8cXS1D9B LE7Atx5ksz_3UWXfYhdy9WlUWb3yAiaJ0ofRjMReRGw2oEoT7ImbSWOUpzM4A1BdlWJqtv0w3y7t WL1CSiRbA20efXJ21YRAsyVNUVH87dNKAnVb0sN5Rn9HawV0XNjqwyWXGexwfbtRMnN1sOGb8oPU xDDDEXeYuem5Xzt1tjZCEdEiHa4tLSbTUD5qaBVVqCHiAWWl2Kotojzcd7bFMMVvVlxLV2NijLbo LWywnWz7IcdjEQ6GkN.iLU6eA1kBVAqirEAbUf5XAHj6A30o8jf1NUm5Y.xLkwfsn4LS4CSlkjcr 7wRm_POIA4SdGfiQyelSxr5cIkpAnJJcGoy97roL4HEUhEijyS1NMoge1CgYKv754P5eq2wadUo0 Vchm1o81AAPxTTGWMrtFGNYjrIvaWkoOTlGdE0JB7MjuNTeYXGscn..ukK6qcYgz_UhtbMweA6gE sgH2GU4wADMQLyltgmu.pA4V52Eyx561hxjSHhLyRqpCWGlm_5roTjfwZYRoFYbY8P24Kvl6zVJs Nw5wNu5Wym0oAaYVPN9GiyaTZoMOTB4t.e4kNC.m6_VDmQ6Y67DMeav3j_Zxkgf436e9kCOng9Hr 2XVQHHBEXpE9L1UcdkqmojvDTNJjYw0tGdcVjdJBathOEAe_hAXGY4k5grRpVyC_.CpK4SyYFxTn xK2U4KKvBlOqZTrjJTjLR8ATFaaWdqIRtnLLO6BbsZDcitJVbszKe2WgP9vBrzg-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic304.consmr.mail.bf2.yahoo.com with HTTP; Tue, 26 Feb 2019 12:29:48 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp404.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID d3f72f6b046eb9545f80da18c5ac6bb7; Tue, 26 Feb 2019 12:29:46 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: The FreeBSD code structure that sometimes break powerpc64's pmac_thermal, buf*daemon* threads, etc. (stuck sleeping) Message-Id: <1C8071BF-5B3B-416D-8CE4-437D4555EEB5@yahoo.com> Date: Tue, 26 Feb 2019 04:29:44 -0800 To: Justin Hibbits , FreeBSD PowerPC ML , freebsd-hackers Hackers X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 93ADA83D92 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.61 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.77)[0.769,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.43)[ip: (4.51), ipnet: 74.6.128.0/21(1.51), asn: 26101(1.21), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.83)[0.832,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.08)[0.085,0]; RCVD_IN_DNSWL_NONE(0.00)[34.128.6.74.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[34.128.6.74.rep.mailspike.net : 127.0.0.17] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2019 12:29:57 -0000 [head -r334018 based. I temporarily changed what messaging happens for KTR_PROC in order to gather information. As stands it was reporting on pmac_thermal because I can listen for the fans to know the stuck-sleeping has happened recently.] A powerpc64 sleepq_timeout callout usage has the structure: 0xe00000009af7c730: at sleepq_timeout+0x148 0xe00000009af7c7d0: at softclock_call_cc+0x234 0xe00000009af7c910: at callout_process+0x2e0 0xe00000009af7c9f0: at handleevents+0x22c 0xe00000009af7caa0: at timercb+0x340 0xe00000009af7cba0: at decr_intr+0x140 0xe00000009af7cbd0: at powerpc_interrupt+0x268 timercb does: now =3D sbinuptime(); . . . handleevents(now, 0); That in turn leads to: if (now >=3D state->nextcallopt || now >=3D state->nextcall) { state->nextcall =3D state->nextcallopt =3D SBT_MAX; callout_process(now); } Which leads to: if (tmp->c_time <=3D now) { . . . if (tmp->c_iflags & CALLOUT_DIRECT) { . . . softclock_call_cc(tmp, cc, #ifdef CALLOUT_PROFILING &mpcalls_dir, = &lockcalls_dir, NULL, #endif 1); So softclock_call_cc and sleepq_timeout do not get a copy of timercb 's now value. (I will refer to this now value in contexts for which it is not accessible.) sleepq_timeout uses: td->td_sleeptimo > sbinuptime() to indicate to not do something like: sleepq_resume_thread(. . .) but also does not use sleepq_set_timeout_sbt to set up another sleepq_timeout callout or do some other such under this condition. (So a what is effectively a no-op ends up the last activity before the thread is stuck asleep.) (I will continue to write sbinuptime() to reference the value used in that test.) With multiple processors, it is observed that, despite sbinuptime() being physically later: sbinuptime() < now does sometimes happen for the code involved, for example: sbinuptime(): 0x11abf13bd142 now : 0x11accb5df419 This is even though: tmp->c_time =3D=3D td->td_sleeptimo is observed. sbinuptime() < now sometimes leads to: sbinuptime() < tmp->c_time =3D=3D td->td_sleeptimo <=3D now for example the following happened: sbinuptime(): 0x11abf13bd142 tmp->c_time : 0x11ac2096af71 [=3D=3D td->td_sleeptimo] now : 0x11accb5df419 As I understand, keeping the various powerpc64 CPUs' sbinuptime() values fully synchronized is unlikely so the problem would still exist even if "closer but not identical" across CPUs was achieved. [The testing context here is an old PowerMac G5 4-core (system total), which actually involves 2 sockets, 2 cores per socket.] Overall this structure does not seemed to be designed to handle variations in sbinuptime() values across CPUs. The sleepq_timeout source code (before my limited modifications for reporting more): static void sleepq_timeout(void *arg) { struct sleepqueue_chain *sc __unused; struct sleepqueue *sq; struct thread *td; void *wchan; int wakeup_swapper; td =3D arg; wakeup_swapper =3D 0; CTR3(KTR_PROC, "sleepq_timeout: thread %p (pid %ld, %s)", (void *)td, (long)td->td_proc->p_pid, (void *)td->td_name); thread_lock(td); =20 if (td->td_sleeptimo > sbinuptime() || td->td_sleeptimo =3D=3D = 0) { /* * The thread does not want a timeout (yet). */ } else if (TD_IS_SLEEPING(td) && TD_ON_SLEEPQ(td)) { /* * See if the thread is asleep and get the wait * channel if it is. */ wchan =3D td->td_wchan; sc =3D SC_LOOKUP(wchan); THREAD_LOCKPTR_ASSERT(td, &sc->sc_lock); sq =3D sleepq_lookup(wchan); MPASS(sq !=3D NULL); td->td_flags |=3D TDF_TIMEOUT; wakeup_swapper =3D sleepq_resume_thread(sq, td, 0); } else if (TD_ON_SLEEPQ(td)) { /* * If the thread is on the SLEEPQ but isn't sleeping * yet, it can either be on another CPU in between * sleepq_add() and one of the sleepq_*wait*() * routines or it can be in sleepq_catch_signals(). */ td->td_flags |=3D TDF_TIMEOUT; } thread_unlock(td); if (wakeup_swapper) kick_proc0(); } I wonder if just not having: td->td_sleeptimo > sbinuptime() || in the if (. . .) is appropriate for the powerpc64 context in use: presume callout_process 's tmp->c_time <=3D now is sufficient? (Vs: race issues?) Adjusted KTR_PROC output example ("show ktr /v" in ddb captured, so newest to oldest order, flags and such are in hex despite my not providing 0x prefixes): 922 (0xc00000000d12f000:cpu3) 151713214399 = /usr/src/sys/kern/subr_sleepqueue.c.1027: sleepq_timeout thread not want = timeout yet: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) = td_sleeptimo=3D11ac2096af71 sbinuptime=3D11abf13bd142 921 (0xc00000000d12f000:cpu3) 151713214374 = /usr/src/sys/kern/subr_sleepqueue.c.1016: sleepq_timeout: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 920 (0xc00000000d12f000:cpu3) 151713214337 = /usr/src/sys/kern/kern_timeout.c.503: callout_process to call = softclock_clock_cc: thread 0xc00000000a8315a0 c_time=3D11ac2096af71 = now=3D11accb5df419 919 (0xc00000000a8315a0:cpu3) 151686051357 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 918 (0xc00000000a8315a0:cpu3) 151686051310 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dfffffed8 pr=3D0 flags=3D100 td_sleeptimo=3D11ac2096af71 = prec=3Dfffffed flags=3D501 917 (0xc00000000a8315a0:cpu3) 151686051306 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 916 (0xc00000000a8315a0:cpu3) 151686051267 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on pmac_therm (0xc0000000015082c3) 915 (0xc00000000a8315a0:cpu3) 151686051162 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 914 (0xc00000000a8315a0:cpu3) 151686051144 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 913 (0xc000000001f6e000:cpu0) 151686050654 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 912 (0xc000000001f6e000:cpu0) 151686050634 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) 911 (0xc00000000a8315a0:cpu3) 151684339557 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 910 (0xc00000000a8315a0:cpu3) 151684339525 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abe0139d4d = prec=3Dcccccbe flags=3D501 909 (0xc00000000a8315a0:cpu3) 151684339517 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 908 (0xc00000000a8315a0:cpu3) 151684339498 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) 907 (0xc00000000a8315a0:cpu3) 151684339326 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 906 (0xc00000000a8315a0:cpu3) 151684339313 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 905 (0xc000000001f6e000:cpu3) 151684338924 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 904 (0xc000000001f6e000:cpu3) 151684338918 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) 903 (0xc00000000a8315a0:cpu3) 151682628069 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 902 (0xc00000000a8315a0:cpu3) 151682628004 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abd3054758 = prec=3Dcccccbe flags=3D501 901 (0xc00000000a8315a0:cpu3) 151682628000 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 900 (0xc00000000a8315a0:cpu3) 151682627960 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) 899 (0xc00000000a8315a0:cpu3) 151682627706 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 898 (0xc00000000a8315a0:cpu3) 151682627683 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 897 (0xc000000001f6e000:cpu0) 151682627254 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 896 (0xc000000001f6e000:cpu0) 151682627242 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) 895 (0xc00000000a8315a0:cpu0) 151680915222 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 894 (0xc00000000a8315a0:cpu0) 151680915139 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abc5f6f163 = prec=3Dcccccbe flags=3D501 893 (0xc00000000a8315a0:cpu0) 151680915133 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) 892 (0xc00000000a8315a0:cpu0) 151680915085 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) 891 (0xc00000000a8315a0:cpu0) 151680914811 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 890 (0xc00000000a8315a0:cpu0) 151680914784 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) 889 (0xc000000001f6e000:cpu2) 151680914304 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) 888 (0xc000000001f6e000:cpu2) 151680914284 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Tue Feb 26 20:25:10 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CE3BD1502A88 for ; Tue, 26 Feb 2019 20:25:09 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-22.consmr.mail.ne1.yahoo.com (sonic312-22.consmr.mail.ne1.yahoo.com [66.163.191.203]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F157A71990 for ; Tue, 26 Feb 2019 20:25:07 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: yr9.G9kVM1lHADL0MxVesQWj4fSDbSpAd5arPU2e5V6sIPfACIdXlFI9jUQSFib SVhFLd353Veu6.a6CEhwyARoZ0vbcX.AqUol4n4W5lYGQ5sK2LW8vHZDlIC5fZSU0yDgxnWSW9NR OoEQh6gMmyhVYBf4h8rbU7SNUpID4Qlsufkz35WUh1QKTZ8jKnZyr7gr.CoWOZhd_caF9GJX8uiB y3vJauCch5eyGPl1OKu4tm4rYQHIzj_782YiNhgCNrtLMQtZYIQukxVzB1qXtXgjDZS5riL7FYUE COxtUHroGT359TFX7xSv1t3lcksQ6FPL3Y9x5OC93JAsB22U_O2ITXC445uEbSxNBZ4MFY0otYqU s5nx4Vr0wW5jUvCxU49JI8xhblkx0b.b0fvxH3YyPoswns19.qtNiwgCmxIfGRmMJ_Oc.c0hYv2g 8KTTjCn2TYW7r0.js.ICheknLPe2BSDe.d0P_EEVooeUKuNBp.0uxu9fbhUUEJ9pzPPifZV_yyTB kSmp7ldJlF7l1IeYg.bmLaVs87oei6O8sN3xk1y0gaMYn4x5JIZa_7L_Pe2KFRt8FKBI0S2wJ51y qjiVLkOwc9lzkZsNrHrtKM4.uSoKkFyt1uLwqBVl82aZz6nnYGtMFcuPL2YLM2IfdpNQEeeRRQDO rkljEuUttRopZgld8rpbGqHOmkdMatmeZW1kbzsYxBZK4j022E8i9LqTbt7hxmn8yzuf92MVibW6 _jp0GiOt0Xu9MJmzVf5lsmH9y7s6TLcrdLs9eomMb9jNMy5DWMOsvWPPLPY1WyBv48r6fFJzddSw tDMq7xtmcxBz3Ak3mUx6S8I_mKWm8SjPAG8s06lmVVLA47fakohJvUmE1rGxXtdgJ8aE.KziFMaP F41B0KfTjqAOt9R80T7ExQEy2rwkTh9BxDXbeT_PU6ttTJjUEIiypexE3dWgrHDxeUekxHB3JIXe cInsnm_6Nj3okhUOzcT12_pOCT2uhVED9Wk0ORfHC4QlS0xiJO96ZZQ.FwV.BnBIiAWrLZFV0ejy jY8Yoz7OdqzARlRf6Mw80nRQ1Nlg0uVgQmuJC.0cDhPQ42AE5gio7Rq.Qph4t.kI- Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.ne1.yahoo.com with HTTP; Tue, 26 Feb 2019 20:25:06 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp424.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 6b1053a590c7ef4d6bf27d69baf020cc; Tue, 26 Feb 2019 20:25:01 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: The FreeBSD code structure that sometimes break powerpc64's pmac_thermal, buf*daemon* threads, etc. (stuck sleeping) Date: Tue, 26 Feb 2019 12:25:00 -0800 References: <1C8071BF-5B3B-416D-8CE4-437D4555EEB5@yahoo.com> To: Justin Hibbits , FreeBSD PowerPC ML , freebsd-hackers Hackers In-Reply-To: <1C8071BF-5B3B-416D-8CE4-437D4555EEB5@yahoo.com> Message-Id: <4DFDC776-BE0E-4952-8414-6B7ACFEC815B@yahoo.com> X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: F157A71990 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.13 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.96)[0.964,0]; NEURAL_HAM_LONG(-0.06)[-0.059,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.38)[ip: (4.67), ipnet: 66.163.184.0/21(1.27), asn: 36646(1.01), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.36)[0.357,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[203.191.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2019 20:25:10 -0000 [In one aspect of my comments I seem to have misapplied some generic background information: cross CPU sbinuptime() values are not involved for timercb vs. sleepq_timeout in the call chain. It sometimes goes backwards on the same CPU during the call chain.] On 2019-Feb-26, at 04:29, Mark Millard wrote: > [head -r334018 based. I temporarily changed what messaging > happens for KTR_PROC in order to gather information. As stands > it was reporting on pmac_thermal because I can listen for the > fans to know the stuck-sleeping has happened recently.] >=20 > A powerpc64 sleepq_timeout callout usage has the structure: >=20 > 0xe00000009af7c730: at sleepq_timeout+0x148 > 0xe00000009af7c7d0: at softclock_call_cc+0x234 > 0xe00000009af7c910: at callout_process+0x2e0 > 0xe00000009af7c9f0: at handleevents+0x22c > 0xe00000009af7caa0: at timercb+0x340 > 0xe00000009af7cba0: at decr_intr+0x140 > 0xe00000009af7cbd0: at powerpc_interrupt+0x268 >=20 > timercb does: >=20 > now =3D sbinuptime(); > . . . > handleevents(now, 0); >=20 > That in turn leads to: >=20 > if (now >=3D state->nextcallopt || now >=3D state->nextcall) { > state->nextcall =3D state->nextcallopt =3D SBT_MAX; > callout_process(now); > } >=20 > Which leads to: >=20 > if (tmp->c_time <=3D now) { > . . . > if (tmp->c_iflags & CALLOUT_DIRECT) { > . . . > softclock_call_cc(tmp, cc, > #ifdef CALLOUT_PROFILING > &mpcalls_dir, = &lockcalls_dir, NULL, > #endif > 1); >=20 > So softclock_call_cc and sleepq_timeout do not get a copy > of timercb 's now value. (I will refer to this now value > in contexts for which it is not accessible.) >=20 > sleepq_timeout uses: >=20 > td->td_sleeptimo > sbinuptime() >=20 > to indicate to not do something like: >=20 > sleepq_resume_thread(. . .) >=20 > but also does not use sleepq_set_timeout_sbt to > set up another sleepq_timeout callout or do some > other such under this condition. (So a what is > effectively a no-op ends up the last activity > before the thread is stuck asleep.) >=20 > (I will continue to write sbinuptime() to reference > the value used in that test.) >=20 > With multiple processors, it is observed that, > despite sbinuptime() being physically later: >=20 > sbinuptime() < now >=20 > does sometimes happen for the code involved, > for example: >=20 > sbinuptime(): 0x11abf13bd142 > now : 0x11accb5df419 >=20 > This is even though: >=20 > tmp->c_time =3D=3D td->td_sleeptimo >=20 > is observed. sbinuptime() < now sometimes leads > to: >=20 > sbinuptime() < tmp->c_time =3D=3D td->td_sleeptimo <=3D now >=20 > for example the following happened: >=20 > sbinuptime(): 0x11abf13bd142 > tmp->c_time : 0x11ac2096af71 [=3D=3D td->td_sleeptimo] > now : 0x11accb5df419 >=20 > As I understand, keeping the various > powerpc64 CPUs' sbinuptime() values fully > synchronized is unlikely so the problem would > still exist even if "closer but not identical" > across CPUs was achieved. The comments about various CPUs having somewhat mismatched sbinuptime() values does not seem to apply to timercb vs. softclock_call_cc in: 0xe00000009af7c7d0: at softclock_call_cc+0x234 0xe00000009af7c910: at callout_process+0x2e0 0xe00000009af7c9f0: at handleevents+0x22c 0xe00000009af7caa0: at timercb+0x340 But that means that sbinuptime() values are going backwards on the same CPU during that call chain's activity! Ouch. > [The testing context here is an old PowerMac G5 > 4-core (system total), which actually involves > 2 sockets, 2 cores per socket.] >=20 > Overall this structure does not seemed to be > designed to handle variations in sbinuptime() > values across CPUs. >=20 >=20 > The sleepq_timeout source code (before my > limited modifications for reporting more): >=20 > static void > sleepq_timeout(void *arg) > { > struct sleepqueue_chain *sc __unused; > struct sleepqueue *sq; > struct thread *td; > void *wchan; > int wakeup_swapper; >=20 > td =3D arg; > wakeup_swapper =3D 0; > CTR3(KTR_PROC, "sleepq_timeout: thread %p (pid %ld, %s)", > (void *)td, (long)td->td_proc->p_pid, (void *)td->td_name); >=20 > thread_lock(td); >=20 > if (td->td_sleeptimo > sbinuptime() || td->td_sleeptimo =3D=3D = 0) { > /* > * The thread does not want a timeout (yet). > */ > } else if (TD_IS_SLEEPING(td) && TD_ON_SLEEPQ(td)) { > /* > * See if the thread is asleep and get the wait > * channel if it is. > */ > wchan =3D td->td_wchan; > sc =3D SC_LOOKUP(wchan); > THREAD_LOCKPTR_ASSERT(td, &sc->sc_lock); > sq =3D sleepq_lookup(wchan); > MPASS(sq !=3D NULL); > td->td_flags |=3D TDF_TIMEOUT; > wakeup_swapper =3D sleepq_resume_thread(sq, td, 0); > } else if (TD_ON_SLEEPQ(td)) { > /* > * If the thread is on the SLEEPQ but isn't sleeping > * yet, it can either be on another CPU in between > * sleepq_add() and one of the sleepq_*wait*() > * routines or it can be in sleepq_catch_signals(). > */ > td->td_flags |=3D TDF_TIMEOUT; > } >=20 > thread_unlock(td); > if (wakeup_swapper) > kick_proc0(); > } >=20 >=20 > I wonder if just not having: >=20 > td->td_sleeptimo > sbinuptime() || >=20 > in the if (. . .) is appropriate for the powerpc64 context in > use: presume callout_process 's tmp->c_time <=3D now is > sufficient? (Vs: race issues?) >=20 >=20 >=20 > Adjusted KTR_PROC output example ("show ktr /v" in ddb > captured, so newest to oldest order, flags and such are > in hex despite my not providing 0x prefixes): >=20 > 922 (0xc00000000d12f000:cpu3) 151713214399 = /usr/src/sys/kern/subr_sleepqueue.c.1027: sleepq_timeout thread not want = timeout yet: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) = td_sleeptimo=3D11ac2096af71 sbinuptime=3D11abf13bd142 > 921 (0xc00000000d12f000:cpu3) 151713214374 = /usr/src/sys/kern/subr_sleepqueue.c.1016: sleepq_timeout: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 920 (0xc00000000d12f000:cpu3) 151713214337 = /usr/src/sys/kern/kern_timeout.c.503: callout_process to call = softclock_clock_cc: thread 0xc00000000a8315a0 c_time=3D11ac2096af71 = now=3D11accb5df419 > 919 (0xc00000000a8315a0:cpu3) 151686051357 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 918 (0xc00000000a8315a0:cpu3) 151686051310 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dfffffed8 pr=3D0 flags=3D100 td_sleeptimo=3D11ac2096af71 = prec=3Dfffffed flags=3D501 > 917 (0xc00000000a8315a0:cpu3) 151686051306 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 916 (0xc00000000a8315a0:cpu3) 151686051267 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on pmac_therm (0xc0000000015082c3) > 915 (0xc00000000a8315a0:cpu3) 151686051162 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 914 (0xc00000000a8315a0:cpu3) 151686051144 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 913 (0xc000000001f6e000:cpu0) 151686050654 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 912 (0xc000000001f6e000:cpu0) 151686050634 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > 911 (0xc00000000a8315a0:cpu3) 151684339557 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 910 (0xc00000000a8315a0:cpu3) 151684339525 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abe0139d4d = prec=3Dcccccbe flags=3D501 > 909 (0xc00000000a8315a0:cpu3) 151684339517 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 908 (0xc00000000a8315a0:cpu3) 151684339498 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) > 907 (0xc00000000a8315a0:cpu3) 151684339326 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 906 (0xc00000000a8315a0:cpu3) 151684339313 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 905 (0xc000000001f6e000:cpu3) 151684338924 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 904 (0xc000000001f6e000:cpu3) 151684338918 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > 903 (0xc00000000a8315a0:cpu3) 151682628069 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 902 (0xc00000000a8315a0:cpu3) 151682628004 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abd3054758 = prec=3Dcccccbe flags=3D501 > 901 (0xc00000000a8315a0:cpu3) 151682628000 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 900 (0xc00000000a8315a0:cpu3) 151682627960 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) > 899 (0xc00000000a8315a0:cpu3) 151682627706 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 898 (0xc00000000a8315a0:cpu3) 151682627683 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 897 (0xc000000001f6e000:cpu0) 151682627254 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 896 (0xc000000001f6e000:cpu0) 151682627242 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > 895 (0xc00000000a8315a0:cpu0) 151680915222 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 894 (0xc00000000a8315a0:cpu0) 151680915139 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abc5f6f163 = prec=3Dcccccbe flags=3D501 > 893 (0xc00000000a8315a0:cpu0) 151680915133 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 892 (0xc00000000a8315a0:cpu0) 151680915085 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) > 891 (0xc00000000a8315a0:cpu0) 151680914811 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 890 (0xc00000000a8315a0:cpu0) 151680914784 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 889 (0xc000000001f6e000:cpu2) 151680914304 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 888 (0xc000000001f6e000:cpu2) 151680914284 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Tue Feb 26 21:11:57 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A47701504135 for ; Tue, 26 Feb 2019 21:11:57 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic305-22.consmr.mail.ne1.yahoo.com (sonic305-22.consmr.mail.ne1.yahoo.com [66.163.185.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 862C8737A6 for ; Tue, 26 Feb 2019 21:11:56 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: OLsV_NQVM1mvRRyCyIsKLRyN9sXbjQhUgFV0ofmvE5TNVz2Y_L2KAEQt8EeYe1H Oj5AyG_pUrXTpfBw1AS.ZzOvfnV1RSE5Wc1K9ejhagYG_NRU6pyzAmJG0XQb6lDv27FtzGU97ora RLbbQl2.djkUSNKSbsq3b8U2TFWZD8SMIj7B5xgZsegO.2aMCEJ53U542_celG584K3rMl3XUY2M gjOfBlvYFE_TgGp3kcVGvnwIHSn06f4sZWPMhSx6Z6Gh12djB_Ri3UvsfDQ6waKQ4ECRtp3kurwI 3BdWUDh8RnWn9iaqDnMyxFEzNvyOXKg01iZLvl50u3cEWnpDD6vFKQpwEBIIhUbShLSDiAjGITQV KS0pu_s8Kwj.vL2S9rnylxSrKOEmiROaOHMxIDwcS9ezCzJVmhVIyhnLiqm2Wsgel0kUD94AY9_F r88xDw4Hk99JWRUwrNIHsERiDf9L9vuR6raVHg8N.DRmC__3JuehgmVH5mngIaNKySa._VGDxs.P byKTOq853yZrOkVSKKNTMAkKMgGRHGlwQfN.mEZ7w4fvgn0vTJyI3nbXa1hq_7acyzsnYlyTz_ou BOpLFxRFcELaEUEQ3iTKZmj7YSYTlE.tFHi7fq7MA7vEGgAu5UN996YcARn5W7qNqBpQSGAApn4w J0vjkoxBe1akY9xn1dCcd1Rm_YGl1uZhJElCV.UokT7yed7789dXS_bpLtefE6hYlftNBFhhJZM. ktDynL7ALqbIaZ1n_s67Wc3dY7nWifa9fBPH_sjE8ELMGgSdRvcz6cmmlXRbhBJBI_SLlpjscrdv X_yA8atcd8VkJB9Md7pHKWVui26aSxVDXorwVEJLE2tqC4yAK9lHAlV9V42rxKDR5md5Giva.tIf I0hdcgg91iOsi0xGVJ70lXvgUoJ7UgPlIsHPJ7J6Gj2dvdhFhigQ5JLhBDHkyr3ukNH9vFARw2Dd T0Bgu.Jx6NbFl4rpHy4JZfsMd48ZQx1zsqWdMjpfdxd6lxFa3hF4CVtH9IZ7JFubQKZROpK8t27n cOUnBPQnYBrURql.Bvypq06p6lhvtcBuwWjpQCqEAobFghmysKwjfswL4j84- Received: from sonic.gate.mail.ne1.yahoo.com by sonic305.consmr.mail.ne1.yahoo.com with HTTP; Tue, 26 Feb 2019 21:11:55 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp417.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 5a67ab55de4bbffc4121695e74b1b6c7; Tue, 26 Feb 2019 21:11:53 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: An experimental hack that appears to allow old PowerMacG5 4-core (system total) system to boot reliably (head -r343884 based context) From: Mark Millard In-Reply-To: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> Date: Tue, 26 Feb 2019 13:11:52 -0800 Cc: FreeBSD PowerPC ML , Dennis Clarke Content-Transfer-Encoding: quoted-printable Message-Id: References: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> To: Justin Hibbits X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 862C8737A6 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.11 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.96)[0.959,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.36)[ip: (4.57), ipnet: 66.163.184.0/21(1.27), asn: 36646(1.01), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.29)[0.289,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.02)[0.019,0]; RCVD_IN_DNSWL_NONE(0.00)[148.185.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2019 21:11:57 -0000 [I explicitly note that my hack is racy. It apepars that I've finally had an example.] On 2019-Feb-24, at 13:50, Mark Millard wrote: > On 2019-Feb-24, at 13:07, Justin Hibbits = wrote: >=20 >> On Sat, Feb 23, 2019 at 1:36 PM Mark Millard = wrote: >>>=20 >>> For sys/powerpc/aim/mp_cpudep.c 's cpudep_ap_bootstrap I added as = shown below: >>>=20 >>> +extern void hack_into_slb_if_needed(void* vap); // HACK!!! >>> + >>> uintptr_t >>> cpudep_ap_bootstrap(void) >>> { >>> . . . >>> + hack_into_slb_if_needed(pcpup->pc_curpcb); // HACK!!! >>> + >>> sp =3D pcpup->pc_curpcb->pcb_sp; In the above, after the implict slb_insert_kernel, but before the pcpup->pc_curpcb-> attempt, the slb entry could be replaced again. There are, after all, other threads in operation before SI_SUB_SMP starts: SI_SUB_KTHREAD_INIT =3D 0xe000000, /* init process*/ SI_SUB_KTHREAD_PAGE =3D 0xe400000, /* pageout daemon*/ SI_SUB_KTHREAD_VM =3D 0xe800000, /* vm daemon*/ SI_SUB_KTHREAD_BUF =3D 0xea00000, /* buffer daemon*/ SI_SUB_KTHREAD_UPDATE =3D 0xec00000, /* update daemon*/ SI_SUB_KTHREAD_IDLE =3D 0xee00000, /* idle procs*/ #ifndef EARLY_AP_STARTUP SI_SUB_SMP =3D 0xf000000, /* start the APs*/ #endif I've finally had one boot hang-up, apparently from this happening. >>> and in src/sys/powerpc/aim/slb.c I added an implementation: >>>=20 >>> +void hack_into_slb_if_needed(void* vap); // HACK!!! >>> +void hack_into_slb_if_needed(void* vap) // HACK!!! >>> +{ // HACK!!! >>> + struct slb *cache=3D PCPU_GET(aim.slb); >>> + vm_offset_t va=3D (vm_offset_t)vap; >>> + uint64_t slbv=3D kernel_va_to_slbv(va); >>> + uint64_t esid=3D va>>ADDR_SR_SHFT; >>> + uint64_t slbe=3D (esid<>> + int i; >>> + >>> + for (i =3D 0; i < n_slbs; i++) { >>> + if (i =3D=3D USER_SLB_SLOT) >>> + continue; >>> + if (cache[i].slbe =3D=3D (slbe | i)) >>> + break; >>> + } >>> + >>> + if (i=3D=3Dn_slbs) >>> + slb_insert_kernel(slbe,slbv); >>> +} // HACK!!! >>> + >>>=20 >>> So far I've not had any boot hang-ups after this. >>>=20 >>> Given the random nature of the hang-ups it will be a >>> while before I conclude for sure how reliable this >>> change makes booting, but so far so good. >>>=20 >>> (I recognize that the "break" could be "return" >>> and then then the "if (i=3D=3Dn_slbs)" would not be >>> needed.) >>>=20 >>>=20 >>> Other issues not fixed by this: >>>=20 >>> This does not change the buf*daemon* randomly getting >>> hung up (and so timing out on shutdown). This appears >>> to be the same issue that leads to the fans sometimes >>> starting to run full-rate because of pmac_thermal >>> being hun -up. >>>=20 >>> For buf*daemon* "top -SHIopid" before shutdown shows >>> just the ones that will not hang-up. The same goes for >>> seeing before hand for pmac_thermal vs. the fans. >>>=20 >>> =3D=3D=3D >>> Mark Millard >>=20 >> Hi Mark, >>=20 >> Fantastic work tracking this down! So the problem is we now can = fault >> when accessing KVA space. I think we should allow this, otherwise we >> can hamper performance with reduced KVA size. I'll have to think >> about how best to do this. Would you be willing to test patches I >> come up with? >=20 > I'll try to test whatever updates you want but there may be some > issues with timeliness. >=20 >=20 >=20 > The reason for the "sometimes" boot-failure is that the entry in the > slb for the PCB/stack for the CPU being added has sometimes been > replaced already before the CPU the pcb is for has sufficiently > configured to allow automatic handling --and other times has not > yet been replaced: the random slb replacement mechanism. >=20 > There already is code to handle slb entry replacements but it does > not work for a CPU still being set up (at the stage of the > sometimes failure). At least that is what I expect for: >=20 > # grep -r "handle_kernel_slb_spill" /usr/src/sys/powerpc/ > /usr/src/sys/powerpc/aim/trap_subr64.S: bl = handle_kernel_slb_spill > /usr/src/sys/powerpc/powerpc/trap.c: void = handle_kernel_slb_spill(int, register_t, register_t); > /usr/src/sys/powerpc/powerpc/trap.c:handle_kernel_slb_spill(int type, = register_t dar, register_t srr0) >=20 > So my hack was to separately do the potential replacement in that > early time frame to allow the configuration for the CPU to get > far enough along for the existing mechanism to work. (At least > that is what I expect that I did.) >=20 > So far I've had no boot failures of any kind with the hack. > I've removed the hacks for reporting information and things > still work. >=20 > But I've not tried anything extensive after booting because > things like buf*daemon* threads and pmac_thermal are randomly > hanging up in/at: >=20 > mi_switch+0x134 sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c > (mi_swtich seems to have called sched_switch based on the > "+0x134" and the code in that area --but ched_switch is not > listed) >=20 > I've no clue what is safe when one or more buf*daeomon* threads > make no progress. >=20 > For shutdown that frequently leads to timeouts for stopping some > buf*deamon* threads (when all 8 time out it takes about 8 minutes). > The buf*deamon* that fail are the ones that "top -SHIopid" no > longer shows. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Tue Feb 26 21:47:01 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C0E2615050E7 for ; Tue, 26 Feb 2019 21:47:01 +0000 (UTC) (envelope-from dclarke@blastwave.org) Received: from atl4mhfb01.myregisteredsite.com (atl4mhfb01.myregisteredsite.com [209.17.115.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CCB8874F4A for ; Tue, 26 Feb 2019 21:46:58 +0000 (UTC) (envelope-from dclarke@blastwave.org) Received: from atl4mhob07.registeredsite.com (atl4mhob07.registeredsite.com [209.17.115.45]) by atl4mhfb01.myregisteredsite.com (8.14.4/8.14.4) with ESMTP id x1QLkpnn012794 for ; Tue, 26 Feb 2019 16:46:51 -0500 Received: from mailpod.hostingplatform.com (atl4qobmail02pod2.registeredsite.com [10.30.77.36]) by atl4mhob07.registeredsite.com (8.14.4/8.14.4) with ESMTP id x1QLkiMl026169 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Tue, 26 Feb 2019 16:46:44 -0500 Received: (qmail 39651 invoked by uid 0); 26 Feb 2019 21:46:44 -0000 X-TCPREMOTEIP: 174.118.245.214 X-Authenticated-UID: dclarke@blastwave.org Received: from unknown (HELO ?172.16.35.3?) (dclarke@blastwave.org@174.118.245.214) by 0 with ESMTPA; 26 Feb 2019 21:46:44 -0000 Subject: Re: An experimental hack that appears to allow old PowerMacG5 4-core (system total) system to boot reliably (head -r343884 based context) To: Mark Millard , Justin Hibbits Cc: FreeBSD PowerPC ML References: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> From: Dennis Clarke Openpgp: preference=signencrypt Autocrypt: addr=dclarke@blastwave.org; keydata= mQINBFxoSrYBEAC1M5KicBVclSHf6d81rxTQYgFhIMhNxekNQgNsB39lCWcq3zSZi75Rflb0 Q74b+lIjBi7a5XygweXgFINPNVLpknrG8y7jA/8jrKqVy5qQ/7Mw/uVou4culndNOkXwNyW9 WTNoAzAtKlDEmzIX/pfaqrulAP8se3ci9vqXInIHpRHZithrrvAsWQWuhC200PYvBlA/Vmv6 3UxV26LVa1uNYgJSgiBbCI9VTv14YSnFRG6WWXTRmVksJMiNY7fZnKGNhFkrcnGxVqVKnCgj enG67ms6uwzhkfa/F1C3BPljb5WcApJwph/Iaq+7EpVD6DmE1xYP6pgqFX4yW5MVRMn6XaIR rbkP90CodrCOTedyrB1E7N8xNZKX+sUwWBnfqv7n8rBGnlNzo2GOBHVxqw7EGYoQItlHDmhx deOOgq6VmmL1kZn4D+5BLUw/w2SljDqXpdF/Gnm3WXGe+ooBGcoMXeiqv+4PM5k11CIBLjRK p2cD51upwccFILPDF8Wipy8t6Oc+ToLz80zb5kiBR9dggORbPr4WHCt7VS4s24mAX7wBQ/EB ePRUykvES3WJLuRBdFAPtXBc9m/q0gzU9iPx3eIm8u2SbO7kUMBESexeBpJ8cIfJ7/LX2LV8 UoWxfJieklheUPZtOA06pyMcb37/A/HZNMOUYh83TKVCnv7FxwARAQABtCVEZW5uaXMgQ2xh cmtlIDxkY2xhcmtlQGJsYXN0d2F2ZS5vcmc+iQJOBBMBCAA4FiEE1j0Rv6qd1s9jGqtWj5Fg Cl9xztwFAlxoSrYCGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQj5FgCl9xztz1Bg/+ KIyWqzrWfTexJ0+9S0EhCNwkb8aCaGKde+dqiqTFFobS5UWphhAtMtLnU4tZG2K+GPIBnMpC 6tC5gxB4TppgcGzqRNle4CjY4Lt7SQs23V+hbTZJLDwlBWbbuqDIvkNiO1pFuaHGNJVYaQ5y qlm156/Y+GmarfVGbVhjelRq3DjDwTcdo1J36UUo3GS8/g1uXX84Va71nAeyivtzwNbU18F4 Bcbmo7fMS0nBUmEqJJWftjmz2ihP1opz2HOEzv9q7uU8q3yfg1pweT8Zscx+Y5dtUd3d4dRL iXJxm2Z2dVcWabMmlhOnLqhPaf39WjKkxr2mHiYN2sUJ5S6yKUM6HKVM7ZE/1HRYo1OZgsEC PQka65hK36ezldtQplKcGlG7DjIW3Vi1BK6o70/7Hvdyfqdeft3qY1bs8BcHfNyan/DBGgTe 34eGnqqU+YY0mRTCpukbC2/MYYEYdeS9/RYiwCf1Tn8x232iVpX6wYx8+L8Nb3QEkTNM3VP0 ArAoF1EE9RZ2jLBV9g+vKRRiatPN8pGMv9on0pO6HhAp19Db4owW/pcgsAXsLS/mjjkxo1Br Gu0shJZ6o6SqDfMpfdNyUVdzvAgAUwWtdSXlgXpn6oCn7B7YhEkj+jQ9p8Y398o9YAybe70v 7GLkZqcPkCv9GQ3Cw5a+i/FNm4JCDeD99ZC5Ag0EXGhKtgEQAMZCBzuT2z/PWurlNcc/ChFy 4sRHrDXL/pwGOy9Ue0s/busdKxPWomOMbFA4PIILaxrT0L1w6xb1Svj2CgYbhSDsW12SdqsA C5MrqQi/j5S/H4rEsZt8nsSbSx6JF+tP5x0i14zG2GXv7+DjxrDMfFThejeEeIcHU//Ip1MQ CF7uGv4ug3WUSKHR7wVTceq5T3oR9kLguszBhavyJZrYte6r0TDG0GdFAGQMAau4FcHsOHyf 46Gx66rGoWmgH+938kodF71d7a0FXpUUI9RAhL1MepR78QkyjGTocBKRbrcXZPO8ya9/Tcmp fRxlJNeMM9TQKND3GYSzZrsYWdmXPdx18R0rzfBOCdDPUjVJhcV9AbeH4EApDPxjDSADQ0X9 SmSoMd27MjU8rFG+Mfu0gbK/OG4kPga/2MO5lU3sublv0PMYcsQqYOcqSBDxBdkAZMDFt376 lCSxau0Ijj2bb49ippjjH6gQU5iA6ASLSFN8AWs80dVeIUt964RAc/XY8QAW621Qe6OaSqh3 M+Umdf38Cc6qySjphSEF6i+YQ1FlbmK09yyEEpDuaFejgRXXaMxj6sF+b/g4JTqxlHDEc9Nd 8+L/zrtPkUXWAss9a8jtm5hGquc37EjyZyLr+35dtyEJBJ2o0G9Len2F9+mfDdRRKJAiqqLL 3JxHKFTZ4cShABEBAAGJAjYEGAEIACAWIQTWPRG/qp3Wz2Maq1aPkWAKX3HO3AUCXGhKtgIb DAAKCRCPkWAKX3HO3MYdEACW614cKJJT9/M2wPyYecKj+KR5tv+oTdGdcZl87mG47XWn4fKI kpyTR9EGVHGbSbrCyG8qMvz+vhe+Aj9SbJ4ccr+1KIaNkBcACOSJdU2UC2sqOBxckki0ArbB ds3efHBaAEKCZv4Qfj5sHILLkImaCtR+FjvP0fr5ankJkbOeucqgxPmkKJxFBgiotWQxPp59 Sl5uzNGeLPBmkleYQMQFAOK6Yhrgsh35AmYNgNoPR6KWsfaIh9BPgEOOxc3Zl99fsZogbt1U 2YUj7L0nCa5s1AMTftZDTBsqZyotDO8/TpwSEC0EOHvcg/GAj+ocMgVPTHaTrgCV2Yy2lCVG u1Mu2T7zsCRMDJNvhC7LA3Qo8Fdc7SFJekr7TllTWB4mbQyYj9/vjQINxoKZV6v7Yfw/rYcm xY2fVsSdxZFvDIM/VRryQpoqzPv9YQrDVWDEb139NtvrNEeUnIXv+cRBKFMBxQ0PIHDkwNAb cmXY5/R58QiqnGE23je0WQNg+iBrbJN9P7inp178m6j9SFor+5pW567vYakASRQn5GPqHqt9 fRQvz5E3aa8xDscR6Gs9HQAhsA5kDqvH/XxQRD7Y1jG9T73WMlS6j928qHfMwQ6EvNuIQwqN PToVd6cMhrTJKE5gUVLVs9Oa81zr/5pNCKJ9upm6cU349JNDO/SDKSTtLA== Message-ID: <54edf264-4f3a-55ac-9c69-914244eaf605@blastwave.org> Date: Tue, 26 Feb 2019 16:46:43 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: CCB8874F4A X-Spamd-Bar: ++++ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [4.19 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_FIVE(0.00)[5]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.92)[0.917,0]; IP_SCORE(0.50)[ipnet: 209.17.112.0/21(1.42), asn: 19871(1.13), country: US(-0.07)]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-ppc@freebsd.org]; DMARC_NA(0.00)[blastwave.org]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.95)[0.950,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[mx1.netsolmail.net]; NEURAL_SPAM_LONG(0.94)[0.942,0]; RCVD_IN_DNSWL_NONE(0.00)[55.115.17.209.list.dnswl.org : 127.0.5.0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:19871, ipnet:209.17.112.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2019 21:47:02 -0000 On 2/26/19 4:11 PM, Mark Millard wrote: > [I explicitly note that my hack is racy. It apepars that > I've finally had an example.] > I'll try to test what-ever comes down from HEAD later today and see what happens with and without kern.smp.disabled as well as usefdt=1. -- Dennis Clarke RISC-V/SPARC/PPC/ARM/CISC UNIX and Linux spoken GreyBeard and suspenders optional From owner-freebsd-ppc@freebsd.org Wed Feb 27 03:01:18 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 225EB150D415 for ; Wed, 27 Feb 2019 03:01:18 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-15.consmr.mail.bf2.yahoo.com (sonic311-15.consmr.mail.bf2.yahoo.com [74.6.131.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C4D47889F8 for ; Wed, 27 Feb 2019 03:01:16 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 0DucC9IVM1lygUDru_n4jdC6KxmK5akptpu38wqgOSCt8FrSh9DvCyHOn6DTJ3m izwEJrLBvVMOGYpXuqBqd44Y5PjDaC4xWM4.cWlPsJ9U3EQoe8QQPcu2DzC7nIIxx2Wfs8HnUuwc zGmrI6T.HMeGcDtifHv0LUY.krFiuvV9gfdQVRCjnIJQfG6XB6v_ZpeR9R_lKQVx3v9xgM8MVoki _cDHftLouPXSZS2lNVhpsiXtWKQtqtrjDVB9pd.oYiXVWvKSZ4uy7qZbNuJLiVwTzoV_ZIrx1cqX HUHbkgzfDLzLs4sWsurlHEMwdATk1yiOF3zz0uRAPtdr7k.bWRUxkn__Cjv9Hu6Z9a0sNOLve1us lLDWWsf1w_h8ajVz61J37BgHzXEqFTLna2BJcJphP3VFoHwtFTwo_dRY42u2qqtLns_3QsUFsvnt WCL8WlfUrDZ3uyhx7f1KEmkD6xK4s3UdKcMEYSePM.OxG1fkEom2ZeRMCLps4G8eFCUQ43zbTAs9 crVlUlMtj05.2E0Uac3o7x1BifJRDNj2eQyLvCxf7hzoddXVluGugSy9CWZnXSATB2ta.DsWNsyK tmIfvn8FFY3IOl5yHQgZIBW7FRM5_c2pknivAbBlk7AqroVCPqA48F8i6Z7VDBhJfJH.cyVByiI_ 07JmTx5C6fqRaF0FmgJk_h.1cple5yZ.3XQc8LDE7bVveuM7nJX5hd_TcgV9FPovCoxUDlNVQyl9 Pc0QjYQK_.cIULKaQ8BbxeFJQzGOV9HbPRldC3Da6ieu224R.UAJ9jD4Ch8__2_gAi0XZw8Ua8Pm zZI21jFn_rZANSU3K_LEObzZ9hTea1K90vT73MSUoHcHc2_Vq.hRa2DtHB_sJkOfk4fdHOLmXIT6 JkIAtB1Nipyq.Vg9Qo1AgM_LS2VoM9qmvlY29nhcgCuWCpMGht47_50zDa0eHJipL6aEYmTfWF8j x8Fw.Wbh78oWIOtirOeVvTg7TS7yrkR9FiRiYoTPt8ppaXQ5yZ4w87uC0NUCDVTEZvbLPdTuMrK9 hCUXRFIC9Tx7eYBqFJWT5PA5XO_qpq2vlTiBpwlgrL_Nq6w-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.bf2.yahoo.com with HTTP; Wed, 27 Feb 2019 03:01:09 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp401.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID d8191612de33c9557a5da1427097446f for ; Wed, 27 Feb 2019 03:01:04 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: head -r334018 powerpc64 PowerMac G5 pmac_thermal: which sleep sometimes gets hung up vs. which ones have not (so far) Message-Id: <4BDD3946-C7E8-4333-847E-3374864A105E@yahoo.com> Date: Tue, 26 Feb 2019 19:01:02 -0800 To: FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: C4D47889F8 X-Spamd-Bar: + X-Spamd-Result: default: False [1.73 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(0.68)[0.680,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-ppc@freebsd.org]; NEURAL_SPAM_MEDIUM(0.42)[0.417,0]; RCPT_COUNT_ONE(0.00)[1]; IP_SCORE(1.06)[ip: (2.67), ipnet: 74.6.128.0/21(1.50), asn: 26101(1.20), country: US(-0.07)]; NEURAL_SPAM_LONG(0.08)[0.079,0]; RCVD_IN_DNSWL_NONE(0.00)[125.131.6.74.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2019 03:01:18 -0000 fan_management_proc uses: pause("pmac_therm", hz); which leads to (via my replacing what KTR_PROC tracing reports): sleepq_set_timeout_sbt: sbt=3Dfffffed8 pr=3D0 flags=3D100 = td_sleeptimo=3D... prec=3Dfffffed flags=3D501 (the figures are all hexadecimal) So far all pmac_thermal stuck-sleeps have been based on this timeout happening but failing the comparison to sbinuptime() in the sleepq_timeout callout (instead of the original --but unaccessible "now"). But the comparison does not always fail --and so the pause does not always get stuck. Note: (timeo/hz) seconds is used (approximately), here (hz/hz) seconds approximately 1 second. smu_run_cmd uses: error =3D tsleep(cmd, 0, "smu", 800 * hz / 1000); . . . which leads to: sleepq_set_timeout_sbt: sbt=3Dcccccbe0 pr=3D0 flags=3D100 = td_sleeptimo=3D... prec=3Dcccccbe flags=3D501 Note: (timeo/hz) seconds is used, here approximately 0.8 seconds. kiic_transfer uses: timo =3D 100; . . . mtx_sleep(dev, &sc->sc_mutex, 0, "kiic", timo); . . . err =3D mtx_sleep(dev, &sc->sc_mutex, 0, "kiic", timo); which leads to: sleepq_set_timeout_sbt: sbt=3D653332be30 pr=3D0 flags=3D100 = td_sleeptimo=3D... prec=3D653332be3 flags=3D501 Note: (100/hz) seconds is used here. smuiic_transfer uses: mtx_sleep(sc, &sc->sc_mtx, 0, "smuiic", 100); . . . mtx_sleep(sc, &sc->sc_mtx, 0, "smuiic", 10); which leads to: sleepq_set_timeout_sbt: sbt=3D28f5c26 pr=3D0 flags=3D100 = td_sleeptimo=3D... prec=3D28f5c2 flags=3D501 Note: (100/hz) seconds is used and (10/hz) seconds is used. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Thu Feb 28 13:06:30 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 533341506B0B for ; Thu, 28 Feb 2019 13:06:30 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic313-19.consmr.mail.gq1.yahoo.com (sonic313-19.consmr.mail.gq1.yahoo.com [98.137.65.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DFD5C6A03D for ; Thu, 28 Feb 2019 13:06:28 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: uS5nA9MVM1kgQOAg.OjvK9ohSvowwPCWEH6z6nkRCCbAtSGxWRzU21GZFDeY.to vy76VoNkeSGzQVh3mE0ao1t5uPYaxRZw3V18KZjQTCej4MovNEmKPdqfPcmz014J8CjMh0050Fdw YBcrNXcIrZfdfKldTC41dPrPQD4LtPf3POGVdQnbt5NPif9xpQnLIHy4wgdKbPFSDq6VeobOykEv pZAxijmxTDvU.Ove_piQ3gCM3r0GHI6F_AuZ.wWP7RIKiH9RRdwoitB.K1jfCWOaBhSHH8X7bxAp pmBQRiR5KudKoWvTCq2aXNX3PtXGwYfpBCou_rVjU11_eYjywNbbjy6yXqjc0fZmB1oYm60z1m6z owghgL0y7EerB0zwLebae6TpelC_Iv8Wa533DHtN8I5kVMem3JKlt10C.YxYBk00U41_SjDkN69e OjRcBJj8iBKc29tFoCJyhVh4NtalXSy1eyAEgxGcIuXC78kFNGpTeiKIyFQmqh_FD5KZx_o7LHFB C81K2a6UglMDl8p_NpnzuifeqCTMwVGmPIBx9xEfZhLVO8KsVFRXnWoqcmznh7PZHLEXX3GCYrEG YgbXv99F4mDYemczik_3M.WtCwdS1jVQzAKGVpdQy_yxqbYatk8VavJ7HRnHl7U05oSEmNdrlHHL SR33J98IAywWjF4PDndC7yezqyq2XOVwrSf0sXc1QLbBE_VPvnhkSH6e8WZhXSGEg81pg8BHs5vZ XygZmGJZUCrRThdu8Sa.xAQmb0mnvRjb9BtKCdSLdN7dm8XDHI.3EAFpZJZe6S1nzeh_LEd7dBCM d0rx.fr.Owq.K9g0QH6dsb.hX1LN5HpyNMlgQcajUSOhBgjJiXZoNyX1e1Rn8vJFK9mMgAuM6Kel _yXJky5nwV4cTvy6b_L_mAMWT1H1iOjWxNqoALuKCiXCTNZ.lg4hxicogP7dXyK8v5Wbw1guER.u acb6WRfSIPMAGRUdPMJPmLQtJDFVJwnhjUqWgIk7z0vds3K_W8g5xgljdmFxKdBsg6qrjGMmzBXs hxynt3cPlxvncCJJ6aabDL_ztJ5BVp_aZK7t6MlxM4xNc0iBkus_sTetaeSUG7NlveHIJMdiO Received: from sonic.gate.mail.ne1.yahoo.com by sonic313.consmr.mail.gq1.yahoo.com with HTTP; Thu, 28 Feb 2019 13:06:26 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp405.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 9137d10ee3f5edb79ba565dfae703c50; Thu, 28 Feb 2019 13:06:24 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes Message-Id: Date: Thu, 28 Feb 2019 05:06:23 -0800 Cc: freebsd-hackers Hackers To: FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: DFD5C6A03D X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.80 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.97)[0.970,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.77)[ip: (7.09), ipnet: 98.137.64.0/21(1.01), asn: 36647(0.81), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.66)[0.664,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.91)[0.911,0]; RCVD_IN_DNSWL_NONE(0.00)[82.65.137.98.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 13:06:30 -0000 Basic context: The code for sleeps of various forms depends on calls to: static __inline sbintime_t sbinuptime(void) { struct bintime _bt; binuptime(&_bt); return (bttosbt(_bt)); } and comparisons with the return values, such as checking for timeouts. The upper 32 bits of the unsigned 64 bit result has the seconds and the lower 32 bits has the fraction as a multiplier of 1sec/(2**64). An observed problem is that later sbinuptime calls sometimes end up with smaller values than earlier ones. (Past lisy freebsd-ppc messages give details.) This makes for problems with checking for timeouts when using later sbinuptime() calls after a timeout was initially detected against an earlier value: A.0) timercb getting the earlier sbinuptime() value A.1) callout_process using that to detect a timeout, B) sleepq_timeout checking the timeout again, using a separate sbinuptime() call. Some details about example values, overflows, and such follow. I used the following sort of hacked code to report values when overflows happen: #if defined(__powerpc64__) && defined(AIM) void binuptime(struct bintime *bt) { struct timehands *th; u_int gen; struct timecounter *tc; // HACK!!! u_int tim_cnt, tim_offset, tim_diff; // HACK!!! uint64_t freq, scale_factor, diff_scaled; // HACK!!! do { th =3D timehands; tc =3D th->th_counter; // HACK!!! gen =3D atomic_load_acq_int(&th->th_generation); *bt =3D th->th_offset; tim_cnt=3D tc->tc_get_timecount(tc); // HACK!!! (steps = of tc_diff with values recorded) tim_offset=3D th->th_offset_count; // HACK!!! tim_diff=3D (tim_cnt - tim_offset) & = tc->tc_counter_mask; // HACK!!! scale_factor=3D th->th_scale; // HACK!!! diff_scaled=3D scale_factor * tim_diff; // HACK!!! bintime_addx(bt, diff_scaled); // HACK!!! freq=3D tc->tc_frequency; // HACK!!! atomic_thread_fence_acq(); } while (gen =3D=3D 0 || gen !=3D th->th_generation); if (*(volatile uint64_t*)0xc0000000000000f0=3D=3D0u && = (0xffffffffffffffffull/scale_factor)tc_get_timecount(tc) value --but only the lower 32 bits are extracted and returned.) Basically whenever tim_diff is such that: (0xffffffffffffffff/scale_factor)tc_frequency: tc->tc_frequency =3D=3D 0x1fca055 (i.e., 33333333) ( tc->tc_counter_mask is 0xfffffffful as well. ) An example observation of diff_scaled having an overflowed value is: scale_factor =3D=3D 0x80da2067ac scale_factor*freq overflows unsigned, 64 bit representation. tim_offset =3D=3D 0x3da0eaeb tim_cnt =3D=3D 0x42dea3c4 tim_diff =3D=3D 0x53db8d9 For reference: 0x1fc9d43 =3D=3D = 0xffffffffffffffffull/scale_factor scaled_diff =3D=3D 0xA353A5BF3FF780CC (truncated to 64 bits) So scale_factor * tim_diff leaves diff_scaled truncated to the least significant 64 bits, which does not preserve ordering properties. Another example: scale_factor =3D=3D 0x80d95962c0 scale_factor*freq =3D=3D 0xfffffffffd65c9c0 tim_offset =3D=3D 0x4d1fb8e2 tim_cnt =3D=3D 0x4d1fb8e1 tim_diff =3D=3D 0xffffffff For reference: 0x1fca055 =3D=3D = 0xffffffffffffffffull/scale_factor scaled_diff =3D=3D 0xD959623F26A69D40 (truncated to 64 bits) Again the diff_scaled holds a truncated value from scale_factor * tim_diff . Another example: scale_factor =3D=3D 0x80da20c940 scale_factor*freq overflows unsigned, 64 bit representation. tim_offset =3D=3D 0x9a7f5cdb tim_cnt =3D=3D 0xb26bbd5 tim_diff =3D=3D 0x70a75efa For reference: 0x1fc9d41 =3D=3D = 0xffffffffffffffffull/scale_factor scaled_diff =3D=3D 0xB3AC715C56AA0880 (truncated to 64 bits) Again the diff_scaled holds a truncated value from scale_factor * tim_diff . Note that the scale_factor does vary. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Thu Feb 28 14:55:55 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 941A6150A71C; Thu, 28 Feb 2019 14:55:55 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 162356E768; Thu, 28 Feb 2019 14:55:54 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x1SEthXj081104 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 28 Feb 2019 16:55:46 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x1SEthXj081104 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x1SEtgqQ081103; Thu, 28 Feb 2019 16:55:42 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Thu, 28 Feb 2019 16:55:42 +0200 From: Konstantin Belousov To: Mark Millard Cc: FreeBSD PowerPC ML , freebsd-hackers Hackers Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes Message-ID: <20190228145542.GT2420@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.2 (2019-01-07) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 14:55:55 -0000 On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via freebsd-ppc wrote: > Basic context: > > The code for sleeps of various forms depends > on calls to: > > static __inline sbintime_t > sbinuptime(void) > { > struct bintime _bt; > > binuptime(&_bt); > return (bttosbt(_bt)); > } > > and comparisons with the return values, such > as checking for timeouts. The upper 32 bits > of the unsigned 64 bit result has the seconds > and the lower 32 bits has the fraction as > a multiplier of 1sec/(2**64). > > An observed problem is that later sbinuptime calls > sometimes end up with smaller values than earlier > ones. (Past lisy freebsd-ppc messages give details.) > This makes for problems with checking for timeouts > when using later sbinuptime() calls after a timeout > was initially detected against an earlier value: > > A.0) timercb getting the earlier sbinuptime() value > A.1) callout_process using that to detect a timeout, > B) sleepq_timeout checking the timeout again, > using a separate sbinuptime() call. > > Some details about example values, overflows, and such follow. > > I used the following sort of hacked code to report values when > overflows happen: > > #if defined(__powerpc64__) && defined(AIM) > void > binuptime(struct bintime *bt) > { > struct timehands *th; > u_int gen; > > struct timecounter *tc; // HACK!!! > u_int tim_cnt, tim_offset, tim_diff; // HACK!!! > uint64_t freq, scale_factor, diff_scaled; // HACK!!! > > do { > th = timehands; > tc = th->th_counter; // HACK!!! > gen = atomic_load_acq_int(&th->th_generation); > *bt = th->th_offset; > tim_cnt= tc->tc_get_timecount(tc); // HACK!!! (steps of tc_diff with values recorded) > tim_offset= th->th_offset_count; // HACK!!! > tim_diff= (tim_cnt - tim_offset) & tc->tc_counter_mask; // HACK!!! > scale_factor= th->th_scale; // HACK!!! > diff_scaled= scale_factor * tim_diff; // HACK!!! > bintime_addx(bt, diff_scaled); // HACK!!! > freq= tc->tc_frequency; // HACK!!! > atomic_thread_fence_acq(); > } while (gen == 0 || gen != th->th_generation); > > if (*(volatile uint64_t*)0xc0000000000000f0==0u && (0xffffffffffffffffull/scale_factor) *(volatile uint64_t*)0xc0000000000000d0= freq; > *(volatile uint64_t*)0xc0000000000000d8= scale_factor; > *(volatile u_int*)0xc0000000000000e0= tim_offset; > *(volatile u_int*)0xc0000000000000e4= tim_cnt; > *(volatile u_int*)0xc0000000000000e8= tim_diff; > *(volatile uint64_t*)0xc0000000000000f0= diff_scaled; > *(volatile uint64_t*)0xc0000000000000f8= scale_factor*freq; > __asm__ ("sync"); > } > } > #else > . . . > #endif > > (mtfb() is used to provide the tc->tc_get_timecount(tc) > value --but only the lower 32 bits are extracted and > returned.) > > Basically whenever tim_diff is such that: > > (0xffffffffffffffff/scale_factor) > then diff_scaled overflows an unsigned, 64 bit representation, > ending up with just the least 64 bits. This truncated value > ends up being used in: > > bintime_addx(bt, diff_scaled); > > Observed consistently for tc->tc_frequency: > > tc->tc_frequency == 0x1fca055 (i.e., 33333333) > > ( tc->tc_counter_mask is 0xfffffffful as well. ) > > An example observation of diff_scaled having an overflowed > value is: > > scale_factor == 0x80da2067ac > scale_factor*freq overflows unsigned, 64 bit representation. > tim_offset == 0x3da0eaeb > tim_cnt == 0x42dea3c4 > tim_diff == 0x53db8d9 > For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) > > So scale_factor * tim_diff leaves diff_scaled truncated to > the least significant 64 bits, which does not preserve > ordering properties. > > Another example: > > scale_factor == 0x80d95962c0 > scale_factor*freq == 0xfffffffffd65c9c0 > tim_offset == 0x4d1fb8e2 > tim_cnt == 0x4d1fb8e1 > tim_diff == 0xffffffff > For reference: 0x1fca055 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xD959623F26A69D40 (truncated to 64 bits) > > Again the diff_scaled holds a truncated value from > scale_factor * tim_diff . > > Another example: > > scale_factor == 0x80da20c940 > scale_factor*freq overflows unsigned, 64 bit representation. > tim_offset == 0x9a7f5cdb > tim_cnt == 0xb26bbd5 > tim_diff == 0x70a75efa > For reference: 0x1fc9d41 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xB3AC715C56AA0880 (truncated to 64 bits) > > Again the diff_scaled holds a truncated value from > scale_factor * tim_diff . > > Note that the scale_factor does vary. > Try the following (I did not even booted it). If worked out, ffclock counterpart also needs the patching. diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..19e81bbf023 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -355,13 +355,20 @@ void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (fls(scale) + fls(delta) > 63) { + bt->sec += (scale >> 32) * delta; + scale &= UINT_MAX; + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +395,20 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (fls(scale) + fls(delta) > 63) { + bt->sec += (scale >> 32) * delta; + scale &= UINT_MAX; + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } From owner-freebsd-ppc@freebsd.org Thu Feb 28 15:08:19 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2EEDE150ACD3; Thu, 28 Feb 2019 15:08:19 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9F8A96EEB4; Thu, 28 Feb 2019 15:08:18 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x1SF8B1b083568 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 28 Feb 2019 17:08:14 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x1SF8B1b083568 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x1SF8BDh083567; Thu, 28 Feb 2019 17:08:11 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Thu, 28 Feb 2019 17:08:11 +0200 From: Konstantin Belousov To: Mark Millard Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes Message-ID: <20190228150811.GU2420@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190228145542.GT2420@kib.kiev.ua> User-Agent: Mutt/1.11.2 (2019-01-07) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 15:08:19 -0000 On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote: > On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via freebsd-ppc wrote: > > Basic context: > > > > The code for sleeps of various forms depends > > on calls to: > > > > static __inline sbintime_t > > sbinuptime(void) > > { > > struct bintime _bt; > > > > binuptime(&_bt); > > return (bttosbt(_bt)); > > } > > > > and comparisons with the return values, such > > as checking for timeouts. The upper 32 bits > > of the unsigned 64 bit result has the seconds > > and the lower 32 bits has the fraction as > > a multiplier of 1sec/(2**64). > > > > An observed problem is that later sbinuptime calls > > sometimes end up with smaller values than earlier > > ones. (Past lisy freebsd-ppc messages give details.) > > This makes for problems with checking for timeouts > > when using later sbinuptime() calls after a timeout > > was initially detected against an earlier value: > > > > A.0) timercb getting the earlier sbinuptime() value > > A.1) callout_process using that to detect a timeout, > > B) sleepq_timeout checking the timeout again, > > using a separate sbinuptime() call. > > > > Some details about example values, overflows, and such follow. > > > > I used the following sort of hacked code to report values when > > overflows happen: > > > > #if defined(__powerpc64__) && defined(AIM) > > void > > binuptime(struct bintime *bt) > > { > > struct timehands *th; > > u_int gen; > > > > struct timecounter *tc; // HACK!!! > > u_int tim_cnt, tim_offset, tim_diff; // HACK!!! > > uint64_t freq, scale_factor, diff_scaled; // HACK!!! > > > > do { > > th = timehands; > > tc = th->th_counter; // HACK!!! > > gen = atomic_load_acq_int(&th->th_generation); > > *bt = th->th_offset; > > tim_cnt= tc->tc_get_timecount(tc); // HACK!!! (steps of tc_diff with values recorded) > > tim_offset= th->th_offset_count; // HACK!!! > > tim_diff= (tim_cnt - tim_offset) & tc->tc_counter_mask; // HACK!!! > > scale_factor= th->th_scale; // HACK!!! > > diff_scaled= scale_factor * tim_diff; // HACK!!! > > bintime_addx(bt, diff_scaled); // HACK!!! > > freq= tc->tc_frequency; // HACK!!! > > atomic_thread_fence_acq(); > > } while (gen == 0 || gen != th->th_generation); > > > > if (*(volatile uint64_t*)0xc0000000000000f0==0u && (0xffffffffffffffffull/scale_factor) > *(volatile uint64_t*)0xc0000000000000d0= freq; > > *(volatile uint64_t*)0xc0000000000000d8= scale_factor; > > *(volatile u_int*)0xc0000000000000e0= tim_offset; > > *(volatile u_int*)0xc0000000000000e4= tim_cnt; > > *(volatile u_int*)0xc0000000000000e8= tim_diff; > > *(volatile uint64_t*)0xc0000000000000f0= diff_scaled; > > *(volatile uint64_t*)0xc0000000000000f8= scale_factor*freq; > > __asm__ ("sync"); > > } > > } > > #else > > . . . > > #endif > > > > (mtfb() is used to provide the tc->tc_get_timecount(tc) > > value --but only the lower 32 bits are extracted and > > returned.) > > > > Basically whenever tim_diff is such that: > > > > (0xffffffffffffffff/scale_factor) > > > then diff_scaled overflows an unsigned, 64 bit representation, > > ending up with just the least 64 bits. This truncated value > > ends up being used in: > > > > bintime_addx(bt, diff_scaled); > > > > Observed consistently for tc->tc_frequency: > > > > tc->tc_frequency == 0x1fca055 (i.e., 33333333) > > > > ( tc->tc_counter_mask is 0xfffffffful as well. ) > > > > An example observation of diff_scaled having an overflowed > > value is: > > > > scale_factor == 0x80da2067ac > > scale_factor*freq overflows unsigned, 64 bit representation. > > tim_offset == 0x3da0eaeb > > tim_cnt == 0x42dea3c4 > > tim_diff == 0x53db8d9 > > For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor > > scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) > > > > So scale_factor * tim_diff leaves diff_scaled truncated to > > the least significant 64 bits, which does not preserve > > ordering properties. > > > > Another example: > > > > scale_factor == 0x80d95962c0 > > scale_factor*freq == 0xfffffffffd65c9c0 > > tim_offset == 0x4d1fb8e2 > > tim_cnt == 0x4d1fb8e1 > > tim_diff == 0xffffffff > > For reference: 0x1fca055 == 0xffffffffffffffffull/scale_factor > > scaled_diff == 0xD959623F26A69D40 (truncated to 64 bits) > > > > Again the diff_scaled holds a truncated value from > > scale_factor * tim_diff . > > > > Another example: > > > > scale_factor == 0x80da20c940 > > scale_factor*freq overflows unsigned, 64 bit representation. > > tim_offset == 0x9a7f5cdb > > tim_cnt == 0xb26bbd5 > > tim_diff == 0x70a75efa > > For reference: 0x1fc9d41 == 0xffffffffffffffffull/scale_factor > > scaled_diff == 0xB3AC715C56AA0880 (truncated to 64 bits) > > > > Again the diff_scaled holds a truncated value from > > scale_factor * tim_diff . > > > > Note that the scale_factor does vary. > > > > Try the following (I did not even booted it). If worked out, ffclock > counterpart also needs the patching. > > diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c > index 2656fb4d22f..19e81bbf023 100644 > --- a/sys/kern/kern_tc.c > +++ b/sys/kern/kern_tc.c > @@ -355,13 +355,20 @@ void > binuptime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale; > + u_int delta, gen; > > do { > th = timehands; > gen = atomic_load_acq_int(&th->th_generation); > *bt = th->th_offset; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale = th->th_scale; > + delta = tc_delta(th); > + if (fls(scale) + fls(delta) > 63) { > + bt->sec += (scale >> 32) * delta; > + scale &= UINT_MAX; > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen == 0 || gen != th->th_generation); Of course I botched the formula, please try this instead: diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..fdd4f4f6a52 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -355,13 +355,22 @@ void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (fls(scale) + fls(delta) > 63) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +397,22 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (fls(scale) + fls(delta) > 63) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } From owner-freebsd-ppc@freebsd.org Thu Feb 28 21:46:17 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 327C1151A75F for ; Thu, 28 Feb 2019 21:46:17 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic303-22.consmr.mail.ne1.yahoo.com (sonic303-22.consmr.mail.ne1.yahoo.com [66.163.188.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 887178AD5F for ; Thu, 28 Feb 2019 21:46:16 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 8kxB3nYVM1kBOQtOEtPR3odYJ_0nZ_nrZUVL6HZ.74dh9izCUPgbpiEhTXh3WHZ YFefPes4x_6YiI3_CrDGU6k2vPKA8lpFOvU4MFaGg42RwJpAWtbhUzG5PkrPKel61b4XufkqqPom fcCmrEK0BmS.AK_Z1z1CKA.VSqB25WcxHn9VnUCusp08xQIjpTLPBttSVkg5x45nFthW.3tphtML IuWHyeGTx9Qb1Ifhu7JVOYlf_jORM5rWXt1m.IvwWJRXtBsutNBdMOpUdfNx9rGjfyEL1YzWbXPp kcYaYvykvlGYDVQuWnejnFTKavkBwkribzNi.mRwlU2Kz1Lrmw7vXalUwzgHZ9GaV0k3lnCniOBN k0q21ptC66I4aFTyMM7X48Te9Exm.bpv.nqLreatvU_4xN3T0zwiYfztOh99RYWsAxmIZz7HHSiX 7p9KV5vGrKrlQC09Nlq.9UsSPeh86KlcRSsvhmRkj.lz0CnE0MVJ8G6kgxclAgJTewSbc6oZkKmo 1kwuSnh23Dlya.llvQpkPDEH6fb3v1ZqrKeToSwNt27Otbnu7nemRkbE1QFIoRxSJsjLFO_9hIHn tfTJOIgE7W6JEHAOWhqMaVe5_1WZJ9v741aHMsM3G3ujbVl4iqvwMoFIbjv6.HiqKD35pYRfstO6 xia8IQtF825sLt8uWaxXCxlxG0_Odzh8t49KD19TGupd14.ywmdfrCf3vKzXTPyg7kPKF3MWNtr0 PGqtH3dzg7030Jp2mADQRvTMLP..EaIZzNuwhyuBN0eHf2uTeupelgb8.5pI73QaovznbxQeGLvn I0JoynI2MuBmQwiefcIoEHEubBjfYx7CT6wghCvXtnJs3I11CPRSO_WIeDlVHZ2QmW5Sq3iqFemt YN_dnaNXgsuHUXuApv0J16p2RfEkJQCWq57AoAZyqjFn982MR7OKYwM8UwHjQ41MJMwJr0.4L.85 qM2ICSJ.OPqNvnbfFNj9JzW46evGyUjs4W4SqiiDttYcFDaTHAkmxF6sKq_DoGiac7BJMdfP78.E NBhgq066CM3bWvN.Kjq6tWCSBkSiSEbvw1PhLfq3KhOd6H5.6hAGLiWXnctQaWsylJHSzLSxK3w- - Received: from sonic.gate.mail.ne1.yahoo.com by sonic303.consmr.mail.ne1.yahoo.com with HTTP; Thu, 28 Feb 2019 21:46:14 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp426.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 4664458328c716d9daee1a813f76f6e2; Thu, 28 Feb 2019 21:46:12 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes From: Mark Millard In-Reply-To: <20190228150811.GU2420@kib.kiev.ua> Date: Thu, 28 Feb 2019 13:46:11 -0800 Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <0A345E1F-7675-4B4B-8A74-ACD59E90E72F@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 887178AD5F X-Spamd-Bar: ------ X-Spamd-Result: default: False [-6.99 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[]; NEURAL_HAM_SHORT(-0.99)[-0.994,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 21:46:17 -0000 On 2019-Feb-28, at 07:08, Konstantin Belousov = wrote: > On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote: >> On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via = freebsd-ppc wrote: >>> . . . >>=20 >> . . . >=20 > Of course I botched the formula, please try this instead: >=20 > diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c > index 2656fb4d22f..fdd4f4f6a52 100644 > --- a/sys/kern/kern_tc.c > +++ b/sys/kern/kern_tc.c > @@ -355,13 +355,22 @@ void > binuptime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; >=20 > do { > th =3D timehands; > gen =3D atomic_load_acq_int(&th->th_generation); > *bt =3D th->th_offset; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale =3D th->th_scale; > + delta =3D tc_delta(th); > + if (fls(scale) + fls(delta) > 63) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; The following two lines confuse me overall: > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); bintime_addx does: static __inline void bintime_addx(struct bintime *_bt, uint64_t _x) { uint64_t _u; _u =3D _bt->frac; _bt->frac +=3D _x; if (_u > _bt->frac) _bt->sec++; } So I'd expect: bintime_addx(bt, x << 32) to find _u > _bt->frac and to also do _bt->sec++ . So overall (as a means of summarizing for bt->sec): bt->sec +=3D (x >> 32) + 1; Is that the intent? > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen =3D=3D 0 || gen !=3D th->th_generation); > } > @@ -388,13 +397,22 @@ void > bintime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; >=20 > do { > th =3D timehands; > gen =3D atomic_load_acq_int(&th->th_generation); > *bt =3D th->th_bintime; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale =3D th->th_scale; > + delta =3D tc_delta(th); > + if (fls(scale) + fls(delta) > 63) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; The same for the below two lines: > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen =3D=3D 0 || gen !=3D th->th_generation); > } >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Thu Feb 28 21:50:57 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6E7DC151AA92 for ; Thu, 28 Feb 2019 21:50:57 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic301-4.consmr.mail.bf2.yahoo.com (sonic301-4.consmr.mail.bf2.yahoo.com [74.6.129.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 662EF8B15D for ; Thu, 28 Feb 2019 21:50:56 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: YhBD9cQVM1m4p0osGu5oRLz1yqBtp_MDtFkIOTKUx5klwK3bPHLl4H.zMWngptV d3xFpCSg0S9i9cCDPFn8LLRv7oJDwki2J26NnIib4B0QUYkqyGORM8sLN.0GK9m11gmOrBvqceo7 YUJpjgW3r1kYSem_nEL1NwjRGZy1wBZST9SQ5l3FfEGRQxWN21dHcgNWmJBdUyoG74bCh.ER791N sv9uDLiAJeKPWAG95TJAzZSj5gxfyIaSG1EOTBB5svjZNsgc1AMNnLttwCEFnejqW2V0Iilih6jS YX6fZhYVkSBg9TMV_VLS2lWHgnmR2dLLG04ob789ByPaTTfDsvIIi3kD8vzNZ2vlXE25c1IK7Ew8 MBrwcvIAjmnbu44eu3UGpM4xoL2DeqMxTcRij6rRWPOtzrDpydN6S9XAF5peVwqaUVRDscZ5KSM1 ZLAhxycppl1Qjp09jXcTxmWvNZXjQ2Gfs_RzNg58uymtNekTsWYZqxBeWd0ptUsGq43.NzIZ3zOZ 2GDg3QX3T4.8Ygzwdn0d.QL1VcdRxspa3q9vwATEC6EhA61s420YTlz1JcXhTpePJd9GghsY2mvR 413oSFjZ22MZJ7tG8lBfj0Iqbo4cHYtgEXnTnw97o6gdG2zoOa5GeVn8mRNOunqtqr01UxKZT0uV iwU6qbKt2TfGTgUXXStBH9CRxs0RwyAwsUeIdiMAR2lBOGoisDcWI6q170rM1db94HIVQfvPkC7W OB3xsYDwWgMA.lqvxpVbsFHn8yHe9eyf0K0CeUS3elqCojsY9PCUNCPtP1TXwgBakPJmPBa7UDvV epDrCHwV1xrbadA3F_daZVaZ1nCX6hz74Cs8L_rV.Y3EZBbTQ_0S4pUKngv8lMvI5mZgXYnU5o2k ILgHfM264VpkgUPWW.Lnf3Eb6uMJfBHHa1.dFKZxlmwmhhGhFmNQz_5YWBBjkzgCuDfY7Gx.Awik kSWZ00d1vHH5njMr6g2jcnBgUAWSoiFnjnTBup448bXvM2pwnAFcPLPEaeXPUxq94_ZYvzTHr.kb t79WIta2BoK2wBok_PdsYtKw6RYxXkZhn8lgV0P7i_uN3dM3m Received: from sonic.gate.mail.ne1.yahoo.com by sonic301.consmr.mail.bf2.yahoo.com with HTTP; Thu, 28 Feb 2019 21:50:55 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp426.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 3349b256950a530d4ab78852d54d1707; Thu, 28 Feb 2019 21:50:51 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes From: Mark Millard In-Reply-To: <0A345E1F-7675-4B4B-8A74-ACD59E90E72F@yahoo.com> Date: Thu, 28 Feb 2019 13:50:49 -0800 Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <0A345E1F-7675-4B4B-8A74-ACD59E90E72F@yahoo.com> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 662EF8B15D X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.00 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.99)[0.993,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.36)[ip: (4.13), ipnet: 74.6.128.0/21(1.53), asn: 26101(1.22), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.54)[0.540,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.62)[0.619,0]; RCVD_IN_DNSWL_NONE(0.00)[43.129.6.74.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 21:50:57 -0000 [I left implicit that I was summarizing for x!=3D0.] On 2019-Feb-28, at 13:46, Mark Millard wrote: > On 2019-Feb-28, at 07:08, Konstantin Belousov = wrote: >=20 >> On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote: >>> On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via = freebsd-ppc wrote: >>>> . . . >>>=20 >>> . . . >>=20 >> Of course I botched the formula, please try this instead: >>=20 >> diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c >> index 2656fb4d22f..fdd4f4f6a52 100644 >> --- a/sys/kern/kern_tc.c >> +++ b/sys/kern/kern_tc.c >> @@ -355,13 +355,22 @@ void >> binuptime(struct bintime *bt) >> { >> struct timehands *th; >> - u_int gen; >> + uint64_t scale, x; >> + u_int delta, gen; >>=20 >> do { >> th =3D timehands; >> gen =3D atomic_load_acq_int(&th->th_generation); >> *bt =3D th->th_offset; >> - bintime_addx(bt, th->th_scale * tc_delta(th)); >> + scale =3D th->th_scale; >> + delta =3D tc_delta(th); >> + if (fls(scale) + fls(delta) > 63) { >> + x =3D (scale >> 32) * delta; >> + scale &=3D UINT_MAX; >=20 > The following two lines confuse me overall: >=20 >> + bt->sec +=3D x >> 32; >> + bintime_addx(bt, x << 32); >=20 > bintime_addx does: >=20 > static __inline void > bintime_addx(struct bintime *_bt, uint64_t _x) > { > uint64_t _u; >=20 > _u =3D _bt->frac; > _bt->frac +=3D _x; > if (_u > _bt->frac) > _bt->sec++; > } >=20 > So I'd expect: I forgot to indicate the context: when x!-0 > bintime_addx(bt, x << 32) >=20 > to find _u > _bt->frac and to also do _bt->sec++ . > So overall (as a means of summarizing for > bt->sec): >=20 > bt->sec +=3D (x >> 32) + 1; >=20 > Is that the intent? >=20 >=20 >> + } >> + bintime_addx(bt, scale * delta); >> atomic_thread_fence_acq(); >> } while (gen =3D=3D 0 || gen !=3D th->th_generation); >> } >> @@ -388,13 +397,22 @@ void >> bintime(struct bintime *bt) >> { >> struct timehands *th; >> - u_int gen; >> + uint64_t scale, x; >> + u_int delta, gen; >>=20 >> do { >> th =3D timehands; >> gen =3D atomic_load_acq_int(&th->th_generation); >> *bt =3D th->th_bintime; >> - bintime_addx(bt, th->th_scale * tc_delta(th)); >> + scale =3D th->th_scale; >> + delta =3D tc_delta(th); >> + if (fls(scale) + fls(delta) > 63) { >> + x =3D (scale >> 32) * delta; >> + scale &=3D UINT_MAX; >=20 > The same for the below two lines: >=20 >> + bt->sec +=3D x >> 32; >> + bintime_addx(bt, x << 32); >=20 >=20 >> + } >> + bintime_addx(bt, scale * delta); >> atomic_thread_fence_acq(); >> } while (gen =3D=3D 0 || gen !=3D th->th_generation); >> } >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Thu Feb 28 22:23:35 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7888F151C0C8 for ; Thu, 28 Feb 2019 22:23:35 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic313-14.consmr.mail.bf2.yahoo.com (sonic313-14.consmr.mail.bf2.yahoo.com [74.6.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7DD208CC7D for ; Thu, 28 Feb 2019 22:23:34 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: Yxkyu2gVM1mrnnvGcuTlHIEWQmEoiysHtza7GiZuVgBPTgPGxTjxHnSdUHavV4M eAcLDhd9fw49La.wIUuVV44oFuLwX5leMf5Bx7F.2RXevsapGNbikariuAOJXxSKtk1HXlvfW6B8 NOul1933tY7DcJ47v0gNASRLfypf5ADJ1DkWnqswP0atTKHMDQoMfKH6vnKhD_x6InSAdiCltpGJ gqQ1TIUd9DTbN9GhgmugtfiZLhKexAxuT_wAag7OJRW5gGLMcA0NNGgj5H0viSo.Ddxt7MV.8NWd 7vsMkXxMg5Y1Sq2o9iytNVcgeGG_HWOgBj100T4wRXHyCI_H0P_C6hMJtVmUCZ6Bp4WjwzlwTQVW qOypIUWZwBe_mHHCQFbIxnAtBBr.Os0H2pILS.ygyN8hSIYJr0ggUwg6c9JxkCKkNZTMsUKpEJFD nrMRxyaRO9nPj0lQYTk1aiLbKMmJbDgGQp1k7yLwWXZ1r6TZlMiFzXl1TEu21ALhUKHOHQTTIqw8 arwhj4k4Mp9tp8otkUGrkAYGWGK3DQRj6H_H0c3mxeB942XcJQz_DAWrjBLE5fTH0iZOI11.OMTn rHR34j3wZ5VbS6IF0T_4WNtoLg1s.HkMwGntmzi6waotAO4ZJFyHB4FiiOlkzLyewnw8mHWHGyYd 4a9Mc6lHGWkpwbln6Oxmo4BetU0_jpE6KXs3q9U3nOVV1.nrmpZa04Fbss3ZOFW2lqEw8bSjh2MH N8A7h6yPbx4gflOkWFgKovytFtxTqQbtJqpZSzZS74oNQJLdC0R3ZoLQc6Kin_ZwVTR0tvRCGvxz vcZwIzTHFUYnmg8o_trCxy8QhdIAMgv8dZ1ql0BPPtXYSdfWIFMzsHfNROfDDa395OMJe2TZ4AKG YjZBKCqohyhgdnTvxnxj4VDBDXUZ3WcLiB0adreaOXCBOgtlyyB_jpM2XZIfhZyg8tiN7sE9Se7P 8vlwQb3TbNYh2kVsVw5L2s3U8qacQPHPqq_1P88TaKcMqSQ864cVAl1rcJwtcBPZ0iZ2wGD2WaaV IKBwtSXdNLkRR.K6KXuRCfoZNPcfA7214wCtZeObqaSYPKPOxBkPIMNJkn4WGxTTdGrAQxKs- Received: from sonic.gate.mail.ne1.yahoo.com by sonic313.consmr.mail.bf2.yahoo.com with HTTP; Thu, 28 Feb 2019 22:23:32 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp402.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID ea3e99fc12ba6d8e3cb563de5a3d005b; Thu, 28 Feb 2019 22:23:28 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes From: Mark Millard In-Reply-To: Date: Thu, 28 Feb 2019 14:23:26 -0800 Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <5C2FABEC-C56A-4153-A3FD-D5E1D54F1F6E@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <0A345E1F-7675-4B4B-8A74-ACD59E90E72F@yahoo.com> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 7DD208CC7D X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.18 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.99)[0.994,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_SPAM_MEDIUM(0.58)[0.583,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.68)[0.676,0]; RCVD_IN_DNSWL_NONE(0.00)[124.133.6.74.list.dnswl.org : 127.0.5.0]; IP_SCORE(1.44)[ip: (4.50), ipnet: 74.6.128.0/21(1.53), asn: 26101(1.22), country: US(-0.07)] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 22:23:35 -0000 [I was distracted and made a stupid mistake. Sorry for the noise.] On 2019-Feb-28, at 13:50, Mark Millard wrote: > [I left implicit that I was summarizing for x!=3D0.] >=20 > On 2019-Feb-28, at 13:46, Mark Millard wrote: >=20 >=20 >=20 >> On 2019-Feb-28, at 07:08, Konstantin Belousov = wrote: >>=20 >>> On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote: >>>> On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via = freebsd-ppc wrote: >>>>> . . . >>>>=20 >>>> . . . >>>=20 >>> Of course I botched the formula, please try this instead: >>>=20 >>> diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c >>> index 2656fb4d22f..fdd4f4f6a52 100644 >>> --- a/sys/kern/kern_tc.c >>> +++ b/sys/kern/kern_tc.c >>> @@ -355,13 +355,22 @@ void >>> binuptime(struct bintime *bt) >>> { >>> struct timehands *th; >>> - u_int gen; >>> + uint64_t scale, x; >>> + u_int delta, gen; >>>=20 >>> do { >>> th =3D timehands; >>> gen =3D atomic_load_acq_int(&th->th_generation); >>> *bt =3D th->th_offset; >>> - bintime_addx(bt, th->th_scale * tc_delta(th)); >>> + scale =3D th->th_scale; >>> + delta =3D tc_delta(th); >>> + if (fls(scale) + fls(delta) > 63) { >>> + x =3D (scale >> 32) * delta; >>> + scale &=3D UINT_MAX; >>=20 >> The following two lines confuse me overall: >>=20 >>> + bt->sec +=3D x >> 32; >>> + bintime_addx(bt, x << 32); >>=20 >> bintime_addx does: >>=20 >> static __inline void >> bintime_addx(struct bintime *_bt, uint64_t _x) >> { >> uint64_t _u; >>=20 >> _u =3D _bt->frac; >> _bt->frac +=3D _x; >> if (_u > _bt->frac) >> _bt->sec++; >> } >>=20 >> So I'd expect: >=20 > I forgot to indicate the context: when x!-0 >=20 >> bintime_addx(bt, x << 32) >>=20 >> to find _u > _bt->frac and to also do _bt->sec++ . >> So overall (as a means of summarizing for >> bt->sec): How I got _u > _bt->frac for that call in general when the call's (x<<32) !=3D 0 I do not know. So ignore my stupid, mistaken questions. >> bt->sec +=3D (x >> 32) + 1; >>=20 >> Is that the intent? >>=20 >>=20 >>> + } >>> + bintime_addx(bt, scale * delta); >>> atomic_thread_fence_acq(); >>> } while (gen =3D=3D 0 || gen !=3D th->th_generation); >>> } >>> @@ -388,13 +397,22 @@ void >>> bintime(struct bintime *bt) >>> { >>> struct timehands *th; >>> - u_int gen; >>> + uint64_t scale, x; >>> + u_int delta, gen; >>>=20 >>> do { >>> th =3D timehands; >>> gen =3D atomic_load_acq_int(&th->th_generation); >>> *bt =3D th->th_bintime; >>> - bintime_addx(bt, th->th_scale * tc_delta(th)); >>> + scale =3D th->th_scale; >>> + delta =3D tc_delta(th); >>> + if (fls(scale) + fls(delta) > 63) { >>> + x =3D (scale >> 32) * delta; >>> + scale &=3D UINT_MAX; >>=20 >> The same for the below two lines: >>=20 >>> + bt->sec +=3D x >> 32; >>> + bintime_addx(bt, x << 32); >>=20 >>=20 >>> + } >>> + bintime_addx(bt, scale * delta); >>> atomic_thread_fence_acq(); >>> } while (gen =3D=3D 0 || gen !=3D th->th_generation); >>> } >>=20 >=20 >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Fri Mar 1 01:55:18 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EC6F7152253B for ; Fri, 1 Mar 2019 01:55:17 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic307-4.consmr.mail.bf2.yahoo.com (sonic307-4.consmr.mail.bf2.yahoo.com [74.6.134.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 754336D7EE for ; Fri, 1 Mar 2019 01:55:17 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: jT3IWfoVM1kZB7yR6G7_dt2A3l7AF_OhcbajmfnevZx6kFf3qqShpcHcmhpji.s HYDTKXOD9qz2AmFDVp8lAyy6UQ7Kpv1wvkjn33_flUtR1fbNRvyOqFhZADXPzfbd6zaqtL21zkzE bYNrmkdzyYuxPQB5dOayqklxAKJ1Ziigkg7A.bBwGTiitUpu.QPT3ArPZlpXF3bgB7.bTNOKaG15 6Zvba2h.5iLAAVXnWSS1Ds1U9NfN8..8456D5Sum0MrLakh.84UuXaZcsZXsjzZwRNzVdrLulNwl SZBRRzeUIR7L.hcfizvvInDZQTeVXonKm.qS1ZDm3Fz_1yACFsQoThkB5l8Uz4hKC3SxCQDqgrYV AtIFQeNKbLnL3h9yXyitzdZnRa3zhhkg2cvmkvneeCakN51Hbtq31kOAfrUaeY_p9wwNZDfFaueN I108fd3coL.vYEfk7iYUAk.lp.ik14.dEH1.4oLLHpMehLL8yk6mwd0oh9GsAZ_XazXax1zB2jXQ I_CjXlKy8.QVLggrWZispV9aEcBkgsKLhky1TfjyyLj_bze4fItbAAfuRCMRv9axu25osG8sYPtZ gEdYgi5YQsVKaRHclic5zM1Pfy74zq_U72phYCWHYH1VupGv50rxyxKPb3VyT7W2oV4g4hTanOX. tGT_QTUHpjcqb8VuaEc_gzYW8kJogZ4XN0KGkHiF7vw56KxjDh4561k9tq09vM0fMqksh8XFRW92 4_7v0JIA.adzqKwWZLduBkabtKGQtY0Zxf5LdFoyvhbAM3TS5efDOpFKAs0fB2sPuQdjgj8bNUf3 B94uga..BVVd905ybHOJhTiMJm13Kk1YAktKbNoTKu_D6def5NomYv1MjyzXSBPYWXJxk10VOS2j DXZoBPSQtSIcBWbogQJK5Yc8gCciF0BwbDhVcbh_um1dxFODs4Ln7vBGCqvuT9QbA90I7E9FeZqx yPuUeyqPwKIfX8IZpbtx9EWc73HTpPVUDMbkidYKd6sjoVOkF2mI5HK5pJjUIhRyIfAKjP50wM9I A1bBnbAmiqOnfHQLXLD7BR5NLryHCHwZ0yqaFfJqs0JOUE5sB8aRcBe3M6qmaLUuLv1MEjkeEFA- - Received: from sonic.gate.mail.ne1.yahoo.com by sonic307.consmr.mail.bf2.yahoo.com with HTTP; Fri, 1 Mar 2019 01:55:09 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp425.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 7611b58950e303ae4e9242192a9f1369; Fri, 01 Mar 2019 01:55:06 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <20190228150811.GU2420@kib.kiev.ua> Date: Thu, 28 Feb 2019 17:55:04 -0800 Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 754336D7EE X-Spamd-Bar: ------ X-Spamd-Result: default: False [-6.99 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[]; NEURAL_HAM_SHORT(-0.99)[-0.991,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 01:55:18 -0000 [The PowerMac becomes non-responsive for significant periods of time.] On 2019-Feb-28, at 07:08, Konstantin Belousov = wrote: > On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote: >> On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via = freebsd-ppc wrote: >>> . . . >>=20 >> . . . >=20 > Of course I botched the formula, please try this instead: >=20 > diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c > index 2656fb4d22f..fdd4f4f6a52 100644 > --- a/sys/kern/kern_tc.c > +++ b/sys/kern/kern_tc.c > @@ -355,13 +355,22 @@ void > binuptime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; >=20 > do { > th =3D timehands; > gen =3D atomic_load_acq_int(&th->th_generation); > *bt =3D th->th_offset; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale =3D th->th_scale; > + delta =3D tc_delta(th); > + if (fls(scale) + fls(delta) > 63) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen =3D=3D 0 || gen !=3D th->th_generation); > } > @@ -388,13 +397,22 @@ void > bintime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; >=20 > do { > th =3D timehands; > gen =3D atomic_load_acq_int(&th->th_generation); > *bt =3D th->th_bintime; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale =3D th->th_scale; > + delta =3D tc_delta(th); > + if (fls(scale) + fls(delta) > 63) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen =3D=3D 0 || gen !=3D th->th_generation); > } The PowerPC G5 ends up not responsive for long periods and responsive for only very short periods, such as being able to type in a few letters and have them show up at the time. I've only barely started investigating what is going on and I'll be rechecking my instrumented variant for stupid mistakes and such. I"ll try your un-instrumented binuptime as well. As stands I'll be updating the kernel via booting a 2nd disk that is not being experimented with. My information gathering may not be very timely. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Fri Mar 1 07:39:19 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 336091511C80 for ; Fri, 1 Mar 2019 07:39:19 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic317-36.consmr.mail.ne1.yahoo.com (sonic317-36.consmr.mail.ne1.yahoo.com [66.163.184.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 018ED8EBB8 for ; Fri, 1 Mar 2019 07:39:17 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: GQvOgkcVM1nNAdUTGoEw1LalJgadNmUZXhiKmIkRvgZI7BkoUJoHv34lA0Up7xQ w3HYK.VTKqcOo29ou5xBJW0z_4qWucPaM7U25UJON6jbWkDTH.RKfBw_iOj6X6JPW.QprdNaSTnp v08Tjov1e2STDGCipg30jM4.CFAweMHN_Y9AuJzFHMUwPMKsOjw74i4I4N5zoT9Rg2xRNhimLtf2 EvLpC_.j9203EhmF3DnnxPYEwNSCUT7O17PyxfkYffNBoujEAYY5R.YV3wyCwDGsVgwv9jOrCFwi tmew_b5luMNiugXoc4ZVJnq1HENueQ9j2FhpcDENRVYBWLvmyHHpEbP67JNeuyZyEGFEfYfljGBF hSslaqfN7gvS.tl__16S3EFAutsk9i48voOX0KwHwGJd1TAkZQYyWhMWvjCFuFJufJqNNPEzYkFR HN9yYMvcfgojXPJjABinpt0.xIlr6V7Ug2Z6SweEAVtq_HQtU360hRMUzyWqgbFKUJ8dNlLj87VP FgR.lUjr.R.7lRzOBVkJKo.QWtjzY2kjIn8.uqf_jNAGetbTFYzuRCIOWz45PcEkwlfXHXMkerHP tvPwdWUK8dHp6HgqXiQkda6TfwUfOCL1R31dGcwINTaAKttO_2XtO.EsGTCIwDtmazxxtoNBbNKu 6qpQXE34VRVkifFz.14NaBpt11CFll4EWAStl8tQNZ6G3Hqj0Q58HEBvHO8AkmgpSw57KEAfmeST K6p_Hwa3qachQG0MJ8EjJLS60DoS0aSu2UXJ4R0PRDMiJQZUF8v3u0lFIZ72G6NmKHEomxtJ8iRP 12AS6LCoriLvt_nS4exRvVThWf0M4t1E7j.6LL05cp.ahYpa2qQLU_.p0yUVMvdKHEYaEXWj0XwE kbtLBJuXKwp1nXuWaX9MvRzO44MrROBw6W3G12nb458gnlB6qqtr8pEt4YXzxmdgVGw2XLu4nKxF zUBQ1wLeCMkgaIezR5LyYOySgP4yEyOaYNJ3KHfzhtohCdinwHrtZu5piDcKGDjJmEntx1Dh2Ame SxyZSkPV.9oGVORlwNeHzrzsG_wvQ0iFPUJUUPC40PwDjSLYOON91vi6xc2I3gWCcr0dgUw-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic317.consmr.mail.ne1.yahoo.com with HTTP; Fri, 1 Mar 2019 07:39:11 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp401.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID e0b5cb1f365c3e7b491adcee9e36fbab; Fri, 01 Mar 2019 07:39:08 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> Date: Thu, 28 Feb 2019 23:39:06 -0800 Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 018ED8EBB8 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.99 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.97)[0.973,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.36)[ip: (4.52), ipnet: 66.163.184.0/21(1.30), asn: 36646(1.04), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.53)[0.531,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.64)[0.638,0]; RCVD_IN_DNSWL_NONE(0.00)[47.184.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 07:39:19 -0000 [The new, trial code also has truncation occurring.] On 2019-Feb-28, at 17:55, Mark Millard wrote: > [The PowerMac becomes non-responsive for significant periods of time.] >=20 > On 2019-Feb-28, at 07:08, Konstantin Belousov = wrote: >=20 >> On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote: >>> On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via = freebsd-ppc wrote: >>>> . . . >>>=20 >>> . . . >>=20 >> Of course I botched the formula, please try this instead: >>=20 >> diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c >> index 2656fb4d22f..fdd4f4f6a52 100644 >> --- a/sys/kern/kern_tc.c >> +++ b/sys/kern/kern_tc.c >> @@ -355,13 +355,22 @@ void >> binuptime(struct bintime *bt) >> { >> struct timehands *th; >> - u_int gen; >> + uint64_t scale, x; >> + u_int delta, gen; >>=20 >> do { >> th =3D timehands; >> gen =3D atomic_load_acq_int(&th->th_generation); >> *bt =3D th->th_offset; >> - bintime_addx(bt, th->th_scale * tc_delta(th)); >> + scale =3D th->th_scale; >> + delta =3D tc_delta(th); >> + if (fls(scale) + fls(delta) > 63) { >> + x =3D (scale >> 32) * delta; >> + scale &=3D UINT_MAX; >> + bt->sec +=3D x >> 32; >> + bintime_addx(bt, x << 32); >> + } >> + bintime_addx(bt, scale * delta); >> atomic_thread_fence_acq(); >> } while (gen =3D=3D 0 || gen !=3D th->th_generation); >> } >> @@ -388,13 +397,22 @@ void >> bintime(struct bintime *bt) >> { >> struct timehands *th; >> - u_int gen; >> + uint64_t scale, x; >> + u_int delta, gen; >>=20 >> do { >> th =3D timehands; >> gen =3D atomic_load_acq_int(&th->th_generation); >> *bt =3D th->th_bintime; >> - bintime_addx(bt, th->th_scale * tc_delta(th)); >> + scale =3D th->th_scale; >> + delta =3D tc_delta(th); >> + if (fls(scale) + fls(delta) > 63) { >> + x =3D (scale >> 32) * delta; >> + scale &=3D UINT_MAX; >> + bt->sec +=3D x >> 32; >> + bintime_addx(bt, x << 32); >> + } >> + bintime_addx(bt, scale * delta); >> atomic_thread_fence_acq(); >> } while (gen =3D=3D 0 || gen !=3D th->th_generation); >> } >=20 > The PowerPC G5 ends up not responsive for long periods and > responsive for only very short periods, such as being able > to type in a few letters and have them show up at the time. >=20 > I've only barely started investigating what is going on > and I'll be rechecking my instrumented variant for stupid > mistakes and such. I"ll try your un-instrumented binuptime > as well. >=20 > As stands I'll be updating the kernel via booting a 2nd > disk that is not being experimented with. >=20 > My information gathering may not be very timely. >=20 Live experimenting and inspection has proved problematical. Below I experiment with the prior scale_factor and tim_diff figures from my oroginal code that recorded such, but showing some of what your new code does for them. (In part the below is text from the original list submittal to have a context.) Observed consistently for tc->tc_frequency: tc->tc_frequency =3D=3D 0x1fca055 (i.e., 33333333) ( tc->tc_counter_mask is 0xfffffffful as well. ) An example observation of diff_scaled having an overflowed value was: scale_factor =3D=3D 0x80da2067ac scale_factor*freq overflows unsigned, 64 bit representation. tim_offset =3D=3D 0x3da0eaeb tim_cnt =3D=3D 0x42dea3c4 tim_diff =3D=3D 0x53db8d9 For reference: 0x1fc9d43 =3D=3D = 0xffffffffffffffffull/scale_factor scaled_diff =3D=3D 0xA353A5BF3FF780CC (truncated to 64 bits) But for the new, trail code: 0x80da2067ac is 40 bits 0x53db8d9 is 27 bits So 67 bits, more than 63. Then: x =3D=3D (0x80da2067ac>>32) * 0x53db8d9 =3D=3D 0x80 * 0x53db8d9 =3D=3D 0x29EDC6C80 x>>32 =3D=3D 0x2 x<<32 =3D=3D 0x9EDC6C8000000000 (limited to 64 bits) Note the truncation of: 0x29EDC6C8000000000. Thus the "bintime_addx(bt, x << 32)" is still based on a truncated value. I'll not bother with the other two examples unless you want such. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Fri Mar 1 11:27:26 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6DF2E150EC7B; Fri, 1 Mar 2019 11:27:26 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E3C7E72353; Fri, 1 Mar 2019 11:27:25 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x21BRHqL067585 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 1 Mar 2019 13:27:20 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x21BRHqL067585 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x21BRH3A067584; Fri, 1 Mar 2019 13:27:17 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Fri, 1 Mar 2019 13:27:17 +0200 From: Konstantin Belousov To: Mark Millard , bde@freebsd.org Cc: freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190301112717.GW2420@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> User-Agent: Mutt/1.11.2 (2019-01-07) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 11:27:26 -0000 On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote: > [The new, trial code also has truncation occurring.] In fact no, I do not think it is. > An example observation of diff_scaled having an overflowed > value was: > > scale_factor == 0x80da2067ac > scale_factor*freq overflows unsigned, 64 bit representation. > tim_offset == 0x3da0eaeb > tim_cnt == 0x42dea3c4 > tim_diff == 0x53db8d9 > For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor > scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) > > But for the new, trail code: > > 0x80da2067ac is 40 bits > 0x53db8d9 is 27 bits > So 67 bits, more than 63. Then: > > x > == (0x80da2067ac>>32) * 0x53db8d9 > == 0x80 * 0x53db8d9 > == 0x29EDC6C80 > > x>>32 > == 0x2 > > x<<32 > == 0x9EDC6C8000000000 (limited to 64 bits) > Note the truncation of: 0x29EDC6C8000000000. Right, this is how the patch is supposed to work. Note that the overflow bits 'lost' due to overflow of the left shift are the same bits that as used to increment bt->sec: bt->sec += x >> 32; So the 2 seconds are accounted for. > > Thus the "bintime_addx(bt, x << 32)" is still > based on a truncated value. > I must admit that 2 seconds of interval where the timehands where not updated is too much. This might be the real cause of all ppc troubles. I tried to see if the overflow case is possible on amd64, and did not get a single case of the '> 63' branch executed during the /usr/tests/lib/libc run. Actually, the same overflow-prone code exists in libc, so below is the updated patch: - I added __predict_false() - libc multiplication is also done separately for high-order bits. (fftclock counterpart is still pending). diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c index 3749e0473af..a14576988ff 100644 --- a/lib/libc/sys/__vdso_gettimeofday.c +++ b/lib/libc/sys/__vdso_gettimeofday.c @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include "libc_private.h" @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) { struct vdso_timehands *th; uint32_t curr, gen; + uint64_t scale, x; u_int delta; int error; @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) continue; if (error != 0) return (error); - bintime_addx(bt, th->th_scale * delta); + scale = th->th_scale; + if (__predict_false(fls(scale) + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); if (abs) bintime_add(bt, &th->th_boottime); diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..be75781e000 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -355,13 +355,22 @@ void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (__predict_false(fls(scale) + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +397,22 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (__predict_false(fls(scale) + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } From owner-freebsd-ppc@freebsd.org Fri Mar 1 18:41:02 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57E4815209FD; Fri, 1 Mar 2019 18:41:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id A1B008C48F; Fri, 1 Mar 2019 18:41:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 59DBB3DEEA9; Sat, 2 Mar 2019 05:40:59 +1100 (AEDT) Date: Sat, 2 Mar 2019 05:40:58 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-Reply-To: <20190301112717.GW2420@kib.kiev.ua> Message-ID: <20190302043936.A4444@besplex.bde.org> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=pcSVAcmkykNWFcors0gA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: A1B008C48F X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.96 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.96)[-0.958,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 18:41:02 -0000 On Fri, 1 Mar 2019, Konstantin Belousov wrote: > On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote: >> [The new, trial code also has truncation occurring.] > In fact no, I do not think it is. > >> An example observation of diff_scaled having an overflowed >> value was: >> >> scale_factor == 0x80da2067ac >> scale_factor*freq overflows unsigned, 64 bit representation. >> tim_offset == 0x3da0eaeb >> tim_cnt == 0x42dea3c4 >> tim_diff == 0x53db8d9 >> For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor >> scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) >> >> But for the new, trail code: >> >> 0x80da2067ac is 40 bits >> 0x53db8d9 is 27 bits >> So 67 bits, more than 63. Then: >> >> x >> == (0x80da2067ac>>32) * 0x53db8d9 >> == 0x80 * 0x53db8d9 >> == 0x29EDC6C80 >> >> x>>32 >> == 0x2 >> >> x<<32 >> == 0x9EDC6C8000000000 (limited to 64 bits) >> Note the truncation of: 0x29EDC6C8000000000. > Right, this is how the patch is supposed to work. Note that the overflow > bits 'lost' due to overflow of the left shift are the same bits that as > used to increment bt->sec: > bt->sec += x >> 32; > So the 2 seconds are accounted for. > >> >> Thus the "bintime_addx(bt, x << 32)" is still >> based on a truncated value. > > I must admit that 2 seconds of interval where the timehands where > not updated is too much. This might be the real cause of all ppc > troubles. I tried to see if the overflow case is possible on amd64, > and did not get a single case of the '> 63' branch executed during the > /usr/tests/lib/libc run. The algorithm requires the update interval to be less than 1 second. th_scale is 2**64 / tc_frequency, so whatever tc_frequency is, after 1 second the value of the multiplication is approximately 2**64 so it overflows about then (depending on rounding). The most useful timecounters are TSC's, and these give another overflow in tc_delta() after 1 second when their frequency is 4 GHz (except the bogus TSC-low timecounter reduces the frequency to below 2 binary GHz, so the usual case is overflow after 2 seconds). > Actually, the same overflow-prone code exists in libc, so below is the > updated patch: > - I added __predict_false() > - libc multiplication is also done separately for high-order bits. > (fftclock counterpart is still pending). > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > index 3749e0473af..a14576988ff 100644 > --- a/lib/libc/sys/__vdso_gettimeofday.c > +++ b/lib/libc/sys/__vdso_gettimeofday.c > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > +#include > #include > #include > #include "libc_private.h" > @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > { > struct vdso_timehands *th; > uint32_t curr, gen; > + uint64_t scale, x; > u_int delta; > int error; > > @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > continue; > if (error != 0) > return (error); > - bintime_addx(bt, th->th_scale * delta); > + scale = th->th_scale; > + if (__predict_false(fls(scale) + fls(delta) > 63)) { This is unnecessarily pessimal. Updates must be frequent enough to prevent tc_delta() overflowing, and it is even easier to arrange that this multiplication doesn't overflow (since the necessary update interval for the latter is constant). `scale' is 64 bits, so fls(scale) is broken on 32-bit arches, and flls(scale) is an especially large pessimization. I saw this on my calcru1() fixes -- the flls()s take almost as long as long long divisions when the use the pessimal C versions. The algorithm requires tc_delta() to be only 32 bits, since otherwise the multiplication would be 64 x 64 bits so would be much slower and harder to write. If tc_freqency is far above 4GHz, then th_scale is far below 4G, so the scaling is not so accurate. But 0.25 parts per billion is much more than enough. Even 1 part per million is enough for a TSC, since TSC instability is more than 1ppm. The overflows could be pushed off to 1024 seconds by dividing by 1024 at suitable places. A 32-bit scale times a 64-bit delta would be simple compared with both 64 bits (much like the code here). The code here can be optimized using values calculated at initialization time instead of fls*(). The overflow threshold for delta is approximately 2**64 / tc_frequency. > + x = (scale >> 32) * delta; > + scale &= UINT_MAX; > + bt->sec += x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > if (abs) > bintime_add(bt, &th->th_boottime); When the timecounter is the i8254, as it often was when timecounters were new, tc_windup() had to be called more often than every i8254 rollover (in practice once every hardclock tick), partly to keep tc_delta() small (since rollover gives a form of overflow). This was not so easy to arrange. It requires not losing any hardclock ticks and also not having any with high latency and also complications to detect the rollover when there is small latency. Most hardware is easier to handle now. With tickless kernels, hardclock() is often not called for about 1 second, but it must be called at least that often to prevent the overflow here. Bruce From owner-freebsd-ppc@freebsd.org Fri Mar 1 19:42:31 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4FE1715227CD; Fri, 1 Mar 2019 19:42:31 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B4F1A8EC03; Fri, 1 Mar 2019 19:42:30 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x21JgHCF094212 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 1 Mar 2019 21:42:20 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x21JgHCF094212 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x21JgHeH094211; Fri, 1 Mar 2019 21:42:17 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Fri, 1 Mar 2019 21:42:17 +0200 From: Konstantin Belousov To: Bruce Evans Cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190301194217.GB68879@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190302043936.A4444@besplex.bde.org> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 19:42:31 -0000 On Sat, Mar 02, 2019 at 05:40:58AM +1100, Bruce Evans wrote: > On Fri, 1 Mar 2019, Konstantin Belousov wrote: > > > On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote: > >> [The new, trial code also has truncation occurring.] > > In fact no, I do not think it is. > > > >> An example observation of diff_scaled having an overflowed > >> value was: > >> > >> scale_factor == 0x80da2067ac > >> scale_factor*freq overflows unsigned, 64 bit representation. > >> tim_offset == 0x3da0eaeb > >> tim_cnt == 0x42dea3c4 > >> tim_diff == 0x53db8d9 > >> For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor > >> scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) > >> > >> But for the new, trail code: > >> > >> 0x80da2067ac is 40 bits > >> 0x53db8d9 is 27 bits > >> So 67 bits, more than 63. Then: > >> > >> x > >> == (0x80da2067ac>>32) * 0x53db8d9 > >> == 0x80 * 0x53db8d9 > >> == 0x29EDC6C80 > >> > >> x>>32 > >> == 0x2 > >> > >> x<<32 > >> == 0x9EDC6C8000000000 (limited to 64 bits) > >> Note the truncation of: 0x29EDC6C8000000000. > > Right, this is how the patch is supposed to work. Note that the overflow > > bits 'lost' due to overflow of the left shift are the same bits that as > > used to increment bt->sec: > > bt->sec += x >> 32; > > So the 2 seconds are accounted for. > > > >> > >> Thus the "bintime_addx(bt, x << 32)" is still > >> based on a truncated value. > > > > I must admit that 2 seconds of interval where the timehands where > > not updated is too much. This might be the real cause of all ppc > > troubles. I tried to see if the overflow case is possible on amd64, > > and did not get a single case of the '> 63' branch executed during the > > /usr/tests/lib/libc run. > > The algorithm requires the update interval to be less than 1 second. > th_scale is 2**64 / tc_frequency, so whatever tc_frequency is, after > 1 second the value of the multiplication is approximately 2**64 so > it overflows about then (depending on rounding). > > The most useful timecounters are TSC's, and these give another overflow > in tc_delta() after 1 second when their frequency is 4 GHz (except the > bogus TSC-low timecounter reduces the frequency to below 2 binary GHz, > so the usual case is overflow after 2 seconds). As I said, I was unable to trigger the overflow on amd64. > > > Actually, the same overflow-prone code exists in libc, so below is the > > updated patch: > > - I added __predict_false() > > - libc multiplication is also done separately for high-order bits. > > (fftclock counterpart is still pending). > > > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > > index 3749e0473af..a14576988ff 100644 > > --- a/lib/libc/sys/__vdso_gettimeofday.c > > +++ b/lib/libc/sys/__vdso_gettimeofday.c > > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > > #include > > #include > > #include > > +#include > > +#include > > #include > > #include > > #include "libc_private.h" > > @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > { > > struct vdso_timehands *th; > > uint32_t curr, gen; > > + uint64_t scale, x; > > u_int delta; > > int error; > > > > @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > continue; > > if (error != 0) > > return (error); > > - bintime_addx(bt, th->th_scale * delta); > > + scale = th->th_scale; > > + if (__predict_false(fls(scale) + fls(delta) > 63)) { > > This is unnecessarily pessimal. Updates must be frequent enough to prevent > tc_delta() overflowing, and it is even easier to arrange that this > multiplication doesn't overflow (since the necessary update interval for > the latter is constant). > > `scale' is 64 bits, so fls(scale) is broken on 32-bit arches, and > flls(scale) is an especially large pessimization. I saw this on my > calcru1() fixes -- the flls()s take almost as long as long long > divisions when the use the pessimal C versions. Ok, fixed. > > The algorithm requires tc_delta() to be only 32 bits, since otherwise the > multiplication would be 64 x 64 bits so would be much slower and harder > to write. > > If tc_freqency is far above 4GHz, then th_scale is far below 4G, so the > scaling is not so accurate. But 0.25 parts per billion is much more than > enough. Even 1 part per million is enough for a TSC, since TSC instability > is more than 1ppm. The overflows could be pushed off to 1024 seconds by > dividing by 1024 at suitable places. A 32-bit scale times a 64-bit delta > would be simple compared with both 64 bits (much like the code here). > > The code here can be optimized using values calculated at initialization > time instead of fls*(). The overflow threshold for delta is approximately > 2**64 / tc_frequency. > > > + x = (scale >> 32) * delta; > > + scale &= UINT_MAX; > > + bt->sec += x >> 32; > > + bintime_addx(bt, x << 32); > > + } > > + bintime_addx(bt, scale * delta); > > if (abs) > > bintime_add(bt, &th->th_boottime); > > When the timecounter is the i8254, as it often was when timecounters > were new, tc_windup() had to be called more often than every i8254 rollover > (in practice once every hardclock tick), partly to keep tc_delta() small > (since rollover gives a form of overflow). This was not so easy to arrange. > It requires not losing any hardclock ticks and also not having any with > high latency and also complications to detect the rollover when there is > small latency. Most hardware is easier to handle now. With tickless > kernels, hardclock() is often not called for about 1 second, but it must > be called at least that often to prevent the overflow here. Updated patch. diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c index 3749e0473af..fdefda08e39 100644 --- a/lib/libc/sys/__vdso_gettimeofday.c +++ b/lib/libc/sys/__vdso_gettimeofday.c @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include "libc_private.h" @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) { struct vdso_timehands *th; uint32_t curr, gen; - u_int delta; + uint64_t scale, x; + u_int delta, scale_bits; int error; do { @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) continue; if (error != 0) return (error); - bintime_addx(bt, th->th_scale * delta); + scale = th->th_scale; +#ifdef _LP64 + scale_bits = ffsl(scale); +#else + scale_bits = ffsll(scale); +#endif + if (__predict_false(scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); if (abs) bintime_add(bt, &th->th_boottime); diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..eedea5183c0 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -72,6 +72,7 @@ struct timehands { struct timecounter *th_counter; int64_t th_adjustment; uint64_t th_scale; + u_int th_scale_bits; u_int th_offset_count; struct bintime th_offset; struct bintime th_bintime; @@ -355,13 +356,22 @@ void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +398,22 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) scale += (th->th_adjustment / 1024) * 2199; scale /= th->th_counter->tc_frequency; th->th_scale = scale * 2; +#ifdef _LP64 + th->th_scale_bits = ffsl(th->th_scale); +#else + th->th_scale_bits = ffsll(th->th_scale); +#endif /* * Now that the struct timehands is again consistent, set the new From owner-freebsd-ppc@freebsd.org Fri Mar 1 20:15:46 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60BA615234C8 for ; Fri, 1 Mar 2019 20:15:46 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic309-21.consmr.mail.ne1.yahoo.com (sonic309-21.consmr.mail.ne1.yahoo.com [66.163.184.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 679DE8FCD6 for ; Fri, 1 Mar 2019 20:15:45 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: UGCe2fUVM1mOKVylvpwuM8JocMdtjeD0oRJaclN8uxKfmEBx4AadKxCY2Pni1Ls 49RyrKL8qw_63XkklBTeXthPOrX4LvgJy_ucnBzSuJBFR7sp2j7RPEWlx8oINodsc24XYGuIfhym Ep5s2tXMmfbu7Pn6E8nuzeA2zLo.mU.1.sJkvBvcdibYEQ9VUw3PieBRGquDBurQSJLraIboWlp6 aLfBsvAHnjEJ.w1rVu0mirN_aEuXoqWCuxBTi93LmtIm8o0TzF7138Yr1YbRHNJE6fzcjsgYn5Vw RCQK8j7g7c_aljHhCbMoZ.WDFvgWQJ4Pxm9j_yjCoaVzhmg560UeZufPo4bkPKlP_8O.RZYLl6Jz dNdkYpQwVKFro.Nwvdtkqh18FFMdoFxHSZNYqCpA4JlPkyd2Lkqv7nzF8VLthu23nz5SIREYXXwh eYyUp.jvaAsPtWfyQqFq1U.Y8n3x5LUhR.FePg2tBROnTMr6XDmHI_mFgog8l0Pt9o6e.jBcKiau TJqAkG7ZljE.6HSLWVvKM2kIgsFvAVe04pmNkN2vbTk2YWTGe3GJl.kxfgY3PeSE5IoCgHCuG8Wy rtAQcBLW8ja_etvCkoNbyxFTAcSNsJ94j1WJ58FauVtIlNS4.GUzl9FFCcUlXHdbzvQweob6MduT BojkG.jBfJ8fwh3GRK093dmqcx5iTqttswCz4VDTzDwoZtIYGOGYN.EZ7JsIPw.fYVOQ9Xz5URDc XqWzn.9hC29mAozOtlXsQB2THqoPVD37RLjNJZOMk99kJwHC48G.aEEeW52bLXu4fgj8fuH5dPav 7pJBr1iVcThQ9zpOUsLbViw5Be9ZawyODKEx4ESyRA2k6l32qjfmR.WjQ5hGryQbU6Q0Yg4SEguf fr.OjokhRO86ETPynyIgzBDqI5cCEeC1d4c1eAiYGt83AdJIp9hLkj8pD7YOQdn3k5.r6lCU3Grx D3WNCnT2aCO_KGIhNRCAOCZv_f7nFJuuIoZBpmGPShjayys_G0wi1qPozFZVZ4u33tjUhv3nHvH_ VXNny5N.MRwAeYAUpdpWsHj6Tv0aOPxlsBwnoERPNuNHepkgeogAr6fB.MqRa.gCWh.6FnTyq8Zh 6oyJv Received: from sonic.gate.mail.ne1.yahoo.com by sonic309.consmr.mail.ne1.yahoo.com with HTTP; Fri, 1 Mar 2019 20:15:37 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp417.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 055c9a72be27f708d115548f28119fc1; Fri, 01 Mar 2019 20:15:36 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <20190302043936.A4444@besplex.bde.org> Date: Fri, 1 Mar 2019 12:15:35 -0800 Cc: Konstantin Belousov , freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <908615FE-0638-4A80-A3C9-6FC36219ECC3@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 679DE8FCD6 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.72 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[optusnet.com.au]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.96)[0.962,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.06)[ip: (3.04), ipnet: 66.163.184.0/21(1.30), asn: 36646(1.04), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.83)[0.828,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.38)[0.377,0]; RCVD_IN_DNSWL_NONE(0.00)[147.184.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 20:15:46 -0000 On 2019-Mar-1, at 10:40, Bruce Evans wrote: > On Fri, 1 Mar 2019, Konstantin Belousov wrote: >=20 >> On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote: >>> [The new, trial code also has truncation occurring.] >> In fact no, I do not think it is. >>=20 >>> An example observation of diff_scaled having an overflowed >>> value was: >>>=20 >>> scale_factor =3D=3D 0x80da2067ac >>> scale_factor*freq overflows unsigned, 64 bit representation. >>> tim_offset =3D=3D 0x3da0eaeb >>> tim_cnt =3D=3D 0x42dea3c4 >>> tim_diff =3D=3D 0x53db8d9 >>> For reference: 0x1fc9d43 =3D=3D = 0xffffffffffffffffull/scale_factor >>> scaled_diff =3D=3D 0xA353A5BF3FF780CC (truncated to 64 bits) >>>=20 >>> But for the new, trail code: >>>=20 >>> 0x80da2067ac is 40 bits >>> 0x53db8d9 is 27 bits >>> So 67 bits, more than 63. Then: >>>=20 >>> x >>> =3D=3D (0x80da2067ac>>32) * 0x53db8d9 >>> =3D=3D 0x80 * 0x53db8d9 >>> =3D=3D 0x29EDC6C80 >>>=20 >>> x>>32 >>> =3D=3D 0x2 >>>=20 >>> x<<32 >>> =3D=3D 0x9EDC6C8000000000 (limited to 64 bits) >>> Note the truncation of: 0x29EDC6C8000000000. >> Right, this is how the patch is supposed to work. Note that the = overflow >> bits 'lost' due to overflow of the left shift are the same bits that = as >> used to increment bt->sec: >> bt->sec +=3D x >> 32; >> So the 2 seconds are accounted for. >>=20 >>>=20 >>> Thus the "bintime_addx(bt, x << 32)" is still >>> based on a truncated value. >>=20 >> I must admit that 2 seconds of interval where the timehands where >> not updated is too much. This might be the real cause of all ppc >> troubles. I tried to see if the overflow case is possible on amd64, >> and did not get a single case of the '> 63' branch executed during = the >> /usr/tests/lib/libc run. >=20 > The algorithm requires the update interval to be less than 1 second. > th_scale is 2**64 / tc_frequency, so whatever tc_frequency is, after > 1 second the value of the multiplication is approximately 2**64 so > it overflows about then (depending on rounding). I've not tracked down evidence below binuptime on powerpc64 yet. I'm not sure how much of the substructure I can investigate. The context here is 2 sockets, each with 2 cores. > The most useful timecounters are TSC's, and these give another = overflow > in tc_delta() after 1 second when their frequency is 4 GHz (except the > bogus TSC-low timecounter reduces the frequency to below 2 binary GHz, > so the usual case is overflow after 2 seconds). The wording suggests a amd64/i386 context but my report was for powerpc64, specifically for old PowerMac G5's. (I currently have\access to only one ut I've seen the beuavior on others in the last.) FreeBSD reports: # sysctl kern.timecounter kern.timecounter.tc.timebase.quality: 0 kern.timecounter.tc.timebase.frequency: 33333333 kern.timecounter.tc.timebase.counter: 1831468476 kern.timecounter.tc.timebase.mask: 4294967295 kern.timecounter.stepwarnings: 0 kern.timecounter.alloweddeviation: 5 kern.timecounter.hardware: timebase kern.timecounter.choice: timebase(0) dummy(-1000000) kern.timecounter.tick: 1 kern.timecounter.fast_gettime: 1 FreeBSD uses the lower 32 bits of the tbr (via mftb). As I understand there are problems with how close the tbr's can be kept. >> Actually, the same overflow-prone code exists in libc, so below is = the >> updated patch: >> - I added __predict_false() >> - libc multiplication is also done separately for high-order bits. >> (fftclock counterpart is still pending). >>=20 >> diff --git a/lib/libc/sys/__vdso_gettimeofday.c = b/lib/libc/sys/__vdso_gettimeofday.c >> index 3749e0473af..a14576988ff 100644 >> --- a/lib/libc/sys/__vdso_gettimeofday.c >> +++ b/lib/libc/sys/__vdso_gettimeofday.c >> @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); >> #include >> #include >> #include >> +#include >> +#include >> #include >> #include >> #include "libc_private.h" >> @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep = *tk, int abs) >> { >> struct vdso_timehands *th; >> uint32_t curr, gen; >> + uint64_t scale, x; >> u_int delta; >> int error; >>=20 >> @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep = *tk, int abs) >> continue; >> if (error !=3D 0) >> return (error); >> - bintime_addx(bt, th->th_scale * delta); >> + scale =3D th->th_scale; >> + if (__predict_false(fls(scale) + fls(delta) > 63)) { >=20 > This is unnecessarily pessimal. Updates must be frequent enough to = prevent > tc_delta() overflowing, and it is even easier to arrange that this > multiplication doesn't overflow (since the necessary update interval = for > the latter is constant). >=20 > `scale' is 64 bits, so fls(scale) is broken on 32-bit arches, and > flls(scale) is an especially large pessimization. I saw this on my > calcru1() fixes -- the flls()s take almost as long as long long > divisions when the use the pessimal C versions. Unlike i386's and amd64's: static __inline __pure2 int fls(int mask) { return (mask =3D=3D 0 ? mask : (int)bsrl((u_int)mask) + 1); } powerpc64's apparently use /usr/src/sys/libkern/fls.c 's: int fls(int mask) { int bit; if (mask =3D=3D 0) return (0); for (bit =3D 1; mask !=3D 1; bit++) mask =3D (unsigned int)mask >> 1; return (bit); } (At least I did not find an alternate for the powerpc64 context.) So, if I got that right, fls does a lot of looping on powerpc64. > The algorithm requires tc_delta() to be only 32 bits, since otherwise = the > multiplication would be 64 x 64 bits so would be much slower and = harder > to write. >=20 > If tc_freqency is far above 4GHz, then th_scale is far below 4G, so = the > scaling is not so accurate. But 0.25 parts per billion is much more = than > enough. Even 1 part per million is enough for a TSC, since TSC = instability > is more than 1ppm. The overflows could be pushed off to 1024 seconds = by > dividing by 1024 at suitable places. A 32-bit scale times a 64-bit = delta > would be simple compared with both 64 bits (much like the code here). >=20 > The code here can be optimized using values calculated at = initialization > time instead of fls*(). The overflow threshold for delta is = approximately > 2**64 / tc_frequency. >=20 >> + x =3D (scale >> 32) * delta; >> + scale &=3D UINT_MAX; >> + bt->sec +=3D x >> 32; >> + bintime_addx(bt, x << 32); >> + } >> + bintime_addx(bt, scale * delta); >> if (abs) >> bintime_add(bt, &th->th_boottime); >=20 > When the timecounter is the i8254, as it often was when timecounters > were new, tc_windup() had to be called more often than every i8254 = rollover > (in practice once every hardclock tick), partly to keep tc_delta() = small > (since rollover gives a form of overflow). This was not so easy to = arrange. > It requires not losing any hardclock ticks and also not having any = with > high latency and also complications to detect the rollover when there = is > small latency. Most hardware is easier to handle now. With tickless > kernels, hardclock() is often not called for about 1 second, but it = must > be called at least that often to prevent the overflow here. Thanks for the notes. I've been wondering around in unfamiliar = territory. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Fri Mar 1 20:38:33 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 546DB1523E40; Fri, 1 Mar 2019 20:38:33 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 9AE989082D; Fri, 1 Mar 2019 20:38:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 9FA6543C3C9; Sat, 2 Mar 2019 07:38:22 +1100 (AEDT) Date: Sat, 2 Mar 2019 07:38:20 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Bruce Evans , Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-Reply-To: <20190301194217.GB68879@kib.kiev.ua> Message-ID: <20190302071425.G5025@besplex.bde.org> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=FNpr/6gs c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=qvjnQ3-ODA_JSq_lWvkA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 9AE989082D X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.93 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.93)[-0.929,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 20:38:33 -0000 On Fri, 1 Mar 2019, Konstantin Belousov wrote: > On Sat, Mar 02, 2019 at 05:40:58AM +1100, Bruce Evans wrote: >> On Fri, 1 Mar 2019, Konstantin Belousov wrote: >> >>> On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote: >>>> [The new, trial code also has truncation occurring.] >>> In fact no, I do not think it is. >>> >>>> An example observation of diff_scaled having an overflowed >>>> value was: >>>> >>>> scale_factor == 0x80da2067ac >>>> scale_factor*freq overflows unsigned, 64 bit representation. >>>> tim_offset == 0x3da0eaeb >>>> tim_cnt == 0x42dea3c4 >>>> tim_diff == 0x53db8d9 >>>> For reference: 0x1fc9d43 == 0xffffffffffffffffull/scale_factor >>>> scaled_diff == 0xA353A5BF3FF780CC (truncated to 64 bits) >>>> >>>> But for the new, trail code: >>>> >>>> 0x80da2067ac is 40 bits >>>> 0x53db8d9 is 27 bits >>>> So 67 bits, more than 63. Then: >>>> >>>> x >>>> == (0x80da2067ac>>32) * 0x53db8d9 >>>> == 0x80 * 0x53db8d9 >>>> == 0x29EDC6C80 >>>> >>>> x>>32 >>>> == 0x2 >>>> >>>> x<<32 >>>> == 0x9EDC6C8000000000 (limited to 64 bits) >>>> Note the truncation of: 0x29EDC6C8000000000. >>> Right, this is how the patch is supposed to work. Note that the overflow >>> bits 'lost' due to overflow of the left shift are the same bits that as >>> used to increment bt->sec: >>> bt->sec += x >> 32; >>> So the 2 seconds are accounted for. 67 bits is 4-8 seconds, and it is an error for the adjustment to be >= 1 second. >>> >>>> >>>> Thus the "bintime_addx(bt, x << 32)" is still >>>> based on a truncated value. >>> >>> I must admit that 2 seconds of interval where the timehands where >>> not updated is too much. This might be the real cause of all ppc >>> troubles. I tried to see if the overflow case is possible on amd64, >>> and did not get a single case of the '> 63' branch executed during the >>> /usr/tests/lib/libc run. >> >> The algorithm requires the update interval to be less than 1 second. >> th_scale is 2**64 / tc_frequency, so whatever tc_frequency is, after >> 1 second the value of the multiplication is approximately 2**64 so >> it overflows about then (depending on rounding). >> >> The most useful timecounters are TSC's, and these give another overflow >> in tc_delta() after 1 second when their frequency is 4 GHz (except the >> bogus TSC-low timecounter reduces the frequency to below 2 binary GHz, >> so the usual case is overflow after 2 seconds). > As I said, I was unable to trigger the overflow on amd64. Yes, amd64 doesn't have the bug. It gets tested more with fast TSCs that overflow after about 1 second for other reasons. Try amd64 with an ACPI or HPET timecounter. I sometimes do this to avoid the overflow in tc_delta() while stopped in ddb. This seemed to work. Actually, it doesn't work since the overflow under discussion still occurs. >* ... >>> @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) >>> continue; >>> if (error != 0) >>> return (error); >>> - bintime_addx(bt, th->th_scale * delta); >>> + scale = th->th_scale; >>> + if (__predict_false(fls(scale) + fls(delta) > 63)) { >> >> This is unnecessarily pessimal. Updates must be frequent enough to prevent >> tc_delta() overflowing, and it is even easier to arrange that this >> multiplication doesn't overflow (since the necessary update interval for >> the latter is constant). >> >> `scale' is 64 bits, so fls(scale) is broken on 32-bit arches, and >> flls(scale) is an especially large pessimization. I saw this on my >> calcru1() fixes -- the flls()s take almost as long as long long >> divisions when the use the pessimal C versions. > Ok, fixed. It is still very slow. > Updated patch. > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > index 3749e0473af..fdefda08e39 100644 > --- a/lib/libc/sys/__vdso_gettimeofday.c > +++ b/lib/libc/sys/__vdso_gettimeofday.c > ... > @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > continue; > if (error != 0) > return (error); > - bintime_addx(bt, th->th_scale * delta); > + scale = th->th_scale; > +#ifdef _LP64 > + scale_bits = ffsl(scale); > +#else > + scale_bits = ffsll(scale); > +#endif > + if (__predict_false(scale_bits + fls(delta) > 63)) { Userland is still fully pessimized, except when the compiler auto-inlines ffs*(). > diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c > index 2656fb4d22f..eedea5183c0 100644 > --- a/sys/kern/kern_tc.c > +++ b/sys/kern/kern_tc.c > @@ -355,13 +356,22 @@ void > binuptime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; > > do { > th = timehands; > gen = atomic_load_acq_int(&th->th_generation); > *bt = th->th_offset; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale = th->th_scale; > + delta = tc_delta(th); > + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { Better, but shouldn't be changed (and the bug that causes the large intervals remains unlocated), and if it is changed then it should use: if (delta >= th->th_large_delta) > @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) > scale += (th->th_adjustment / 1024) * 2199; > scale /= th->th_counter->tc_frequency; > th->th_scale = scale * 2; > +#ifdef _LP64 > + th->th_scale_bits = ffsl(th->th_scale); > +#else > + th->th_scale_bits = ffsll(th->th_scale); > +#endif th->th_large_delta = ((uint64_t)1 << 63) / scale; Bruce From owner-freebsd-ppc@freebsd.org Fri Mar 1 20:45:45 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E570115002A4 for ; Fri, 1 Mar 2019 20:45:44 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic308-11.consmr.mail.ne1.yahoo.com (sonic308-11.consmr.mail.ne1.yahoo.com [66.163.187.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 345AC90DE1 for ; Fri, 1 Mar 2019 20:45:44 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: w_ImTU8VM1lSO2WiVOJxikblHhLZ1O7DwLaxSqVU13MJYKUDrIbQmuK6SgQuOJb y1Qqeia.EbJnu1I26_s4LgMg_u5YKctBdKaODuRe08gAm3SgQ8j_PxV2qwiDYJ6Q0HDWcdiwUhlB VH25xHo7QuzPhqrWmwIgjzDUoDtJn.FXYLmf2qVhpnsfBdd.suGY5jB_4mozV3upiPu4OxksVttk N4xlCfE7oRUA5WJik8c7OVEMoMuPFLHgefgXStVFBAG2gabVQUpLdIffvPbG32SCrKDqGkuPcm1a B0W3lePvGRsxbxigZfhbPDJhnDEqZtSfLqsc5WNzxak0UGVr8wpA5vwXGNoBajBHQRWEHw9Gyrg. UwcaIlob4UyNcnSh_pE_iGjgkupRCHsUYnqSzxyvxSDuEzNvT2bNoL10PB4Vgbp9FsPWhDDTEBdg qluK_o6.gCO5xHpLbvJ7djPWwPnwRx1HbdJXCJ1KL7W6_QgpLVhkJfkbz6LntrfXHYliE2e_mzis Jwow2su9BtZeknPyXNamnonyE3sNAV1o5CLThIlxe1DXLsoyP.6d8N1JNr0edv9lQCt8Sn86BuuV PS2fIdECaFnyAJWebiPjRLQcFiI.8KBe5xtaPNSobOJplxo8b5Z7SbqmaetClsCjmMFeS5U69xIQ zl1bAQd_o0tTYICTjSLEk34AOM378RxeqXfS6.dLu3JUco6IC1oPESMsXiT4g7s63dbKfbovbNht G7E4uY4sKIndk_2ISbNBvRL3DwbdEJHI_6LcxDeLTHiO8PzQT2kB4BtKIuVXZ4YGUxlilYM9XbR9 MARAoqHsN4Lib7gziezivWZAaVp641gkkYFhoxGO4QX2e3oYUF6mBwpr6KXTYofk0sSQjzPJYQNv 3E5SAIFFJmzOksAgBw9Mj5LRGNvfCW60oSjk7QMN.fnVaop3mBNAsQ.28UB46M99Ncm.hvtNEy8I F8A_fcoQqUyTa5zeebvArQoNf.SMft6l2JtVF3EKQBmd9o71_hHJ46p81n.aFfZ0rrEdcXGavAhE TwjZoEk1Hm8HdUtA_VVirIY965jtVL7X0V3nm5YqsRVf8KXf2wpN.okFVi3CFZXmkOb7CEb2QLtu W Received: from sonic.gate.mail.ne1.yahoo.com by sonic308.consmr.mail.ne1.yahoo.com with HTTP; Fri, 1 Mar 2019 20:45:34 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp404.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 25490b72501497ce03a54ef8a5fe872c; Fri, 01 Mar 2019 20:45:31 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <20190301112717.GW2420@kib.kiev.ua> Date: Fri, 1 Mar 2019 12:45:29 -0800 Cc: bde@freebsd.org, freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 345AC90DE1 X-Spamd-Bar: ------ X-Spamd-Result: default: False [-6.98 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.98)[-0.982,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 20:45:45 -0000 On 2019-Mar-1, at 03:27, Konstantin Belousov wrote: > On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote: >> [The new, trial code also has truncation occurring.] > In fact no, I do not think it is. >=20 >> An example observation of diff_scaled having an overflowed >> value was: >>=20 >> scale_factor =3D=3D 0x80da2067ac >> scale_factor*freq overflows unsigned, 64 bit representation. >> tim_offset =3D=3D 0x3da0eaeb >> tim_cnt =3D=3D 0x42dea3c4 >> tim_diff =3D=3D 0x53db8d9 >> For reference: 0x1fc9d43 =3D=3D = 0xffffffffffffffffull/scale_factor >> scaled_diff =3D=3D 0xA353A5BF3FF780CC (truncated to 64 bits) >>=20 >> But for the new, trail code: >>=20 >> 0x80da2067ac is 40 bits >> 0x53db8d9 is 27 bits >> So 67 bits, more than 63. Then: >>=20 >> x >> =3D=3D (0x80da2067ac>>32) * 0x53db8d9 >> =3D=3D 0x80 * 0x53db8d9 >> =3D=3D 0x29EDC6C80 >>=20 >> x>>32 >> =3D=3D 0x2 >>=20 >> x<<32 >> =3D=3D 0x9EDC6C8000000000 (limited to 64 bits) >> Note the truncation of: 0x29EDC6C8000000000. > Right, this is how the patch is supposed to work. Note that the = overflow > bits 'lost' due to overflow of the left shift are the same bits that = as > used to increment bt->sec: > bt->sec +=3D x >> 32; > So the 2 seconds are accounted for. Good to know. >>=20 >> Thus the "bintime_addx(bt, x << 32)" is still >> based on a truncated value. >>=20 >=20 > I must admit that 2 seconds of interval where the timehands where > not updated is too much. This might be the real cause of all ppc > troubles. I tried to see if the overflow case is possible on amd64, > and did not get a single case of the '> 63' branch executed during the > /usr/tests/lib/libc run. >=20 > Actually, the same overflow-prone code exists in libc, so below is the > updated patch: > - I added __predict_false() > - libc multiplication is also done separately for high-order bits. > (fftclock counterpart is still pending). >=20 > diff --git a/lib/libc/sys/__vdso_gettimeofday.c = b/lib/libc/sys/__vdso_gettimeofday.c > index 3749e0473af..a14576988ff 100644 > --- a/lib/libc/sys/__vdso_gettimeofday.c > +++ b/lib/libc/sys/__vdso_gettimeofday.c > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > +#include > #include > #include > #include "libc_private.h" > @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep = *tk, int abs) > { > struct vdso_timehands *th; > uint32_t curr, gen; > + uint64_t scale, x; > u_int delta; > int error; >=20 > @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep = *tk, int abs) > continue; > if (error !=3D 0) > return (error); > - bintime_addx(bt, th->th_scale * delta); > + scale =3D th->th_scale; > + if (__predict_false(fls(scale) + fls(delta) > 63)) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > if (abs) > bintime_add(bt, &th->th_boottime); >=20 > diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c > index 2656fb4d22f..be75781e000 100644 > --- a/sys/kern/kern_tc.c > +++ b/sys/kern/kern_tc.c > @@ -355,13 +355,22 @@ void > binuptime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; >=20 > do { > th =3D timehands; > gen =3D atomic_load_acq_int(&th->th_generation); > *bt =3D th->th_offset; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale =3D th->th_scale; > + delta =3D tc_delta(th); > + if (__predict_false(fls(scale) + fls(delta) > 63)) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen =3D=3D 0 || gen !=3D th->th_generation); > } > @@ -388,13 +397,22 @@ void > bintime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale, x; > + u_int delta, gen; >=20 > do { > th =3D timehands; > gen =3D atomic_load_acq_int(&th->th_generation); > *bt =3D th->th_bintime; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale =3D th->th_scale; > + delta =3D tc_delta(th); > + if (__predict_false(fls(scale) + fls(delta) > 63)) { > + x =3D (scale >> 32) * delta; > + scale &=3D UINT_MAX; > + bt->sec +=3D x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > atomic_thread_fence_acq(); > } while (gen =3D=3D 0 || gen !=3D th->th_generation); > } Thanks. I'll note that powerpc64 seems to use: int fls(int mask) { int bit; if (mask =3D=3D 0) return (0); for (bit =3D 1; mask !=3D 1; bit++) mask =3D (unsigned int)mask >> 1; return (bit); } from /usr/src/sys/libkern/fls.c (unless I missed finding an alternate machine-dependent substitution somewhere). So lots of looping. I've no clue if this contributes to why, with your prior patch, the powerpcl64 spent most of its time being non-responsive and I was unable to do anything effective but escape into and use ddb. (I had to power off to shutdown. I've had to boot from other media to repair things after such.) I expect I need to figure out how to investigate the operation of substructure that contributes to binuptime in order to figure out what contributes to the large deltas. One point from my general understanding is that keeping the tbr's reasonably tracking across the 2 sockets (2 cores per socket) in my context is problematical/racy. (Not that I know the details at this point.) FYI: # sysctl kern.timecounter kern.timecounter.tc.timebase.quality: 0 kern.timecounter.tc.timebase.frequency: 33333333 kern.timecounter.tc.timebase.counter: 1831468476 kern.timecounter.tc.timebase.mask: 4294967295 kern.timecounter.stepwarnings: 0 kern.timecounter.alloweddeviation: 5 kern.timecounter.hardware: timebase kern.timecounter.choice: timebase(0) dummy(-1000000) kern.timecounter.tick: 1 kern.timecounter.fast_gettime: 1 At around 33.3 MHz 2^32 counts would be about 128 seconds to wrap. I do know that the tbr is 64 bit on powerpc64 --and is truncated to 32 bits for its use: static unsigned decr_get_timecount(struct timecounter *tc) { return (mftb()); } in /usr/src/sys/powerpc/powerpc/clock.c . I'll keep trying to get evidence. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Fri Mar 1 20:57:04 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DB0C515007F3; Fri, 1 Mar 2019 20:57:04 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 1C63491330; Fri, 1 Mar 2019 20:57:03 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (v-critter.freebsd.dk [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id 734E7202563A; Fri, 1 Mar 2019 20:57:02 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTPS id x21Kv2du006671 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Fri, 1 Mar 2019 20:57:02 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.15.2/8.15.2/Submit) id x21Kv1Ug006670; Fri, 1 Mar 2019 20:57:01 GMT (envelope-from phk) To: Mark Millard , Mark Millard via freebsd-hackers cc: Konstantin Belousov , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-reply-to: <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> From: "Poul-Henning Kamp" References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <6668.1551473821.1@critter.freebsd.dk> Content-Transfer-Encoding: quoted-printable Date: Fri, 01 Mar 2019 20:57:01 +0000 Message-ID: <6669.1551473821@critter.freebsd.dk> X-Rspamd-Queue-Id: 1C63491330 X-Spamd-Bar: ++++ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [4.14 / 15.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.95)[0.947,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.dk]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.92)[0.917,0]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: phk.freebsd.dk]; NEURAL_SPAM_LONG(0.87)[0.868,0]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[phk@phk.freebsd.dk,phk@critter.freebsd.dk]; FREEMAIL_TO(0.00)[yahoo.com]; RCVD_NO_TLS_LAST(0.10)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU]; FROM_NEQ_ENVFROM(0.00)[phk@phk.freebsd.dk,phk@critter.freebsd.dk]; IP_SCORE(0.12)[ip: (0.18), ipnet: 130.225.0.0/16(0.08), asn: 1835(0.32), country: EU(-0.00)] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 20:57:05 -0000 -------- In message <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com>, Mark Millard = via freebsd-hackers writes: >> I must admit that 2 seconds of interval where the timehands where >> not updated is too much. I have no idea how you got in that situation, but it is very far from how timecounters were designed to work. -- = Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe = Never attribute to malice what can adequately be explained by incompetence= . From owner-freebsd-ppc@freebsd.org Fri Mar 1 21:11:44 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D1FFA1500D88 for ; Fri, 1 Mar 2019 21:11:43 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound3d.ore.mailhop.org (outbound3d.ore.mailhop.org [54.186.57.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1C60891968 for ; Fri, 1 Mar 2019 21:11:42 +0000 (UTC) (envelope-from ian@freebsd.org) ARC-Seal: i=1; a=rsa-sha256; t=1551474696; cv=none; d=outbound.mailhop.org; s=arc-outbound20181012; b=rh4PpW+UXVVLOc6Odx6lvB6NvvobLCh2CZJ5ekkDNGt2e4KgWCJgmtKt6yWyccSqmv1hn25Ht7vih CNp/h+DmQlrvp3wsnECWKeAJr5QMEUV2YamwMczkRZ42og8iJ3WOzLCV3Ow43HKNrKNoAu4rRflGhW 3hhuxZ2EG6WnM99HQ3wnJUEW9LSVU3oPhNXG7OzkhHgTD9oQVDW2yHth4FxQcR6kBnSZYwZlfn3/K+ udquiKVk9TNMXH6IZ7oe6qJ5kkrfH0JoZj7F2zAbTezfIVX/U7I6gcOvs7j9y8u188IsB06zZyc96X eliDmNLP2SWzk9Ct8mZXOz7ReXCqVdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=outbound.mailhop.org; s=arc-outbound20181012; h=content-transfer-encoding:mime-version:content-type:references:in-reply-to: date:cc:to:from:subject:message-id:dkim-signature:from; bh=sPPrcz/nemabeI3Q0MJBK7jjUbIbxct/Syj4Svde9hw=; b=nptQITE5InWW1GMwq5o/pSDR3IXzNsjeWxiqvqSQGPmKwmh1+GJwXdi7ZJ1sx/VEaGMIDJnRvqRLw 0BAOVrQp4JmiY2Xo8+RPR2kQ4lIS+Fr6LpSGd2xHf3i7sUplWgaUs6fGsva9YxspfwOjlUKCc75AvM O9I599tMv+uBY6yzcDMydRwU6LWvREDNk+tmCdPskU9JjlHJmn2jXyyEopcNZukxKmfVn5ffXyyL7A 6lUpOuiJjjmtC0h7BYTndbqEiYEsDvK6R9RheEktYMzJerKMU75u6haLVVgJnZXMkQW+/+zl5hFvnk porLC1eZXh/jhMedHpzN8Wo35TuSmwA== ARC-Authentication-Results: i=1; outbound3.ore.mailhop.org; spf=softfail smtp.mailfrom=freebsd.org smtp.remote-ip=67.177.211.60; dmarc=none header.from=freebsd.org; arc=none header.oldest-pass=0; DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outbound.mailhop.org; s=dkim-high; h=content-transfer-encoding:mime-version:content-type:references:in-reply-to: date:cc:to:from:subject:message-id:from; bh=sPPrcz/nemabeI3Q0MJBK7jjUbIbxct/Syj4Svde9hw=; b=o2TpAOVZVWwZWhaGMkvo2ZJhiWZQa22fe3UUCgQKv2tRUHkn3r9KHPWdtYWDjNXI8O/Y33cfYqn59 Xl5PyvLcd94saQDLz6hkdkZtoXqJd7GerxoJDbgP51BtM/34hJrvUFOK9wfiLVPkA5B9f930PPRx3O Nv1Tih4O1T/6Wwot/Z7TUy/lBEKGQhlc6C41OkHo4E2K8oR/UkxVQr7atql6t3A3h4XDyJGrr1fUbd SL9pW30/g2sRyfZBDSP7RSEdlz54RE/xdxdt4MqUA4TyBhaQDDuGxr6miqsIaT0YuqI5MUkjj41JYN 1M1WUnsVRPWH2Kt6KninkgzMT/nyQcw== X-MHO-RoutePath: aGlwcGll X-MHO-User: 975cf0de-3c66-11e9-9bb1-1f29e4676f89 X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 67.177.211.60 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [67.177.211.60]) by outbound3.ore.mailhop.org (Halon) with ESMTPSA id 975cf0de-3c66-11e9-9bb1-1f29e4676f89; Fri, 01 Mar 2019 21:11:35 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id x21LBXUS011773; Fri, 1 Mar 2019 14:11:33 -0700 (MST) (envelope-from ian@freebsd.org) Message-ID: <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Ian Lepore To: Poul-Henning Kamp , Mark Millard , Mark Millard via freebsd-hackers Cc: Konstantin Belousov , bde@freebsd.org, FreeBSD PowerPC ML Date: Fri, 01 Mar 2019 14:11:33 -0700 In-Reply-To: <6669.1551473821@critter.freebsd.dk> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 FreeBSD GNOME Team Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 1C60891968 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.98 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.98)[-0.984,0]; ASN(0.00)[asn:16509, ipnet:54.186.0.0/15, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 21:11:44 -0000 On Fri, 2019-03-01 at 20:57 +0000, Poul-Henning Kamp wrote: > -------- > In message <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com>, Mark > Millard via freebsd-hackers writes: > > > > I must admit that 2 seconds of interval where the timehands where > > > not updated is too much. > > I have no idea how you got in that situation, but it is very far > from how timecounters were designed to work. > I wonder if it's fallout from reducing the number of timehands to 2, which always struck me as a really bad idea. I know of at least one arm configuration which fails because of it (it takes a combo of a single- core system, and a pps capture driver that uses hardware latching of the timer and the polling method for reading the latched value; given all that, at least 4 sets of timehands are needed to avoid losing PPS events due to generation changes). -- Ian From owner-freebsd-ppc@freebsd.org Fri Mar 1 21:19:12 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9255D15011FB for ; Fri, 1 Mar 2019 21:19:12 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic316-21.consmr.mail.ne1.yahoo.com (sonic316-21.consmr.mail.ne1.yahoo.com [66.163.187.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 26DD391E1E for ; Fri, 1 Mar 2019 21:19:12 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: wlXP_3IVM1kk39c3jBAvbQ2snU_4bKfLFMIwgAPdcS.VwNo6ZM.RqTnO0kKxG_p rB4zu0T6hnRiSHiogb5z9M8oltnJ2TfbiOQPc1_lf5MPdR5ydMxLbUVftkG_feI1Du2upbVDdXKC 6a9pnHJ_EmM9xwxQHZiEJVWzmaJl7Ra9ObcGTkTcZ_tlEcrhDDfZ.2ciJs3u9ExVyRAvHOWO0nuo oP.AxELNOwpHSeGFjC4yAp8zAC._Vqj5wFbWXrmdAj7b4AetcmgJ.aOSPxJJBiG38Q48DkL.wgMx F8JF1hLSYMCVFeVlkWgT3YXkZ5ncLrdtQCAvI.vIqPyIXCTuMBwcc58PAdqHiXRUzvMGswqtHm8R mzuOcya28CYxmqdeXjpNbdXaQDwxcVp_vr_8.pvCfAu9X2wNvGmHga2Mgj4vHvNdUCtk7AJdvkWQ jU5GU17s9ta26D4DIkmyI3z0hlP4vw7V_HbJ8Jv_feBAWADLh4bL0Fx4Hno6YKxsvmlbmFfM.rtW Qt0_oqhLOq4M_SiKp.pwulzX59vzqusA7UKMb0oRvMbHLJx88TekCA3kC4_cDwBYR4S3JnP7uvkZ e6Dy_KN6D4d.gcckTRdDYNZv485cODO6z5vRj6Ma5bd_9j5Ow8E03QI2HgKKfVK2oyh3nA1WKaBs BPyPYInXxvxz_d1pTIP5UsNGJ.MKS5b3UorN6w9ZePPfPN9NUzzYhBri1VznMWdVE1HiKZkVy.ZI zbJKVYC3cg9oJ5BypClgzvEkP1ptbFf22Vas.zwgpvvF0kNqm77Zm_NqWrtVhlNmw88wW0WGcVUe l_lLr4JwKZr9keb..TlXUhSmUFM4_FgADWSw1IMHbOLemrNhpkaMiTyGqGw454uKW79LBAvHCXjH Sxjc8DyOCCNTxcAKiTCm98_vnekZUQlHPtYbMBQLak1GLochHRRXjgaZShta9432.DeF5NXS1NDZ 1MRjEhgP3vPeBKRsGXYN8cZTLsFBWkFC8mCY1FIZFGChyzxjlsfBNOgXAR9Sk4ZSTIpxqSTE2rIQ HweHk0l9nFCa2YbjD1V69tCzUbmT5cqiykdhh_L7H39MMmeASwDjE74LhHlo1X7ICS87YHFtL7ZW V Received: from sonic.gate.mail.ne1.yahoo.com by sonic316.consmr.mail.ne1.yahoo.com with HTTP; Fri, 1 Mar 2019 21:19:10 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp406.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 80f6d1e2fd7aed759a41de2e5e57562f; Fri, 01 Mar 2019 21:19:06 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <20190301194217.GB68879@kib.kiev.ua> Date: Fri, 1 Mar 2019 13:19:05 -0800 Cc: Bruce Evans , freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: 7bit Message-Id: <87D6CBD5-AE55-4EC8-8797-D8A9DC3D5A5A@yahoo.com> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 26DD391E1E X-Spamd-Bar: ------ X-Spamd-Result: default: False [-6.98 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.98)[-0.977,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 21:19:12 -0000 On 2019-Mar-1, at 11:42, Konstantin Belousov wrote: > . . . > +#ifdef _LP64 > + scale_bits = ffsl(scale); > +#else > + scale_bits = ffsll(scale); > +#endif. . . > + if (__predict_false(scale_bits + fls(delta) > 63)) { The patch from yesterday uniformly used: int fls(int mask) { int bit; if (mask == 0) return (0); for (bit = 1; mask != 1; bit++) mask = (unsigned int)mask >> 1; return (bit); } that looks for the most significant 1 bit. The new patch uses in some places: int ffsl(long mask) { int bit; if (mask == 0) return (0); for (bit = 1; !(mask & 1); bit++) mask = (unsigned long)mask >> 1; return (bit); } that looks for the least significant 1 bit. Similarly for: int ffsll(long long mask) { int bit; if (mask == 0) return (0); for (bit = 1; !(mask & 1); bit++) mask = (unsigned long long)mask >> 1; return (bit); } Was that deliberate? === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Fri Mar 1 21:35:10 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C24B71501C3E; Fri, 1 Mar 2019 21:35:10 +0000 (UTC) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: from pdx.rh.CN85.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A6FEE92999; Fri, 1 Mar 2019 21:35:09 +0000 (UTC) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: from pdx.rh.CN85.dnsmgr.net (localhost [127.0.0.1]) by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3) with ESMTP id x21LZ4D5063886; Fri, 1 Mar 2019 13:35:04 -0800 (PST) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: (from freebsd-rwg@localhost) by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3/Submit) id x21LZ4fM063885; Fri, 1 Mar 2019 13:35:04 -0800 (PST) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <201903012135.x21LZ4fM063885@pdx.rh.CN85.dnsmgr.net> Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-Reply-To: <908615FE-0638-4A80-A3C9-6FC36219ECC3@yahoo.com> To: Mark Millard Date: Fri, 1 Mar 2019 13:35:04 -0800 (PST) CC: Bruce Evans , freebsd-hackers Hackers , Konstantin Belousov , FreeBSD PowerPC ML X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: A6FEE92999 X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [3.12 / 15.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.91)[0.913,0]; IP_SCORE(0.00)[ip: (0.06), ipnet: 69.59.192.0/19(0.03), asn: 13868(0.01), country: US(-0.07)]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[dnsmgr.net]; AUTH_NA(1.00)[]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: pdx.rh.CN85.dnsmgr.net]; NEURAL_SPAM_LONG(0.46)[0.463,0]; NEURAL_SPAM_MEDIUM(0.85)[0.845,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:13868, ipnet:69.59.192.0/19, country:US]; FREEMAIL_CC(0.00)[optusnet.com.au]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 21:35:10 -0000 ( ... trimmed ... ) > > The most useful timecounters are TSC's, and these give another overflow > > in tc_delta() after 1 second when their frequency is 4 GHz (except the > > bogus TSC-low timecounter reduces the frequency to below 2 binary GHz, > > so the usual case is overflow after 2 seconds). > > The wording suggests a amd64/i386 context but my report was for > powerpc64, specifically for old PowerMac G5's. (I currently have\access > to only one ut I've seen the beuavior on others in the last.) > FreeBSD reports: I have access to 2, one is mine and the others is dexter@'s, both of them are now located at his place, but are available for testing. -- Rod Grimes rgrimes@freebsd.org From owner-freebsd-ppc@freebsd.org Sat Mar 2 06:57:36 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20C5B1515AC6; Sat, 2 Mar 2019 06:57:36 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 43BEE8165F; Sat, 2 Mar 2019 06:57:34 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (v-critter.freebsd.dk [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id D1299202561B; Sat, 2 Mar 2019 06:57:32 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTPS id x226vW4b008567 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 06:57:32 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.15.2/8.15.2/Submit) id x226vV7W008566; Sat, 2 Mar 2019 06:57:31 GMT (envelope-from phk) To: Ian Lepore cc: Mark Millard , Mark Millard via freebsd-hackers , Konstantin Belousov , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-reply-to: <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> From: "Poul-Henning Kamp" References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <8564.1551509851.1@critter.freebsd.dk> Content-Transfer-Encoding: quoted-printable Date: Sat, 02 Mar 2019 06:57:31 +0000 Message-ID: <8565.1551509851@critter.freebsd.dk> X-Rspamd-Queue-Id: 43BEE8165F X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [3.41 / 15.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.22)[0.223,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.dk]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.91)[0.911,0]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: phk.freebsd.dk]; NEURAL_SPAM_LONG(0.87)[0.871,0]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[phk@phk.freebsd.dk,phk@critter.freebsd.dk]; RCVD_NO_TLS_LAST(0.10)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU]; FROM_NEQ_ENVFROM(0.00)[phk@phk.freebsd.dk,phk@critter.freebsd.dk]; IP_SCORE(0.12)[ip: (0.17), ipnet: 130.225.0.0/16(0.08), asn: 1835(0.32), country: EU(-0.00)]; FREEMAIL_CC(0.00)[yahoo.com] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 06:57:36 -0000 -------- In message <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org>, I= an Lepore writes: >On Fri, 2019-03-01 at 20:57 +0000, Poul-Henning Kamp wrote: >> -------- >> In message <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com>, Mark >> Millard via freebsd-hackers writes: >> = >> > > I must admit that 2 seconds of interval where the timehands where >> > > not updated is too much. >> = >> I have no idea how you got in that situation, but it is very far >> from how timecounters were designed to work. >> = > >I wonder if it's fallout from reducing the number of timehands to 2, Unless somebody added refcounting, that sounds even less safe. -- = Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe = Never attribute to malice what can adequately be explained by incompetence= . From owner-freebsd-ppc@freebsd.org Sat Mar 2 06:30:14 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B4981514EA8 for ; Sat, 2 Mar 2019 06:30:14 +0000 (UTC) (envelope-from bounces+9749946-40cf-freebsd-ppc=freebsd.org@sendgrid.net) Received: from o1678910x203.outbound-mail.sendgrid.net (o1678910x203.outbound-mail.sendgrid.net [167.89.10.203]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6F61B80720 for ; Sat, 2 Mar 2019 06:30:13 +0000 (UTC) (envelope-from bounces+9749946-40cf-freebsd-ppc=freebsd.org@sendgrid.net) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=sendgrid.me; h=from:to:subject:mime-version:content-type; s=smtpapi; bh=XrY0a2JSkL8fMIVWwitDfzH10lg=; b=A1OztOW0R/SRXcp8+jOX6vou7xYSo 5cCGH5JXMyXhn5plaBN2LtsTucpK0srreD7FDV1TRmQBIgRZr/Br+rgyhSJ5cui5 QD4BetsMw3YhEYO9qv6m53c2llPx4J92m5jZH5BNPcwhv5rqnztiQzGElGyxdR1T 5bnrWOjwhT6ZUU= Received: by filter0001p3iad2.sendgrid.net with SMTP id filter0001p3iad2-3329-5C7A2128-18 2019-03-02 06:22:32.492462362 +0000 UTC m=+110341.818730698 Received: from WINN3MREAI5AD8 (unknown [38.132.124.234]) by ismtpd0030p1iad2.sendgrid.net (SG) with ESMTP id l0dwa4uaSYuYdSaDU2x0eA for ; Sat, 02 Mar 2019 06:22:32.345 +0000 (UTC) thread-index: AdTRSKD3MYf1SfEhT6aLHb7DT4N1xQ== Thread-Topic: Update payment method now M7PA.87855-212673-8774-1477 From: "service@netflix.com" To: Subject: Update payment method now M7PA.87855-212673-8774-1477 Date: Sat, 02 Mar 2019 06:23:32 +0000 (UTC) Message-ID: <547EB1E5EB68475FBC79BA1C490A141C@WINN3MREAI5AD8> MIME-Version: 1.0 X-Mailer: Microsoft CDO for Windows 2000 Content-Class: urn:content-classes:message Importance: normal Priority: normal X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7601.23651 X-SG-EID: Vb+Anvs0EfIvXbjCHlZrgfJ7kERTSlN8eYfhjx7Ga+VMMf26lE7DB46x3sRsECT/Pew9yLlGj4HCL+ 9j5aa/zoiRgCwFuPrNjvb/3jWyuJ4qLnxonjKgtjcNglzJ1hWcjUCKey9o0w+30eNtt/3sBVzdcyc3 LyG3fGk18Ifu4UkdfY8W4lGuD/JlWyEC9Bap4TF0MOa92+9b/J88wNeaKpiYWHsDF9L2bdcV5EtjgF I7bkAYhvb5xYzYwnqbksR3 X-Rspamd-Queue-Id: 6F61B80720 X-Spamd-Bar: +++++++++ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=sendgrid.me header.s=smtpapi header.b=A1OztOW0; spf=pass (mx1.freebsd.org: domain of bounces@sendgrid.net designates 167.89.10.203 as permitted sender) smtp.mailfrom=bounces@sendgrid.net X-Spamd-Result: default: False [9.42 / 15.00]; R_SPF_ALLOW(0.00)[+ip4:167.89.0.0/17]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[sendgrid.me:+]; MX_GOOD(-0.01)[cached: mx.sendgrid.net]; FORGED_SENDER(0.30)[app126121478@xxxxx.com,bounces@sendgrid.net]; RCVD_TLS_LAST(0.00)[]; IP_SCORE(0.20)[ip: (5.91), ipnet: 167.89.0.0/18(-1.07), asn: 11377(-3.75), country: US(-0.07)]; ASN(0.00)[asn:11377, ipnet:167.89.0.0/18, country:US]; MIME_TRACE(0.00)[0:+,1:+]; TAGGED_FROM(0.00)[9749946-40cf-freebsd-ppc=freebsd.org]; ARC_NA(0.00)[]; R_DKIM_ALLOW(0.00)[sendgrid.me:s=smtpapi]; FROM_NEQ_DISPLAY_NAME(4.00)[xxxxx.com,netflix.com]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_NEQ_ENVFROM(0.00)[app126121478@xxxxx.com,bounces@sendgrid.net]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-ppc@freebsd.org]; URIBL_GREY(1.50)[sendgrid.me.multi.uribl.com,sendgrid.net.multi.uribl.com]; RCPT_COUNT_ONE(0.00)[1]; BAD_REP_POLICIES(0.10)[]; NEURAL_SPAM_MEDIUM(1.00)[0.996,0]; NEURAL_SPAM_SHORT(0.93)[0.929,0]; NEURAL_SPAM_LONG(1.00)[1.000,0]; RCVD_IN_DNSWL_NONE(0.00)[203.10.89.167.list.dnswl.org : 127.0.15.0]; MID_RHS_NOT_FQDN(0.50)[]; GREYLIST(0.00)[pass,body] X-Spam: Yes Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 06:30:14 -0000 =20=09 Hi Dear Customer,=09 We weren't able to complete your last payment for your Netflix membership. We'll try charging you again over the next couple of days, but if we aren't able to complete a payment soon, you'll lose access to Netflix. Update payment method now =20 =20=09 =20=09 Need help? Contact support or visit our Help Center. Please do not reply to this email. View or make changes to your Netflix Red membership at any time. You'll need a supported device and an Internet connection to stream videos or to save videos to watch offline. =20=09 =20=09 =20=09 =A92019 Netflix, LLC 901 Cherry Ave, San Bruno, CA 94066 You have received this mandatory email service announcement to update you about important changes to your Netflix product or account. View your email options in your Netflix account. From owner-freebsd-ppc@freebsd.org Sat Mar 2 13:03:31 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 41C2715220E0; Sat, 2 Mar 2019 13:03:31 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 728A38D5CE; Sat, 2 Mar 2019 13:03:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 0D6423DDDC0; Sun, 3 Mar 2019 00:03:19 +1100 (AEDT) Date: Sun, 3 Mar 2019 00:03:18 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-Reply-To: <20190302105140.GC68879@kib.kiev.ua> Message-ID: <20190302225513.W3408@besplex.bde.org> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <20190302071425.G5025@besplex.bde.org> <20190302105140.GC68879@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=ZdOe7bxP52gbij4XtzYA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 728A38D5CE X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.94 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.94)[-0.941,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 13:03:31 -0000 On Sat, 2 Mar 2019, Konstantin Belousov wrote: > On Sat, Mar 02, 2019 at 07:38:20AM +1100, Bruce Evans wrote: >> On Fri, 1 Mar 2019, Konstantin Belousov wrote: >>> + scale = th->th_scale; >>> + delta = tc_delta(th); >>> + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { >> >> Better, but shouldn't be changed (and the bug that causes the large intervals >> remains unlocated), and if it is changed then it should use: >> >> if (delta >= th->th_large_delta) >> >>> @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) >>> scale += (th->th_adjustment / 1024) * 2199; >>> scale /= th->th_counter->tc_frequency; >>> th->th_scale = scale * 2; >>> +#ifdef _LP64 >>> + th->th_scale_bits = ffsl(th->th_scale); >>> +#else >>> + th->th_scale_bits = ffsll(th->th_scale); >>> +#endif >> >> th->th_large_delta = ((uint64_t)1 << 63) / scale; > > So I am able to reproduce it with some surprising ease on HPET running > on Haswell. So what is the cause of it? Maybe the tickless code doesn't generate fake clock ticks right. Or it is just a library bug. The kernel has to be slightly real-time to satisfy the requirement of 1 update per. Applications are further from being real-time. But isn't it enough for the kernel to ensure that the timehands cycle more than once per second? > I looked at the generated code for libc which still uses ffsll() on 32bit, > due to the ABI issues. At least clang generates two BSF instructions for > this code, so I think that forking vdso_timehands ABI for this is not > reasonable right now. > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > index 3749e0473af..3c3c71207bd 100644 > --- a/lib/libc/sys/__vdso_gettimeofday.c > +++ b/lib/libc/sys/__vdso_gettimeofday.c > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > +#include > #include > #include > #include "libc_private.h" > @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > { > struct vdso_timehands *th; > uint32_t curr, gen; > - u_int delta; > + uint64_t scale, x; > + u_int delta, scale_bits; > int error; > > do { > @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > continue; > if (error != 0) > return (error); > - bintime_addx(bt, th->th_scale * delta); > + scale = th->th_scale; > +#ifdef _LP64 > + scale_bits = ffsl(scale); > +#else > + scale_bits = ffsll(scale); > +#endif I see that there is an ABI problem adding th_large_delta. > + if (__predict_false(scale_bits + fls(delta) > 63)) { > + x = (scale >> 32) * delta; > + scale &= UINT_MAX; Should be UINT32_MAX or better 0xffffffff. > + bt->sec += x >> 32; > + bintime_addx(bt, x << 32); > + } > + bintime_addx(bt, scale * delta); > if (abs) > bintime_add(bt, &th->th_boottime); I don't changing this at all this. binuptime() was carefully written to not need so much 64-bit arithmetic. If this pessimization is allowed, then it can also handle a 64-bit deltas. Using the better kernel method: if (__predict_false(delta >= th->th_large_delta)) { bt->sec += (scale >> 32) * (delta >> 32); x = (scale >> 32) * (delta & 0xffffffff); bt->sec += x >> 32; bintime_addx(bt, x << 32); x = (scale & 0xffffffff) * (delta >> 32); bt->sec += x >> 32; bintime_addx(bt, x << 32); bintime_addx(bt, (scale & 0xffffffff) * (delta & 0xffffffff)); } else bintime_addx(bt, scale * (delta & 0xffffffff)); I just noticed that there is a 64 x 32 -> 64 bit multiplication in the current method. This can be changed to do expicit 32 x 32 -> 64 bit multiplications and fix the overflow problem at small extra cost on 32-bit arches: /* 32-bit arches did the next multiplication implicitly. */ x = (scale >> 32) * delta; /* * And they did the following shifts and most of the adds * implicitly too. Except shifting x left by 32 lost the * seconds part that the next line handles. The next line * is the only extra cost for them. */ bt->sec += x >> 32; bintime_addx(bt, (x << 32) + (scale & 0xffffffff) * delta); Bruce From owner-freebsd-ppc@freebsd.org Sat Mar 2 08:14:26 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D7F215182FE for ; Sat, 2 Mar 2019 08:14:26 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic317-29.consmr.mail.bf2.yahoo.com (sonic317-29.consmr.mail.bf2.yahoo.com [74.6.129.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9FBD483EAA for ; Sat, 2 Mar 2019 08:14:24 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: YWyHKJMVM1m4vDc0lWtWFK1KbWkc7MblAPE0nnQd8U2mTnU6vpbWp9yAlC8_vX1 zWYwAmk7TsnXGF.Jtwl4Ps0LtFQzc5AA.ccMrm8hZQtpllCEei4mLEEjYUkDnZVVeMrxXz0i_jO9 eiZYdNOlnLYka__8i4SIFwe7cdjQf5Ak5bFJ2R8WwUZEsxqUJW7kQkF0nuKGsspKbSvKJWu5X4m. TBggkqT8euRsZE_8sI8._aEbYeC69OrRQCB91lkajG_sYncNbeGDdEQGXC5eqc62h7gi2ojzXggc U6mAIgi4z3jraovF4YfcIgtNSmoB41ciufdY8XtZjQrlQN38DiZn5xXyv2Im.Q2PKpkvSSGvpY0b gnpc4MjFhGsV6DhxxwrlVGO89DllAEOzydKllAXhYitXwRcOJ8L79kpUr_M6kYUpqmvfV_phHs0P SM9tTDsiwKdxO.hkCksWD_wHH8om21YjFTvnWvIO.8B5rwr9RXoTsACi.HjXdq6LEL7veJ4Ja3NJ cZD5oywfEQEnLQoC87PB1lCKWlMId2XR3m80bF4t427J5kpZAw8BhtiUdpRcQlRO0MZy0y.FWNfR nXltvuTJb9DNMJ3Speg0biD5VwCnvnvDNnQHnISbKd3EbL0DoKGVdXSitmuinGAdJbH3WTMCuSpJ EsHzsjVOl06LF2n5N_sQPv7kRxcmVmfePhEIi5KZw0lfUK.tL_G_3cDCURvRnXhp.GxbUC2Id66p PuC3K1sxzBPLhcysYOtvUqnDDhBjZqnU0N0_xlL_H7xjRB.2i0dftOBgxmYgJwOkkeK3cUV.IbY0 sarEOpsjtw6NXAJXJLTP9_8ciJpw7VwDzkWZa_fU8lSazg0bIjwoU4rfPBuYBNScljHQDLewiUC4 oUfTZQgIfZuvgUkSM57_dfp0p_LRMbhy1Q8ZNXDhkb86j_KJRbMYC1ZVWl4AOqIrqLMDHc3uryPW VorIILvI1v82Bb6S1Yd2FuVuQM3ymW0KacXwcKbyG5Ip8ysBVPDtF68ZFB6uUYOuJqvMU.X6JMd0 qM8TSj2Xwo1Pegv.mf7XnsyaEZF3x_txPpgDAlpYU5MU4mEu035bJF1eBX2b72Y1IRfFvQQQj Received: from sonic.gate.mail.ne1.yahoo.com by sonic317.consmr.mail.bf2.yahoo.com with HTTP; Sat, 2 Mar 2019 08:14:17 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp430.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 8afb086aa699a02bd342eeec9a180a5e; Sat, 02 Mar 2019 08:14:13 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> Date: Sat, 2 Mar 2019 00:14:11 -0800 Cc: Poul-Henning Kamp , Mark Millard via freebsd-hackers , Konstantin Belousov , bde@freebsd.org, FreeBSD PowerPC ML Content-Transfer-Encoding: 7bit Message-Id: References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> To: Ian Lepore X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 9FBD483EAA X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.60 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.88)[0.876,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.58)[ip: (5.23), ipnet: 74.6.128.0/21(1.52), asn: 26101(1.22), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.97)[0.967,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.68)[0.683,0]; RCVD_IN_DNSWL_NONE(0.00)[84.129.6.74.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 08:14:26 -0000 On 2019-Mar-1, at 13:11, Ian Lepore wrote: > On Fri, 2019-03-01 at 20:57 +0000, Poul-Henning Kamp wrote: >> -------- >> In message <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com>, Mark >> Millard via freebsd-hackers writes: >> >>>> I must admit that 2 seconds of interval where the timehands where >>>> not updated is too much. >> >> I have no idea how you got in that situation, but it is very far >> from how timecounters were designed to work. >> > > I wonder if it's fallout from reducing the number of timehands to 2, > which always struck me as a really bad idea. I know of at least one arm > configuration which fails because of it (it takes a combo of a single- > core system, and a pps capture driver that uses hardware latching of > the timer and the polling method for reading the latched value; given > all that, at least 4 sets of timehands are needed to avoid losing PPS > events due to generation changes). Thanks for the suggestion. I tried putting back having 10 timehands structures with my variant of the original binuptime code. (My investigative code records some information.) Unfortunately, having the extra timehands's did not change the PowerMac G5's behavior. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-ppc@freebsd.org Sat Mar 2 14:17:18 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5A96152522E; Sat, 2 Mar 2019 14:17:18 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 3B905685BF; Sat, 2 Mar 2019 14:17:17 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (v-critter.freebsd.dk [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id A76AB202563B; Sat, 2 Mar 2019 14:17:11 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTPS id x22EHBQ4009995 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 14:17:11 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.15.2/8.15.2/Submit) id x22EHAxh009994; Sat, 2 Mar 2019 14:17:10 GMT (envelope-from phk) To: Konstantin Belousov cc: Ian Lepore , Mark Millard , Mark Millard via freebsd-hackers , Konstantin Belousov , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-reply-to: <20190302105652.GD68879@kib.kiev.ua> From: "Poul-Henning Kamp" References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> <20190302105652.GD68879@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <9992.1551536230.1@critter.freebsd.dk> Date: Sat, 02 Mar 2019 14:17:10 +0000 Message-ID: <9993.1551536230@critter.freebsd.dk> X-Rspamd-Queue-Id: 3B905685BF X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.93 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.93)[-0.930,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 14:17:18 -0000 -------- In message <20190302105652.GD68879@kib.kiev.ua>, Konstantin Belousov writes: >Using more than two timehands increases a chance of reader to try to >use outdated timehands. No, using only two timehands increase the chance that the reader tries to use the timehand which is being updated. As long as the reader does not use the timehand being updated, using a one or two generations old timehand is OK. At worst a frequency change happened since then, in which case the timestamp will be "delta-f * delta-t" wrong. Delta-f is in 1e-7 territory on a system running ntpd(8), so this is below noise level for anything but high-precision timekeeping. The target-value for delta-t was "a few milliseconds" when I wrote timecounters, if somebody has changed that since, I hope they did their math first. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-ppc@freebsd.org Sat Mar 2 14:25:32 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D63B15258BB; Sat, 2 Mar 2019 14:25:32 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56F636A093; Sat, 2 Mar 2019 14:25:31 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x22EPMdu096463 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 16:25:25 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x22EPMdu096463 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x22EPLVr096434; Sat, 2 Mar 2019 16:25:21 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 2 Mar 2019 16:25:21 +0200 From: Konstantin Belousov To: Bruce Evans Cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302142521.GE68879@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <20190302071425.G5025@besplex.bde.org> <20190302105140.GC68879@kib.kiev.ua> <20190302225513.W3408@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190302225513.W3408@besplex.bde.org> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 14:25:32 -0000 On Sun, Mar 03, 2019 at 12:03:18AM +1100, Bruce Evans wrote: > On Sat, 2 Mar 2019, Konstantin Belousov wrote: > > > On Sat, Mar 02, 2019 at 07:38:20AM +1100, Bruce Evans wrote: > >> On Fri, 1 Mar 2019, Konstantin Belousov wrote: > >>> + scale = th->th_scale; > >>> + delta = tc_delta(th); > >>> + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { > >> > >> Better, but shouldn't be changed (and the bug that causes the large intervals > >> remains unlocated), and if it is changed then it should use: > >> > >> if (delta >= th->th_large_delta) > >> > >>> @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) > >>> scale += (th->th_adjustment / 1024) * 2199; > >>> scale /= th->th_counter->tc_frequency; > >>> th->th_scale = scale * 2; > >>> +#ifdef _LP64 > >>> + th->th_scale_bits = ffsl(th->th_scale); > >>> +#else > >>> + th->th_scale_bits = ffsll(th->th_scale); > >>> +#endif > >> > >> th->th_large_delta = ((uint64_t)1 << 63) / scale; > > > > So I am able to reproduce it with some surprising ease on HPET running > > on Haswell. > > So what is the cause of it? Maybe the tickless code doesn't generate > fake clock ticks right. Or it is just a library bug. The kernel has > to be slightly real-time to satisfy the requirement of 1 update per. > Applications are further from being real-time. But isn't it enough > for the kernel to ensure that the timehands cycle more than once per > second? No, I entered ddb as you suggested. > > > I looked at the generated code for libc which still uses ffsll() on 32bit, > > due to the ABI issues. At least clang generates two BSF instructions for > > this code, so I think that forking vdso_timehands ABI for this is not > > reasonable right now. > > > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > > index 3749e0473af..3c3c71207bd 100644 > > --- a/lib/libc/sys/__vdso_gettimeofday.c > > +++ b/lib/libc/sys/__vdso_gettimeofday.c > > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > > #include > > #include > > #include > > +#include > > +#include > > #include > > #include > > #include "libc_private.h" > > @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > { > > struct vdso_timehands *th; > > uint32_t curr, gen; > > - u_int delta; > > + uint64_t scale, x; > > + u_int delta, scale_bits; > > int error; > > > > do { > > @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > continue; > > if (error != 0) > > return (error); > > - bintime_addx(bt, th->th_scale * delta); > > + scale = th->th_scale; > > +#ifdef _LP64 > > + scale_bits = ffsl(scale); > > +#else > > + scale_bits = ffsll(scale); > > +#endif > > I see that there is an ABI problem adding th_large_delta. > > > + if (__predict_false(scale_bits + fls(delta) > 63)) { > > + x = (scale >> 32) * delta; > > + scale &= UINT_MAX; > > Should be UINT32_MAX or better 0xffffffff. Ok. > > > + bt->sec += x >> 32; > > + bintime_addx(bt, x << 32); > > + } > > + bintime_addx(bt, scale * delta); > > if (abs) > > bintime_add(bt, &th->th_boottime); > > I don't changing this at all this. binuptime() was carefully written > to not need so much 64-bit arithmetic. > > If this pessimization is allowed, then it can also handle a 64-bit > deltas. Using the better kernel method: > > if (__predict_false(delta >= th->th_large_delta)) { > bt->sec += (scale >> 32) * (delta >> 32); > x = (scale >> 32) * (delta & 0xffffffff); > bt->sec += x >> 32; > bintime_addx(bt, x << 32); > x = (scale & 0xffffffff) * (delta >> 32); > bt->sec += x >> 32; > bintime_addx(bt, x << 32); > bintime_addx(bt, (scale & 0xffffffff) * > (delta & 0xffffffff)); > } else > bintime_addx(bt, scale * (delta & 0xffffffff)); This only makes sense if delta is extended to uint64_t, which requires the pass over timecounters. > > I just noticed that there is a 64 x 32 -> 64 bit multiplication in the > current method. This can be changed to do expicit 32 x 32 -> 64 bit > multiplications and fix the overflow problem at small extra cost on > 32-bit arches: > > /* 32-bit arches did the next multiplication implicitly. */ > x = (scale >> 32) * delta; > /* > * And they did the following shifts and most of the adds > * implicitly too. Except shifting x left by 32 lost the > * seconds part that the next line handles. The next line > * is the only extra cost for them. > */ > bt->sec += x >> 32; > bintime_addx(bt, (x << 32) + (scale & 0xffffffff) * delta); Ok, what about the following. diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c index 3749e0473af..cfe3d96d001 100644 --- a/lib/libc/sys/__vdso_gettimeofday.c +++ b/lib/libc/sys/__vdso_gettimeofday.c @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include "libc_private.h" @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) { struct vdso_timehands *th; uint32_t curr, gen; - u_int delta; + uint64_t scale, x; + u_int delta, scale_bits; int error; do { @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) continue; if (error != 0) return (error); - bintime_addx(bt, th->th_scale * delta); + scale = th->th_scale; +#ifdef _LP64 + scale_bits = ffsl(scale); +#else + scale_bits = ffsll(scale); +#endif + if (__predict_false(scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= 0xffffffff; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); if (abs) bintime_add(bt, &th->th_boottime); diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..2e28f872229 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -72,6 +72,7 @@ struct timehands { struct timecounter *th_counter; int64_t th_adjustment; uint64_t th_scale; + uint64_t th_large_delta; u_int th_offset_count; struct bintime th_offset; struct bintime th_bintime; @@ -351,17 +352,44 @@ fbclock_getmicrotime(struct timeval *tvp) } while (gen == 0 || gen != th->th_generation); } #else /* !FFCLOCK */ + +static void +bintime_helper(struct bintime *bt, uint64_t *scale, u_int delta) +{ + uint64_t x; + + x = (*scale >> 32) * delta; + *scale &= 0xffffffff; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); +} + void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); +#ifdef _LP64 + /* Avoid overflow for scale * delta. */ + if (__predict_false(th->th_large_delta <= delta)) + bintime_helper(bt, &scale, delta); + bintime_addx(bt, scale * delta); +#else + /* + * Also avoid (uint64_t, uint32_t) -> uint64_t + * multiplication on 32bit arches. + */ + bintime_helper(bt, &scale, delta); + bintime_addx(bt, (u_int)scale * delta); +#endif atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +416,28 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); +#ifdef _LP64 + /* Avoid overflow for scale * delta. */ + if (__predict_false(th->th_large_delta <= delta)) + bintime_helper(bt, &scale, delta); + bintime_addx(bt, scale * delta); +#else + /* + * Also avoid (uint64_t, uint32_t) -> uint64_t + * multiplication on 32bit arches. + */ + bintime_helper(bt, &scale, delta); + bintime_addx(bt, (u_int)scale * delta); +#endif atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -1464,6 +1507,7 @@ tc_windup(struct bintime *new_boottimebin) scale += (th->th_adjustment / 1024) * 2199; scale /= th->th_counter->tc_frequency; th->th_scale = scale * 2; + th->th_large_delta = ((uint64_t)1 << 63) / scale; /* * Now that the struct timehands is again consistent, set the new From owner-freebsd-ppc@freebsd.org Sat Mar 2 14:28:22 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BDC0B1525992; Sat, 2 Mar 2019 14:28:22 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B668B6A134; Sat, 2 Mar 2019 14:28:21 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x22ESEeX097092 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 16:28:17 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x22ESEeX097092 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x22ESEPZ097091; Sat, 2 Mar 2019 16:28:14 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 2 Mar 2019 16:28:14 +0200 From: Konstantin Belousov To: Poul-Henning Kamp Cc: Ian Lepore , Mark Millard , Mark Millard via freebsd-hackers , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302142814.GF68879@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> <20190302105652.GD68879@kib.kiev.ua> <9993.1551536230@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9993.1551536230@critter.freebsd.dk> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 14:28:22 -0000 On Sat, Mar 02, 2019 at 02:17:10PM +0000, Poul-Henning Kamp wrote: > -------- > In message <20190302105652.GD68879@kib.kiev.ua>, Konstantin Belousov writes: > > >Using more than two timehands increases a chance of reader to try to > >use outdated timehands. > > No, using only two timehands increase the chance that the reader tries > to use the timehand which is being updated. There is no problem with using timehands that is being updated, it is detected if writes propagate to the reader CPU at all. Problem is with too late propagation. More the timehands, more the propagation can be delayed by the hardware. > > As long as the reader does not use the timehand being updated, using > a one or two generations old timehand is OK. > > At worst a frequency change happened since then, in which case the > timestamp will be "delta-f * delta-t" wrong. Delta-f is in 1e-7 > territory on a system running ntpd(8), so this is below noise level > for anything but high-precision timekeeping. > > The target-value for delta-t was "a few milliseconds" when I wrote > timecounters, if somebody has changed that since, I hope they did > their math first. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-ppc@freebsd.org Sat Mar 2 14:50:02 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04E3015002F3; Sat, 2 Mar 2019 14:50:02 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 8779C6ABEB; Sat, 2 Mar 2019 14:50:01 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (v-critter.freebsd.dk [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id 96CA9202563A; Sat, 2 Mar 2019 14:50:00 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTPS id x22Eo0Pt010231 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 14:50:00 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.15.2/8.15.2/Submit) id x22Enxwv010228; Sat, 2 Mar 2019 14:49:59 GMT (envelope-from phk) To: Konstantin Belousov cc: Ian Lepore , Mark Millard , Mark Millard via freebsd-hackers , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-reply-to: <20190302142814.GF68879@kib.kiev.ua> From: "Poul-Henning Kamp" References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> <20190302105652.GD68879@kib.kiev.ua> <9993.1551536230@critter.freebsd.dk> <20190302142814.GF68879@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <10226.1551538199.1@critter.freebsd.dk> Date: Sat, 02 Mar 2019 14:49:59 +0000 Message-ID: <10227.1551538199@critter.freebsd.dk> X-Rspamd-Queue-Id: 8779C6ABEB X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.95 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.95)[-0.954,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 14:50:02 -0000 -------- In message <20190302142814.GF68879@kib.kiev.ua>, Konstantin Belousov writes: >On Sat, Mar 02, 2019 at 02:17:10PM +0000, Poul-Henning Kamp wrote: >More the timehands, more the propagation can be delayed by the hardware. One of the two of us don't understand how timecounters work. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-ppc@freebsd.org Sat Mar 2 10:57:02 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4F20A151D4C1; Sat, 2 Mar 2019 10:57:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 714D488D03; Sat, 2 Mar 2019 10:57:01 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x22AurQx005364 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 12:56:56 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x22AurQx005364 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x22AuqMq005363; Sat, 2 Mar 2019 12:56:52 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 2 Mar 2019 12:56:52 +0200 From: Konstantin Belousov To: Ian Lepore Cc: Poul-Henning Kamp , Mark Millard , Mark Millard via freebsd-hackers , Konstantin Belousov , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302105652.GD68879@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 10:57:02 -0000 On Fri, Mar 01, 2019 at 02:11:33PM -0700, Ian Lepore wrote: > On Fri, 2019-03-01 at 20:57 +0000, Poul-Henning Kamp wrote: > > -------- > > In message <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com>, Mark > > Millard via freebsd-hackers writes: > > > > > > I must admit that 2 seconds of interval where the timehands where > > > > not updated is too much. > > > > I have no idea how you got in that situation, but it is very far > > from how timecounters were designed to work. > > > > I wonder if it's fallout from reducing the number of timehands to 2, > which always struck me as a really bad idea. I know of at least one arm > configuration which fails because of it (it takes a combo of a single- > core system, and a pps capture driver that uses hardware latching of > the timer and the polling method for reading the latched value; given > all that, at least 4 sets of timehands are needed to avoid losing PPS > events due to generation changes). Using more than two timehands increases a chance of reader to try to use outdated timehands. In theory, on the hardware with very high inter-core propagation (e.g. with sufficiently large store buffers) it is possible for other CPU to use non-current timehands. More timehands you have, larger is the relative error. If you have very specific configuration which contradicts to the typical modern hardware configuration, I do not see a problem restoring some more timehands entries under a config option. From owner-freebsd-ppc@freebsd.org Sat Mar 2 10:51:50 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD6F1151D154; Sat, 2 Mar 2019 10:51:49 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 190AF88B6C; Sat, 2 Mar 2019 10:51:48 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x22ApfoN004224 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 12:51:44 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x22ApfoN004224 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x22ApeU1004219; Sat, 2 Mar 2019 12:51:40 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 2 Mar 2019 12:51:40 +0200 From: Konstantin Belousov To: Bruce Evans Cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302105140.GC68879@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <20190302071425.G5025@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190302071425.G5025@besplex.bde.org> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 10:51:50 -0000 On Sat, Mar 02, 2019 at 07:38:20AM +1100, Bruce Evans wrote: > On Fri, 1 Mar 2019, Konstantin Belousov wrote: > > + scale = th->th_scale; > > + delta = tc_delta(th); > > + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { > > Better, but shouldn't be changed (and the bug that causes the large intervals > remains unlocated), and if it is changed then it should use: > > if (delta >= th->th_large_delta) > > > @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) > > scale += (th->th_adjustment / 1024) * 2199; > > scale /= th->th_counter->tc_frequency; > > th->th_scale = scale * 2; > > +#ifdef _LP64 > > + th->th_scale_bits = ffsl(th->th_scale); > > +#else > > + th->th_scale_bits = ffsll(th->th_scale); > > +#endif > > th->th_large_delta = ((uint64_t)1 << 63) / scale; So I am able to reproduce it with some surprising ease on HPET running on Haswell. I looked at the generated code for libc which still uses ffsll() on 32bit, due to the ABI issues. At least clang generates two BSF instructions for this code, so I think that forking vdso_timehands ABI for this is not reasonable right now. diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c index 3749e0473af..3c3c71207bd 100644 --- a/lib/libc/sys/__vdso_gettimeofday.c +++ b/lib/libc/sys/__vdso_gettimeofday.c @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include "libc_private.h" @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) { struct vdso_timehands *th; uint32_t curr, gen; - u_int delta; + uint64_t scale, x; + u_int delta, scale_bits; int error; do { @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) continue; if (error != 0) return (error); - bintime_addx(bt, th->th_scale * delta); + scale = th->th_scale; +#ifdef _LP64 + scale_bits = ffsl(scale); +#else + scale_bits = ffsll(scale); +#endif + if (__predict_false(scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); if (abs) bintime_add(bt, &th->th_boottime); diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..0a11c726e3c 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -72,6 +72,7 @@ struct timehands { struct timecounter *th_counter; int64_t th_adjustment; uint64_t th_scale; + uint64_t th_large_delta; u_int th_offset_count; struct bintime th_offset; struct bintime th_bintime; @@ -355,13 +356,22 @@ void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (__predict_false(th->th_large_delta <= delta)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +398,22 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale, x; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); + if (__predict_false(th->th_large_delta <= delta)) { + x = (scale >> 32) * delta; + scale &= UINT_MAX; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -1464,6 +1483,7 @@ tc_windup(struct bintime *new_boottimebin) scale += (th->th_adjustment / 1024) * 2199; scale /= th->th_counter->tc_frequency; th->th_scale = scale * 2; + th->th_large_delta = ((uint64_t)1 << 63) / scale; /* * Now that the struct timehands is again consistent, set the new From owner-freebsd-ppc@freebsd.org Sat Mar 2 17:43:24 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 993A2150643F; Sat, 2 Mar 2019 17:43:24 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id E180270CEF; Sat, 2 Mar 2019 17:43:23 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 02F9B432B9E; Sun, 3 Mar 2019 04:43:20 +1100 (AEDT) Date: Sun, 3 Mar 2019 04:43:20 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-Reply-To: <20190302142521.GE68879@kib.kiev.ua> Message-ID: <20190303041441.V4781@besplex.bde.org> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <20190302071425.G5025@besplex.bde.org> <20190302105140.GC68879@kib.kiev.ua> <20190302225513.W3408@besplex.bde.org> <20190302142521.GE68879@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=P6RKvmIu c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=14Grze90KK8wkU9TH5gA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: E180270CEF X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.98 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.98)[-0.983,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 17:43:24 -0000 On Sat, 2 Mar 2019, Konstantin Belousov wrote: > On Sun, Mar 03, 2019 at 12:03:18AM +1100, Bruce Evans wrote: >> On Sat, 2 Mar 2019, Konstantin Belousov wrote: >>> ... >>> So I am able to reproduce it with some surprising ease on HPET running >>> on Haswell. >> >> So what is the cause of it? Maybe the tickless code doesn't generate >> fake clock ticks right. Or it is just a library bug. The kernel has >> to be slightly real-time to satisfy the requirement of 1 update per. >> Applications are further from being real-time. But isn't it enough >> for the kernel to ensure that the timehands cycle more than once per >> second? > No, I entered ddb as you suggested. But using ddb is not normal. It is convenient that this fixes HPET and ACPI timecounters after using ddb, but this method doesn't help for timecounters that wrap fast. TSC-low at 2GHz wraps in 2 seconds, and i8254 wraps in a few milliseconds. >> I don't changing this at all this. binuptime() was carefully written >> to not need so much 64-bit arithmetic. >> >> If this pessimization is allowed, then it can also handle a 64-bit >> deltas. Using the better kernel method: >> >> if (__predict_false(delta >= th->th_large_delta)) { >> bt->sec += (scale >> 32) * (delta >> 32); >> x = (scale >> 32) * (delta & 0xffffffff); >> bt->sec += x >> 32; >> bintime_addx(bt, x << 32); >> x = (scale & 0xffffffff) * (delta >> 32); >> bt->sec += x >> 32; >> bintime_addx(bt, x << 32); >> bintime_addx(bt, (scale & 0xffffffff) * >> (delta & 0xffffffff)); >> } else >> bintime_addx(bt, scale * (delta & 0xffffffff)); > This only makes sense if delta is extended to uint64_t, which requires > the pass over timecounters. Yes, that was its point. It is a bit annoying to have a hardware timecounter like the TSC that doesn't wrap naturally, but then make it wrap by masking high bits. The masking step is also a bit wasteful. For the TSC, it is 1 step to discard high bids at the register level, then another step to apply the nask to discard th high bits again. >> I just noticed that there is a 64 x 32 -> 64 bit multiplication in the >> current method. This can be changed to do expicit 32 x 32 -> 64 bit >> multiplications and fix the overflow problem at small extra cost on >> 32-bit arches: >> >> /* 32-bit arches did the next multiplication implicitly. */ >> x = (scale >> 32) * delta; >> /* >> * And they did the following shifts and most of the adds >> * implicitly too. Except shifting x left by 32 lost the >> * seconds part that the next line handles. The next line >> * is the only extra cost for them. >> */ >> bt->sec += x >> 32; >> bintime_addx(bt, (x << 32) + (scale & 0xffffffff) * delta); > > Ok, what about the following. I'm not sure that I really want this, even if the pessimization is done. But it avoids using fls*(), so is especially good for 32-bit systems and OK for 64-bit systems too, especially in userland where fls*() is in the fast path. > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > index 3749e0473af..cfe3d96d001 100644 > --- a/lib/libc/sys/__vdso_gettimeofday.c > +++ b/lib/libc/sys/__vdso_gettimeofday.c > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include Not needed with 0xffffffff instead of UINT_MAX. The userland part is otherwise little changed. > diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c > index 2656fb4d22f..2e28f872229 100644 > --- a/sys/kern/kern_tc.c > +++ b/sys/kern/kern_tc.c > ... > @@ -351,17 +352,44 @@ fbclock_getmicrotime(struct timeval *tvp) > } while (gen == 0 || gen != th->th_generation); > } > #else /* !FFCLOCK */ > + > +static void > +bintime_helper(struct bintime *bt, uint64_t *scale, u_int delta) > +{ > + uint64_t x; > + > + x = (*scale >> 32) * delta; > + *scale &= 0xffffffff; > + bt->sec += x >> 32; > + bintime_addx(bt, x << 32); > +} It is probably best to not inline the slow path, but clang tends to inline everything anyway. I prefer my way of writing this in 3 lines. Modifying 'scale' for the next step is especially ugly and pessimal when the next step is in the caller and this function is not inlined. > + > void > binuptime(struct bintime *bt) > { > struct timehands *th; > - u_int gen; > + uint64_t scale; > + u_int delta, gen; > > do { > th = timehands; > gen = atomic_load_acq_int(&th->th_generation); > *bt = th->th_offset; > - bintime_addx(bt, th->th_scale * tc_delta(th)); > + scale = th->th_scale; > + delta = tc_delta(th); > +#ifdef _LP64 > + /* Avoid overflow for scale * delta. */ > + if (__predict_false(th->th_large_delta <= delta)) > + bintime_helper(bt, &scale, delta); > + bintime_addx(bt, scale * delta); > +#else > + /* > + * Also avoid (uint64_t, uint32_t) -> uint64_t > + * multiplication on 32bit arches. > + */ "Also avoid overflow for ..." > + bintime_helper(bt, &scale, delta); > + bintime_addx(bt, (u_int)scale * delta); The cast should be to uint32_t, but better write it as & 0xffffffff as elsewhere. bintime_helper() already reduced 'scale' to 32 bits. The cast might be needed to tell the compiler this, especially when the function is not inlined. Better not do it in the function. The function doesn't even use the reduced value. bintime_helper() is in the fast path in this case, so should be inlined. > +#endif > atomic_thread_fence_acq(); > } while (gen == 0 || gen != th->th_generation); > } This needs lots of testing of course. Bruce From owner-freebsd-ppc@freebsd.org Sat Mar 2 17:14:31 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 45B3E150565B; Sat, 2 Mar 2019 17:14:31 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 78E0B700AC; Sat, 2 Mar 2019 17:14:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 5BD983DD847; Sun, 3 Mar 2019 04:14:24 +1100 (AEDT) Date: Sun, 3 Mar 2019 04:14:23 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Poul-Henning Kamp cc: Konstantin Belousov , Ian Lepore , Mark Millard , Mark Millard via freebsd-hackers , Konstantin Belousov , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] In-Reply-To: <9993.1551536230@critter.freebsd.dk> Message-ID: <20190303032006.T4781@besplex.bde.org> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> <20190302105652.GD68879@kib.kiev.ua> <9993.1551536230@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=UJetJGXy c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=nwOOQBBF5AvJ24hNhIcA:9 a=CjuIK1q_8ugA:10 X-Rspamd-Queue-Id: 78E0B700AC X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.42 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-6.25 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[42.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: extmail.optusnet.com.au]; NEURAL_HAM_SHORT(-0.83)[-0.827,0]; RCPT_COUNT_SEVEN(0.00)[7]; IP_SCORE(-3.11)[ip: (-8.30), ipnet: 211.28.0.0/14(-4.01), asn: 4804(-3.19), country: AU(-0.04)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; FREEMAIL_CC(0.00)[gmail.com]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 17:14:31 -0000 On Sat, 2 Mar 2019, Poul-Henning Kamp wrote: > -------- > In message <20190302105652.GD68879@kib.kiev.ua>, Konstantin Belousov writes: > >> Using more than two timehands increases a chance of reader to try to >> use outdated timehands. > > No, using only two timehands increase the chance that the reader tries > to use the timehand which is being updated. Then it sees the generation change and retries. We fixed the ordering of accesses to the generation count so that this is robust. 1 timehands is always valid, so with 2 timehands there is no wait for the retry except in the very unlikely event that the generation changes for the new timehands too. 1 timehands would work too, but the retries would have to wait while it is updated. > As long as the reader does not use the timehand being updated, using > a one or two generations old timehand is OK. In old versions, there were races checking the generation count. Having multiple timehands made these races more unlikely to matter. > The target-value for delta-t was "a few milliseconds" when I wrote > timecounters, if somebody has changed that since, I hope they did > their math first. Tickless kernels complicate things. It's surprising that tc_ticktock() works so well with them. Calls to hardclock() are not periodic, so calls to tc_ticktock() are not periodic either. It has to handle coalesced and 1/hz ticks. Too much coalescing would break it. With my normal hz = 100, cpu0:timer interrupts still occur at at least 100 Hz. These presumably go to hardclock(), so the timing is satisfied. With hz = 1000, cpu0:timer interrupts only occur at at least 200 Hz. This is less than tc_ticktock() expects, but it still works. Bruce From owner-freebsd-ppc@freebsd.org Sat Mar 2 17:12:25 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D12115054E9; Sat, 2 Mar 2019 17:12:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FEA76FFD4; Sat, 2 Mar 2019 17:12:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x22HCHPJ034795 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 19:12:20 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x22HCHPJ034795 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x22HCHq6034794; Sat, 2 Mar 2019 19:12:17 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 2 Mar 2019 19:12:17 +0200 From: Konstantin Belousov To: Poul-Henning Kamp Cc: Ian Lepore , Mark Millard , Mark Millard via freebsd-hackers , bde@freebsd.org, FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302171217.GG68879@kib.kiev.ua> References: <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com> <6669.1551473821@critter.freebsd.dk> <210dfd0f50ee6b1149c914ee503502654eb5f328.camel@freebsd.org> <20190302105652.GD68879@kib.kiev.ua> <9993.1551536230@critter.freebsd.dk> <20190302142814.GF68879@kib.kiev.ua> <10227.1551538199@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <10227.1551538199@critter.freebsd.dk> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 17:12:25 -0000 On Sat, Mar 02, 2019 at 02:49:59PM +0000, Poul-Henning Kamp wrote: > -------- > In message <20190302142814.GF68879@kib.kiev.ua>, Konstantin Belousov writes: > >On Sat, Mar 02, 2019 at 02:17:10PM +0000, Poul-Henning Kamp wrote: > > >More the timehands, more the propagation can be delayed by the hardware. > > One of the two of us don't understand how timecounters work. Perhaps look at the code ? Pre-2015 (or so) timecounters did not worked on non-TSO machines. From owner-freebsd-ppc@freebsd.org Sat Mar 2 20:17:52 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9F6E5150DB01 for ; Sat, 2 Mar 2019 20:17:52 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic315-9.consmr.mail.gq1.yahoo.com (sonic315-9.consmr.mail.gq1.yahoo.com [98.137.65.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 45EF7814F3 for ; Sat, 2 Mar 2019 20:17:51 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: B0Oa4z4VM1m21146NCZx7fPGXyvSCM_EpoYPTMp_66BUKwgT.De9yYKgqAscm64 WNFw_Xyuh9hWZxzEn0.VdxWee4FADJk4ghthbR9eNwccNc0td9hsqYl3v4EGdLkx6oheOq19z06M 0gyFBnoBmWjrD7Wl8eDaxQY6jMqSPWgHZTIPdpzKnuB9Zg_uiK1tHxP4Yuwukd0wkklUhC_clI66 9vnlWuMU2P6e68TyYcGOIGv9IYsJng7wzBYAAbQS1w4YQ2q9VWwk4UeJdYTHwY_o5RXfWpLGesLJ DYA4nmqDmq4.E_eX56.gZA7hEPYhj0ADpep_fTLlwL2A0eyrUBpEHGoIBTFu.r7tdQykr167rhKf jBTDg1Chuwlf545Eu7HNKIUvPIyu8m1PR0nQWzxVOm7Qs6B4_PkYgxih8jLmdRhkfZd3fwf.2O.2 2hUqAVqmGv4p0d0DA_U2bw5GbzpZpnjIUoSwj4RmS0xNog1n4Vsu8AvgyYu.trVQ1F0lsQeNwQWC DinXFFG3WljJpSky1da1erxk8prQZYWni.kBf_4QefOsyayfYcrnu1NFrcteL_6JKZ5CRBKIre6n BX3LVPYt1jZp1o5vFyOWR5GpQwEpjRd5hCeaFxvux6Yfc7YVhrAPYxhrO.jZKFPXfpDjQ8BvLU3X BdWRihB71NLNR8iLvBHb.LX8i2n0CIV0.9UdTFduKcYKs.ZKlDnZ21PVP_YEZNlW5bDrwY6oWpbQ 7iZkyCqwk6N0PYiGcx4lyd6vO521iwUwBOyuiEv5ewCrJj9iuJE_gHjZl_H1SWCXmgS.E6id020X l9isSoq1LoRbRZsGBtzzkG2AlIhaTYFVYnYgKRevaqXFtf1SnPnTttgbR7tdXbltuzuXSVj37fi3 SANixPdBH8744R5QA.dUwZ8c6qxWf5qSWwv6ph8kzAh5Sqs8SmEEMgS.Rlf63ZZNBoVf.25PLJHz 8743x4YhrbybYWjZHHvhsSV2hRLm.ah9VcTnHBLzfFys1YCJ135goLeHqEmiZvIkXyFJxOAEQjSU ICQtQKdIvhkufThOD.MIw.kvCJ5AlbCLgEaVYAodNX1wsi2ElwVyMbbQqoF82ZPpwLjA- Received: from sonic.gate.mail.ne1.yahoo.com by sonic315.consmr.mail.gq1.yahoo.com with HTTP; Sat, 2 Mar 2019 20:17:44 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp416.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 0ea625337651e9767f232be8b735f5f8; Sat, 02 Mar 2019 20:07:34 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] From: Mark Millard In-Reply-To: <87D6CBD5-AE55-4EC8-8797-D8A9DC3D5A5A@yahoo.com> Date: Sat, 2 Mar 2019 12:07:33 -0800 Cc: Bruce Evans , freebsd-hackers Hackers , FreeBSD PowerPC ML Content-Transfer-Encoding: 7bit Message-Id: References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <87D6CBD5-AE55-4EC8-8797-D8A9DC3D5A5A@yahoo.com> To: Konstantin Belousov X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 45EF7814F3 X-Spamd-Bar: ++++ X-Spamd-Result: default: False [4.12 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; IP_SCORE(1.91)[ip: (7.81), ipnet: 98.137.64.0/21(1.00), asn: 36647(0.80), country: US(-0.07)]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.82)[0.820,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_SPAM_MEDIUM(0.98)[0.981,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.92)[0.925,0]; RCVD_IN_DNSWL_NONE(0.00)[33.65.137.98.list.dnswl.org : 127.0.5.0]; FREEMAIL_CC(0.00)[optusnet.com.au] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 20:17:52 -0000 On 2019-Mar-1, at 13:19, Mark Millard wrote: > On 2019-Mar-1, at 11:42, Konstantin Belousov wrote: > >> . . . >> +#ifdef _LP64 >> + scale_bits = ffsl(scale); >> +#else >> + scale_bits = ffsll(scale); >> +#endif. . . >> + if (__predict_false(scale_bits + fls(delta) > 63)) { > > > The patch from yesterday uniformly used: > > int > fls(int mask) > { > int bit; > > if (mask == 0) > return (0); > for (bit = 1; mask != 1; bit++) > mask = (unsigned int)mask >> 1; > return (bit); > } > > that looks for the most significant 1 bit. > > The new patch uses in some places: > > int > ffsl(long mask) > { > int bit; > > if (mask == 0) > return (0); > for (bit = 1; !(mask & 1); bit++) > mask = (unsigned long)mask >> 1; > return (bit); > } > > that looks for the least significant 1 bit. Similarly > for: > > int > ffsll(long long mask) > { > int bit; > > if (mask == 0) > return (0); > for (bit = 1; !(mask & 1); bit++) > mask = (unsigned long long)mask >> 1; > return (bit); > } > > Was that deliberate? Be that as it may: I've been watching you and Bruce work on a code update. I'm intending to wait until you let me know you want me to test before trying again (on the PowerMac G5). (I've not been testing on anything else: I did not intended to test systems I've not seen a problem with until after the G5 seemed to be working.) In part my waiting is because the first patch that I tried left things unusable and I have to recover from the consequences of a forced power off. It took a fair amount of time. I'd not be surprised if the G5 type of context has another problem, separate from what I reported and what you are working on. If so I may not be able to be an effective tester: the fix may just repeat what I saw the first time (based on a messed up context). === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)