From owner-svn-src-head@freebsd.org Sat Apr 27 08:51:02 2019 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B0B5158A374 for ; Sat, 27 Apr 2019 08:51:02 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-13.consmr.mail.bf2.yahoo.com (sonic311-13.consmr.mail.bf2.yahoo.com [74.6.131.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D2C0D75A38 for ; Sat, 27 Apr 2019 08:51:00 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: LlxWHhkVM1msRXNzy4wk8dQcuD1y_3a4LdJHPJcEEtDOXwWels8.bTDAdTtApN0 nHS6p7stPpsb6yKz._B9iPchJbp6eokq7cJKZ7hXWfiDQsc1LBcnVZvfgeKz1klwXi5lzFwRSukp a2hSwJQL4S8zvXUXoWh116KBeaEwtWCqF_5g0dA3fAZP5euYIXU1b2KjiRY0ox12v.L1pe0xGKSA laYJVbjVi8PeTAEW27xc0RDZ7namilVjYvINyYQ6sGZ6uZboPYtN9LgruQhYmvONJAut2XOls7JN cetEL7JgZ6Pr0l_JXbhGjTEWe1qS9S8p3dEFkPeYG3cme1KuUavXUlNC4RZvvwFZskPuYMsVicMK m6hWavi9ocBKlNQAFsOLN6tbPacjg03TYgasBpduQG33S.iPra1brcl08zxgmgN.eDfWsXS9o4v8 Rxxea3.TWMXnjiQUYQZRQzYYS5BgyLMjB1CNFUMDlXe4XwTrPxpesRyZeRVjwIukEtzZWtH2srz7 yR9c5qlzkdA5cvZZ4kTLJzma6AZje5ZLCNVRAx7Lp9gowBJdFhr8z9ODJTPgFukDtxzyYL0Jq7po CyzN5YMTX3HKmUmAgzhX7cAzoSstYZ8bAJpSx1HtMwxvqGdh4ajJBInuxJe97REVAuIyK1RswAO7 qNwjfAm5xWbBPLFlKI7XcXqjjY_tQtTiyI5afznM.lGfwsgi_ha4ev2gXpXNxIrAk28ziar7d51C tX5mVN7q77jkEhAgT7dkE93I_MCh8UBB_65nfVFoCHaO06nZRamPmpfhgDw0nJ8kFpW0yJW8DLO5 mgqTj90fNPldImrYDtyKM.GEVQcHbJa4YF4zRaU_BIbfC4tjEI6H0K7I.rH0cbMyt2ddgRGmtCbc lqzE75FwQdiF6yPBDHy0tP5htf3wSzjrSA7I3ZeXDseHR.xvBg2CMHKdmLln1xHl6gfuqAdb7z25 pDmMDf7jcoasT_8OFoarcDSe8yCXIalfr6Htzv1aNrr00qSpw11mKbK3YOb1kn1nzTQ30Il3XIoN R8to6kJRK6l3a4j5rNQ2WZ3lnfYa0pTklRO3D3DX7W6nvM.fwZxf0opSFypf_0LBGqdDme5DehhY RGn8izu32TrzlHw.dRgL4Uv4PywanXlZREonnJtVAEyPNTDxMlhXHC2u95RlS4URt4Zue1xh.394 WpWPB Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.bf2.yahoo.com with HTTP; Sat, 27 Apr 2019 08:50:55 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.103]) ([76.115.7.162]) by smtp412.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 72364bc812dd473439ca6bd33c1d4c63; Sat, 27 Apr 2019 08:50:50 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: svn commit: r346588 - head/lib/libc/powerpc64/string Message-Id: Date: Sat, 27 Apr 2019 01:50:48 -0700 To: svn-src-head@freebsd.org, Justin Hibbits X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: D2C0D75A38 X-Spamd-Bar: ++++ X-Spamd-Result: default: False [4.67 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; FAKE_REPLY(1.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.84)[0.844,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.43)[ip: (4.38), ipnet: 74.6.128.0/21(1.57), asn: 26101(1.26), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.92)[0.916,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.99)[0.990,0]; RCVD_IN_DNSWL_NONE(0.00)[123.131.6.74.list.dnswl.org : 127.0.5.0] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Apr 2019 08:51:02 -0000 Justin Hibbits jhibbits at FreeBSD.org wrote on Fri Apr 26 16:21:47 UTC 2019 : > This actually uses 'cmpb' which is only available on PowerISA 2.05+, = so > I'll need to pull it out for now, and re-enable it once we have > ifuncs. As it stands, this commit broke the G5 and POWER4/POWER5. As I understand the code like: xor %r8,%r8,%r8 /* %r8 <- Zero. */ xor %r0,%r5,%r6 /* Check if double words are different. = */ cmpb %r7,%r5,%r8 /* Check if double words contain zero. = */ /* * If double words are different or contain zero, * find what byte is different or contains zero, * else load next double words. */ or. %r9,%r7,%r0 bne .Lstrcmp_check_zeros_differences (and similarly for the loop. . .): A) Each byte of %r5 that is non-zero needs that byte of %r7 to be zero. B) Each byte of %r5 that is zero need that byte of %r7 to be non-zero. (cmpb assigns 0xff for non-zero as I understand, but even one non-zero bit is sufficient for the overall code structure.) If I've got that much correct, then the following might be an alternative to cmpb for now. I'll explain the code via commented c/c++-ish code and then show the assembler notation: unsigned long ul_has_zero_byte(unsigned long b) { unsigned long constexpr low_7bits_of_bytes{0x7f7f7f7f'7f7f7f7ful}; // Illustrating = byte transformations: unsigned long const x=3D b & low_7bits_of_bytes; // 0x00->0x00, = 0x80->0x00, other->ms-bit-in-byte=3D=3D0 unsigned long const y=3D x + low_7bits_of_bytes; // ->0x7f, = ->0x7f, ->ms-bit-in-byte=3D=3D1 unsigned long const z=3D b | y | low_7bits_of_bytes; // ->0x7f, = ->0xff, ->0xff return ~z; // ->0x80, = ->0x00, ->0x00 } (used in a powerpc64 context, so unsigned long being 64 bits). So, not using %r8 as zero but for a different value, each cmpb can be replaced by: # Only once to set up the value in %r8 (Note: 32639=3D0x7f7f): lis r8,32639 ori r8,r8,32639 rldimi r8,r8,32,0 # each "cmpb %r7,%r5,%r8" replaced by: and r7,r5,r8 add r7,r7,r8 nor r5,r7,r5 andc r5,r5,r8 (The code is from compiler output, but with registers adjusted to match the context.) The c/c++-ish code came from thinking about material from Hacker's Delight Second Edition and the specific criteria needed here: it uses part of Figure 6-2 "Find First 0-Byte, branch-free code", adjusted for width and for returning something sufficient here. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)