From owner-svn-src-projects@freebsd.org Mon Aug 3 17:05:52 2020 Return-Path: Delivered-To: svn-src-projects@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 13AF73A19AD for ; Mon, 3 Aug 2020 17:05:52 +0000 (UTC) (envelope-from jrtc27@jrtc27.com) Received: from mail-wm1-f68.google.com (mail-wm1-f68.google.com [209.85.128.68]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BL45q2P60z46fb for ; Mon, 3 Aug 2020 17:05:51 +0000 (UTC) (envelope-from jrtc27@jrtc27.com) Received: by mail-wm1-f68.google.com with SMTP id x5so260182wmi.2 for ; Mon, 03 Aug 2020 10:05:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=D8AoVUDToB57LttVSmp2lD7qamBQIousaWaQJbknaKM=; b=SYKuxybgKIUgjBbzE70C18IvmJyolNROmr8qMZHxCDfbu8GzibxpXbfqgUu0CkktxW nUq3QvXg6Q9Y1BC7usC84IjAuTM6F/WxhJD5TBSL8Mzc3tO0X73teH8qdlj02k+2B1UG ZLzMmsN1bNrO2agGuCUBvS5diLSBXg1iGWtYL0F5rQeihczI5yj209CQK3SxUMPsrrST f1jMoFuI1dZM15ofYrhorHtQzQ3IUaULb8fEpd3UcicU3HAdtyiMIZ66yRhUv1AIEnD4 31BOxSkKIIxQeZTisrpYJNPtQ4eS0OSJb1AuMZMQ2oaQJxC3Ui4ooe9X0JNs3HQGQKGu q+sw== X-Gm-Message-State: AOAM530SelfIJfLwQVQ0We+B1NJKqm5ONVamBau9mj1myXahIB0GtVK+ WcWKvp4lEep6kZWAg0I0zsSWIg== X-Google-Smtp-Source: ABdhPJw1hU7a0B+z91iH4is/U91PFxhrKYLi06iG8c43tnji0mBPxPC++5AVMJf9t4ZPMIW3R0k5wA== X-Received: by 2002:a1c:19c2:: with SMTP id 185mr225415wmz.8.1596474349703; Mon, 03 Aug 2020 10:05:49 -0700 (PDT) Received: from [192.168.149.251] (trinity-students-nat.trin.cam.ac.uk. [131.111.193.104]) by smtp.gmail.com with ESMTPSA id f124sm364004wmf.7.2020.08.03.10.05.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Aug 2020 10:05:48 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r363773 - projects/clang1100-import/contrib/llvm-project/compiler-rt/lib/builtins From: Jessica Clarke In-Reply-To: Date: Mon, 3 Aug 2020 18:05:47 +0100 Cc: src-committers , svn-src-projects@freebsd.org, Ed Schouten , John Baldwin Content-Transfer-Encoding: quoted-printable Message-Id: References: <202008021807.072I7GM9059504@repo.freebsd.org> <5D6813EB-F270-412D-8C15-8A05CB6353DE@freebsd.org> To: Dimitry Andric X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 4BL45q2P60z46fb X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of jrtc27@jrtc27.com designates 209.85.128.68 as permitted sender) smtp.mailfrom=jrtc27@jrtc27.com X-Spamd-Result: default: False [-2.44 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[svn-src-projects@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; RCPT_COUNT_FIVE(0.00)[5]; NEURAL_HAM_LONG(-0.90)[-0.904]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-1.13)[-1.133]; RCVD_IN_DNSWL_NONE(0.00)[209.85.128.68:from]; NEURAL_HAM_MEDIUM(-0.90)[-0.903]; FORGED_SENDER(0.30)[jrtc27@freebsd.org,jrtc27@jrtc27.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.128.68:from]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[jrtc27@freebsd.org,jrtc27@jrtc27.com]; RCVD_TLS_ALL(0.00)[] X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Aug 2020 17:05:52 -0000 On 3 Aug 2020, at 17:50, Dimitry Andric wrote: >=20 > On 2 Aug 2020, at 22:50, Jessica Clarke wrote: >>=20 >> On 2 Aug 2020, at 19:07, Dimitry Andric wrote: >>>=20 >>> Author: dim >>> Date: Sun Aug 2 18:07:16 2020 >>> New Revision: 363773 >>> URL: https://svnweb.freebsd.org/changeset/base/363773 >>>=20 >>> Log: >>> Reapply r230021, r276851 and a few other commits to compiler-rt >>>=20 >>> Reapply r230021 (by ed): >>>=20 >>> Add a workaround to prevent endless recursion in compiler-rt. >>>=20 >>> SPARC and MIPS CPUs don't have special instructions to count >>> leading/trailing zeroes. The compiler-rt library provides fallback >>> rountines for these. The 64-bit routines, __clzdi2 and __ctzdi2, are >>> implemented as simple wrappers around the compiler built-in >>> __builtin_clz(), assuming these will expand to either 32-bit >>> CPU instructions or calls to __clzsi2 and __ctzsi2. > ... >> Are you sure this is still necessary? https://reviews.llvm.org/D42902 >> (which landed in in 2018 for 7.0.0, way after the original r230021 in >> 2012) followed by a follow-up commit for the correct SPARC macro, = fixed >> this in an OS-independent way upstream, but inside c?zdi2.c = themselves >> so you won't notice a merge conflict. >=20 > Yes, you are probably right, though the preceding comment in int_lib.h > specifically mentions: >=20 > * Instead of placing this workaround in c?zdi2.c, put it in this > * global header to prevent other C files from making the detour > * through __c?zdi2() as well. >=20 > There are indeed quite a lot of calls to __builtin_c[lt]z throughout > compiler-rt, so removing this workaround from the int_lib.h header > will possibly pessimize all of those. Is that going to work alright = for > all affected architectures, which appear to be mips64, riscv and > sparc64? I don't think it matters much. __clzdi2 is this (the preprocessed version of which you provided): dwords x; x.all =3D a; const si_int f =3D -(x.s.high =3D=3D 0); return __builtin_clz((x.s.high & ~f) | (x.s.low & f)) + (f & ((si_int)(sizeof(si_int) * CHAR_BIT))); The implementation of __clzsi2 does a bunch more work than that (not crazy, but still several times more), so I don't think you'll notice all that much. Besides, Clang is our primary compiler and won't suffer from this workaround, so I personally have no qualms about a small performance hit when using GCC and what should be a relatively cold[1] function. I'm of the view we should be as close to upstream as possible, ideally with zero diffs, so personally wouldn't carry patches like this unless it had a noticeable affect on system performance, but even then would encourage others to patch it upstream first and foremost. Jess [1] Even if compiler-rt uses it a lot, I would expect the work done by its callers, and the callers of the callers etc, to far outweigh the time taken within __clzdi2 (and similarly for other related variants). > Note also that https://bugs.llvm.org/show_bug.cgi?id=3D11663 is = *still* > open, possibly because of that reason. But it looks like interest in > the bug simply waned, and it was never closed. >=20 > In any case, I managed to get the new-new-new-mk2-style version of > mips64-xtoolchain working, and got these warnings, so indeed the fix > is now applied in two places: >=20 > /usr/src/contrib/llvm-project/compiler-rt/lib/builtins/clzdi2.c:23: = warning: "__builtin_clz" redefined > 23 | #define __builtin_clz(a) __clzsi2(a) > | > In file included from = /usr/src/contrib/llvm-project/compiler-rt/lib/builtins/clzdi2.c:13: > /usr/src/contrib/llvm-project/compiler-rt/lib/builtins/int_lib.h:112: = note: this is the location of the previous definition > 112 | #define __builtin_clz __clzsi2 > | >=20 > The preprocessed C code looks OK, in the sense that it delegates to > __clzsi2 now: >=20 > extern si_int __clzsi2(si_int); >=20 > si_int __clzdi2(di_int a) { > dwords x; > x.all =3D a; > const si_int f =3D -(x.s.high =3D=3D 0); > return __clzsi2((x.s.high & ~f) | (x.s.low & f)) + > (f & ((si_int)(sizeof(si_int) * 8))); > } >=20 > And the resulting assembly shows no further external calls: >=20 > 0000000000000000 <__clzdi2>: > 0: 67bdffe0 daddiu sp,sp,-32 > 4: ffbf0018 sd ra,24(sp) > 8: ffbc0010 sd gp,16(sp) > c: ffb00008 sd s0,8(sp) > 10: 3c1c0000 lui gp,0x0 > 14: 0399e02d daddu gp,gp,t9 > 18: 679c0000 daddiu gp,gp,0 > 1c: 0004103f dsra32 v0,a0,0x0 > 20: 2c430001 sltiu v1,v0,1 > 24: 0003802f dnegu s0,v1 > 28: 2463ffff addiu v1,v1,-1 > 2c: 00621824 and v1,v1,v0 > 30: 00042000 sll a0,a0,0x0 > 34: 00902024 and a0,a0,s0 > 38: df990000 ld t9,0(gp) > 3c: 0320f809 jalr t9 > 40: 00642025 or a0,v1,a0 > 44: 32100020 andi s0,s0,0x20 > 48: 02021021 addu v0,s0,v0 > 4c: dfbf0018 ld ra,24(sp) > 50: dfbc0010 ld gp,16(sp) > 54: dfb00008 ld s0,8(sp) > 58: 03e00008 jr ra > 5c: 67bd0020 daddiu sp,sp,32 >=20 > In summary, I'm fine with dropping this part of r230021, as long as > everybody is convinced this does not pessimize the rest too much. >=20 > -Dimitry