From owner-svn-src-vendor@freebsd.org Mon Feb 25 19:07:19 2019 Return-Path: Delivered-To: svn-src-vendor@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4673D1512CDA; Mon, 25 Feb 2019 19:07:19 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EC28773C93; Mon, 25 Feb 2019 19:07:18 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id D020B213EA; Mon, 25 Feb 2019 19:07:18 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x1PJ7IPP040469; Mon, 25 Feb 2019 19:07:18 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x1PJ7Go4040459; Mon, 25 Feb 2019 19:07:16 GMT (envelope-from dim@FreeBSD.org) Message-Id: <201902251907.x1PJ7Go4040459@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Mon, 25 Feb 2019 19:07:16 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org Subject: svn commit: r344535 - in vendor/llvm/dist-release_80: docs lib/MC lib/Target/X86 lib/Transforms/InstCombine lib/Transforms/Scalar test/CodeGen/AArch64 test/CodeGen/X86 test/MC/ELF test/Transforms/I... X-SVN-Group: vendor X-SVN-Commit-Author: dim X-SVN-Commit-Paths: in vendor/llvm/dist-release_80: docs lib/MC lib/Target/X86 lib/Transforms/InstCombine lib/Transforms/Scalar test/CodeGen/AArch64 test/CodeGen/X86 test/MC/ELF test/Transforms/InstCombine X-SVN-Commit-Revision: 344535 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: EC28773C93 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.97 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; NEURAL_HAM_SHORT(-0.98)[-0.976,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US] X-BeenThere: svn-src-vendor@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the vendor work area tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2019 19:07:19 -0000 Author: dim Date: Mon Feb 25 19:07:16 2019 New Revision: 344535 URL: https://svnweb.freebsd.org/changeset/base/344535 Log: Vendor import of llvm release_80 branch r354799: https://llvm.org/svn/llvm-project/llvm/branches/release_80@354799 Added: vendor/llvm/dist-release_80/test/CodeGen/X86/pr40730.ll Modified: vendor/llvm/dist-release_80/docs/ReleaseNotes.rst vendor/llvm/dist-release_80/docs/index.rst vendor/llvm/dist-release_80/lib/MC/ELFObjectWriter.cpp vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp vendor/llvm/dist-release_80/lib/Transforms/Scalar/MergeICmps.cpp vendor/llvm/dist-release_80/test/CodeGen/AArch64/machine-outliner-bad-adrp.mir vendor/llvm/dist-release_80/test/MC/ELF/invalid-symver.s vendor/llvm/dist-release_80/test/MC/ELF/multiple-different-symver.s vendor/llvm/dist-release_80/test/Transforms/InstCombine/vec_shuffle.ll Modified: vendor/llvm/dist-release_80/docs/ReleaseNotes.rst ============================================================================== --- vendor/llvm/dist-release_80/docs/ReleaseNotes.rst Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/docs/ReleaseNotes.rst Mon Feb 25 19:07:16 2019 (r344535) @@ -5,12 +5,6 @@ LLVM 8.0.0 Release Notes .. contents:: :local: -.. warning:: - These are in-progress notes for the upcoming LLVM 8 release. - Release notes for previous releases can be found on - `the Download Page `_. - - Introduction ============ @@ -26,11 +20,25 @@ have questions or comments, the `LLVM Developer's Mail `_ is a good place to send them. -Note that if you are reading this file from a Subversion checkout or the main -LLVM web page, this document applies to the *next* release, not the current -one. To see the release notes for a specific release, please see the `releases -page `_. +Minimum Required Compiler Version +================================= +As `discussed on the mailing list +`_, +building LLVM will soon require more recent toolchains as follows: +============= ==== +Clang 3.5 +Apple Clang 6.0 +GCC 5.1 +Visual Studio 2017 +============= ==== + +A new CMake check when configuring LLVM provides a soft-error if your +toolchain will become unsupported soon. You can opt out of the soft-error by +setting the ``LLVM_TEMPORARILY_ALLOW_OLD_TOOLCHAIN`` CMake variable to +``ON``. + + Non-comprehensive list of changes in this release ================================================= .. NOTE @@ -40,27 +48,11 @@ Non-comprehensive list of changes in this release functionality, or simply have a lot to talk about), see the `NOTE` below for adding a new subsection. -* As `discussed on the mailing list - `_, - building LLVM will soon require more recent toolchains as follows: - - ============= ==== - Clang 3.5 - Apple Clang 6.0 - GCC 5.1 - Visual Studio 2017 - ============= ==== - - A new CMake check when configuring LLVM provides a soft-error if your - toolchain will become unsupported soon. You can opt out of the soft-error by - setting the ``LLVM_TEMPORARILY_ALLOW_OLD_TOOLCHAIN`` CMake variable to - ``ON``. - * The **llvm-cov** tool can now export lcov trace files using the `-format=lcov` option of the `export` command. -* The add_llvm_loadable_module CMake macro has been removed. The - add_llvm_library macro with the MODULE argument now provides the same +* The ``add_llvm_loadable_module`` CMake macro has been removed. The + ``add_llvm_library`` macro with the ``MODULE`` argument now provides the same functionality. See `Writing an LLVM Pass `_. @@ -70,6 +62,24 @@ Non-comprehensive list of changes in this release * Added support for labels as offsets in ``.reloc`` directive. +* Support for precise identification of X86 instructions with memory operands, + by using debug information. This supports profile-driven cache prefetching. + It is enabled with the ``-x86-discriminate-memops`` LLVM Flag. + +* Support for profile-driven software cache prefetching on X86. This is part of + a larger system, consisting of: an offline cache prefetches recommender, + AutoFDO tooling, and LLVM. In this system, a binary compiled with + ``-x86-discriminate-memops`` is run under the observation of the recommender. + The recommender identifies certain memory access instructions by their binary + file address, and recommends a prefetch of a specific type (NTA, T0, etc) be + performed at a specified fixed offset from such an instruction's memory + operand. Next, this information needs to be converted to the AutoFDO syntax + and the resulting profile may be passed back to the compiler with the LLVM + flag ``-prefetch-hints-file``, together with the exact same set of + compilation parameters used for the original binary. More information is + available in the `RFC + `_. + .. NOTE If you would like to document a larger change, then you can add a subsection about it right here. You can copy the following boilerplate @@ -83,10 +93,19 @@ Non-comprehensive list of changes in this release Changes to the LLVM IR ---------------------- +* Function attribute ``speculative_load_hardening`` has been introduced to + allow indicating that `Speculative Load Hardening + `_ must be enabled for the function body. + Changes to the AArch64 Target ----------------------------- +* Support for Speculative Load Hardening has been added. + +* Initial support for the Tiny code model, where code and its statically + defined symbols must live within 1MB of each other. + * Added support for the ``.arch_extension`` assembler directive, just like on ARM. @@ -126,14 +145,59 @@ Changes to the MIPS Target Changes to the PowerPC Target ----------------------------- - During this release ... +* Switched to non-PIC default +* Deprecated Darwin support + +* Enabled Out-of-Order scheduling for P9 + +* Better overload rules for compatible vector type parameter + +* Support constraint ‘wi’, modifier ‘x’ and VSX registers in inline asm + +* More ``__float128`` support + +* Added new builtins like vector int128 ``pack``/``unpack`` and + ``stxvw4x.be``/``stxvd2x.be`` + +* Provided significant improvements to the automatic vectorizer + +* Code-gen improvements (especially for Power9) + +* Fixed some long-standing bugs in the back end + +* Added experimental prologue/epilogue improvements + +* Enabled builtins tests in compiler-rt + +* Add ``___fixunstfti``/``floattitf`` in compiler-rt to support conversion + between IBM double-double and unsigned int128 + +* Disable randomized address space when running the sanitizers on Linux ppc64le + +* Completed support in LLD for ELFv2 + +* Enabled llvm-exegesis latency mode for PPC + + Changes to the X86 Target ------------------------- * Machine model for AMD bdver2 (Piledriver) CPU was added. It is used to support instruction scheduling and other instruction cost heuristics. +* New AVX512F gather and scatter intrinsics were added that take a mask + instead of a scalar integer. This removes the need for a bitcast in IR. The + new intrinsics are named like the old intrinsics with ``llvm.avx512.`` + replaced with ``llvm.avx512.mask.``. The old intrinsics will be removed in a + future release. + +* Added ``cascadelake`` as a CPU name for -march. This is ``skylake-avx512`` + with the addition of the ``avx512vnni`` instruction set. + +* ADCX instruction will no longer be emitted. This instruction is rarely better + than the legacy ADC instruction and just increased code size. + Changes to the AMDGPU Target ----------------------------- @@ -156,7 +220,11 @@ use for it will be to add support for returning small return values, once the underlying WebAssembly platform itself supports it. Additionally, multithreading support is not yet included in the stable ABI. +Changes to the Nios2 Target +--------------------------- +* The Nios2 target was removed from this release. + Changes to the OCaml bindings ----------------------------- @@ -168,6 +236,14 @@ Changes to the C API Changes to the DAG infrastructure --------------------------------- + +Changes to LLDB +=============== +* Printed source code is now syntax highlighted in the terminal (only for C + languages). + +* The expression command now supports tab completing expressions. + External Open Source Projects Using LLVM 8 ========================================== Modified: vendor/llvm/dist-release_80/docs/index.rst ============================================================================== --- vendor/llvm/dist-release_80/docs/index.rst Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/docs/index.rst Mon Feb 25 19:07:16 2019 (r344535) @@ -1,11 +1,6 @@ Overview ======== -.. warning:: - - If you are using a released version of LLVM, see `the download page - `_ to find your documentation. - The LLVM compiler infrastructure supports a wide range of projects, from industrial strength compilers to specialized JIT applications to small research projects. Modified: vendor/llvm/dist-release_80/lib/MC/ELFObjectWriter.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/MC/ELFObjectWriter.cpp Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/lib/MC/ELFObjectWriter.cpp Mon Feb 25 19:07:16 2019 (r344535) @@ -1275,14 +1275,20 @@ void ELFObjectWriter::executePostLayoutBinding(MCAssem if (!Symbol.isUndefined() && !Rest.startswith("@@@")) continue; - // FIXME: produce a better error message. + // FIXME: Get source locations for these errors or diagnose them earlier. if (Symbol.isUndefined() && Rest.startswith("@@") && - !Rest.startswith("@@@")) - report_fatal_error("A @@ version cannot be undefined"); + !Rest.startswith("@@@")) { + Asm.getContext().reportError(SMLoc(), "versioned symbol " + AliasName + + " must be defined"); + continue; + } - if (Renames.count(&Symbol) && Renames[&Symbol] != Alias) - report_fatal_error(llvm::Twine("Multiple symbol versions defined for ") + - Symbol.getName()); + if (Renames.count(&Symbol) && Renames[&Symbol] != Alias) { + Asm.getContext().reportError( + SMLoc(), llvm::Twine("multiple symbol versions defined for ") + + Symbol.getName()); + continue; + } Renames.insert(std::make_pair(&Symbol, Alias)); } Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp Mon Feb 25 19:07:16 2019 (r344535) @@ -13884,7 +13884,6 @@ static SDValue lowerVectorShuffleAsLanePermuteAndPermu int NumEltsPerLane = NumElts / NumLanes; SmallVector SrcLaneMask(NumLanes, SM_SentinelUndef); - SmallVector LaneMask(NumElts, SM_SentinelUndef); SmallVector PermMask(NumElts, SM_SentinelUndef); for (int i = 0; i != NumElts; ++i) { @@ -13899,8 +13898,18 @@ static SDValue lowerVectorShuffleAsLanePermuteAndPermu return SDValue(); SrcLaneMask[DstLane] = SrcLane; - LaneMask[i] = (SrcLane * NumEltsPerLane) + (i % NumEltsPerLane); PermMask[i] = (DstLane * NumEltsPerLane) + (M % NumEltsPerLane); + } + + // Make sure we set all elements of the lane mask, to avoid undef propagation. + SmallVector LaneMask(NumElts, SM_SentinelUndef); + for (int DstLane = 0; DstLane != NumLanes; ++DstLane) { + int SrcLane = SrcLaneMask[DstLane]; + if (0 <= SrcLane) + for (int j = 0; j != NumEltsPerLane; ++j) { + LaneMask[(DstLane * NumEltsPerLane) + j] = + (SrcLane * NumEltsPerLane) + j; + } } // If we're only shuffling a single lowest lane and the rest are identity Modified: vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp Mon Feb 25 19:07:16 2019 (r344535) @@ -1376,7 +1376,8 @@ Instruction *InstCombiner::foldVectorBinop(BinaryOpera if (match(LHS, m_ShuffleVector(m_Value(L0), m_Value(L1), m_Constant(Mask))) && match(RHS, m_ShuffleVector(m_Value(R0), m_Value(R1), m_Specific(Mask))) && LHS->hasOneUse() && RHS->hasOneUse() && - cast(LHS)->isConcat()) { + cast(LHS)->isConcat() && + cast(RHS)->isConcat()) { // This transform does not have the speculative execution constraint as // below because the shuffle is a concatenation. The new binops are // operating on exactly the same elements as the existing binop. Modified: vendor/llvm/dist-release_80/lib/Transforms/Scalar/MergeICmps.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Transforms/Scalar/MergeICmps.cpp Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/lib/Transforms/Scalar/MergeICmps.cpp Mon Feb 25 19:07:16 2019 (r344535) @@ -11,21 +11,37 @@ // later typically inlined as a chain of efficient hardware comparisons). This // typically benefits c++ member or nonmember operator==(). // -// The basic idea is to replace a larger chain of integer comparisons loaded -// from contiguous memory locations into a smaller chain of such integer +// The basic idea is to replace a longer chain of integer comparisons loaded +// from contiguous memory locations into a shorter chain of larger integer // comparisons. Benefits are double: // - There are less jumps, and therefore less opportunities for mispredictions // and I-cache misses. // - Code size is smaller, both because jumps are removed and because the // encoding of a 2*n byte compare is smaller than that of two n-byte // compares. - +// +// Example: +// +// struct S { +// int a; +// char b; +// char c; +// uint16_t d; +// bool operator==(const S& o) const { +// return a == o.a && b == o.b && c == o.c && d == o.d; +// } +// }; +// +// Is optimized as : +// +// bool S::operator==(const S& o) const { +// return memcmp(this, &o, 8) == 0; +// } +// +// Which will later be expanded (ExpandMemCmp) as a single 8-bytes icmp. +// //===----------------------------------------------------------------------===// -#include -#include -#include -#include #include "llvm/Analysis/Loads.h" #include "llvm/Analysis/TargetLibraryInfo.h" #include "llvm/Analysis/TargetTransformInfo.h" @@ -34,6 +50,10 @@ #include "llvm/Pass.h" #include "llvm/Transforms/Scalar.h" #include "llvm/Transforms/Utils/BuildLibCalls.h" +#include +#include +#include +#include using namespace llvm; @@ -50,76 +70,95 @@ static bool isSimpleLoadOrStore(const Instruction *I) return false; } -// A BCE atom. +// A BCE atom "Binary Compare Expression Atom" represents an integer load +// that is a constant offset from a base value, e.g. `a` or `o.c` in the example +// at the top. struct BCEAtom { - BCEAtom() : GEP(nullptr), LoadI(nullptr), Offset() {} + BCEAtom() = default; + BCEAtom(GetElementPtrInst *GEP, LoadInst *LoadI, int BaseId, APInt Offset) + : GEP(GEP), LoadI(LoadI), BaseId(BaseId), Offset(Offset) {} - const Value *Base() const { return GEP ? GEP->getPointerOperand() : nullptr; } - + // We want to order BCEAtoms by (Base, Offset). However we cannot use + // the pointer values for Base because these are non-deterministic. + // To make sure that the sort order is stable, we first assign to each atom + // base value an index based on its order of appearance in the chain of + // comparisons. We call this index `BaseOrdering`. For example, for: + // b[3] == c[2] && a[1] == d[1] && b[4] == c[3] + // | block 1 | | block 2 | | block 3 | + // b gets assigned index 0 and a index 1, because b appears as LHS in block 1, + // which is before block 2. + // We then sort by (BaseOrdering[LHS.Base()], LHS.Offset), which is stable. bool operator<(const BCEAtom &O) const { - assert(Base() && "invalid atom"); - assert(O.Base() && "invalid atom"); - // Just ordering by (Base(), Offset) is sufficient. However because this - // means that the ordering will depend on the addresses of the base - // values, which are not reproducible from run to run. To guarantee - // stability, we use the names of the values if they exist; we sort by: - // (Base.getName(), Base(), Offset). - const int NameCmp = Base()->getName().compare(O.Base()->getName()); - if (NameCmp == 0) { - if (Base() == O.Base()) { - return Offset.slt(O.Offset); - } - return Base() < O.Base(); - } - return NameCmp < 0; + return BaseId != O.BaseId ? BaseId < O.BaseId : Offset.slt(O.Offset); } - GetElementPtrInst *GEP; - LoadInst *LoadI; + GetElementPtrInst *GEP = nullptr; + LoadInst *LoadI = nullptr; + unsigned BaseId = 0; APInt Offset; }; +// A class that assigns increasing ids to values in the order in which they are +// seen. See comment in `BCEAtom::operator<()``. +class BaseIdentifier { +public: + // Returns the id for value `Base`, after assigning one if `Base` has not been + // seen before. + int getBaseId(const Value *Base) { + assert(Base && "invalid base"); + const auto Insertion = BaseToIndex.try_emplace(Base, Order); + if (Insertion.second) + ++Order; + return Insertion.first->second; + } + +private: + unsigned Order = 1; + DenseMap BaseToIndex; +}; + // If this value is a load from a constant offset w.r.t. a base address, and // there are no other users of the load or address, returns the base address and // the offset. -BCEAtom visitICmpLoadOperand(Value *const Val) { - BCEAtom Result; - if (auto *const LoadI = dyn_cast(Val)) { - LLVM_DEBUG(dbgs() << "load\n"); - if (LoadI->isUsedOutsideOfBlock(LoadI->getParent())) { - LLVM_DEBUG(dbgs() << "used outside of block\n"); - return {}; - } - // Do not optimize atomic loads to non-atomic memcmp - if (!LoadI->isSimple()) { - LLVM_DEBUG(dbgs() << "volatile or atomic\n"); - return {}; - } - Value *const Addr = LoadI->getOperand(0); - if (auto *const GEP = dyn_cast(Addr)) { - LLVM_DEBUG(dbgs() << "GEP\n"); - if (GEP->isUsedOutsideOfBlock(LoadI->getParent())) { - LLVM_DEBUG(dbgs() << "used outside of block\n"); - return {}; - } - const auto &DL = GEP->getModule()->getDataLayout(); - if (!isDereferenceablePointer(GEP, DL)) { - LLVM_DEBUG(dbgs() << "not dereferenceable\n"); - // We need to make sure that we can do comparison in any order, so we - // require memory to be unconditionnally dereferencable. - return {}; - } - Result.Offset = APInt(DL.getPointerTypeSizeInBits(GEP->getType()), 0); - if (GEP->accumulateConstantOffset(DL, Result.Offset)) { - Result.GEP = GEP; - Result.LoadI = LoadI; - } - } +BCEAtom visitICmpLoadOperand(Value *const Val, BaseIdentifier &BaseId) { + auto *const LoadI = dyn_cast(Val); + if (!LoadI) + return {}; + LLVM_DEBUG(dbgs() << "load\n"); + if (LoadI->isUsedOutsideOfBlock(LoadI->getParent())) { + LLVM_DEBUG(dbgs() << "used outside of block\n"); + return {}; } - return Result; + // Do not optimize atomic loads to non-atomic memcmp + if (!LoadI->isSimple()) { + LLVM_DEBUG(dbgs() << "volatile or atomic\n"); + return {}; + } + Value *const Addr = LoadI->getOperand(0); + auto *const GEP = dyn_cast(Addr); + if (!GEP) + return {}; + LLVM_DEBUG(dbgs() << "GEP\n"); + if (GEP->isUsedOutsideOfBlock(LoadI->getParent())) { + LLVM_DEBUG(dbgs() << "used outside of block\n"); + return {}; + } + const auto &DL = GEP->getModule()->getDataLayout(); + if (!isDereferenceablePointer(GEP, DL)) { + LLVM_DEBUG(dbgs() << "not dereferenceable\n"); + // We need to make sure that we can do comparison in any order, so we + // require memory to be unconditionnally dereferencable. + return {}; + } + APInt Offset = APInt(DL.getPointerTypeSizeInBits(GEP->getType()), 0); + if (!GEP->accumulateConstantOffset(DL, Offset)) + return {}; + return BCEAtom(GEP, LoadI, BaseId.getBaseId(GEP->getPointerOperand()), + Offset); } -// A basic block with a comparison between two BCE atoms. +// A basic block with a comparison between two BCE atoms, e.g. `a == o.a` in the +// example at the top. // The block might do extra work besides the atom comparison, in which case // doesOtherWork() returns true. Under some conditions, the block can be // split into the atom comparison part and the "other work" part @@ -137,9 +176,7 @@ class BCECmpBlock { if (Rhs_ < Lhs_) std::swap(Rhs_, Lhs_); } - bool IsValid() const { - return Lhs_.Base() != nullptr && Rhs_.Base() != nullptr; - } + bool IsValid() const { return Lhs_.BaseId != 0 && Rhs_.BaseId != 0; } // Assert the block is consistent: If valid, it should also have // non-null members besides Lhs_ and Rhs_. @@ -265,7 +302,8 @@ bool BCECmpBlock::doesOtherWork() const { // Visit the given comparison. If this is a comparison between two valid // BCE atoms, returns the comparison. BCECmpBlock visitICmp(const ICmpInst *const CmpI, - const ICmpInst::Predicate ExpectedPredicate) { + const ICmpInst::Predicate ExpectedPredicate, + BaseIdentifier &BaseId) { // The comparison can only be used once: // - For intermediate blocks, as a branch condition. // - For the final block, as an incoming value for the Phi. @@ -275,25 +313,27 @@ BCECmpBlock visitICmp(const ICmpInst *const CmpI, LLVM_DEBUG(dbgs() << "cmp has several uses\n"); return {}; } - if (CmpI->getPredicate() == ExpectedPredicate) { - LLVM_DEBUG(dbgs() << "cmp " - << (ExpectedPredicate == ICmpInst::ICMP_EQ ? "eq" : "ne") - << "\n"); - auto Lhs = visitICmpLoadOperand(CmpI->getOperand(0)); - if (!Lhs.Base()) return {}; - auto Rhs = visitICmpLoadOperand(CmpI->getOperand(1)); - if (!Rhs.Base()) return {}; - const auto &DL = CmpI->getModule()->getDataLayout(); - return BCECmpBlock(std::move(Lhs), std::move(Rhs), - DL.getTypeSizeInBits(CmpI->getOperand(0)->getType())); - } - return {}; + if (CmpI->getPredicate() != ExpectedPredicate) + return {}; + LLVM_DEBUG(dbgs() << "cmp " + << (ExpectedPredicate == ICmpInst::ICMP_EQ ? "eq" : "ne") + << "\n"); + auto Lhs = visitICmpLoadOperand(CmpI->getOperand(0), BaseId); + if (!Lhs.BaseId) + return {}; + auto Rhs = visitICmpLoadOperand(CmpI->getOperand(1), BaseId); + if (!Rhs.BaseId) + return {}; + const auto &DL = CmpI->getModule()->getDataLayout(); + return BCECmpBlock(std::move(Lhs), std::move(Rhs), + DL.getTypeSizeInBits(CmpI->getOperand(0)->getType())); } // Visit the given comparison block. If this is a comparison between two valid // BCE atoms, returns the comparison. BCECmpBlock visitCmpBlock(Value *const Val, BasicBlock *const Block, - const BasicBlock *const PhiBlock) { + const BasicBlock *const PhiBlock, + BaseIdentifier &BaseId) { if (Block->empty()) return {}; auto *const BranchI = dyn_cast(Block->getTerminator()); if (!BranchI) return {}; @@ -306,7 +346,7 @@ BCECmpBlock visitCmpBlock(Value *const Val, BasicBlock auto *const CmpI = dyn_cast(Val); if (!CmpI) return {}; LLVM_DEBUG(dbgs() << "icmp\n"); - auto Result = visitICmp(CmpI, ICmpInst::ICMP_EQ); + auto Result = visitICmp(CmpI, ICmpInst::ICMP_EQ, BaseId); Result.CmpI = CmpI; Result.BranchI = BranchI; return Result; @@ -323,7 +363,8 @@ BCECmpBlock visitCmpBlock(Value *const Val, BasicBlock assert(BranchI->getNumSuccessors() == 2 && "expecting a cond branch"); BasicBlock *const FalseBlock = BranchI->getSuccessor(1); auto Result = visitICmp( - CmpI, FalseBlock == PhiBlock ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE); + CmpI, FalseBlock == PhiBlock ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE, + BaseId); Result.CmpI = CmpI; Result.BranchI = BranchI; return Result; @@ -335,9 +376,9 @@ static inline void enqueueBlock(std::vectorgetName() << "': Found cmp of " << Comparison.SizeBits() - << " bits between " << Comparison.Lhs().Base() << " + " + << " bits between " << Comparison.Lhs().BaseId << " + " << Comparison.Lhs().Offset << " and " - << Comparison.Rhs().Base() << " + " + << Comparison.Rhs().BaseId << " + " << Comparison.Rhs().Offset << "\n"); LLVM_DEBUG(dbgs() << "\n"); Comparisons.push_back(Comparison); @@ -360,8 +401,8 @@ class BCECmpChain { private: static bool IsContiguous(const BCECmpBlock &First, const BCECmpBlock &Second) { - return First.Lhs().Base() == Second.Lhs().Base() && - First.Rhs().Base() == Second.Rhs().Base() && + return First.Lhs().BaseId == Second.Lhs().BaseId && + First.Rhs().BaseId == Second.Rhs().BaseId && First.Lhs().Offset + First.SizeBits() / 8 == Second.Lhs().Offset && First.Rhs().Offset + First.SizeBits() / 8 == Second.Rhs().Offset; } @@ -385,11 +426,12 @@ BCECmpChain::BCECmpChain(const std::vector Comparisons; + BaseIdentifier BaseId; for (size_t BlockIdx = 0; BlockIdx < Blocks.size(); ++BlockIdx) { BasicBlock *const Block = Blocks[BlockIdx]; assert(Block && "invalid block"); BCECmpBlock Comparison = visitCmpBlock(Phi.getIncomingValueForBlock(Block), - Block, Phi.getParent()); + Block, Phi.getParent(), BaseId); Comparison.BB = Block; if (!Comparison.IsValid()) { LLVM_DEBUG(dbgs() << "chain with invalid BCECmpBlock, no merge.\n"); @@ -466,9 +508,10 @@ BCECmpChain::BCECmpChain(const std::vector @shuffle_v8i32_0dcd3f14(<8 x i32> %a, <8 x i32> %b) { +; CHECK-LABEL: shuffle_v8i32_0dcd3f14: +; CHECK: # %bb.0: +; CHECK-NEXT: vextractf128 $1, %ymm0, %xmm2 +; CHECK-NEXT: vblendps {{.*#+}} xmm2 = xmm2[0],xmm0[1,2,3] +; CHECK-NEXT: vpermilps {{.*#+}} xmm2 = xmm2[3,1,1,0] +; CHECK-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 +; CHECK-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm1[2,3,2,3] +; CHECK-NEXT: vpermilpd {{.*#+}} ymm1 = ymm1[0,0,3,2] +; CHECK-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3],ymm0[4],ymm1[5],ymm0[6,7] +; CHECK-NEXT: retq + %shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> + ret <8 x i32> %shuffle +} + +; CHECK: .LCPI1_0: +; CHECK-NEXT: .quad 60129542157 +; CHECK-NEXT: .quad 60129542157 +; CHECK-NEXT: .quad 68719476736 +; CHECK-NEXT: .quad 60129542157 + +define <8 x i32> @shuffle_v8i32_0dcd3f14_constant(<8 x i32> %a0) { +; CHECK-LABEL: shuffle_v8i32_0dcd3f14_constant: +; CHECK: # %bb.0: +; CHECK-NEXT: vextractf128 $1, %ymm0, %xmm1 +; CHECK-NEXT: vblendps {{.*#+}} xmm1 = xmm1[0],xmm0[1,2,3] +; CHECK-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[3,1,1,0] +; CHECK-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; CHECK-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0],mem[1,2,3],ymm0[4],mem[5],ymm0[6,7] +; CHECK-NEXT: retq + %res = shufflevector <8 x i32> %a0, <8 x i32> , <8 x i32> + ret <8 x i32> %res +} Modified: vendor/llvm/dist-release_80/test/MC/ELF/invalid-symver.s ============================================================================== --- vendor/llvm/dist-release_80/test/MC/ELF/invalid-symver.s Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/test/MC/ELF/invalid-symver.s Mon Feb 25 19:07:16 2019 (r344535) @@ -1,7 +1,7 @@ // RUN: not llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o %t 2> %t.out // RUN: FileCheck --input-file=%t.out %s -// CHECK: A @@ version cannot be undefined +// CHECK: error: versioned symbol foo@@bar must be defined .symver undefined, foo@@bar .long undefined Modified: vendor/llvm/dist-release_80/test/MC/ELF/multiple-different-symver.s ============================================================================== --- vendor/llvm/dist-release_80/test/MC/ELF/multiple-different-symver.s Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/test/MC/ELF/multiple-different-symver.s Mon Feb 25 19:07:16 2019 (r344535) @@ -1,6 +1,6 @@ // RUN: not llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o %t 2>&1 | FileCheck %s -// CHECK: Multiple symbol versions defined for foo +// CHECK: error: multiple symbol versions defined for foo .symver foo, foo@1 .symver foo, foo@2 Modified: vendor/llvm/dist-release_80/test/Transforms/InstCombine/vec_shuffle.ll ============================================================================== --- vendor/llvm/dist-release_80/test/Transforms/InstCombine/vec_shuffle.ll Mon Feb 25 18:52:47 2019 (r344534) +++ vendor/llvm/dist-release_80/test/Transforms/InstCombine/vec_shuffle.ll Mon Feb 25 19:07:16 2019 (r344535) @@ -1114,3 +1114,18 @@ define <2 x float> @frem_splat_constant1(<2 x float> % ret <2 x float> %r } +; Equivalent shuffle masks, but only one is a narrowing op. + +define <2 x i1> @PR40734(<1 x i1> %x, <4 x i1> %y) { +; CHECK-LABEL: @PR40734( +; CHECK-NEXT: [[WIDEN:%.*]] = shufflevector <1 x i1> zeroinitializer, <1 x i1> [[X:%.*]], <2 x i32> +; CHECK-NEXT: [[NARROW:%.*]] = shufflevector <4 x i1> [[Y:%.*]], <4 x i1> undef, <2 x i32> +; CHECK-NEXT: [[R:%.*]] = and <2 x i1> [[WIDEN]], [[NARROW]] +; CHECK-NEXT: ret <2 x i1> [[R]] +; + %widen = shufflevector <1 x i1> zeroinitializer, <1 x i1> %x, <2 x i32> + %narrow = shufflevector <4 x i1> %y, <4 x i1> undef, <2 x i32> + %r = and <2 x i1> %widen, %narrow + ret <2 x i1> %r +} +