From owner-svn-src-all@freebsd.org Fri Jul 31 22:12:39 2020 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id F268C3A427C; Fri, 31 Jul 2020 22:12:39 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BJM3C6RYhz41p6; Fri, 31 Jul 2020 22:12:39 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C15AE8A72; Fri, 31 Jul 2020 22:12:39 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 06VMCdYo034388; Fri, 31 Jul 2020 22:12:39 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 06VMCZhs034365; Fri, 31 Jul 2020 22:12:35 GMT (envelope-from dim@FreeBSD.org) Message-Id: <202007312212.06VMCZhs034365@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Fri, 31 Jul 2020 22:12:35 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org Subject: svn commit: r363744 - in vendor/llvm-project/release-11.x: clang/lib/Sema lld/COFF llvm/include/llvm/CodeGen llvm/lib/Analysis llvm/lib/CodeGen llvm/lib/CodeGen/SelectionDAG llvm/lib/MC llvm/lib/Ta... X-SVN-Group: vendor X-SVN-Commit-Author: dim X-SVN-Commit-Paths: in vendor/llvm-project/release-11.x: clang/lib/Sema lld/COFF llvm/include/llvm/CodeGen llvm/lib/Analysis llvm/lib/CodeGen llvm/lib/CodeGen/SelectionDAG llvm/lib/MC llvm/lib/Target/AArch64 llvm/lib/Tar... X-SVN-Commit-Revision: 363744 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2020 22:12:40 -0000 Author: dim Date: Fri Jul 31 22:12:34 2020 New Revision: 363744 URL: https://svnweb.freebsd.org/changeset/base/363744 Log: Vendor import of llvm-project branch release/11.x llvmorg-11.0.0-rc1-25-g903c872b169. Modified: vendor/llvm-project/release-11.x/clang/lib/Sema/SemaOpenMP.cpp vendor/llvm-project/release-11.x/lld/COFF/Config.h vendor/llvm-project/release-11.x/lld/COFF/Driver.cpp vendor/llvm-project/release-11.x/lld/COFF/InputFiles.cpp vendor/llvm-project/release-11.x/lld/COFF/MinGW.cpp vendor/llvm-project/release-11.x/lld/COFF/Options.td vendor/llvm-project/release-11.x/lld/COFF/Writer.cpp vendor/llvm-project/release-11.x/llvm/include/llvm/CodeGen/TargetFrameLowering.h vendor/llvm-project/release-11.x/llvm/lib/Analysis/BasicAliasAnalysis.cpp vendor/llvm-project/release-11.x/llvm/lib/CodeGen/LocalStackSlotAllocation.cpp vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp vendor/llvm-project/release-11.x/llvm/lib/MC/WinCOFFObjectWriter.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.h vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrFormats.td vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/SVEInstrFormats.td vendor/llvm-project/release-11.x/llvm/lib/Target/PowerPC/PPCISelLowering.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h vendor/llvm-project/release-11.x/llvm/lib/Target/RISCV/RISCVISelLowering.cpp vendor/llvm-project/release-11.x/llvm/lib/Target/RISCV/RISCVInstrInfoB.td vendor/llvm-project/release-11.x/llvm/lib/Target/X86/X86ISelLowering.cpp vendor/llvm-project/release-11.x/llvm/lib/ToolDrivers/llvm-lib/LibDriver.cpp vendor/llvm-project/release-11.x/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp vendor/llvm-project/release-11.x/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp vendor/llvm-project/release-11.x/llvm/lib/Transforms/Scalar/JumpThreading.cpp vendor/llvm-project/release-11.x/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp vendor/llvm-project/release-11.x/openmp/runtime/src/kmp_ftn_entry.h vendor/llvm-project/release-11.x/openmp/runtime/src/kmp_os.h vendor/llvm-project/release-11.x/openmp/runtime/src/ompt-specific.cpp Modified: vendor/llvm-project/release-11.x/clang/lib/Sema/SemaOpenMP.cpp ============================================================================== --- vendor/llvm-project/release-11.x/clang/lib/Sema/SemaOpenMP.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/clang/lib/Sema/SemaOpenMP.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -2244,7 +2244,11 @@ OpenMPClauseKind Sema::isOpenMPPrivateDecl(ValueDecl * [](OpenMPDirectiveKind K) { return isOpenMPTaskingDirective(K); }, Level)) { bool IsTriviallyCopyable = - D->getType().getNonReferenceType().isTriviallyCopyableType(Context); + D->getType().getNonReferenceType().isTriviallyCopyableType(Context) && + !D->getType() + .getNonReferenceType() + .getCanonicalType() + ->getAsCXXRecordDecl(); OpenMPDirectiveKind DKind = DSAStack->getDirective(Level); SmallVector CaptureRegions; getOpenMPCaptureRegions(CaptureRegions, DKind); Modified: vendor/llvm-project/release-11.x/lld/COFF/Config.h ============================================================================== --- vendor/llvm-project/release-11.x/lld/COFF/Config.h Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/lld/COFF/Config.h Fri Jul 31 22:12:34 2020 (r363744) @@ -140,6 +140,7 @@ struct Configuration { bool safeSEH = false; Symbol *sehTable = nullptr; Symbol *sehCount = nullptr; + bool noSEH = false; // Used for /opt:lldlto=N unsigned ltoo = 2; Modified: vendor/llvm-project/release-11.x/lld/COFF/Driver.cpp ============================================================================== --- vendor/llvm-project/release-11.x/lld/COFF/Driver.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/lld/COFF/Driver.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -1700,9 +1700,10 @@ void LinkerDriver::link(ArrayRef argsArr config->wordsize = config->is64() ? 8 : 4; // Handle /safeseh, x86 only, on by default, except for mingw. - if (config->machine == I386 && - args.hasFlag(OPT_safeseh, OPT_safeseh_no, !config->mingw)) - config->safeSEH = true; + if (config->machine == I386) { + config->safeSEH = args.hasFlag(OPT_safeseh, OPT_safeseh_no, !config->mingw); + config->noSEH = args.hasArg(OPT_noseh); + } // Handle /functionpadmin for (auto *arg : args.filtered(OPT_functionpadmin, OPT_functionpadmin_opt)) Modified: vendor/llvm-project/release-11.x/lld/COFF/InputFiles.cpp ============================================================================== --- vendor/llvm-project/release-11.x/lld/COFF/InputFiles.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/lld/COFF/InputFiles.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -348,13 +348,13 @@ void ObjFile::recordPrevailingSymbolForMingw( // of the section chunk we actually include instead of discarding it, // add the symbol to a map to allow using it for implicitly // associating .[px]data$ sections to it. + // Use the suffix from the .text$ instead of the leader symbol + // name, for cases where the names differ (i386 mangling/decorations, + // cases where the leader is a weak symbol named .weak.func.default*). int32_t sectionNumber = sym.getSectionNumber(); SectionChunk *sc = sparseChunks[sectionNumber]; if (sc && sc->getOutputCharacteristics() & IMAGE_SCN_MEM_EXECUTE) { - StringRef name; - name = check(coffObj->getSymbolName(sym)); - if (getMachineType() == I386) - name.consume_front("_"); + StringRef name = sc->getSectionName().split('$').second; prevailingSectionMap[name] = sectionNumber; } } Modified: vendor/llvm-project/release-11.x/lld/COFF/MinGW.cpp ============================================================================== --- vendor/llvm-project/release-11.x/lld/COFF/MinGW.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/lld/COFF/MinGW.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -34,6 +34,11 @@ AutoExporter::AutoExporter() { "libclang_rt.builtins-arm", "libclang_rt.builtins-i386", "libclang_rt.builtins-x86_64", + "libclang_rt.profile", + "libclang_rt.profile-aarch64", + "libclang_rt.profile-arm", + "libclang_rt.profile-i386", + "libclang_rt.profile-x86_64", "libc++", "libc++abi", "libunwind", @@ -57,6 +62,10 @@ AutoExporter::AutoExporter() { "__builtin_", // Artificial symbols such as .refptr ".", + // profile generate symbols + "__profc_", + "__profd_", + "__profvp_", }; excludeSymbolSuffixes = { Modified: vendor/llvm-project/release-11.x/lld/COFF/Options.td ============================================================================== --- vendor/llvm-project/release-11.x/lld/COFF/Options.td Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/lld/COFF/Options.td Fri Jul 31 22:12:34 2020 (r363744) @@ -204,6 +204,7 @@ def include_optional : Joined<["/", "-", "/?", "-?"], HelpText<"Add symbol as undefined, but allow it to remain undefined">; def kill_at : F<"kill-at">; def lldmingw : F<"lldmingw">; +def noseh : F<"noseh">; def output_def : Joined<["/", "-", "/?", "-?"], "output-def:">; def pdb_source_path : P<"pdbsourcepath", "Base path used to make relative source file path absolute in PDB">; Modified: vendor/llvm-project/release-11.x/lld/COFF/Writer.cpp ============================================================================== --- vendor/llvm-project/release-11.x/lld/COFF/Writer.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/lld/COFF/Writer.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -1393,7 +1393,7 @@ template void Writer::writeHeade pe->DLLCharacteristics |= IMAGE_DLL_CHARACTERISTICS_GUARD_CF; if (config->integrityCheck) pe->DLLCharacteristics |= IMAGE_DLL_CHARACTERISTICS_FORCE_INTEGRITY; - if (setNoSEHCharacteristic) + if (setNoSEHCharacteristic || config->noSEH) pe->DLLCharacteristics |= IMAGE_DLL_CHARACTERISTICS_NO_SEH; if (config->terminalServerAware) pe->DLLCharacteristics |= IMAGE_DLL_CHARACTERISTICS_TERMINAL_SERVER_AWARE; Modified: vendor/llvm-project/release-11.x/llvm/include/llvm/CodeGen/TargetFrameLowering.h ============================================================================== --- vendor/llvm-project/release-11.x/llvm/include/llvm/CodeGen/TargetFrameLowering.h Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/include/llvm/CodeGen/TargetFrameLowering.h Fri Jul 31 22:12:34 2020 (r363744) @@ -134,6 +134,12 @@ class TargetFrameLowering { (public) /// was called). virtual unsigned getStackAlignmentSkew(const MachineFunction &MF) const; + /// This method returns whether or not it is safe for an object with the + /// given stack id to be bundled into the local area. + virtual bool isStackIdSafeForLocalArea(unsigned StackId) const { + return true; + } + /// getOffsetOfLocalArea - This method returns the offset of the local area /// from the stack pointer on entrance to a function. /// Modified: vendor/llvm-project/release-11.x/llvm/lib/Analysis/BasicAliasAnalysis.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Analysis/BasicAliasAnalysis.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Analysis/BasicAliasAnalysis.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -1648,8 +1648,32 @@ AliasResult BasicAAResult::aliasPHI(const PHINode *PN, } SmallVector V1Srcs; + // For a recursive phi, that recurses through a contant gep, we can perform + // aliasing calculations using the other phi operands with an unknown size to + // specify that an unknown number of elements after the initial value are + // potentially accessed. bool isRecursive = false; - if (PV) { + auto CheckForRecPhi = [&](Value *PV) { + if (!EnableRecPhiAnalysis) + return false; + if (GEPOperator *PVGEP = dyn_cast(PV)) { + // Check whether the incoming value is a GEP that advances the pointer + // result of this PHI node (e.g. in a loop). If this is the case, we + // would recurse and always get a MayAlias. Handle this case specially + // below. We need to ensure that the phi is inbounds and has a constant + // positive operand so that we can check for alias with the initial value + // and an unknown but positive size. + if (PVGEP->getPointerOperand() == PN && PVGEP->isInBounds() && + PVGEP->getNumIndices() == 1 && isa(PVGEP->idx_begin()) && + !cast(PVGEP->idx_begin())->isNegative()) { + isRecursive = true; + return true; + } + } + return false; + }; + + if (PV) { // If we have PhiValues then use it to get the underlying phi values. const PhiValues::ValueSet &PhiValueSet = PV->getValuesForPhi(PN); // If we have more phi values than the search depth then return MayAlias @@ -1660,19 +1684,8 @@ AliasResult BasicAAResult::aliasPHI(const PHINode *PN, return MayAlias; // Add the values to V1Srcs for (Value *PV1 : PhiValueSet) { - if (EnableRecPhiAnalysis) { - if (GEPOperator *PV1GEP = dyn_cast(PV1)) { - // Check whether the incoming value is a GEP that advances the pointer - // result of this PHI node (e.g. in a loop). If this is the case, we - // would recurse and always get a MayAlias. Handle this case specially - // below. - if (PV1GEP->getPointerOperand() == PN && PV1GEP->getNumIndices() == 1 && - isa(PV1GEP->idx_begin())) { - isRecursive = true; - continue; - } - } - } + if (CheckForRecPhi(PV1)) + continue; V1Srcs.push_back(PV1); } } else { @@ -1687,18 +1700,8 @@ AliasResult BasicAAResult::aliasPHI(const PHINode *PN, // and 'n' are the number of PHI sources. return MayAlias; - if (EnableRecPhiAnalysis) - if (GEPOperator *PV1GEP = dyn_cast(PV1)) { - // Check whether the incoming value is a GEP that advances the pointer - // result of this PHI node (e.g. in a loop). If this is the case, we - // would recurse and always get a MayAlias. Handle this case specially - // below. - if (PV1GEP->getPointerOperand() == PN && PV1GEP->getNumIndices() == 1 && - isa(PV1GEP->idx_begin())) { - isRecursive = true; - continue; - } - } + if (CheckForRecPhi(PV1)) + continue; if (UniqueSrc.insert(PV1).second) V1Srcs.push_back(PV1); Modified: vendor/llvm-project/release-11.x/llvm/lib/CodeGen/LocalStackSlotAllocation.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/CodeGen/LocalStackSlotAllocation.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/CodeGen/LocalStackSlotAllocation.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -220,6 +220,8 @@ void LocalStackSlotPass::calculateFrameObjectOffsets(M continue; if (StackProtectorFI == (int)i) continue; + if (!TFI.isStackIdSafeForLocalArea(MFI.getStackID(i))) + continue; switch (MFI.getObjectSSPLayout(i)) { case MachineFrameInfo::SSPLK_None: @@ -253,6 +255,8 @@ void LocalStackSlotPass::calculateFrameObjectOffsets(M if (MFI.getStackProtectorIndex() == (int)i) continue; if (ProtectedObjs.count(i)) + continue; + if (!TFI.isStackIdSafeForLocalArea(MFI.getStackID(i))) continue; AdjustStackOffset(MFI, i, Offset, StackGrowsDown, MaxAlign); Modified: vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -11372,9 +11372,10 @@ SDValue DAGCombiner::visitTRUNCATE(SDNode *N) { // Stop if more than one members are non-undef. if (NumDefs > 1) break; + VTs.push_back(EVT::getVectorVT(*DAG.getContext(), VT.getVectorElementType(), - X.getValueType().getVectorNumElements())); + X.getValueType().getVectorElementCount())); } if (NumDefs == 0) @@ -18795,6 +18796,11 @@ static SDValue combineConcatVectorOfScalars(SDNode *N, static SDValue combineConcatVectorOfExtracts(SDNode *N, SelectionDAG &DAG) { EVT VT = N->getValueType(0); EVT OpVT = N->getOperand(0).getValueType(); + + // We currently can't generate an appropriate shuffle for a scalable vector. + if (VT.isScalableVector()) + return SDValue(); + int NumElts = VT.getVectorNumElements(); int NumOpElts = OpVT.getVectorNumElements(); @@ -19055,11 +19061,14 @@ SDValue DAGCombiner::visitCONCAT_VECTORS(SDNode *N) { return V; // Type legalization of vectors and DAG canonicalization of SHUFFLE_VECTOR - // nodes often generate nop CONCAT_VECTOR nodes. - // Scan the CONCAT_VECTOR operands and look for a CONCAT operations that - // place the incoming vectors at the exact same location. + // nodes often generate nop CONCAT_VECTOR nodes. Scan the CONCAT_VECTOR + // operands and look for a CONCAT operations that place the incoming vectors + // at the exact same location. + // + // For scalable vectors, EXTRACT_SUBVECTOR indexes are implicitly scaled. SDValue SingleSource = SDValue(); - unsigned PartNumElem = N->getOperand(0).getValueType().getVectorNumElements(); + unsigned PartNumElem = + N->getOperand(0).getValueType().getVectorMinNumElements(); for (unsigned i = 0, e = N->getNumOperands(); i != e; ++i) { SDValue Op = N->getOperand(i); @@ -19181,7 +19190,10 @@ static SDValue narrowExtractedVectorBinOp(SDNode *Extr // The binop must be a vector type, so we can extract some fraction of it. EVT WideBVT = BinOp.getValueType(); - if (!WideBVT.isVector()) + // The optimisations below currently assume we are dealing with fixed length + // vectors. It is possible to add support for scalable vectors, but at the + // moment we've done no analysis to prove whether they are profitable or not. + if (!WideBVT.isFixedLengthVector()) return SDValue(); EVT VT = Extract->getValueType(0); Modified: vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -2151,7 +2151,7 @@ SDValue DAGTypeLegalizer::SplitVecOp_UnaryOp(SDNode *N EVT InVT = Lo.getValueType(); EVT OutVT = EVT::getVectorVT(*DAG.getContext(), ResVT.getVectorElementType(), - InVT.getVectorNumElements()); + InVT.getVectorElementCount()); if (N->isStrictFPOpcode()) { Lo = DAG.getNode(N->getOpcode(), dl, { OutVT, MVT::Other }, @@ -2197,13 +2197,19 @@ SDValue DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR SDValue Idx = N->getOperand(1); SDLoc dl(N); SDValue Lo, Hi; + + if (SubVT.isScalableVector() != + N->getOperand(0).getValueType().isScalableVector()) + report_fatal_error("Extracting a fixed-length vector from an illegal " + "scalable vector is not yet supported"); + GetSplitVector(N->getOperand(0), Lo, Hi); - uint64_t LoElts = Lo.getValueType().getVectorNumElements(); + uint64_t LoElts = Lo.getValueType().getVectorMinNumElements(); uint64_t IdxVal = cast(Idx)->getZExtValue(); if (IdxVal < LoElts) { - assert(IdxVal + SubVT.getVectorNumElements() <= LoElts && + assert(IdxVal + SubVT.getVectorMinNumElements() <= LoElts && "Extracted subvector crosses vector split!"); return DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, SubVT, Lo, Idx); } else { @@ -2559,13 +2565,9 @@ SDValue DAGTypeLegalizer::SplitVecOp_TruncateHelper(SD SDValue InVec = N->getOperand(OpNo); EVT InVT = InVec->getValueType(0); EVT OutVT = N->getValueType(0); - unsigned NumElements = OutVT.getVectorNumElements(); + ElementCount NumElements = OutVT.getVectorElementCount(); bool IsFloat = OutVT.isFloatingPoint(); - // Widening should have already made sure this is a power-two vector - // if we're trying to split it at all. assert() that's true, just in case. - assert(!(NumElements & 1) && "Splitting vector, but not in half!"); - unsigned InElementSize = InVT.getScalarSizeInBits(); unsigned OutElementSize = OutVT.getScalarSizeInBits(); @@ -2595,6 +2597,9 @@ SDValue DAGTypeLegalizer::SplitVecOp_TruncateHelper(SD GetSplitVector(InVec, InLoVec, InHiVec); // Truncate them to 1/2 the element size. + // + // This assumes the number of elements is a power of two; any vector that + // isn't should be widened, not split. EVT HalfElementVT = IsFloat ? EVT::getFloatingPointVT(InElementSize/2) : EVT::getIntegerVT(*DAG.getContext(), InElementSize/2); @@ -3605,16 +3610,15 @@ SDValue DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS(S EVT InVT = N->getOperand(0).getValueType(); EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0)); SDLoc dl(N); - unsigned WidenNumElts = WidenVT.getVectorNumElements(); - unsigned NumInElts = InVT.getVectorNumElements(); unsigned NumOperands = N->getNumOperands(); bool InputWidened = false; // Indicates we need to widen the input. if (getTypeAction(InVT) != TargetLowering::TypeWidenVector) { - if (WidenVT.getVectorNumElements() % InVT.getVectorNumElements() == 0) { + unsigned WidenNumElts = WidenVT.getVectorMinNumElements(); + unsigned NumInElts = InVT.getVectorMinNumElements(); + if (WidenNumElts % NumInElts == 0) { // Add undef vectors to widen to correct length. - unsigned NumConcat = WidenVT.getVectorNumElements() / - InVT.getVectorNumElements(); + unsigned NumConcat = WidenNumElts / NumInElts; SDValue UndefVal = DAG.getUNDEF(InVT); SmallVector Ops(NumConcat); for (unsigned i=0; i < NumOperands; ++i) @@ -3638,6 +3642,11 @@ SDValue DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS(S return GetWidenedVector(N->getOperand(0)); if (NumOperands == 2) { + assert(!WidenVT.isScalableVector() && + "Cannot use vector shuffles to widen CONCAT_VECTOR result"); + unsigned WidenNumElts = WidenVT.getVectorNumElements(); + unsigned NumInElts = InVT.getVectorNumElements(); + // Replace concat of two operands with a shuffle. SmallVector MaskOps(WidenNumElts, -1); for (unsigned i = 0; i < NumInElts; ++i) { @@ -3652,6 +3661,11 @@ SDValue DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS(S } } + assert(!WidenVT.isScalableVector() && + "Cannot use build vectors to widen CONCAT_VECTOR result"); + unsigned WidenNumElts = WidenVT.getVectorNumElements(); + unsigned NumInElts = InVT.getVectorNumElements(); + // Fall back to use extracts and build vector. EVT EltVT = WidenVT.getVectorElementType(); SmallVector Ops(WidenNumElts); @@ -4913,7 +4927,8 @@ SDValue DAGTypeLegalizer::GenWidenVectorLoads(SmallVec int LdWidth = LdVT.getSizeInBits(); int WidthDiff = WidenWidth - LdWidth; - // Allow wider loads. + // Allow wider loads if they are sufficiently aligned to avoid memory faults + // and if the original load is simple. unsigned LdAlign = (!LD->isSimple()) ? 0 : LD->getAlignment(); // Find the vector type that can load from. @@ -4965,19 +4980,6 @@ SDValue DAGTypeLegalizer::GenWidenVectorLoads(SmallVec LD->getPointerInfo().getWithOffset(Offset), LD->getOriginalAlign(), MMOFlags, AAInfo); LdChain.push_back(L.getValue(1)); - if (L->getValueType(0).isVector() && NewVTWidth >= LdWidth) { - // Later code assumes the vector loads produced will be mergeable, so we - // must pad the final entry up to the previous width. Scalars are - // combined separately. - SmallVector Loads; - Loads.push_back(L); - unsigned size = L->getValueSizeInBits(0); - while (size < LdOp->getValueSizeInBits(0)) { - Loads.push_back(DAG.getUNDEF(L->getValueType(0))); - size += L->getValueSizeInBits(0); - } - L = DAG.getNode(ISD::CONCAT_VECTORS, dl, LdOp->getValueType(0), Loads); - } } else { L = DAG.getLoad(NewVT, dl, Chain, BasePtr, LD->getPointerInfo().getWithOffset(Offset), @@ -5018,8 +5020,17 @@ SDValue DAGTypeLegalizer::GenWidenVectorLoads(SmallVec EVT NewLdTy = LdOps[i].getValueType(); if (NewLdTy != LdTy) { // Create a larger vector. + unsigned NumOps = NewLdTy.getSizeInBits() / LdTy.getSizeInBits(); + assert(NewLdTy.getSizeInBits() % LdTy.getSizeInBits() == 0); + SmallVector WidenOps(NumOps); + unsigned j = 0; + for (; j != End-Idx; ++j) + WidenOps[j] = ConcatOps[Idx+j]; + for (; j != NumOps; ++j) + WidenOps[j] = DAG.getUNDEF(LdTy); + ConcatOps[End-1] = DAG.getNode(ISD::CONCAT_VECTORS, dl, NewLdTy, - makeArrayRef(&ConcatOps[Idx], End - Idx)); + WidenOps); Idx = End - 1; LdTy = NewLdTy; } Modified: vendor/llvm-project/release-11.x/llvm/lib/MC/WinCOFFObjectWriter.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/MC/WinCOFFObjectWriter.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/MC/WinCOFFObjectWriter.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -375,6 +375,7 @@ void WinCOFFObjectWriter::DefineSymbol(const MCSymbol COFFSymbol *Local = nullptr; if (cast(MCSym).isWeakExternal()) { Sym->Data.StorageClass = COFF::IMAGE_SYM_CLASS_WEAK_EXTERNAL; + Sym->Section = nullptr; COFFSymbol *WeakDefault = getLinkedSymbol(MCSym); if (!WeakDefault) { Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -1192,7 +1192,7 @@ void AArch64FrameLowering::emitPrologue(MachineFunctio // Process the SVE callee-saves to determine what space needs to be // allocated. - if (AFI->getSVECalleeSavedStackSize()) { + if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) { // Find callee save instructions in frame. CalleeSavesBegin = MBBI; assert(IsSVECalleeSave(CalleeSavesBegin) && "Unexpected instruction"); @@ -1200,11 +1200,7 @@ void AArch64FrameLowering::emitPrologue(MachineFunctio ++MBBI; CalleeSavesEnd = MBBI; - int64_t OffsetToFirstCalleeSaveFromSP = - MFI.getObjectOffset(AFI->getMaxSVECSFrameIndex()); - StackOffset OffsetToCalleeSavesFromSP = - StackOffset(OffsetToFirstCalleeSaveFromSP, MVT::nxv1i8) + SVEStackSize; - AllocateBefore -= OffsetToCalleeSavesFromSP; + AllocateBefore = {CalleeSavedSize, MVT::nxv1i8}; AllocateAfter = SVEStackSize - AllocateBefore; } @@ -1582,7 +1578,7 @@ void AArch64FrameLowering::emitEpilogue(MachineFunctio // deallocated. StackOffset DeallocateBefore = {}, DeallocateAfter = SVEStackSize; MachineBasicBlock::iterator RestoreBegin = LastPopI, RestoreEnd = LastPopI; - if (AFI->getSVECalleeSavedStackSize()) { + if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) { RestoreBegin = std::prev(RestoreEnd);; while (IsSVECalleeSave(RestoreBegin) && RestoreBegin != MBB.begin()) @@ -1592,23 +1588,21 @@ void AArch64FrameLowering::emitEpilogue(MachineFunctio assert(IsSVECalleeSave(RestoreBegin) && IsSVECalleeSave(std::prev(RestoreEnd)) && "Unexpected instruction"); - int64_t OffsetToFirstCalleeSaveFromSP = - MFI.getObjectOffset(AFI->getMaxSVECSFrameIndex()); - StackOffset OffsetToCalleeSavesFromSP = - StackOffset(OffsetToFirstCalleeSaveFromSP, MVT::nxv1i8) + SVEStackSize; - DeallocateBefore = OffsetToCalleeSavesFromSP; - DeallocateAfter = SVEStackSize - DeallocateBefore; + StackOffset CalleeSavedSizeAsOffset = {CalleeSavedSize, MVT::nxv1i8}; + DeallocateBefore = SVEStackSize - CalleeSavedSizeAsOffset; + DeallocateAfter = CalleeSavedSizeAsOffset; } // Deallocate the SVE area. if (SVEStackSize) { if (AFI->isStackRealigned()) { - if (AFI->getSVECalleeSavedStackSize()) - // Set SP to start of SVE area, from which the callee-save reloads - // can be done. The code below will deallocate the stack space + if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) + // Set SP to start of SVE callee-save area from which they can + // be reloaded. The code below will deallocate the stack space // space by moving FP -> SP. emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::FP, - -SVEStackSize, TII, MachineInstr::FrameDestroy); + {-CalleeSavedSize, MVT::nxv1i8}, TII, + MachineInstr::FrameDestroy); } else { if (AFI->getSVECalleeSavedStackSize()) { // Deallocate the non-SVE locals first before we can deallocate (and @@ -2595,25 +2589,23 @@ static int64_t determineSVEStackObjectOffsets(MachineF int &MinCSFrameIndex, int &MaxCSFrameIndex, bool AssignOffsets) { +#ifndef NDEBUG // First process all fixed stack objects. - int64_t Offset = 0; for (int I = MFI.getObjectIndexBegin(); I != 0; ++I) - if (MFI.getStackID(I) == TargetStackID::SVEVector) { - int64_t FixedOffset = -MFI.getObjectOffset(I); - if (FixedOffset > Offset) - Offset = FixedOffset; - } + assert(MFI.getStackID(I) != TargetStackID::SVEVector && + "SVE vectors should never be passed on the stack by value, only by " + "reference."); +#endif auto Assign = [&MFI](int FI, int64_t Offset) { LLVM_DEBUG(dbgs() << "alloc FI(" << FI << ") at SP[" << Offset << "]\n"); MFI.setObjectOffset(FI, Offset); }; + int64_t Offset = 0; + // Then process all callee saved slots. if (getSVECalleeSaveSlotRange(MFI, MinCSFrameIndex, MaxCSFrameIndex)) { - // Make sure to align the last callee save slot. - MFI.setObjectAlignment(MaxCSFrameIndex, Align(16)); - // Assign offsets to the callee save slots. for (int I = MinCSFrameIndex; I <= MaxCSFrameIndex; ++I) { Offset += MFI.getObjectSize(I); @@ -2622,6 +2614,9 @@ static int64_t determineSVEStackObjectOffsets(MachineF Assign(I, -Offset); } } + + // Ensure that the Callee-save area is aligned to 16bytes. + Offset = alignTo(Offset, Align(16U)); // Create a buffer of SVE objects to allocate and sort it. SmallVector ObjectsToAllocate; Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.h ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.h Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64FrameLowering.h Fri Jul 31 22:12:34 2020 (r363744) @@ -105,6 +105,12 @@ class AArch64FrameLowering : public TargetFrameLowerin } } + bool isStackIdSafeForLocalArea(unsigned StackId) const override { + // We don't support putting SVE objects into the pre-allocated local + // frame block at the moment. + return StackId != TargetStackID::SVEVector; + } + private: bool shouldCombineCSRLocalStackBump(MachineFunction &MF, uint64_t StackBumpBytes) const; Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -245,7 +245,8 @@ class AArch64DAGToDAGISel : public SelectionDAGISel { unsigned SubRegIdx); void SelectLoadLane(SDNode *N, unsigned NumVecs, unsigned Opc); void SelectPostLoadLane(SDNode *N, unsigned NumVecs, unsigned Opc); - void SelectPredicatedLoad(SDNode *N, unsigned NumVecs, const unsigned Opc); + void SelectPredicatedLoad(SDNode *N, unsigned NumVecs, unsigned Scale, + unsigned Opc_rr, unsigned Opc_ri); bool SelectAddrModeFrameIndexSVE(SDValue N, SDValue &Base, SDValue &OffImm); /// SVE Reg+Imm addressing mode. @@ -262,14 +263,12 @@ class AArch64DAGToDAGISel : public SelectionDAGISel { void SelectPostStore(SDNode *N, unsigned NumVecs, unsigned Opc); void SelectStoreLane(SDNode *N, unsigned NumVecs, unsigned Opc); void SelectPostStoreLane(SDNode *N, unsigned NumVecs, unsigned Opc); - template - void SelectPredicatedStore(SDNode *N, unsigned NumVecs, const unsigned Opc_rr, - const unsigned Opc_ri); - template + void SelectPredicatedStore(SDNode *N, unsigned NumVecs, unsigned Scale, + unsigned Opc_rr, unsigned Opc_ri); std::tuple - findAddrModeSVELoadStore(SDNode *N, const unsigned Opc_rr, - const unsigned Opc_ri, const SDValue &OldBase, - const SDValue &OldOffset); + findAddrModeSVELoadStore(SDNode *N, unsigned Opc_rr, unsigned Opc_ri, + const SDValue &OldBase, const SDValue &OldOffset, + unsigned Scale); bool tryBitfieldExtractOp(SDNode *N); bool tryBitfieldExtractOpFromSExt(SDNode *N); @@ -1414,12 +1413,12 @@ void AArch64DAGToDAGISel::SelectPostLoad(SDNode *N, un /// Optimize \param OldBase and \param OldOffset selecting the best addressing /// mode. Returns a tuple consisting of an Opcode, an SDValue representing the /// new Base and an SDValue representing the new offset. -template std::tuple -AArch64DAGToDAGISel::findAddrModeSVELoadStore(SDNode *N, const unsigned Opc_rr, - const unsigned Opc_ri, +AArch64DAGToDAGISel::findAddrModeSVELoadStore(SDNode *N, unsigned Opc_rr, + unsigned Opc_ri, const SDValue &OldBase, - const SDValue &OldOffset) { + const SDValue &OldOffset, + unsigned Scale) { SDValue NewBase = OldBase; SDValue NewOffset = OldOffset; // Detect a possible Reg+Imm addressing mode. @@ -1429,21 +1428,30 @@ AArch64DAGToDAGISel::findAddrModeSVELoadStore(SDNode * // Detect a possible reg+reg addressing mode, but only if we haven't already // detected a Reg+Imm one. const bool IsRegReg = - !IsRegImm && SelectSVERegRegAddrMode(OldBase, NewBase, NewOffset); + !IsRegImm && SelectSVERegRegAddrMode(OldBase, Scale, NewBase, NewOffset); // Select the instruction. return std::make_tuple(IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset); } void AArch64DAGToDAGISel::SelectPredicatedLoad(SDNode *N, unsigned NumVecs, - const unsigned Opc) { + unsigned Scale, unsigned Opc_ri, + unsigned Opc_rr) { + assert(Scale < 4 && "Invalid scaling value."); SDLoc DL(N); EVT VT = N->getValueType(0); SDValue Chain = N->getOperand(0); + // Optimize addressing mode. + SDValue Base, Offset; + unsigned Opc; + std::tie(Opc, Base, Offset) = findAddrModeSVELoadStore( + N, Opc_rr, Opc_ri, N->getOperand(2), + CurDAG->getTargetConstant(0, DL, MVT::i64), Scale); + SDValue Ops[] = {N->getOperand(1), // Predicate - N->getOperand(2), // Memory operand - CurDAG->getTargetConstant(0, DL, MVT::i64), Chain}; + Base, // Memory operand + Offset, Chain}; const EVT ResTys[] = {MVT::Untyped, MVT::Other}; @@ -1479,10 +1487,9 @@ void AArch64DAGToDAGISel::SelectStore(SDNode *N, unsig ReplaceNode(N, St); } -template void AArch64DAGToDAGISel::SelectPredicatedStore(SDNode *N, unsigned NumVecs, - const unsigned Opc_rr, - const unsigned Opc_ri) { + unsigned Scale, unsigned Opc_rr, + unsigned Opc_ri) { SDLoc dl(N); // Form a REG_SEQUENCE to force register allocation. @@ -1492,9 +1499,9 @@ void AArch64DAGToDAGISel::SelectPredicatedStore(SDNode // Optimize addressing mode. unsigned Opc; SDValue Offset, Base; - std::tie(Opc, Base, Offset) = findAddrModeSVELoadStore( + std::tie(Opc, Base, Offset) = findAddrModeSVELoadStore( N, Opc_rr, Opc_ri, N->getOperand(NumVecs + 3), - CurDAG->getTargetConstant(0, dl, MVT::i64)); + CurDAG->getTargetConstant(0, dl, MVT::i64), Scale); SDValue Ops[] = {RegSeq, N->getOperand(NumVecs + 2), // predicate Base, // address @@ -4085,63 +4092,51 @@ void AArch64DAGToDAGISel::Select(SDNode *Node) { } case Intrinsic::aarch64_sve_st2: { if (VT == MVT::nxv16i8) { - SelectPredicatedStore(Node, 2, AArch64::ST2B, - AArch64::ST2B_IMM); + SelectPredicatedStore(Node, 2, 0, AArch64::ST2B, AArch64::ST2B_IMM); return; } else if (VT == MVT::nxv8i16 || VT == MVT::nxv8f16 || (VT == MVT::nxv8bf16 && Subtarget->hasBF16())) { - SelectPredicatedStore(Node, 2, AArch64::ST2H, - AArch64::ST2H_IMM); + SelectPredicatedStore(Node, 2, 1, AArch64::ST2H, AArch64::ST2H_IMM); return; } else if (VT == MVT::nxv4i32 || VT == MVT::nxv4f32) { - SelectPredicatedStore(Node, 2, AArch64::ST2W, - AArch64::ST2W_IMM); + SelectPredicatedStore(Node, 2, 2, AArch64::ST2W, AArch64::ST2W_IMM); return; } else if (VT == MVT::nxv2i64 || VT == MVT::nxv2f64) { - SelectPredicatedStore(Node, 2, AArch64::ST2D, - AArch64::ST2D_IMM); + SelectPredicatedStore(Node, 2, 3, AArch64::ST2D, AArch64::ST2D_IMM); return; } break; } case Intrinsic::aarch64_sve_st3: { if (VT == MVT::nxv16i8) { - SelectPredicatedStore(Node, 3, AArch64::ST3B, - AArch64::ST3B_IMM); + SelectPredicatedStore(Node, 3, 0, AArch64::ST3B, AArch64::ST3B_IMM); return; } else if (VT == MVT::nxv8i16 || VT == MVT::nxv8f16 || (VT == MVT::nxv8bf16 && Subtarget->hasBF16())) { - SelectPredicatedStore(Node, 3, AArch64::ST3H, - AArch64::ST3H_IMM); + SelectPredicatedStore(Node, 3, 1, AArch64::ST3H, AArch64::ST3H_IMM); return; } else if (VT == MVT::nxv4i32 || VT == MVT::nxv4f32) { - SelectPredicatedStore(Node, 3, AArch64::ST3W, - AArch64::ST3W_IMM); + SelectPredicatedStore(Node, 3, 2, AArch64::ST3W, AArch64::ST3W_IMM); return; } else if (VT == MVT::nxv2i64 || VT == MVT::nxv2f64) { - SelectPredicatedStore(Node, 3, AArch64::ST3D, - AArch64::ST3D_IMM); + SelectPredicatedStore(Node, 3, 3, AArch64::ST3D, AArch64::ST3D_IMM); return; } break; } case Intrinsic::aarch64_sve_st4: { if (VT == MVT::nxv16i8) { - SelectPredicatedStore(Node, 4, AArch64::ST4B, - AArch64::ST4B_IMM); + SelectPredicatedStore(Node, 4, 0, AArch64::ST4B, AArch64::ST4B_IMM); return; } else if (VT == MVT::nxv8i16 || VT == MVT::nxv8f16 || (VT == MVT::nxv8bf16 && Subtarget->hasBF16())) { - SelectPredicatedStore(Node, 4, AArch64::ST4H, - AArch64::ST4H_IMM); + SelectPredicatedStore(Node, 4, 1, AArch64::ST4H, AArch64::ST4H_IMM); return; } else if (VT == MVT::nxv4i32 || VT == MVT::nxv4f32) { - SelectPredicatedStore(Node, 4, AArch64::ST4W, - AArch64::ST4W_IMM); + SelectPredicatedStore(Node, 4, 2, AArch64::ST4W, AArch64::ST4W_IMM); return; } else if (VT == MVT::nxv2i64 || VT == MVT::nxv2f64) { - SelectPredicatedStore(Node, 4, AArch64::ST4D, - AArch64::ST4D_IMM); + SelectPredicatedStore(Node, 4, 3, AArch64::ST4D, AArch64::ST4D_IMM); return; } break; @@ -4741,51 +4736,51 @@ void AArch64DAGToDAGISel::Select(SDNode *Node) { } case AArch64ISD::SVE_LD2_MERGE_ZERO: { if (VT == MVT::nxv16i8) { - SelectPredicatedLoad(Node, 2, AArch64::LD2B_IMM); + SelectPredicatedLoad(Node, 2, 0, AArch64::LD2B_IMM, AArch64::LD2B); return; } else if (VT == MVT::nxv8i16 || VT == MVT::nxv8f16 || (VT == MVT::nxv8bf16 && Subtarget->hasBF16())) { - SelectPredicatedLoad(Node, 2, AArch64::LD2H_IMM); + SelectPredicatedLoad(Node, 2, 1, AArch64::LD2H_IMM, AArch64::LD2H); return; } else if (VT == MVT::nxv4i32 || VT == MVT::nxv4f32) { - SelectPredicatedLoad(Node, 2, AArch64::LD2W_IMM); + SelectPredicatedLoad(Node, 2, 2, AArch64::LD2W_IMM, AArch64::LD2W); return; } else if (VT == MVT::nxv2i64 || VT == MVT::nxv2f64) { - SelectPredicatedLoad(Node, 2, AArch64::LD2D_IMM); + SelectPredicatedLoad(Node, 2, 3, AArch64::LD2D_IMM, AArch64::LD2D); return; } break; } case AArch64ISD::SVE_LD3_MERGE_ZERO: { if (VT == MVT::nxv16i8) { - SelectPredicatedLoad(Node, 3, AArch64::LD3B_IMM); + SelectPredicatedLoad(Node, 3, 0, AArch64::LD3B_IMM, AArch64::LD3B); return; } else if (VT == MVT::nxv8i16 || VT == MVT::nxv8f16 || (VT == MVT::nxv8bf16 && Subtarget->hasBF16())) { - SelectPredicatedLoad(Node, 3, AArch64::LD3H_IMM); + SelectPredicatedLoad(Node, 3, 1, AArch64::LD3H_IMM, AArch64::LD3H); return; } else if (VT == MVT::nxv4i32 || VT == MVT::nxv4f32) { - SelectPredicatedLoad(Node, 3, AArch64::LD3W_IMM); + SelectPredicatedLoad(Node, 3, 2, AArch64::LD3W_IMM, AArch64::LD3W); return; } else if (VT == MVT::nxv2i64 || VT == MVT::nxv2f64) { - SelectPredicatedLoad(Node, 3, AArch64::LD3D_IMM); + SelectPredicatedLoad(Node, 3, 3, AArch64::LD3D_IMM, AArch64::LD3D); return; } break; } case AArch64ISD::SVE_LD4_MERGE_ZERO: { if (VT == MVT::nxv16i8) { - SelectPredicatedLoad(Node, 4, AArch64::LD4B_IMM); + SelectPredicatedLoad(Node, 4, 0, AArch64::LD4B_IMM, AArch64::LD4B); return; } else if (VT == MVT::nxv8i16 || VT == MVT::nxv8f16 || (VT == MVT::nxv8bf16 && Subtarget->hasBF16())) { - SelectPredicatedLoad(Node, 4, AArch64::LD4H_IMM); + SelectPredicatedLoad(Node, 4, 1, AArch64::LD4H_IMM, AArch64::LD4H); return; } else if (VT == MVT::nxv4i32 || VT == MVT::nxv4f32) { - SelectPredicatedLoad(Node, 4, AArch64::LD4W_IMM); + SelectPredicatedLoad(Node, 4, 2, AArch64::LD4W_IMM, AArch64::LD4W); return; } else if (VT == MVT::nxv2i64 || VT == MVT::nxv2f64) { - SelectPredicatedLoad(Node, 4, AArch64::LD4D_IMM); + SelectPredicatedLoad(Node, 4, 3, AArch64::LD4D_IMM, AArch64::LD4D); return; } break; @@ -4805,10 +4800,14 @@ FunctionPass *llvm::createAArch64ISelDag(AArch64Target /// When \p PredVT is a scalable vector predicate in the form /// MVT::nxxi1, it builds the correspondent scalable vector of -/// integers MVT::nxxi s.t. M x bits = 128. If the input +/// integers MVT::nxxi s.t. M x bits = 128. When targeting +/// structured vectors (NumVec >1), the output data type is +/// MVT::nxxi s.t. M x bits = 128. If the input /// PredVT is not in the form MVT::nxxi1, it returns an invalid /// EVT. -static EVT getPackedVectorTypeFromPredicateType(LLVMContext &Ctx, EVT PredVT) { +static EVT getPackedVectorTypeFromPredicateType(LLVMContext &Ctx, EVT PredVT, + unsigned NumVec) { + assert(NumVec > 0 && NumVec < 5 && "Invalid number of vectors."); if (!PredVT.isScalableVector() || PredVT.getVectorElementType() != MVT::i1) return EVT(); @@ -4818,7 +4817,8 @@ static EVT getPackedVectorTypeFromPredicateType(LLVMCo ElementCount EC = PredVT.getVectorElementCount(); EVT ScalarVT = EVT::getIntegerVT(Ctx, AArch64::SVEBitsPerBlock / EC.Min); - EVT MemVT = EVT::getVectorVT(Ctx, ScalarVT, EC); + EVT MemVT = EVT::getVectorVT(Ctx, ScalarVT, EC * NumVec); + return MemVT; } @@ -4842,6 +4842,15 @@ static EVT getMemVTFromNode(LLVMContext &Ctx, SDNode * return cast(Root->getOperand(3))->getVT(); case AArch64ISD::ST1_PRED: return cast(Root->getOperand(4))->getVT(); + case AArch64ISD::SVE_LD2_MERGE_ZERO: + return getPackedVectorTypeFromPredicateType( + Ctx, Root->getOperand(1)->getValueType(0), /*NumVec=*/2); + case AArch64ISD::SVE_LD3_MERGE_ZERO: + return getPackedVectorTypeFromPredicateType( + Ctx, Root->getOperand(1)->getValueType(0), /*NumVec=*/3); + case AArch64ISD::SVE_LD4_MERGE_ZERO: + return getPackedVectorTypeFromPredicateType( + Ctx, Root->getOperand(1)->getValueType(0), /*NumVec=*/4); default: break; } @@ -4857,7 +4866,7 @@ static EVT getMemVTFromNode(LLVMContext &Ctx, SDNode * // We are using an SVE prefetch intrinsic. Type must be inferred // from the width of the predicate. return getPackedVectorTypeFromPredicateType( - Ctx, Root->getOperand(2)->getValueType(0)); + Ctx, Root->getOperand(2)->getValueType(0), /*NumVec=*/1); } /// SelectAddrModeIndexedSVE - Attempt selection of the addressing mode: Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -932,8 +932,11 @@ AArch64TargetLowering::AArch64TargetLowering(const Tar setOperationAction(ISD::SHL, VT, Custom); setOperationAction(ISD::SRL, VT, Custom); setOperationAction(ISD::SRA, VT, Custom); - if (VT.getScalarType() == MVT::i1) + if (VT.getScalarType() == MVT::i1) { setOperationAction(ISD::SETCC, VT, Custom); + setOperationAction(ISD::TRUNCATE, VT, Custom); + setOperationAction(ISD::CONCAT_VECTORS, VT, Legal); + } } } @@ -8858,6 +8861,16 @@ SDValue AArch64TargetLowering::LowerTRUNCATE(SDValue O SelectionDAG &DAG) const { EVT VT = Op.getValueType(); + if (VT.getScalarType() == MVT::i1) { + // Lower i1 truncate to `(x & 1) != 0`. + SDLoc dl(Op); + EVT OpVT = Op.getOperand(0).getValueType(); + SDValue Zero = DAG.getConstant(0, dl, OpVT); + SDValue One = DAG.getConstant(1, dl, OpVT); + SDValue And = DAG.getNode(ISD::AND, dl, OpVT, Op.getOperand(0), One); + return DAG.getSetCC(dl, VT, And, Zero, ISD::SETNE); + } + if (!VT.isVector() || VT.isScalableVector()) return Op; @@ -12288,6 +12301,9 @@ static SDValue performLD1ReplicateCombine(SDNode *N, S "Unsupported opcode."); SDLoc DL(N); EVT VT = N->getValueType(0); + if (VT == MVT::nxv8bf16 && + !static_cast(DAG.getSubtarget()).hasBF16()) + return SDValue(); EVT LoadVT = VT; if (VT.isFloatingPoint()) @@ -14908,6 +14924,11 @@ bool AArch64TargetLowering::fallBackToDAGISel(const In for (unsigned i = 0; i < Inst.getNumOperands(); ++i) if (isa(Inst.getOperand(i)->getType())) return true; + + if (const AllocaInst *AI = dyn_cast(&Inst)) { + if (isa(AI->getAllocatedType())) + return true; + } return false; } Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrFormats.td ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrFormats.td Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrFormats.td Fri Jul 31 22:12:34 2020 (r363744) @@ -495,6 +495,9 @@ def SImmS4XForm : SDNodeXFormgetTargetConstant(N->getSExtValue() / 16, SDLoc(N), MVT::i64); }]>; +def SImmS32XForm : SDNodeXFormgetTargetConstant(N->getSExtValue() / 32, SDLoc(N), MVT::i64); +}]>; // simm6sN predicate - True if the immediate is a multiple of N in the range // [-32 * N, 31 * N]. @@ -546,7 +549,7 @@ def simm4s16 : Operand, ImmLeaf, ImmLeaf=-256 && Imm <= 224 && (Imm % 32) == 0x0; }]> { +[{ return Imm >=-256 && Imm <= 224 && (Imm % 32) == 0x0; }], SImmS32XForm> { let PrintMethod = "printImmScale<32>"; let ParserMatchClass = SImm4s32Operand; let DecoderMethod = "DecodeSImm<4>"; Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -2744,6 +2744,35 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock & return; } + // Copy a Z register pair by copying the individual sub-registers. + if (AArch64::ZPR2RegClass.contains(DestReg) && + AArch64::ZPR2RegClass.contains(SrcReg)) { + static const unsigned Indices[] = {AArch64::zsub0, AArch64::zsub1}; + copyPhysRegTuple(MBB, I, DL, DestReg, SrcReg, KillSrc, AArch64::ORR_ZZZ, + Indices); + return; + } + + // Copy a Z register triple by copying the individual sub-registers. + if (AArch64::ZPR3RegClass.contains(DestReg) && + AArch64::ZPR3RegClass.contains(SrcReg)) { + static const unsigned Indices[] = {AArch64::zsub0, AArch64::zsub1, + AArch64::zsub2}; + copyPhysRegTuple(MBB, I, DL, DestReg, SrcReg, KillSrc, AArch64::ORR_ZZZ, + Indices); + return; + } + + // Copy a Z register quad by copying the individual sub-registers. + if (AArch64::ZPR4RegClass.contains(DestReg) && + AArch64::ZPR4RegClass.contains(SrcReg)) { + static const unsigned Indices[] = {AArch64::zsub0, AArch64::zsub1, + AArch64::zsub2, AArch64::zsub3}; + copyPhysRegTuple(MBB, I, DL, DestReg, SrcReg, KillSrc, AArch64::ORR_ZZZ, + Indices); + return; + } + if (AArch64::GPR64spRegClass.contains(DestReg) && (AArch64::GPR64spRegClass.contains(SrcReg) || SrcReg == AArch64::XZR)) { if (DestReg == AArch64::SP || SrcReg == AArch64::SP) { Modified: vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp ============================================================================== --- vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp Fri Jul 31 21:43:56 2020 (r363743) +++ vendor/llvm-project/release-11.x/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp Fri Jul 31 22:12:34 2020 (r363744) @@ -40,6 +40,14 @@ AArch64RegisterInfo::AArch64RegisterInfo(const Triple AArch64_MC::initLLVMToCVRegMapping(this); } +static bool hasSVEArgsOrReturn(const MachineFunction *MF) { + const Function &F = MF->getFunction(); + return isa(F.getReturnType()) || + any_of(F.args(), [](const Argument &Arg) { + return isa(Arg.getType()); + }); +} + const MCPhysReg * AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const { assert(MF && "Invalid MachineFunction pointer."); @@ -75,6 +83,8 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineF // This is for OSes other than Windows; Windows is a separate case further // above. return CSR_AArch64_AAPCS_X18_SaveList; + if (hasSVEArgsOrReturn(MF)) + return CSR_AArch64_SVE_AAPCS_SaveList; *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***