From owner-svn-src-all@freebsd.org Sat Jan 19 18:44:31 2019 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 465A9148E82F; Sat, 19 Jan 2019 18:44:31 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E81BA8F194; Sat, 19 Jan 2019 18:44:30 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C8B411E81D; Sat, 19 Jan 2019 18:44:30 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x0JIiU0E010861; Sat, 19 Jan 2019 18:44:30 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x0JIiMB6010822; Sat, 19 Jan 2019 18:44:22 GMT (envelope-from dim@FreeBSD.org) Message-Id: <201901191844.x0JIiMB6010822@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Sat, 19 Jan 2019 18:44:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org Subject: svn commit: r343191 - in vendor/llvm/dist-release_80: . include/llvm/CodeGen include/llvm/IR lib/CodeGen/AsmPrinter lib/CodeGen/SelectionDAG lib/MC lib/Target/AArch64 lib/Target/AMDGPU lib/Target/M... X-SVN-Group: vendor X-SVN-Commit-Author: dim X-SVN-Commit-Paths: in vendor/llvm/dist-release_80: . include/llvm/CodeGen include/llvm/IR lib/CodeGen/AsmPrinter lib/CodeGen/SelectionDAG lib/MC lib/Target/AArch64 lib/Target/AMDGPU lib/Target/MSP430 lib/Target/X86 lib/... X-SVN-Commit-Revision: 343191 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: E81BA8F194 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.98 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.98)[-0.981,0] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Jan 2019 18:44:31 -0000 Author: dim Date: Sat Jan 19 18:44:22 2019 New Revision: 343191 URL: https://svnweb.freebsd.org/changeset/base/343191 Log: Vendor import of llvm release_80 branch r351543: https://llvm.org/svn/llvm-project/llvm/branches/release_80@351543 Added: vendor/llvm/dist-release_80/test/CodeGen/AArch64/seh-finally.ll vendor/llvm/dist-release_80/test/CodeGen/AArch64/seh-localescape.ll vendor/llvm/dist-release_80/test/CodeGen/AArch64/wineh8.mir vendor/llvm/dist-release_80/test/CodeGen/AMDGPU/llvm.amdgcn.ds.ordered.add.ll vendor/llvm/dist-release_80/test/CodeGen/AMDGPU/llvm.amdgcn.ds.ordered.swap.ll vendor/llvm/dist-release_80/test/Transforms/InstCombine/sink-alloca.ll Modified: vendor/llvm/dist-release_80/CMakeLists.txt vendor/llvm/dist-release_80/include/llvm/CodeGen/MachineFunction.h vendor/llvm/dist-release_80/include/llvm/IR/IntrinsicsAMDGPU.td vendor/llvm/dist-release_80/lib/CodeGen/AsmPrinter/WinException.cpp vendor/llvm/dist-release_80/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp vendor/llvm/dist-release_80/lib/MC/MCWin64EH.cpp vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64AsmPrinter.cpp vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64FrameLowering.cpp vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64ISelLowering.cpp vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64InstrInfo.td vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64RegisterInfo.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPU.h vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.h vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUSearchableTables.td vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/DSInstructions.td vendor/llvm/dist-release_80/lib/Target/AMDGPU/GCNHazardRecognizer.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIISelLowering.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInsertWaitcnts.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.cpp vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.h vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.td vendor/llvm/dist-release_80/lib/Target/MSP430/MSP430AsmPrinter.cpp vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.h vendor/llvm/dist-release_80/lib/Target/X86/X86InstrAVX512.td vendor/llvm/dist-release_80/lib/Target/X86/X86InstrFragmentsSIMD.td vendor/llvm/dist-release_80/lib/Target/X86/X86InstrSSE.td vendor/llvm/dist-release_80/lib/Target/X86/X86IntrinsicsInfo.h vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp vendor/llvm/dist-release_80/lib/Transforms/Scalar/SROA.cpp vendor/llvm/dist-release_80/lib/Transforms/Vectorize/SLPVectorizer.cpp vendor/llvm/dist-release_80/test/CodeGen/AArch64/wineh4.mir vendor/llvm/dist-release_80/test/CodeGen/MSP430/2009-12-21-FrameAddr.ll vendor/llvm/dist-release_80/test/CodeGen/MSP430/fp.ll vendor/llvm/dist-release_80/test/CodeGen/MSP430/interrupt.ll vendor/llvm/dist-release_80/test/CodeGen/X86/avx2-intrinsics-x86.ll vendor/llvm/dist-release_80/test/CodeGen/X86/avx512-intrinsics.ll vendor/llvm/dist-release_80/test/CodeGen/X86/avx512bw-intrinsics.ll vendor/llvm/dist-release_80/test/CodeGen/X86/avx512bwvl-intrinsics.ll vendor/llvm/dist-release_80/test/Transforms/SLPVectorizer/X86/PR39774.ll vendor/llvm/dist-release_80/test/Transforms/SLPVectorizer/X86/PR40310.ll vendor/llvm/dist-release_80/test/Transforms/SROA/basictest.ll vendor/llvm/dist-release_80/utils/release/build_llvm_package.bat Modified: vendor/llvm/dist-release_80/CMakeLists.txt ============================================================================== --- vendor/llvm/dist-release_80/CMakeLists.txt Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/CMakeLists.txt Sat Jan 19 18:44:22 2019 (r343191) @@ -21,7 +21,7 @@ if(NOT DEFINED LLVM_VERSION_PATCH) set(LLVM_VERSION_PATCH 0) endif() if(NOT DEFINED LLVM_VERSION_SUFFIX) - set(LLVM_VERSION_SUFFIX svn) + set(LLVM_VERSION_SUFFIX "") endif() if (NOT PACKAGE_VERSION) Modified: vendor/llvm/dist-release_80/include/llvm/CodeGen/MachineFunction.h ============================================================================== --- vendor/llvm/dist-release_80/include/llvm/CodeGen/MachineFunction.h Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/include/llvm/CodeGen/MachineFunction.h Sat Jan 19 18:44:22 2019 (r343191) @@ -329,6 +329,7 @@ class MachineFunction { bool CallsUnwindInit = false; bool HasEHScopes = false; bool HasEHFunclets = false; + bool HasLocalEscape = false; /// List of C++ TypeInfo used. std::vector TypeInfos; @@ -810,6 +811,9 @@ class MachineFunction { bool hasEHFunclets() const { return HasEHFunclets; } void setHasEHFunclets(bool V) { HasEHFunclets = V; } + + bool hasLocalEscape() const { return HasLocalEscape; } + void setHasLocalEscape(bool V) { HasLocalEscape = V; } /// Find or create an LandingPadInfo for the specified MachineBasicBlock. LandingPadInfo &getOrCreateLandingPadInfo(MachineBasicBlock *LandingPad); Modified: vendor/llvm/dist-release_80/include/llvm/IR/IntrinsicsAMDGPU.td ============================================================================== --- vendor/llvm/dist-release_80/include/llvm/IR/IntrinsicsAMDGPU.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/include/llvm/IR/IntrinsicsAMDGPU.td Sat Jan 19 18:44:22 2019 (r343191) @@ -392,6 +392,24 @@ class AMDGPULDSF32Intrin : [IntrArgMemOnly, NoCapture<0>] >; +class AMDGPUDSOrderedIntrinsic : Intrinsic< + [llvm_i32_ty], + // M0 = {hi16:address, lo16:waveID}. Allow passing M0 as a pointer, so that + // the bit packing can be optimized at the IR level. + [LLVMQualPointerType, // IntToPtr(M0) + llvm_i32_ty, // value to add or swap + llvm_i32_ty, // ordering + llvm_i32_ty, // scope + llvm_i1_ty, // isVolatile + llvm_i32_ty, // ordered count index (OA index), also added to the address + llvm_i1_ty, // wave release, usually set to 1 + llvm_i1_ty], // wave done, set to 1 for the last ordered instruction + [NoCapture<0>] +>; + +def int_amdgcn_ds_ordered_add : AMDGPUDSOrderedIntrinsic; +def int_amdgcn_ds_ordered_swap : AMDGPUDSOrderedIntrinsic; + def int_amdgcn_ds_fadd : AMDGPULDSF32Intrin<"__builtin_amdgcn_ds_faddf">; def int_amdgcn_ds_fmin : AMDGPULDSF32Intrin<"__builtin_amdgcn_ds_fminf">; def int_amdgcn_ds_fmax : AMDGPULDSF32Intrin<"__builtin_amdgcn_ds_fmaxf">; Modified: vendor/llvm/dist-release_80/lib/CodeGen/AsmPrinter/WinException.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/CodeGen/AsmPrinter/WinException.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/CodeGen/AsmPrinter/WinException.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -545,15 +545,17 @@ void WinException::emitCSpecificHandlerTable(const Mac OS.AddComment(Comment); }; - // Emit a label assignment with the SEH frame offset so we can use it for - // llvm.eh.recoverfp. - StringRef FLinkageName = - GlobalValue::dropLLVMManglingEscape(MF->getFunction().getName()); - MCSymbol *ParentFrameOffset = - Ctx.getOrCreateParentFrameOffsetSymbol(FLinkageName); - const MCExpr *MCOffset = - MCConstantExpr::create(FuncInfo.SEHSetFrameOffset, Ctx); - Asm->OutStreamer->EmitAssignment(ParentFrameOffset, MCOffset); + if (!isAArch64) { + // Emit a label assignment with the SEH frame offset so we can use it for + // llvm.eh.recoverfp. + StringRef FLinkageName = + GlobalValue::dropLLVMManglingEscape(MF->getFunction().getName()); + MCSymbol *ParentFrameOffset = + Ctx.getOrCreateParentFrameOffsetSymbol(FLinkageName); + const MCExpr *MCOffset = + MCConstantExpr::create(FuncInfo.SEHSetFrameOffset, Ctx); + Asm->OutStreamer->EmitAssignment(ParentFrameOffset, MCOffset); + } // Use the assembler to compute the number of table entries through label // difference and division. @@ -937,6 +939,9 @@ void WinException::emitEHRegistrationOffsetLabel(const if (FI != INT_MAX) { const TargetFrameLowering *TFI = Asm->MF->getSubtarget().getFrameLowering(); unsigned UnusedReg; + // FIXME: getFrameIndexReference needs to match the behavior of + // AArch64RegisterInfo::hasBasePointer in which one of the scenarios where + // SP is used is if frame size >= 256. Offset = TFI->getFrameIndexReference(*Asm->MF, FI, UnusedReg); } Modified: vendor/llvm/dist-release_80/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -6182,6 +6182,8 @@ SelectionDAGBuilder::visitIntrinsicCall(const CallInst .addFrameIndex(FI); } + MF.setHasLocalEscape(true); + return nullptr; } Modified: vendor/llvm/dist-release_80/lib/MC/MCWin64EH.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/MC/MCWin64EH.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/MC/MCWin64EH.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -453,6 +453,38 @@ static void ARM64EmitUnwindCode(MCStreamer &streamer, } } +// Returns the epilog symbol of an epilog with the exact same unwind code +// sequence, if it exists. Otherwise, returns nulltpr. +// EpilogInstrs - Unwind codes for the current epilog. +// Epilogs - Epilogs that potentialy match the current epilog. +static MCSymbol* +FindMatchingEpilog(const std::vector& EpilogInstrs, + const std::vector& Epilogs, + const WinEH::FrameInfo *info) { + for (auto *EpilogStart : Epilogs) { + auto InstrsIter = info->EpilogMap.find(EpilogStart); + assert(InstrsIter != info->EpilogMap.end() && + "Epilog not found in EpilogMap"); + const auto &Instrs = InstrsIter->second; + + if (Instrs.size() != EpilogInstrs.size()) + continue; + + bool Match = true; + for (unsigned i = 0; i < Instrs.size(); ++i) + if (Instrs[i].Operation != EpilogInstrs[i].Operation || + Instrs[i].Offset != EpilogInstrs[i].Offset || + Instrs[i].Register != EpilogInstrs[i].Register) { + Match = false; + break; + } + + if (Match) + return EpilogStart; + } + return nullptr; +} + // Populate the .xdata section. The format of .xdata on ARM64 is documented at // https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling static void ARM64EmitUnwindInfo(MCStreamer &streamer, WinEH::FrameInfo *info) { @@ -477,12 +509,28 @@ static void ARM64EmitUnwindInfo(MCStreamer &streamer, // Process epilogs. MapVector EpilogInfo; + // Epilogs processed so far. + std::vector AddedEpilogs; + for (auto &I : info->EpilogMap) { MCSymbol *EpilogStart = I.first; auto &EpilogInstrs = I.second; uint32_t CodeBytes = ARM64CountOfUnwindCodes(EpilogInstrs); - EpilogInfo[EpilogStart] = TotalCodeBytes; - TotalCodeBytes += CodeBytes; + + MCSymbol* MatchingEpilog = + FindMatchingEpilog(EpilogInstrs, AddedEpilogs, info); + if (MatchingEpilog) { + assert(EpilogInfo.find(MatchingEpilog) != EpilogInfo.end() && + "Duplicate epilog not found"); + EpilogInfo[EpilogStart] = EpilogInfo[MatchingEpilog]; + // Clear the unwind codes in the EpilogMap, so that they don't get output + // in the logic below. + EpilogInstrs.clear(); + } else { + EpilogInfo[EpilogStart] = TotalCodeBytes; + TotalCodeBytes += CodeBytes; + AddedEpilogs.push_back(EpilogStart); + } } // Code Words, Epilog count, E, X, Vers, Function Length Modified: vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64AsmPrinter.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64AsmPrinter.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64AsmPrinter.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -694,6 +694,34 @@ void AArch64AsmPrinter::EmitInstruction(const MachineI switch (MI->getOpcode()) { default: break; + case AArch64::MOVMCSym: { + unsigned DestReg = MI->getOperand(0).getReg(); + const MachineOperand &MO_Sym = MI->getOperand(1); + MachineOperand Hi_MOSym(MO_Sym), Lo_MOSym(MO_Sym); + MCOperand Hi_MCSym, Lo_MCSym; + + Hi_MOSym.setTargetFlags(AArch64II::MO_G1 | AArch64II::MO_S); + Lo_MOSym.setTargetFlags(AArch64II::MO_G0 | AArch64II::MO_NC); + + MCInstLowering.lowerOperand(Hi_MOSym, Hi_MCSym); + MCInstLowering.lowerOperand(Lo_MOSym, Lo_MCSym); + + MCInst MovZ; + MovZ.setOpcode(AArch64::MOVZXi); + MovZ.addOperand(MCOperand::createReg(DestReg)); + MovZ.addOperand(Hi_MCSym); + MovZ.addOperand(MCOperand::createImm(16)); + EmitToStreamer(*OutStreamer, MovZ); + + MCInst MovK; + MovK.setOpcode(AArch64::MOVKXi); + MovK.addOperand(MCOperand::createReg(DestReg)); + MovK.addOperand(MCOperand::createReg(DestReg)); + MovK.addOperand(Lo_MCSym); + MovK.addOperand(MCOperand::createImm(0)); + EmitToStreamer(*OutStreamer, MovK); + return; + } case AArch64::MOVIv2d_ns: // If the target has , lower this // instruction to movi.16b instead. Modified: vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64FrameLowering.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64FrameLowering.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64FrameLowering.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -228,6 +228,10 @@ bool AArch64FrameLowering::hasFP(const MachineFunction MFI.getMaxCallFrameSize() > DefaultSafeSPDisplacement) return true; + // Win64 SEH requires frame pointer if funclets are present. + if (MF.hasLocalEscape()) + return true; + return false; } Modified: vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64ISelLowering.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64ISelLowering.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64ISelLowering.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -2743,6 +2743,34 @@ SDValue AArch64TargetLowering::LowerINTRINSIC_WO_CHAIN case Intrinsic::aarch64_neon_umin: return DAG.getNode(ISD::UMIN, dl, Op.getValueType(), Op.getOperand(1), Op.getOperand(2)); + + case Intrinsic::localaddress: { + // Returns one of the stack, base, or frame pointer registers, depending on + // which is used to reference local variables. + MachineFunction &MF = DAG.getMachineFunction(); + const AArch64RegisterInfo *RegInfo = Subtarget->getRegisterInfo(); + unsigned Reg; + if (RegInfo->hasBasePointer(MF)) + Reg = RegInfo->getBaseRegister(); + else // This function handles the SP or FP case. + Reg = RegInfo->getFrameRegister(MF); + return DAG.getCopyFromReg(DAG.getEntryNode(), dl, Reg, + Op.getSimpleValueType()); + } + + case Intrinsic::eh_recoverfp: { + // FIXME: This needs to be implemented to correctly handle highly aligned + // stack objects. For now we simply return the incoming FP. Refer D53541 + // for more details. + SDValue FnOp = Op.getOperand(1); + SDValue IncomingFPOp = Op.getOperand(2); + GlobalAddressSDNode *GSD = dyn_cast(FnOp); + auto *Fn = dyn_cast_or_null(GSD ? GSD->getGlobal() : nullptr); + if (!Fn) + report_fatal_error( + "llvm.eh.recoverfp must take a function as the first argument"); + return IncomingFPOp; + } } } Modified: vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64InstrInfo.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64InstrInfo.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64InstrInfo.td Sat Jan 19 18:44:22 2019 (r343191) @@ -133,7 +133,11 @@ def UseNegativeImmediates : Predicate<"false">, AssemblerPredicate<"!FeatureNoNegativeImmediates", "NegativeImmediates">; +def AArch64LocalRecover : SDNode<"ISD::LOCAL_RECOVER", + SDTypeProfile<1, 1, [SDTCisSameAs<0, 1>, + SDTCisInt<1>]>>; + //===----------------------------------------------------------------------===// // AArch64-specific DAG Nodes. // @@ -6800,6 +6804,9 @@ def : Pat<(AArch64tcret tglobaladdr:$dst, (i32 timm:$F (TCRETURNdi texternalsym:$dst, imm:$FPDiff)>; def : Pat<(AArch64tcret texternalsym:$dst, (i32 timm:$FPDiff)), (TCRETURNdi texternalsym:$dst, imm:$FPDiff)>; + +def MOVMCSym : Pseudo<(outs GPR64:$dst), (ins i64imm:$sym), []>, Sched<[]>; +def : Pat<(i64 (AArch64LocalRecover mcsym:$sym)), (MOVMCSym mcsym:$sym)>; include "AArch64InstrAtomics.td" include "AArch64SVEInstrInfo.td" Modified: vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64RegisterInfo.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64RegisterInfo.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AArch64/AArch64RegisterInfo.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -466,6 +466,13 @@ void AArch64RegisterInfo::eliminateFrameIndex(MachineB // Modify MI as necessary to handle as much of 'Offset' as possible Offset = TFI->resolveFrameIndexReference(MF, FrameIndex, FrameReg); + + if (MI.getOpcode() == TargetOpcode::LOCAL_ESCAPE) { + MachineOperand &FI = MI.getOperand(FIOperandNum); + FI.ChangeToImmediate(Offset); + return; + } + if (rewriteAArch64FrameIndex(MI, FIOperandNum, FrameReg, Offset, TII)) return; Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPU.h ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPU.h Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPU.h Sat Jan 19 18:44:22 2019 (r343191) @@ -254,7 +254,7 @@ namespace AMDGPUAS { FLAT_ADDRESS = 0, ///< Address space for flat memory. GLOBAL_ADDRESS = 1, ///< Address space for global memory (RAT0, VTX0). - REGION_ADDRESS = 2, ///< Address space for region memory. + REGION_ADDRESS = 2, ///< Address space for region memory. (GDS) CONSTANT_ADDRESS = 4, ///< Address space for constant memory (VTX2) LOCAL_ADDRESS = 3, ///< Address space for local memory. Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -4192,6 +4192,7 @@ const char* AMDGPUTargetLowering::getTargetNodeName(un NODE_NAME_CASE(TBUFFER_STORE_FORMAT_D16) NODE_NAME_CASE(TBUFFER_LOAD_FORMAT) NODE_NAME_CASE(TBUFFER_LOAD_FORMAT_D16) + NODE_NAME_CASE(DS_ORDERED_COUNT) NODE_NAME_CASE(ATOMIC_CMP_SWAP) NODE_NAME_CASE(ATOMIC_INC) NODE_NAME_CASE(ATOMIC_DEC) Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.h ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.h Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUISelLowering.h Sat Jan 19 18:44:22 2019 (r343191) @@ -474,6 +474,7 @@ enum NodeType : unsigned { TBUFFER_STORE_FORMAT_D16, TBUFFER_LOAD_FORMAT, TBUFFER_LOAD_FORMAT_D16, + DS_ORDERED_COUNT, ATOMIC_CMP_SWAP, ATOMIC_INC, ATOMIC_DEC, Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUSearchableTables.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUSearchableTables.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUSearchableTables.td Sat Jan 19 18:44:22 2019 (r343191) @@ -72,6 +72,8 @@ def : SourceOfDivergence def : SourceOfDivergence; def : SourceOfDivergence; def : SourceOfDivergence; +def : SourceOfDivergence; +def : SourceOfDivergence; foreach intr = AMDGPUImageDimAtomicIntrinsics in def : SourceOfDivergence; Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -308,6 +308,8 @@ bool GCNTTIImpl::getTgtMemIntrinsic(IntrinsicInst *Ins switch (Inst->getIntrinsicID()) { case Intrinsic::amdgcn_atomic_inc: case Intrinsic::amdgcn_atomic_dec: + case Intrinsic::amdgcn_ds_ordered_add: + case Intrinsic::amdgcn_ds_ordered_swap: case Intrinsic::amdgcn_ds_fadd: case Intrinsic::amdgcn_ds_fmin: case Intrinsic::amdgcn_ds_fmax: { Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/DSInstructions.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/DSInstructions.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/DSInstructions.td Sat Jan 19 18:44:22 2019 (r343191) @@ -817,6 +817,11 @@ defm : DSAtomicRetPat_mc; +def : Pat < + (SIds_ordered_count i32:$value, i16:$offset), + (DS_ORDERED_COUNT $value, (as_i16imm $offset)) +>; + //===----------------------------------------------------------------------===// // Real instructions //===----------------------------------------------------------------------===// Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/GCNHazardRecognizer.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/GCNHazardRecognizer.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/GCNHazardRecognizer.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -88,14 +88,28 @@ static bool isSMovRel(unsigned Opcode) { } } -static bool isSendMsgTraceDataOrGDS(const MachineInstr &MI) { +static bool isSendMsgTraceDataOrGDS(const SIInstrInfo &TII, + const MachineInstr &MI) { + if (TII.isAlwaysGDS(MI.getOpcode())) + return true; + switch (MI.getOpcode()) { case AMDGPU::S_SENDMSG: case AMDGPU::S_SENDMSGHALT: case AMDGPU::S_TTRACEDATA: return true; + // These DS opcodes don't support GDS. + case AMDGPU::DS_NOP: + case AMDGPU::DS_PERMUTE_B32: + case AMDGPU::DS_BPERMUTE_B32: + return false; default: - // TODO: GDS + if (TII.isDS(MI.getOpcode())) { + int GDS = AMDGPU::getNamedOperandIdx(MI.getOpcode(), + AMDGPU::OpName::gds); + if (MI.getOperand(GDS).getImm()) + return true; + } return false; } } @@ -145,7 +159,7 @@ GCNHazardRecognizer::getHazardType(SUnit *SU, int Stal checkReadM0Hazards(MI) > 0) return NoopHazard; - if (ST.hasReadM0SendMsgHazard() && isSendMsgTraceDataOrGDS(*MI) && + if (ST.hasReadM0SendMsgHazard() && isSendMsgTraceDataOrGDS(TII, *MI) && checkReadM0Hazards(MI) > 0) return NoopHazard; @@ -199,7 +213,7 @@ unsigned GCNHazardRecognizer::PreEmitNoops(MachineInst isSMovRel(MI->getOpcode()))) return std::max(WaitStates, checkReadM0Hazards(MI)); - if (ST.hasReadM0SendMsgHazard() && isSendMsgTraceDataOrGDS(*MI)) + if (ST.hasReadM0SendMsgHazard() && isSendMsgTraceDataOrGDS(TII, *MI)) return std::max(WaitStates, checkReadM0Hazards(MI)); return WaitStates; Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIISelLowering.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIISelLowering.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIISelLowering.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -910,6 +910,8 @@ bool SITargetLowering::getTgtMemIntrinsic(IntrinsicInf switch (IntrID) { case Intrinsic::amdgcn_atomic_inc: case Intrinsic::amdgcn_atomic_dec: + case Intrinsic::amdgcn_ds_ordered_add: + case Intrinsic::amdgcn_ds_ordered_swap: case Intrinsic::amdgcn_ds_fadd: case Intrinsic::amdgcn_ds_fmin: case Intrinsic::amdgcn_ds_fmax: { @@ -937,6 +939,8 @@ bool SITargetLowering::getAddrModeArguments(IntrinsicI switch (II->getIntrinsicID()) { case Intrinsic::amdgcn_atomic_inc: case Intrinsic::amdgcn_atomic_dec: + case Intrinsic::amdgcn_ds_ordered_add: + case Intrinsic::amdgcn_ds_ordered_swap: case Intrinsic::amdgcn_ds_fadd: case Intrinsic::amdgcn_ds_fmin: case Intrinsic::amdgcn_ds_fmax: { @@ -5438,6 +5442,63 @@ SDValue SITargetLowering::LowerINTRINSIC_W_CHAIN(SDVal SDLoc DL(Op); switch (IntrID) { + case Intrinsic::amdgcn_ds_ordered_add: + case Intrinsic::amdgcn_ds_ordered_swap: { + MemSDNode *M = cast(Op); + SDValue Chain = M->getOperand(0); + SDValue M0 = M->getOperand(2); + SDValue Value = M->getOperand(3); + unsigned OrderedCountIndex = M->getConstantOperandVal(7); + unsigned WaveRelease = M->getConstantOperandVal(8); + unsigned WaveDone = M->getConstantOperandVal(9); + unsigned ShaderType; + unsigned Instruction; + + switch (IntrID) { + case Intrinsic::amdgcn_ds_ordered_add: + Instruction = 0; + break; + case Intrinsic::amdgcn_ds_ordered_swap: + Instruction = 1; + break; + } + + if (WaveDone && !WaveRelease) + report_fatal_error("ds_ordered_count: wave_done requires wave_release"); + + switch (DAG.getMachineFunction().getFunction().getCallingConv()) { + case CallingConv::AMDGPU_CS: + case CallingConv::AMDGPU_KERNEL: + ShaderType = 0; + break; + case CallingConv::AMDGPU_PS: + ShaderType = 1; + break; + case CallingConv::AMDGPU_VS: + ShaderType = 2; + break; + case CallingConv::AMDGPU_GS: + ShaderType = 3; + break; + default: + report_fatal_error("ds_ordered_count unsupported for this calling conv"); + } + + unsigned Offset0 = OrderedCountIndex << 2; + unsigned Offset1 = WaveRelease | (WaveDone << 1) | (ShaderType << 2) | + (Instruction << 4); + unsigned Offset = Offset0 | (Offset1 << 8); + + SDValue Ops[] = { + Chain, + Value, + DAG.getTargetConstant(Offset, DL, MVT::i16), + copyToM0(DAG, Chain, DL, M0).getValue(1), // Glue + }; + return DAG.getMemIntrinsicNode(AMDGPUISD::DS_ORDERED_COUNT, DL, + M->getVTList(), Ops, M->getMemoryVT(), + M->getMemOperand()); + } case Intrinsic::amdgcn_atomic_inc: case Intrinsic::amdgcn_atomic_dec: case Intrinsic::amdgcn_ds_fadd: Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInsertWaitcnts.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInsertWaitcnts.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInsertWaitcnts.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -536,11 +536,14 @@ void WaitcntBrackets::updateByEvent(const SIInstrInfo CurrScore); } if (Inst.mayStore()) { - setExpScore( - &Inst, TII, TRI, MRI, - AMDGPU::getNamedOperandIdx(Inst.getOpcode(), AMDGPU::OpName::data0), - CurrScore); if (AMDGPU::getNamedOperandIdx(Inst.getOpcode(), + AMDGPU::OpName::data0) != -1) { + setExpScore( + &Inst, TII, TRI, MRI, + AMDGPU::getNamedOperandIdx(Inst.getOpcode(), AMDGPU::OpName::data0), + CurrScore); + } + if (AMDGPU::getNamedOperandIdx(Inst.getOpcode(), AMDGPU::OpName::data1) != -1) { setExpScore(&Inst, TII, TRI, MRI, AMDGPU::getNamedOperandIdx(Inst.getOpcode(), @@ -1093,7 +1096,8 @@ void SIInsertWaitcnts::updateEventWaitcntAfter(Machine // bracket and the destination operand scores. // TODO: Use the (TSFlags & SIInstrFlags::LGKM_CNT) property everywhere. if (TII->isDS(Inst) && TII->usesLGKM_CNT(Inst)) { - if (TII->hasModifiersSet(Inst, AMDGPU::OpName::gds)) { + if (TII->isAlwaysGDS(Inst.getOpcode()) || + TII->hasModifiersSet(Inst, AMDGPU::OpName::gds)) { ScoreBrackets->updateByEvent(TII, TRI, MRI, GDS_ACCESS, Inst); ScoreBrackets->updateByEvent(TII, TRI, MRI, GDS_GPR_LOCK, Inst); } else { Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -2390,6 +2390,16 @@ bool SIInstrInfo::isSchedulingBoundary(const MachineIn changesVGPRIndexingMode(MI); } +bool SIInstrInfo::isAlwaysGDS(uint16_t Opcode) const { + return Opcode == AMDGPU::DS_ORDERED_COUNT || + Opcode == AMDGPU::DS_GWS_INIT || + Opcode == AMDGPU::DS_GWS_SEMA_V || + Opcode == AMDGPU::DS_GWS_SEMA_BR || + Opcode == AMDGPU::DS_GWS_SEMA_P || + Opcode == AMDGPU::DS_GWS_SEMA_RELEASE_ALL || + Opcode == AMDGPU::DS_GWS_BARRIER; +} + bool SIInstrInfo::hasUnwantedEffectsWhenEXECEmpty(const MachineInstr &MI) const { unsigned Opcode = MI.getOpcode(); @@ -2403,7 +2413,8 @@ bool SIInstrInfo::hasUnwantedEffectsWhenEXECEmpty(cons // EXEC = 0, but checking for that case here seems not worth it // given the typical code patterns. if (Opcode == AMDGPU::S_SENDMSG || Opcode == AMDGPU::S_SENDMSGHALT || - Opcode == AMDGPU::EXP || Opcode == AMDGPU::EXP_DONE) + Opcode == AMDGPU::EXP || Opcode == AMDGPU::EXP_DONE || + Opcode == AMDGPU::DS_ORDERED_COUNT) return true; if (MI.isInlineAsm()) Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.h ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.h Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.h Sat Jan 19 18:44:22 2019 (r343191) @@ -450,6 +450,8 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo { return get(Opcode).TSFlags & SIInstrFlags::DS; } + bool isAlwaysGDS(uint16_t Opcode) const; + static bool isMIMG(const MachineInstr &MI) { return MI.getDesc().TSFlags & SIInstrFlags::MIMG; } Modified: vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/AMDGPU/SIInstrInfo.td Sat Jan 19 18:44:22 2019 (r343191) @@ -45,6 +45,11 @@ def SIsbuffer_load : SDNode<"AMDGPUISD::SBUFFER_LOAD", [SDNPMayLoad, SDNPMemOperand] >; +def SIds_ordered_count : SDNode<"AMDGPUISD::DS_ORDERED_COUNT", + SDTypeProfile<1, 2, [SDTCisVT<0, i32>, SDTCisVT<1, i32>, SDTCisVT<2, i16>]>, + [SDNPMayLoad, SDNPMayStore, SDNPMemOperand, SDNPHasChain, SDNPInGlue] +>; + def SIatomic_inc : SDNode<"AMDGPUISD::ATOMIC_INC", SDTAtomic2, [SDNPMayLoad, SDNPMayStore, SDNPMemOperand, SDNPHasChain] >; Modified: vendor/llvm/dist-release_80/lib/Target/MSP430/MSP430AsmPrinter.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/MSP430/MSP430AsmPrinter.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/MSP430/MSP430AsmPrinter.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -17,6 +17,7 @@ #include "MSP430InstrInfo.h" #include "MSP430MCInstLower.h" #include "MSP430TargetMachine.h" +#include "llvm/BinaryFormat/ELF.h" #include "llvm/CodeGen/AsmPrinter.h" #include "llvm/CodeGen/MachineConstantPool.h" #include "llvm/CodeGen/MachineFunctionPass.h" @@ -28,6 +29,7 @@ #include "llvm/IR/Module.h" #include "llvm/MC/MCAsmInfo.h" #include "llvm/MC/MCInst.h" +#include "llvm/MC/MCSectionELF.h" #include "llvm/MC/MCStreamer.h" #include "llvm/MC/MCSymbol.h" #include "llvm/Support/TargetRegistry.h" @@ -44,6 +46,8 @@ namespace { StringRef getPassName() const override { return "MSP430 Assembly Printer"; } + bool runOnMachineFunction(MachineFunction &MF) override; + void printOperand(const MachineInstr *MI, int OpNum, raw_ostream &O, const char* Modifier = nullptr); void printSrcMemOperand(const MachineInstr *MI, int OpNum, @@ -55,6 +59,8 @@ namespace { unsigned OpNo, unsigned AsmVariant, const char *ExtraCode, raw_ostream &O) override; void EmitInstruction(const MachineInstr *MI) override; + + void EmitInterruptVectorSection(MachineFunction &ISR); }; } // end of anonymous namespace @@ -151,6 +157,32 @@ void MSP430AsmPrinter::EmitInstruction(const MachineIn MCInst TmpInst; MCInstLowering.Lower(MI, TmpInst); EmitToStreamer(*OutStreamer, TmpInst); +} + +void MSP430AsmPrinter::EmitInterruptVectorSection(MachineFunction &ISR) { + MCSection *Cur = OutStreamer->getCurrentSectionOnly(); + const auto *F = &ISR.getFunction(); + assert(F->hasFnAttribute("interrupt") && + "Functions with MSP430_INTR CC should have 'interrupt' attribute"); + StringRef IVIdx = F->getFnAttribute("interrupt").getValueAsString(); + MCSection *IV = OutStreamer->getContext().getELFSection( + "__interrupt_vector_" + IVIdx, + ELF::SHT_PROGBITS, ELF::SHF_ALLOC | ELF::SHF_EXECINSTR); + OutStreamer->SwitchSection(IV); + + const MCSymbol *FunctionSymbol = getSymbol(F); + OutStreamer->EmitSymbolValue(FunctionSymbol, TM.getProgramPointerSize()); + OutStreamer->SwitchSection(Cur); +} + +bool MSP430AsmPrinter::runOnMachineFunction(MachineFunction &MF) { + // Emit separate section for an interrupt vector if ISR + if (MF.getFunction().getCallingConv() == CallingConv::MSP430_INTR) + EmitInterruptVectorSection(MF); + + SetupMachineFunction(MF); + EmitFunctionBody(); + return false; } // Force static initialization. Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -27202,6 +27202,8 @@ const char *X86TargetLowering::getTargetNodeName(unsig case X86ISD::VSHLI: return "X86ISD::VSHLI"; case X86ISD::VSRLI: return "X86ISD::VSRLI"; case X86ISD::VSRAI: return "X86ISD::VSRAI"; + case X86ISD::VSHLV: return "X86ISD::VSHLV"; + case X86ISD::VSRLV: return "X86ISD::VSRLV"; case X86ISD::VSRAV: return "X86ISD::VSRAV"; case X86ISD::VROTLI: return "X86ISD::VROTLI"; case X86ISD::VROTRI: return "X86ISD::VROTRI"; Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.h ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.h Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86ISelLowering.h Sat Jan 19 18:44:22 2019 (r343191) @@ -315,10 +315,8 @@ namespace llvm { // Vector shift elements VSHL, VSRL, VSRA, - // Vector variable shift right arithmetic. - // Unlike ISD::SRA, in case shift count greater then element size - // use sign bit to fill destination data element. - VSRAV, + // Vector variable shift + VSHLV, VSRLV, VSRAV, // Vector shift elements by immediate VSHLI, VSRLI, VSRAI, Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86InstrAVX512.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86InstrAVX512.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86InstrAVX512.td Sat Jan 19 18:44:22 2019 (r343191) @@ -6445,52 +6445,53 @@ defm : avx512_var_shift_lowering; // Special handing for handling VPSRAV intrinsics. -multiclass avx512_var_shift_int_lowering p> { +multiclass avx512_var_shift_int_lowering p> { let Predicates = p in { - def : Pat<(_.VT (X86vsrav _.RC:$src1, _.RC:$src2)), + def : Pat<(_.VT (OpNode _.RC:$src1, _.RC:$src2)), (!cast(InstrStr#_.ZSuffix#rr) _.RC:$src1, _.RC:$src2)>; - def : Pat<(_.VT (X86vsrav _.RC:$src1, (_.LdFrag addr:$src2))), + def : Pat<(_.VT (OpNode _.RC:$src1, (_.LdFrag addr:$src2))), (!cast(InstrStr#_.ZSuffix##rm) _.RC:$src1, addr:$src2)>; def : Pat<(_.VT (vselect _.KRCWM:$mask, - (X86vsrav _.RC:$src1, _.RC:$src2), _.RC:$src0)), + (OpNode _.RC:$src1, _.RC:$src2), _.RC:$src0)), (!cast(InstrStr#_.ZSuffix#rrk) _.RC:$src0, _.KRC:$mask, _.RC:$src1, _.RC:$src2)>; def : Pat<(_.VT (vselect _.KRCWM:$mask, - (X86vsrav _.RC:$src1, (_.LdFrag addr:$src2)), + (OpNode _.RC:$src1, (_.LdFrag addr:$src2)), _.RC:$src0)), (!cast(InstrStr#_.ZSuffix##rmk) _.RC:$src0, _.KRC:$mask, _.RC:$src1, addr:$src2)>; def : Pat<(_.VT (vselect _.KRCWM:$mask, - (X86vsrav _.RC:$src1, _.RC:$src2), _.ImmAllZerosV)), + (OpNode _.RC:$src1, _.RC:$src2), _.ImmAllZerosV)), (!cast(InstrStr#_.ZSuffix#rrkz) _.KRC:$mask, _.RC:$src1, _.RC:$src2)>; def : Pat<(_.VT (vselect _.KRCWM:$mask, - (X86vsrav _.RC:$src1, (_.LdFrag addr:$src2)), + (OpNode _.RC:$src1, (_.LdFrag addr:$src2)), _.ImmAllZerosV)), (!cast(InstrStr#_.ZSuffix##rmkz) _.KRC:$mask, _.RC:$src1, addr:$src2)>; } } -multiclass avx512_var_shift_int_lowering_mb p> : - avx512_var_shift_int_lowering { +multiclass avx512_var_shift_int_lowering_mb p> : + avx512_var_shift_int_lowering { let Predicates = p in { - def : Pat<(_.VT (X86vsrav _.RC:$src1, + def : Pat<(_.VT (OpNode _.RC:$src1, (X86VBroadcast (_.ScalarLdFrag addr:$src2)))), (!cast(InstrStr#_.ZSuffix##rmb) _.RC:$src1, addr:$src2)>; def : Pat<(_.VT (vselect _.KRCWM:$mask, - (X86vsrav _.RC:$src1, + (OpNode _.RC:$src1, (X86VBroadcast (_.ScalarLdFrag addr:$src2))), _.RC:$src0)), (!cast(InstrStr#_.ZSuffix##rmbk) _.RC:$src0, _.KRC:$mask, _.RC:$src1, addr:$src2)>; def : Pat<(_.VT (vselect _.KRCWM:$mask, - (X86vsrav _.RC:$src1, + (OpNode _.RC:$src1, (X86VBroadcast (_.ScalarLdFrag addr:$src2))), _.ImmAllZerosV)), (!cast(InstrStr#_.ZSuffix##rmbkz) _.KRC:$mask, @@ -6498,15 +6499,47 @@ multiclass avx512_var_shift_int_lowering_mb; -defm : avx512_var_shift_int_lowering<"VPSRAVW", v16i16x_info, [HasVLX, HasBWI]>; -defm : avx512_var_shift_int_lowering<"VPSRAVW", v32i16_info, [HasBWI]>; -defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", v4i32x_info, [HasVLX]>; -defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", v8i32x_info, [HasVLX]>; -defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", v16i32_info, [HasAVX512]>; -defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", v2i64x_info, [HasVLX]>; -defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", v4i64x_info, [HasVLX]>; -defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", v8i64_info, [HasAVX512]>; +multiclass avx512_var_shift_int_lowering_vl { + defm : avx512_var_shift_int_lowering; + defm : avx512_var_shift_int_lowering; + defm : avx512_var_shift_int_lowering; +} + +multiclass avx512_var_shift_int_lowering_mb_vl { + defm : avx512_var_shift_int_lowering_mb; + defm : avx512_var_shift_int_lowering_mb; + defm : avx512_var_shift_int_lowering_mb; +} + +defm : avx512_var_shift_int_lowering_vl<"VPSRAVW", X86vsrav, avx512vl_i16_info, + HasBWI>; +defm : avx512_var_shift_int_lowering_mb_vl<"VPSRAVD", X86vsrav, + avx512vl_i32_info, HasAVX512>; +defm : avx512_var_shift_int_lowering_mb_vl<"VPSRAVQ", X86vsrav, + avx512vl_i64_info, HasAVX512>; + +defm : avx512_var_shift_int_lowering_vl<"VPSRLVW", X86vsrlv, avx512vl_i16_info, + HasBWI>; +defm : avx512_var_shift_int_lowering_mb_vl<"VPSRLVD", X86vsrlv, + avx512vl_i32_info, HasAVX512>; +defm : avx512_var_shift_int_lowering_mb_vl<"VPSRLVQ", X86vsrlv, + avx512vl_i64_info, HasAVX512>; + +defm : avx512_var_shift_int_lowering_vl<"VPSLLVW", X86vshlv, avx512vl_i16_info, + HasBWI>; +defm : avx512_var_shift_int_lowering_mb_vl<"VPSLLVD", X86vshlv, + avx512vl_i32_info, HasAVX512>; +defm : avx512_var_shift_int_lowering_mb_vl<"VPSLLVQ", X86vshlv, + avx512vl_i64_info, HasAVX512>; + // Use 512bit VPROL/VPROLI version to implement v2i64/v4i64 + v4i32/v8i32 in case NoVLX. let Predicates = [HasAVX512, NoVLX] in { Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86InstrFragmentsSIMD.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86InstrFragmentsSIMD.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86InstrFragmentsSIMD.td Sat Jan 19 18:44:22 2019 (r343191) @@ -198,6 +198,8 @@ def X86vsra : SDNode<"X86ISD::VSRA", X86vshiftunifo def X86vshiftvariable : SDTypeProfile<1, 2, [SDTCisVec<0>, SDTCisSameAs<0,1>, SDTCisSameAs<0,2>, SDTCisInt<0>]>; +def X86vshlv : SDNode<"X86ISD::VSHLV", X86vshiftvariable>; +def X86vsrlv : SDNode<"X86ISD::VSRLV", X86vshiftvariable>; def X86vsrav : SDNode<"X86ISD::VSRAV", X86vshiftvariable>; def X86vshli : SDNode<"X86ISD::VSHLI", X86vshiftimm>; Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86InstrSSE.td ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86InstrSSE.td Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86InstrSSE.td Sat Jan 19 18:44:22 2019 (r343191) @@ -8318,7 +8318,7 @@ def : Pat<(v32i8 (X86SubVBroadcast (v16i8 VR128:$src)) // Variable Bit Shifts // multiclass avx2_var_shift opc, string OpcodeStr, SDNode OpNode, - ValueType vt128, ValueType vt256> { + SDNode IntrinNode, ValueType vt128, ValueType vt256> { def rr : AVX28I opc, string OpcodeSt (vt256 (load addr:$src2)))))]>, VEX_4V, VEX_L, Sched<[SchedWriteVarVecShift.YMM.Folded, SchedWriteVarVecShift.YMM.ReadAfterFold]>; + + def : Pat<(vt128 (IntrinNode VR128:$src1, VR128:$src2)), + (!cast(NAME#"rr") VR128:$src1, VR128:$src2)>; + def : Pat<(vt128 (IntrinNode VR128:$src1, (load addr:$src2))), + (!cast(NAME#"rm") VR128:$src1, addr:$src2)>; + def : Pat<(vt256 (IntrinNode VR256:$src1, VR256:$src2)), + (!cast(NAME#"Yrr") VR256:$src1, VR256:$src2)>; + def : Pat<(vt256 (IntrinNode VR256:$src1, (load addr:$src2))), + (!cast(NAME#"Yrm") VR256:$src1, addr:$src2)>; } let Predicates = [HasAVX2, NoVLX] in { - defm VPSLLVD : avx2_var_shift<0x47, "vpsllvd", shl, v4i32, v8i32>; - defm VPSLLVQ : avx2_var_shift<0x47, "vpsllvq", shl, v2i64, v4i64>, VEX_W; - defm VPSRLVD : avx2_var_shift<0x45, "vpsrlvd", srl, v4i32, v8i32>; - defm VPSRLVQ : avx2_var_shift<0x45, "vpsrlvq", srl, v2i64, v4i64>, VEX_W; - defm VPSRAVD : avx2_var_shift<0x46, "vpsravd", sra, v4i32, v8i32>; - - def : Pat<(v4i32 (X86vsrav VR128:$src1, VR128:$src2)), - (VPSRAVDrr VR128:$src1, VR128:$src2)>; - def : Pat<(v4i32 (X86vsrav VR128:$src1, (load addr:$src2))), - (VPSRAVDrm VR128:$src1, addr:$src2)>; - def : Pat<(v8i32 (X86vsrav VR256:$src1, VR256:$src2)), - (VPSRAVDYrr VR256:$src1, VR256:$src2)>; - def : Pat<(v8i32 (X86vsrav VR256:$src1, (load addr:$src2))), - (VPSRAVDYrm VR256:$src1, addr:$src2)>; + defm VPSLLVD : avx2_var_shift<0x47, "vpsllvd", shl, X86vshlv, v4i32, v8i32>; + defm VPSLLVQ : avx2_var_shift<0x47, "vpsllvq", shl, X86vshlv, v2i64, v4i64>, VEX_W; + defm VPSRLVD : avx2_var_shift<0x45, "vpsrlvd", srl, X86vsrlv, v4i32, v8i32>; + defm VPSRLVQ : avx2_var_shift<0x45, "vpsrlvq", srl, X86vsrlv, v2i64, v4i64>, VEX_W; + defm VPSRAVD : avx2_var_shift<0x46, "vpsravd", sra, X86vsrav, v4i32, v8i32>; } //===----------------------------------------------------------------------===// Modified: vendor/llvm/dist-release_80/lib/Target/X86/X86IntrinsicsInfo.h ============================================================================== --- vendor/llvm/dist-release_80/lib/Target/X86/X86IntrinsicsInfo.h Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Target/X86/X86IntrinsicsInfo.h Sat Jan 19 18:44:22 2019 (r343191) @@ -389,10 +389,10 @@ static const IntrinsicData IntrinsicsWithoutChain[] = X86_INTRINSIC_DATA(avx2_pslli_d, VSHIFT, X86ISD::VSHLI, 0), X86_INTRINSIC_DATA(avx2_pslli_q, VSHIFT, X86ISD::VSHLI, 0), X86_INTRINSIC_DATA(avx2_pslli_w, VSHIFT, X86ISD::VSHLI, 0), - X86_INTRINSIC_DATA(avx2_psllv_d, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx2_psllv_d_256, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx2_psllv_q, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx2_psllv_q_256, INTR_TYPE_2OP, ISD::SHL, 0), + X86_INTRINSIC_DATA(avx2_psllv_d, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx2_psllv_d_256, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx2_psllv_q, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx2_psllv_q_256, INTR_TYPE_2OP, X86ISD::VSHLV, 0), X86_INTRINSIC_DATA(avx2_psra_d, INTR_TYPE_2OP, X86ISD::VSRA, 0), X86_INTRINSIC_DATA(avx2_psra_w, INTR_TYPE_2OP, X86ISD::VSRA, 0), X86_INTRINSIC_DATA(avx2_psrai_d, VSHIFT, X86ISD::VSRAI, 0), @@ -405,10 +405,10 @@ static const IntrinsicData IntrinsicsWithoutChain[] = X86_INTRINSIC_DATA(avx2_psrli_d, VSHIFT, X86ISD::VSRLI, 0), X86_INTRINSIC_DATA(avx2_psrli_q, VSHIFT, X86ISD::VSRLI, 0), X86_INTRINSIC_DATA(avx2_psrli_w, VSHIFT, X86ISD::VSRLI, 0), - X86_INTRINSIC_DATA(avx2_psrlv_d, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx2_psrlv_d_256, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx2_psrlv_q, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx2_psrlv_q_256, INTR_TYPE_2OP, ISD::SRL, 0), + X86_INTRINSIC_DATA(avx2_psrlv_d, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx2_psrlv_d_256, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx2_psrlv_q, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx2_psrlv_q_256, INTR_TYPE_2OP, X86ISD::VSRLV, 0), X86_INTRINSIC_DATA(avx512_add_pd_512, INTR_TYPE_2OP, ISD::FADD, X86ISD::FADD_RND), X86_INTRINSIC_DATA(avx512_add_ps_512, INTR_TYPE_2OP, ISD::FADD, X86ISD::FADD_RND), X86_INTRINSIC_DATA(avx512_cmp_pd_128, CMP_MASK_CC, X86ISD::CMPM, 0), @@ -943,11 +943,11 @@ static const IntrinsicData IntrinsicsWithoutChain[] = X86_INTRINSIC_DATA(avx512_pslli_d_512, VSHIFT, X86ISD::VSHLI, 0), X86_INTRINSIC_DATA(avx512_pslli_q_512, VSHIFT, X86ISD::VSHLI, 0), X86_INTRINSIC_DATA(avx512_pslli_w_512, VSHIFT, X86ISD::VSHLI, 0), - X86_INTRINSIC_DATA(avx512_psllv_d_512, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx512_psllv_q_512, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx512_psllv_w_128, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx512_psllv_w_256, INTR_TYPE_2OP, ISD::SHL, 0), - X86_INTRINSIC_DATA(avx512_psllv_w_512, INTR_TYPE_2OP, ISD::SHL, 0), + X86_INTRINSIC_DATA(avx512_psllv_d_512, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx512_psllv_q_512, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx512_psllv_w_128, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx512_psllv_w_256, INTR_TYPE_2OP, X86ISD::VSHLV, 0), + X86_INTRINSIC_DATA(avx512_psllv_w_512, INTR_TYPE_2OP, X86ISD::VSHLV, 0), X86_INTRINSIC_DATA(avx512_psra_d_512, INTR_TYPE_2OP, X86ISD::VSRA, 0), X86_INTRINSIC_DATA(avx512_psra_q_128, INTR_TYPE_2OP, X86ISD::VSRA, 0), X86_INTRINSIC_DATA(avx512_psra_q_256, INTR_TYPE_2OP, X86ISD::VSRA, 0), @@ -971,11 +971,11 @@ static const IntrinsicData IntrinsicsWithoutChain[] = X86_INTRINSIC_DATA(avx512_psrli_d_512, VSHIFT, X86ISD::VSRLI, 0), X86_INTRINSIC_DATA(avx512_psrli_q_512, VSHIFT, X86ISD::VSRLI, 0), X86_INTRINSIC_DATA(avx512_psrli_w_512, VSHIFT, X86ISD::VSRLI, 0), - X86_INTRINSIC_DATA(avx512_psrlv_d_512, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx512_psrlv_q_512, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx512_psrlv_w_128, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx512_psrlv_w_256, INTR_TYPE_2OP, ISD::SRL, 0), - X86_INTRINSIC_DATA(avx512_psrlv_w_512, INTR_TYPE_2OP, ISD::SRL, 0), + X86_INTRINSIC_DATA(avx512_psrlv_d_512, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx512_psrlv_q_512, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx512_psrlv_w_128, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx512_psrlv_w_256, INTR_TYPE_2OP, X86ISD::VSRLV, 0), + X86_INTRINSIC_DATA(avx512_psrlv_w_512, INTR_TYPE_2OP, X86ISD::VSRLV, 0), X86_INTRINSIC_DATA(avx512_pternlog_d_128, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0), X86_INTRINSIC_DATA(avx512_pternlog_d_256, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0), X86_INTRINSIC_DATA(avx512_pternlog_d_512, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0), Modified: vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Transforms/InstCombine/InstructionCombining.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -3065,9 +3065,11 @@ static bool TryToSinkInstruction(Instruction *I, Basic I->isTerminator()) return false; - // Do not sink alloca instructions out of the entry block. - if (isa(I) && I->getParent() == - &DestBlock->getParent()->getEntryBlock()) + // Do not sink static or dynamic alloca instructions. Static allocas must + // remain in the entry block, and dynamic allocas must not be sunk in between + // a stacksave / stackrestore pair, which would incorrectly shorten its + // lifetime. + if (isa(I)) return false; // Do not sink into catchswitch blocks. Modified: vendor/llvm/dist-release_80/lib/Transforms/Scalar/SROA.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Transforms/Scalar/SROA.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Transforms/Scalar/SROA.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -3031,7 +3031,10 @@ class llvm::sroa::AllocaSliceRewriter (private) ConstantInt *Size = ConstantInt::get(cast(II.getArgOperand(0)->getType()), NewEndOffset - NewBeginOffset); - Value *Ptr = getNewAllocaSlicePtr(IRB, OldPtr->getType()); + // Lifetime intrinsics always expect an i8* so directly get such a pointer + // for the new alloca slice. + Type *PointerTy = IRB.getInt8PtrTy(OldPtr->getType()->getPointerAddressSpace()); + Value *Ptr = getNewAllocaSlicePtr(IRB, PointerTy); Value *New; if (II.getIntrinsicID() == Intrinsic::lifetime_start) New = IRB.CreateLifetimeStart(Ptr, Size); Modified: vendor/llvm/dist-release_80/lib/Transforms/Vectorize/SLPVectorizer.cpp ============================================================================== --- vendor/llvm/dist-release_80/lib/Transforms/Vectorize/SLPVectorizer.cpp Sat Jan 19 16:04:26 2019 (r343190) +++ vendor/llvm/dist-release_80/lib/Transforms/Vectorize/SLPVectorizer.cpp Sat Jan 19 18:44:22 2019 (r343191) @@ -1468,8 +1468,9 @@ void BoUpSLP::buildTree_rec(ArrayRef VL, unsi // If any of the scalars is marked as a value that needs to stay scalar, then // we need to gather the scalars. + // The reduction nodes (stored in UserIgnoreList) also should stay scalar. for (unsigned i = 0, e = VL.size(); i != e; ++i) { - if (MustGather.count(VL[i])) { + if (MustGather.count(VL[i]) || is_contained(UserIgnoreList, VL[i])) { LLVM_DEBUG(dbgs() << "SLP: Gathering due to gathered scalar.\n"); newTreeEntry(VL, false, UserTreeIdx); return; Added: vendor/llvm/dist-release_80/test/CodeGen/AArch64/seh-finally.ll ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ vendor/llvm/dist-release_80/test/CodeGen/AArch64/seh-finally.ll Sat Jan 19 18:44:22 2019 (r343191) @@ -0,0 +1,67 @@ +; RUN: llc -mtriple arm64-windows -o - %s | FileCheck %s + +; Function Attrs: noinline optnone uwtable +define dso_local i32 @foo() { +entry: +; CHECK-LABEL: foo +; CHECK: orr w8, wzr, #0x1 +; CHECK: mov w0, wzr +; CHECK: mov x1, x29 +; CHECK: .set .Lfoo$frame_escape_0, -4 +; CHECK: stur w8, [x29, #-4] *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***