FreeBSD Mail Archives

Date:      Tue, 11 Sep 2018 10:09:45 +0000 (UTC)
From:      Dimitry Andric <dim@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-vendor@freebsd.org
Subject:   svn commit: r338575 - in vendor/llvm/dist-release_70: docs lib/MC/MCParser lib/Support/Unix lib/Target/AMDGPU lib/Target/ARM lib/Target/ARM/MCTargetDesc lib/Target/BPF/MCTargetDesc lib/Target/X86/A...
Message-ID:  <201809111009.w8BA9jUQ027257@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help

Author: dim
Date: Tue Sep 11 10:09:45 2018
New Revision: 338575
URL: https://svnweb.freebsd.org/changeset/base/338575

Log:
  Vendor import of llvm release_70 branch r341916:
  https://llvm.org/svn/llvm-project/llvm/branches/release_70@341916

Added:
  vendor/llvm/dist-release_70/test/CodeGen/ARM/ldrex-frame-size.ll
  vendor/llvm/dist-release_70/test/MC/AsmParser/directive_file-3.s   (contents, props changed)
  vendor/llvm/dist-release_70/test/MC/X86/pr38826.s   (contents, props changed)
  vendor/llvm/dist-release_70/test/Transforms/Inline/infinite-loop-two-predecessors.ll
  vendor/llvm/dist-release_70/test/Transforms/LICM/loopsink-pr38462.ll
Modified:
  vendor/llvm/dist-release_70/docs/ReleaseNotes.rst
  vendor/llvm/dist-release_70/docs/index.rst
  vendor/llvm/dist-release_70/lib/MC/MCParser/AsmParser.cpp
  vendor/llvm/dist-release_70/lib/Support/Unix/Path.inc
  vendor/llvm/dist-release_70/lib/Support/Unix/Process.inc
  vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPU.h
  vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp
  vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.h
  vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
  vendor/llvm/dist-release_70/lib/Target/ARM/ARMFrameLowering.cpp
  vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrFormats.td
  vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrThumb2.td
  vendor/llvm/dist-release_70/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h
  vendor/llvm/dist-release_70/lib/Target/ARM/Thumb2InstrInfo.cpp
  vendor/llvm/dist-release_70/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
  vendor/llvm/dist-release_70/lib/Target/X86/AsmParser/X86AsmParser.cpp
  vendor/llvm/dist-release_70/lib/Transforms/Scalar/LoopSink.cpp
  vendor/llvm/dist-release_70/lib/Transforms/Scalar/SROA.cpp
  vendor/llvm/dist-release_70/lib/Transforms/Utils/CloneFunction.cpp
  vendor/llvm/dist-release_70/lib/Transforms/Vectorize/LoopVectorize.cpp
  vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/amdgpu-alias-analysis.ll
  vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
  vendor/llvm/dist-release_70/test/CodeGen/ARM/ldstrex.ll
  vendor/llvm/dist-release_70/test/CodeGen/X86/eip-addressing-i386.ll
  vendor/llvm/dist-release_70/test/MC/X86/x86_errors.s
  vendor/llvm/dist-release_70/test/Transforms/LoopVectorize/X86/uniform-phi.ll
  vendor/llvm/dist-release_70/test/Transforms/SROA/phi-and-select.ll
  vendor/llvm/dist-release_70/utils/lit/lit/TestRunner.py

Modified: vendor/llvm/dist-release_70/docs/ReleaseNotes.rst
==============================================================================
--- vendor/llvm/dist-release_70/docs/ReleaseNotes.rst	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/docs/ReleaseNotes.rst	Tue Sep 11 10:09:45 2018	(r338575)
@@ -5,12 +5,7 @@ LLVM 7.0.0 Release Notes
 .. contents::
     :local:
 
-.. warning::
-   These are in-progress notes for the upcoming LLVM 7 release.
-   Release notes for previous releases can be found on
-   `the Download Page <http://releases.llvm.org/download.html>`_.
 
-
 Introduction
 ============
 
@@ -18,38 +13,27 @@ This document contains the release notes for the LLVM 
 release 7.0.0.  Here we describe the status of LLVM, including major improvements
 from the previous release, improvements in various subprojects of LLVM, and
 some of the current users of the code.  All LLVM releases may be downloaded
-from the `LLVM releases web site <http://llvm.org/releases/>`_.
+from the `LLVM releases web site <https://llvm.org/releases/>`_.
 
 For more information about LLVM, including information about the latest
-release, please check out the `main LLVM web site <http://llvm.org/>`_.  If you
+release, please check out the `main LLVM web site <https://llvm.org/>`_.  If you
 have questions or comments, the `LLVM Developer's Mailing List
-<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ is a good place to send
+<https://lists.llvm.org/mailman/listinfo/llvm-dev>`_ is a good place to send
 them.
 
-Note that if you are reading this file from a Subversion checkout or the main
-LLVM web page, this document applies to the *next* release, not the current
-one.  To see the release notes for a specific release, please see the `releases
-page <http://llvm.org/releases/>`_.
-
 Non-comprehensive list of changes in this release
 =================================================
-.. NOTE
-   For small 1-3 sentence descriptions, just add an entry at the end of
-   this list. If your description won't fit comfortably in one bullet
-   point (e.g. maybe you would like to give an example of the
-   functionality, or simply have a lot to talk about), see the `NOTE` below
-   for adding a new subsection.
 
 * The Windows installer no longer includes a Visual Studio integration.
   Instead, a new
-  `LLVM Compiler Toolchain Visual Studio extension <https://marketplace.visualstudio.com/items?itemName=LLVMExtensions.llvm-toolchain>`
-  is available on the Visual Studio Marketplace. The new integration includes
-  support for Visual Studio 2017.
+  `LLVM Compiler Toolchain Visual Studio extension <https://marketplace.visualstudio.com/items?itemName=LLVMExtensions.llvm-toolchain>`_
+  is available on the Visual Studio Marketplace. The new integration
+  supports Visual Studio 2017.
 
 * Libraries have been renamed from 7.0 to 7. This change also impacts
   downstream libraries like lldb.
 
-* The LoopInstSimplify pass (-loop-instsimplify) has been removed.
+* The LoopInstSimplify pass (``-loop-instsimplify``) has been removed.
 
 * Symbols starting with ``?`` are no longer mangled by LLVM when using the
   Windows ``x`` or ``w`` IR mangling schemes.
@@ -64,16 +48,13 @@ Non-comprehensive list of changes in this release
   information available in LLVM to statically predict the performance of
   machine code for a specific CPU.
 
-* The optimization flag to merge constants (-fmerge-all-constants) is no longer
-  applied by default.
-
 * Optimization of floating-point casts is improved. This may cause surprising
-  results for code that is relying on the undefined behavior of overflowing 
+  results for code that is relying on the undefined behavior of overflowing
   casts. The optimization can be disabled by specifying a function attribute:
-  "strict-float-cast-overflow"="false". This attribute may be created by the
+  ``"strict-float-cast-overflow"="false"``. This attribute may be created by the
   clang option ``-fno-strict-float-cast-overflow``.
-  Code sanitizers can be used to detect affected patterns. The option for
-  detecting this problem alone is "-fsanitize=float-cast-overflow":
+  Code sanitizers can be used to detect affected patterns. The clang option for
+  detecting this problem alone is ``-fsanitize=float-cast-overflow``:
 
 .. code-block:: c
 
@@ -86,7 +67,7 @@ Non-comprehensive list of changes in this release
 
 .. code-block:: bash
 
-    clang -O1 ftrunc.c -fsanitize=float-cast-overflow ; ./a.out 
+    clang -O1 ftrunc.c -fsanitize=float-cast-overflow ; ./a.out
     ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of representable values of type 'int'
     junk in the ftrunc: 0.000000
 
@@ -104,19 +85,20 @@ Non-comprehensive list of changes in this release
     git grep -l 'DEBUG' | xargs perl -pi -e 's/\bDEBUG\s?\(/LLVM_DEBUG(/g'
     git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
 
-* Early support for UBsan, X-Ray instrumentation and libFuzzer (x86 and x86_64) for OpenBSD. Support for MSan
-  (x86_64), X-Ray instrumentation and libFuzzer (x86 and x86_64) for FreeBSD.
+* Early support for UBsan, X-Ray instrumentation and libFuzzer (x86 and x86_64)
+  for OpenBSD. Support for MSan (x86_64), X-Ray instrumentation and libFuzzer
+  (x86 and x86_64) for FreeBSD.
 
 * ``SmallVector<T, 0>`` shrank from ``sizeof(void*) * 4 + sizeof(T)`` to
   ``sizeof(void*) + sizeof(unsigned) * 2``, smaller than ``std::vector<T>`` on
   64-bit platforms.  The maximum capacity is now restricted to ``UINT32_MAX``.
   Since SmallVector doesn't have the exception-safety pessimizations some
-  implementations saddle std::vector with and is better at using ``realloc``,
-  it's now a better choice even on the heap (although when TinyPtrVector works,
-  it's even smaller).
+  implementations saddle ``std::vector`` with and is better at using ``realloc``,
+  it's now a better choice even on the heap (although when ``TinyPtrVector`` works,
+  that's even smaller).
 
 * Preliminary/experimental support for DWARF v5 debugging information,
-  including the new .debug_names accelerator table. DWARF emitted at ``-O0``
+  including the new ``.debug_names`` accelerator table. DWARF emitted at ``-O0``
   should be fully DWARF v5 compliant. Type units and split DWARF are known
   not to be compliant, and higher optimization levels will still emit some
   information in v4 format.
@@ -129,30 +111,24 @@ Non-comprehensive list of changes in this release
   but it can now handle leftover C declarations in preprocessor output, if
   given output from a preprocessor run externally.)
 
-* CodeView debug info can now be emitted MinGW configurations, if requested.
+* CodeView debug info can now be emitted for MinGW configurations, if requested.
 
-* Note..
+* The :program:`opt` tool now supports the ``-load-pass-plugin`` option for
+  loading pass plugins for the new PassManager.
 
-.. NOTE
-   If you would like to document a larger change, then you can add a
-   subsection about it right here. You can copy the following boilerplate
-   and un-indent it (the indentation causes it to be inside this comment).
+* Support for profiling JITed code with perf.
 
-   Special New Feature
-   -------------------
 
-   Makes programs 10x faster by doing Special New Thing.
-
 Changes to the LLVM IR
 ----------------------
 
-* The signatures for the builtins @llvm.memcpy, @llvm.memmove, and @llvm.memset
-  have changed. Alignment is no longer an argument, and are instead conveyed as
-  parameter attributes.
+* The signatures for the builtins ``@llvm.memcpy``, ``@llvm.memmove``, and
+  ``@llvm.memset`` have changed. Alignment is no longer an argument, and are
+  instead conveyed as parameter attributes.
 
-* invariant.group.barrier has been renamed to launder.invariant.group.
+* ``invariant.group.barrier`` has been renamed to ``launder.invariant.group``.
 
-* invariant.group metadata can now refer only empty metadata nodes.
+* ``invariant.group`` metadata can now refer only to empty metadata nodes.
 
 Changes to the AArch64 Target
 -----------------------------
@@ -160,10 +136,13 @@ Changes to the AArch64 Target
 * The ``.inst`` assembler directive is now usable on both COFF and Mach-O
   targets, in addition to ELF.
 
-* Support for most remaining COFF relocations have been added.
+* Support for most remaining COFF relocations has been added.
 
 * Support for TLS on Windows has been added.
 
+* Assembler and disassembler support for the ARM Scalable Vector Extension has
+  been added.
+
 Changes to the ARM Target
 -------------------------
 
@@ -187,14 +166,75 @@ Changes to the Hexagon Target
 Changes to the MIPS Target
 --------------------------
 
- During this release ...
+During this release the MIPS target has:
 
+* Added support for Virtualization, Global INValidate ASE,
+  and CRC ASE instructions.
 
+* Introduced definitions of ``[d]rem``, ``[d]remu``,
+  and microMIPSR6 ``ll/sc`` instructions.
+
+* Shrink-wrapping is now supported and enabled by default (except for ``-O0``).
+
+* Extended size reduction pass by the LWP and SWP instructions.
+
+* Gained initial support of GlobalISel instruction selection framework.
+
+* Updated the P5600 scheduler model not to use instruction itineraries.
+
+* Added disassembly support for comparison and fused (negative) multiply
+  ``add/sub`` instructions.
+
+* Improved the selection of multiple instructions.
+
+* Load/store ``lb``, ``sb``, ``ld``, ``sd``, ``lld``, ... instructions
+  now support 32/64-bit offsets.
+
+* Added support for ``y``, ``M``, and ``L`` inline assembler operand codes.
+
+* Extended list of relocations supported by the ``.reloc`` directive
+
+* Fixed using a wrong register class for creating an emergency
+  spill slot for mips3 / n64 ABI.
+
+* MIPS relocation types were generated for microMIPS code.
+
+* Corrected definitions of multiple instructions (``lwp``, ``swp``, ``ctc2``,
+  ``cfc2``, ``sync``, ``synci``, ``cvt.d.w``, ...).
+
+* Fixed atomic operations at ``-O0`` level.
+
+* Fixed local dynamic TLS with Sym64
+
 Changes to the PowerPC Target
 -----------------------------
 
- During this release ...
+During this release the PowerPC target has:
 
+* Replaced the list scheduler for post register allocation with the machine scheduler.
+
+* Added support for ``coldcc`` calling convention.
+
+* Added support for ``symbol@high`` and ``symbol@higha`` symbol modifiers.
+
+* Added support for quad-precision floating point type (``__float128``) under the llvm option ``-enable-ppc-quad-precision``.
+
+* Added dump function to ``LatencyPriorityQueue``.
+
+* Completed the Power9 scheduler model.
+
+* Optimized TLS code generation.
+
+* Improved MachineLICM for hoisting constant stores.
+
+* Improved code generation to reduce register use by using more register + immediate instructions.
+
+* Improved code generation to better exploit rotate-and-mask instructions.
+
+* Fixed the bug in dynamic loader for JIT which crashed NNVM.
+
+* Numerous bug fixes and code cleanups.
+
 Changes to the SystemZ Target
 -----------------------------
 
@@ -226,57 +266,61 @@ Changes to the X86 Target
   environments - in MSVC environments, long doubles are the same size as
   normal doubles.)
 
-Changes to the AMDGPU Target
------------------------------
-
- During this release ...
-
-Changes to the AVR Target
------------------------------
-
- During this release ...
-
 Changes to the OCaml bindings
 -----------------------------
 
-* Remove ``add_bb_vectorize``.
+* Removed ``add_bb_vectorize``.
 
 
 Changes to the C API
 --------------------
 
-* Remove ``LLVMAddBBVectorizePass``. The implementation was removed and the C
+* Removed ``LLVMAddBBVectorizePass``. The implementation was removed and the C
   interface was made a deprecated no-op in LLVM 5. Use
   ``LLVMAddSLPVectorizePass`` instead to get the supported SLP vectorizer.
 
+* Expanded the OrcJIT APIs so they can register event listeners like debuggers
+  and profilers.
+
 Changes to the DAG infrastructure
 ---------------------------------
-* ADDC/ADDE/SUBC/SUBE are now deprecated and will default to expand. Backends
-  that wish to continue to use these opcodes should explicitely request so
+* ``ADDC``/``ADDE``/``SUBC``/``SUBE`` are now deprecated and will default to expand. Backends
+  that wish to continue to use these opcodes should explicitely request to do so
   using ``setOperationAction`` in their ``TargetLowering``. New backends
-  should use UADDO/ADDCARRY/USUBO/SUBCARRY instead of the deprecated opcodes.
+  should use ``UADDO``/``ADDCARRY``/``USUBO``/``SUBCARRY`` instead of the deprecated opcodes.
 
-* The SETCCE opcode has now been removed in favor of SETCCCARRY.
+* The ``SETCCE`` opcode has now been removed in favor of ``SETCCCARRY``.
 
-* TableGen now supports multi-alternative pattern fragments via the PatFrags
-  class.  PatFrag is now derived from PatFrags, which may require minor
-  changes to backends that directly access PatFrag members.
+* TableGen now supports multi-alternative pattern fragments via the ``PatFrags``
+  class.  ``PatFrag`` is now derived from ``PatFrags``, which may require minor
+  changes to backends that directly access ``PatFrag`` members.
 
+
 External Open Source Projects Using LLVM 7
 ==========================================
 
-* A project...
+Zig Programming Language
+------------------------
 
+`Zig <https://ziglang.org>`_  is an open-source programming language designed
+for robustness, optimality, and clarity. Zig is an alternative to C, providing
+high level features such as generics, compile time function execution, partial
+evaluation, and LLVM-based coroutines, while exposing low level LLVM IR
+features such as aliases and intrinsics. Zig uses Clang to provide automatic
+import of .h symbols - even inline functions and macros. Zig uses LLD combined
+with lazily building compiler-rt to provide out-of-the-box cross-compiling for
+all supported targets.
 
+
 Additional Information
 ======================
 
 A wide variety of additional information is available on the `LLVM web page
-<http://llvm.org/>`_, in particular in the `documentation
-<http://llvm.org/docs/>`_ section.  The web page also contains versions of the
+<https://llvm.org/>`_, in particular in the `documentation
+<https://llvm.org/docs/>`_ section.  The web page also contains versions of the
 API documentation which is up-to-date with the Subversion version of the source
 code.  You can access versions of these documents specific to this release by
 going into the ``llvm/docs/`` directory in the LLVM tree.
 
 If you have any questions or comments about LLVM, please feel free to contact
-us via the `mailing lists <http://llvm.org/docs/#maillist>`_.
+us via the `mailing lists <https://llvm.org/docs/#mailing-lists>`_.

Modified: vendor/llvm/dist-release_70/docs/index.rst
==============================================================================
--- vendor/llvm/dist-release_70/docs/index.rst	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/docs/index.rst	Tue Sep 11 10:09:45 2018	(r338575)
@@ -1,11 +1,6 @@
 Overview
 ========
 
-.. warning::
-
-   If you are using a released version of LLVM, see `the download page
-   <http://llvm.org/releases/>`_ to find your documentation.
-
 The LLVM compiler infrastructure supports a wide range of projects, from
 industrial strength compilers to specialized JIT applications to small
 research projects.

Modified: vendor/llvm/dist-release_70/lib/MC/MCParser/AsmParser.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/MC/MCParser/AsmParser.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/MC/MCParser/AsmParser.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -3348,17 +3348,17 @@ bool AsmParser::parseDirectiveFile(SMLoc DirectiveLoc)
     }
   }
 
-  // In case there is a -g option as well as debug info from directive .file,
-  // we turn off the -g option, directly use the existing debug info instead.
-  // Also reset any implicit ".file 0" for the assembler source.
-  if (Ctx.getGenDwarfForAssembly()) {
-    Ctx.getMCDwarfLineTable(0).resetRootFile();
-    Ctx.setGenDwarfForAssembly(false);
-  }
-
   if (FileNumber == -1)
     getStreamer().EmitFileDirective(Filename);
   else {
+    // In case there is a -g option as well as debug info from directive .file,
+    // we turn off the -g option, directly use the existing debug info instead.
+    // Also reset any implicit ".file 0" for the assembler source.
+    if (Ctx.getGenDwarfForAssembly()) {
+      Ctx.getMCDwarfLineTable(0).resetRootFile();
+      Ctx.setGenDwarfForAssembly(false);
+    }
+
     MD5::MD5Result *CKMem = nullptr;
     if (HasMD5) {
       CKMem = (MD5::MD5Result *)Ctx.allocate(sizeof(MD5::MD5Result), 1);

Modified: vendor/llvm/dist-release_70/lib/Support/Unix/Path.inc
==============================================================================
--- vendor/llvm/dist-release_70/lib/Support/Unix/Path.inc	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Support/Unix/Path.inc	Tue Sep 11 10:09:45 2018	(r338575)
@@ -769,8 +769,10 @@ std::error_code openFile(const Twine &Name, int &Resul
 
   SmallString<128> Storage;
   StringRef P = Name.toNullTerminatedStringRef(Storage);
-  if ((ResultFD = sys::RetryAfterSignal(-1, ::open, P.begin(), OpenFlags, Mode)) <
-      0)
+  // Call ::open in a lambda to avoid overload resolution in RetryAfterSignal
+  // when open is overloaded, such as in Bionic.
+  auto Open = [&]() { return ::open(P.begin(), OpenFlags, Mode); };
+  if ((ResultFD = sys::RetryAfterSignal(-1, Open)) < 0)
     return std::error_code(errno, std::generic_category());
 #ifndef O_CLOEXEC
   if (!(Flags & OF_ChildInherit)) {

Modified: vendor/llvm/dist-release_70/lib/Support/Unix/Process.inc
==============================================================================
--- vendor/llvm/dist-release_70/lib/Support/Unix/Process.inc	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Support/Unix/Process.inc	Tue Sep 11 10:09:45 2018	(r338575)
@@ -211,7 +211,10 @@ std::error_code Process::FixupStandardFileDescriptors(
     assert(errno == EBADF && "expected errno to have EBADF at this point!");
 
     if (NullFD < 0) {
-      if ((NullFD = RetryAfterSignal(-1, ::open, "/dev/null", O_RDWR)) < 0)
+      // Call ::open in a lambda to avoid overload resolution in
+      // RetryAfterSignal when open is overloaded, such as in Bionic.
+      auto Open = [&]() { return ::open("/dev/null", O_RDWR); };
+      if ((NullFD = RetryAfterSignal(-1, Open)) < 0)
         return std::error_code(errno, std::generic_category());
     }
 

Modified: vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPU.h
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPU.h	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPU.h	Tue Sep 11 10:09:45 2018	(r338575)
@@ -229,7 +229,7 @@ struct AMDGPUAS {
 
   enum : unsigned {
     // The maximum value for flat, generic, local, private, constant and region.
-    MAX_COMMON_ADDRESS = 5,
+    MAX_AMDGPU_ADDRESS = 6,
 
     GLOBAL_ADDRESS = 1,   ///< Address space for global memory (RAT0, VTX0).
     CONSTANT_ADDRESS = 4, ///< Address space for constant memory (VTX2)

Modified: vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -50,47 +50,51 @@ void AMDGPUAAWrapperPass::getAnalysisUsage(AnalysisUsa
 AMDGPUAAResult::ASAliasRulesTy::ASAliasRulesTy(AMDGPUAS AS_, Triple::ArchType Arch_)
   : Arch(Arch_), AS(AS_) {
   // These arrarys are indexed by address space value
-  // enum elements 0 ... to 5
-  static const AliasResult ASAliasRulesPrivIsZero[6][6] = {
-  /*             Private    Global    Constant  Group     Flat      Region*/
-  /* Private  */ {MayAlias, NoAlias , NoAlias , NoAlias , MayAlias, NoAlias},
-  /* Global   */ {NoAlias , MayAlias, NoAlias , NoAlias , MayAlias, NoAlias},
-  /* Constant */ {NoAlias , NoAlias , MayAlias, NoAlias , MayAlias, NoAlias},
-  /* Group    */ {NoAlias , NoAlias , NoAlias , MayAlias, MayAlias, NoAlias},
-  /* Flat     */ {MayAlias, MayAlias, MayAlias, MayAlias, MayAlias, MayAlias},
-  /* Region   */ {NoAlias , NoAlias , NoAlias , NoAlias , MayAlias, MayAlias}
+  // enum elements 0 ... to 6
+  static const AliasResult ASAliasRulesPrivIsZero[7][7] = {
+  /*                    Private    Global    Constant  Group     Flat      Region    Constant 32-bit */
+  /* Private  */        {MayAlias, NoAlias , NoAlias , NoAlias , MayAlias, NoAlias , NoAlias},
+  /* Global   */        {NoAlias , MayAlias, MayAlias, NoAlias , MayAlias, NoAlias , MayAlias},
+  /* Constant */        {NoAlias , MayAlias, MayAlias, NoAlias , MayAlias, NoAlias , MayAlias},
+  /* Group    */        {NoAlias , NoAlias , NoAlias , MayAlias, MayAlias, NoAlias , NoAlias},
+  /* Flat     */        {MayAlias, MayAlias, MayAlias, MayAlias, MayAlias, MayAlias, MayAlias},
+  /* Region   */        {NoAlias , NoAlias , NoAlias , NoAlias , MayAlias, MayAlias, NoAlias},
+  /* Constant 32-bit */ {NoAlias , MayAlias, MayAlias, NoAlias , MayAlias, NoAlias , MayAlias}
   };
-  static const AliasResult ASAliasRulesGenIsZero[6][6] = {
-  /*             Flat       Global    Region    Group     Constant  Private */
-  /* Flat     */ {MayAlias, MayAlias, MayAlias, MayAlias, MayAlias, MayAlias},
-  /* Global   */ {MayAlias, MayAlias, NoAlias , NoAlias , NoAlias , NoAlias},
-  /* Constant */ {MayAlias, NoAlias , MayAlias, NoAlias , NoAlias,  NoAlias},
-  /* Group    */ {MayAlias, NoAlias , NoAlias , MayAlias, NoAlias , NoAlias},
-  /* Region   */ {MayAlias, NoAlias , NoAlias , NoAlias,  MayAlias, NoAlias},
-  /* Private  */ {MayAlias, NoAlias , NoAlias , NoAlias , NoAlias , MayAlias}
+  static const AliasResult ASAliasRulesGenIsZero[7][7] = {
+  /*                    Flat       Global    Region    Group     Constant  Private   Constant 32-bit */
+  /* Flat     */        {MayAlias, MayAlias, MayAlias, MayAlias, MayAlias, MayAlias, MayAlias},
+  /* Global   */        {MayAlias, MayAlias, NoAlias , NoAlias , MayAlias, NoAlias , MayAlias},
+  /* Region   */        {MayAlias, NoAlias , NoAlias , NoAlias,  MayAlias, NoAlias , MayAlias},
+  /* Group    */        {MayAlias, NoAlias , NoAlias , MayAlias, NoAlias , NoAlias , NoAlias},
+  /* Constant */        {MayAlias, MayAlias, MayAlias, NoAlias , NoAlias,  NoAlias , MayAlias},
+  /* Private  */        {MayAlias, NoAlias , NoAlias , NoAlias , NoAlias , MayAlias, NoAlias},
+  /* Constant 32-bit */ {MayAlias, MayAlias, MayAlias, NoAlias , MayAlias, NoAlias , NoAlias}
   };
-  assert(AS.MAX_COMMON_ADDRESS <= 5);
+  static_assert(AMDGPUAS::MAX_AMDGPU_ADDRESS <= 6, "Addr space out of range");
   if (AS.FLAT_ADDRESS == 0) {
-    assert(AS.GLOBAL_ADDRESS   == 1 &&
-           AS.REGION_ADDRESS   == 2 &&
-           AS.LOCAL_ADDRESS    == 3 &&
-           AS.CONSTANT_ADDRESS == 4 &&
-           AS.PRIVATE_ADDRESS  == 5);
+    assert(AS.GLOBAL_ADDRESS         == 1 &&
+           AS.REGION_ADDRESS         == 2 &&
+           AS.LOCAL_ADDRESS          == 3 &&
+           AS.CONSTANT_ADDRESS       == 4 &&
+           AS.PRIVATE_ADDRESS        == 5 &&
+           AS.CONSTANT_ADDRESS_32BIT == 6);
     ASAliasRules = &ASAliasRulesGenIsZero;
   } else {
-    assert(AS.PRIVATE_ADDRESS  == 0 &&
-           AS.GLOBAL_ADDRESS   == 1 &&
-           AS.CONSTANT_ADDRESS == 2 &&
-           AS.LOCAL_ADDRESS    == 3 &&
-           AS.FLAT_ADDRESS     == 4 &&
-           AS.REGION_ADDRESS   == 5);
+    assert(AS.PRIVATE_ADDRESS        == 0 &&
+           AS.GLOBAL_ADDRESS         == 1 &&
+           AS.CONSTANT_ADDRESS       == 2 &&
+           AS.LOCAL_ADDRESS          == 3 &&
+           AS.FLAT_ADDRESS           == 4 &&
+           AS.REGION_ADDRESS         == 5 &&
+           AS.CONSTANT_ADDRESS_32BIT == 6);
     ASAliasRules = &ASAliasRulesPrivIsZero;
   }
 }
 
 AliasResult AMDGPUAAResult::ASAliasRulesTy::getAliasResult(unsigned AS1,
     unsigned AS2) const {
-  if (AS1 > AS.MAX_COMMON_ADDRESS || AS2 > AS.MAX_COMMON_ADDRESS) {
+  if (AS1 > AS.MAX_AMDGPU_ADDRESS || AS2 > AS.MAX_AMDGPU_ADDRESS) {
     if (Arch == Triple::amdgcn)
       report_fatal_error("Pointer address space out of range");
     return AS1 == AS2 ? MayAlias : NoAlias;

Modified: vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.h
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.h	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUAliasAnalysis.h	Tue Sep 11 10:09:45 2018	(r338575)
@@ -63,7 +63,7 @@ class AMDGPUAAResult : public AAResultBase<AMDGPUAARes
   private:
     Triple::ArchType Arch;
     AMDGPUAS AS;
-    const AliasResult (*ASAliasRules)[6][6];
+    const AliasResult (*ASAliasRules)[7][7];
   } ASAliasRules;
 };
 

Modified: vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -1451,7 +1451,11 @@ bool AMDGPUDAGToDAGISel::SelectSMRD(SDValue Addr, SDVa
                                      SDValue &Offset, bool &Imm) const {
   SDLoc SL(Addr);
 
-  if (CurDAG->isBaseWithConstantOffset(Addr)) {
+  // A 32-bit (address + offset) should not cause unsigned 32-bit integer
+  // wraparound, because s_load instructions perform the addition in 64 bits.
+  if ((Addr.getValueType() != MVT::i32 ||
+       Addr->getFlags().hasNoUnsignedWrap()) &&
+      CurDAG->isBaseWithConstantOffset(Addr)) {
     SDValue N0 = Addr.getOperand(0);
     SDValue N1 = Addr.getOperand(1);
 

Modified: vendor/llvm/dist-release_70/lib/Target/ARM/ARMFrameLowering.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/ARM/ARMFrameLowering.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/ARM/ARMFrameLowering.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -1514,6 +1514,7 @@ static unsigned estimateRSStackSizeLimit(MachineFuncti
           break;
         case ARMII::AddrMode5:
         case ARMII::AddrModeT2_i8s4:
+        case ARMII::AddrModeT2_ldrex:
           Limit = std::min(Limit, ((1U << 8) - 1) * 4);
           break;
         case ARMII::AddrModeT2_i12:

Modified: vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrFormats.td
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrFormats.td	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrFormats.td	Tue Sep 11 10:09:45 2018	(r338575)
@@ -109,6 +109,7 @@ def AddrModeT2_pc   : AddrMode<14>;
 def AddrModeT2_i8s4 : AddrMode<15>;
 def AddrMode_i12    : AddrMode<16>;
 def AddrMode5FP16   : AddrMode<17>;
+def AddrModeT2_ldrex : AddrMode<18>;
 
 // Load / store index mode.
 class IndexMode<bits<2> val> {

Modified: vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrThumb2.td
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrThumb2.td	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/ARM/ARMInstrThumb2.td	Tue Sep 11 10:09:45 2018	(r338575)
@@ -3267,7 +3267,7 @@ def t2LDREXH : T2I_ldrex<0b0101, (outs rGPR:$Rt), (ins
                          [(set rGPR:$Rt, (ldrex_2 addr_offset_none:$addr))]>,
                Requires<[IsThumb, HasV8MBaseline]>;
 def t2LDREX  : Thumb2I<(outs rGPR:$Rt), (ins t2addrmode_imm0_1020s4:$addr),
-                       AddrModeNone, 4, NoItinerary,
+                       AddrModeT2_ldrex, 4, NoItinerary,
                        "ldrex", "\t$Rt, $addr", "",
                      [(set rGPR:$Rt, (ldrex_4 t2addrmode_imm0_1020s4:$addr))]>,
                Requires<[IsThumb, HasV8MBaseline]> {
@@ -3346,7 +3346,7 @@ def t2STREXH : T2I_strex<0b0101, (outs rGPR:$Rd),
 
 def t2STREX  : Thumb2I<(outs rGPR:$Rd), (ins rGPR:$Rt,
                              t2addrmode_imm0_1020s4:$addr),
-                  AddrModeNone, 4, NoItinerary,
+                  AddrModeT2_ldrex, 4, NoItinerary,
                   "strex", "\t$Rd, $Rt, $addr", "",
                   [(set rGPR:$Rd,
                         (strex_4 rGPR:$Rt, t2addrmode_imm0_1020s4:$addr))]>,

Modified: vendor/llvm/dist-release_70/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h	Tue Sep 11 10:09:45 2018	(r338575)
@@ -201,7 +201,8 @@ namespace ARMII {
     AddrModeT2_pc   = 14, // +/- i12 for pc relative data
     AddrModeT2_i8s4 = 15, // i8 * 4
     AddrMode_i12    = 16,
-    AddrMode5FP16   = 17  // i8 * 2
+    AddrMode5FP16   = 17,  // i8 * 2
+    AddrModeT2_ldrex = 18, // i8 * 4, with unscaled offset in MCInst
   };
 
   inline static const char *AddrModeToString(AddrMode addrmode) {
@@ -224,6 +225,7 @@ namespace ARMII {
     case AddrModeT2_pc:   return "AddrModeT2_pc";
     case AddrModeT2_i8s4: return "AddrModeT2_i8s4";
     case AddrMode_i12:    return "AddrMode_i12";
+    case AddrModeT2_ldrex:return "AddrModeT2_ldrex";
     }
   }
 

Modified: vendor/llvm/dist-release_70/lib/Target/ARM/Thumb2InstrInfo.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/ARM/Thumb2InstrInfo.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/ARM/Thumb2InstrInfo.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -621,6 +621,11 @@ bool llvm::rewriteT2FrameIndex(MachineInstr &MI, unsig
       // MCInst operand expects already scaled value.
       Scale = 1;
       assert((Offset & 3) == 0 && "Can't encode this offset!");
+    } else if (AddrMode == ARMII::AddrModeT2_ldrex) {
+      Offset += MI.getOperand(FrameRegIdx + 1).getImm() * 4;
+      NumBits = 8; // 8 bits scaled by 4
+      Scale = 4;
+      assert((Offset & 3) == 0 && "Can't encode this offset!");
     } else {
       llvm_unreachable("Unsupported addressing mode!");
     }

Modified: vendor/llvm/dist-release_70/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -10,6 +10,8 @@
 #include "MCTargetDesc/BPFMCTargetDesc.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/MC/MCAsmBackend.h"
+#include "llvm/MC/MCAssembler.h"
+#include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCFixup.h"
 #include "llvm/MC/MCObjectWriter.h"
 #include "llvm/Support/EndianStream.h"
@@ -71,7 +73,12 @@ void BPFAsmBackend::applyFixup(const MCAssembler &Asm,
                                bool IsResolved,
                                const MCSubtargetInfo *STI) const {
   if (Fixup.getKind() == FK_SecRel_4 || Fixup.getKind() == FK_SecRel_8) {
-    assert(Value == 0);
+    if (Value) {
+      MCContext &Ctx = Asm.getContext();
+      Ctx.reportError(Fixup.getLoc(),
+                      "Unsupported relocation: try to compile with -O2 or above, "
+                      "or check your static variable usage");
+    }
   } else if (Fixup.getKind() == FK_Data_4) {
     support::endian::write<uint32_t>(&Data[Fixup.getOffset()], Value, Endian);
   } else if (Fixup.getKind() == FK_Data_8) {

Modified: vendor/llvm/dist-release_70/lib/Target/X86/AsmParser/X86AsmParser.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Target/X86/AsmParser/X86AsmParser.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Target/X86/AsmParser/X86AsmParser.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -1054,7 +1054,7 @@ static bool CheckBaseRegAndIndexRegAndScale(unsigned B
   // RIP/EIP-relative addressing is only supported in 64-bit mode.
   if (!Is64BitMode && BaseReg != 0 &&
       (BaseReg == X86::RIP || BaseReg == X86::EIP)) {
-    ErrMsg = "RIP-relative addressing requires 64-bit mode";
+    ErrMsg = "IP-relative addressing requires 64-bit mode";
     return true;
   }
 
@@ -1099,7 +1099,7 @@ bool X86AsmParser::ParseRegister(unsigned &RegNo,
     // checked.
     // FIXME: Check AH, CH, DH, BH cannot be used in an instruction requiring a
     // REX prefix.
-    if (RegNo == X86::RIZ || RegNo == X86::RIP || RegNo == X86::EIP ||
+    if (RegNo == X86::RIZ || RegNo == X86::RIP ||
         X86MCRegisterClasses[X86::GR64RegClassID].contains(RegNo) ||
         X86II::isX86_64NonExtLowByteReg(RegNo) ||
         X86II::isX86_64ExtendedReg(RegNo))

Modified: vendor/llvm/dist-release_70/lib/Transforms/Scalar/LoopSink.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Transforms/Scalar/LoopSink.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Transforms/Scalar/LoopSink.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -152,6 +152,14 @@ findBBsToSinkInto(const Loop &L, const SmallPtrSetImpl
     }
   }
 
+  // Can't sink into blocks that have no valid insertion point.
+  for (BasicBlock *BB : BBsToSinkInto) {
+    if (BB->getFirstInsertionPt() == BB->end()) {
+      BBsToSinkInto.clear();
+      break;
+    }
+  }
+
   // If the total frequency of BBsToSinkInto is larger than preheader frequency,
   // do not sink.
   if (adjustedSumFreq(BBsToSinkInto, BFI) >

Modified: vendor/llvm/dist-release_70/lib/Transforms/Scalar/SROA.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Transforms/Scalar/SROA.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Transforms/Scalar/SROA.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -3046,6 +3046,42 @@ class llvm::sroa::AllocaSliceRewriter (private)
     return true;
   }
 
+  void fixLoadStoreAlign(Instruction &Root) {
+    // This algorithm implements the same visitor loop as
+    // hasUnsafePHIOrSelectUse, and fixes the alignment of each load
+    // or store found.
+    SmallPtrSet<Instruction *, 4> Visited;
+    SmallVector<Instruction *, 4> Uses;
+    Visited.insert(&Root);
+    Uses.push_back(&Root);
+    do {
+      Instruction *I = Uses.pop_back_val();
+
+      if (LoadInst *LI = dyn_cast<LoadInst>(I)) {
+        unsigned LoadAlign = LI->getAlignment();
+        if (!LoadAlign)
+          LoadAlign = DL.getABITypeAlignment(LI->getType());
+        LI->setAlignment(std::min(LoadAlign, getSliceAlign()));
+        continue;
+      }
+      if (StoreInst *SI = dyn_cast<StoreInst>(I)) {
+        unsigned StoreAlign = SI->getAlignment();
+        if (!StoreAlign) {
+          Value *Op = SI->getOperand(0);
+          StoreAlign = DL.getABITypeAlignment(Op->getType());
+        }
+        SI->setAlignment(std::min(StoreAlign, getSliceAlign()));
+        continue;
+      }
+
+      assert(isa<BitCastInst>(I) || isa<PHINode>(I) ||
+             isa<SelectInst>(I) || isa<GetElementPtrInst>(I));
+      for (User *U : I->users())
+        if (Visited.insert(cast<Instruction>(U)).second)
+          Uses.push_back(cast<Instruction>(U));
+    } while (!Uses.empty());
+  }
+
   bool visitPHINode(PHINode &PN) {
     LLVM_DEBUG(dbgs() << "    original: " << PN << "\n");
     assert(BeginOffset >= NewAllocaBeginOffset && "PHIs are unsplittable");
@@ -3069,6 +3105,9 @@ class llvm::sroa::AllocaSliceRewriter (private)
     LLVM_DEBUG(dbgs() << "          to: " << PN << "\n");
     deleteIfTriviallyDead(OldPtr);
 
+    // Fix the alignment of any loads or stores using this PHI node.
+    fixLoadStoreAlign(PN);
+
     // PHIs can't be promoted on their own, but often can be speculated. We
     // check the speculation outside of the rewriter so that we see the
     // fully-rewritten alloca.
@@ -3092,6 +3131,9 @@ class llvm::sroa::AllocaSliceRewriter (private)
 
     LLVM_DEBUG(dbgs() << "          to: " << SI << "\n");
     deleteIfTriviallyDead(OldPtr);
+
+    // Fix the alignment of any loads or stores using this select.
+    fixLoadStoreAlign(SI);
 
     // Selects can't be promoted on their own, but often can be speculated. We
     // check the speculation outside of the rewriter so that we see the

Modified: vendor/llvm/dist-release_70/lib/Transforms/Utils/CloneFunction.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Transforms/Utils/CloneFunction.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Transforms/Utils/CloneFunction.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -636,6 +636,22 @@ void llvm::CloneAndPruneIntoFromInst(Function *NewFunc
   Function::iterator Begin = cast<BasicBlock>(VMap[StartingBB])->getIterator();
   Function::iterator I = Begin;
   while (I != NewFunc->end()) {
+    // We need to simplify conditional branches and switches with a constant
+    // operand. We try to prune these out when cloning, but if the
+    // simplification required looking through PHI nodes, those are only
+    // available after forming the full basic block. That may leave some here,
+    // and we still want to prune the dead code as early as possible.
+    //
+    // Do the folding before we check if the block is dead since we want code
+    // like
+    //  bb:
+    //    br i1 undef, label %bb, label %bb
+    // to be simplified to
+    //  bb:
+    //    br label %bb
+    // before we call I->getSinglePredecessor().
+    ConstantFoldTerminator(&*I);
+
     // Check if this block has become dead during inlining or other
     // simplifications. Note that the first block will appear dead, as it has
     // not yet been wired up properly.
@@ -645,13 +661,6 @@ void llvm::CloneAndPruneIntoFromInst(Function *NewFunc
       DeleteDeadBlock(DeadBB);
       continue;
     }
-
-    // We need to simplify conditional branches and switches with a constant
-    // operand. We try to prune these out when cloning, but if the
-    // simplification required looking through PHI nodes, those are only
-    // available after forming the full basic block. That may leave some here,
-    // and we still want to prune the dead code as early as possible.
-    ConstantFoldTerminator(&*I);
 
     BranchInst *BI = dyn_cast<BranchInst>(I->getTerminator());
     if (!BI || BI->isConditional()) { ++I; continue; }

Modified: vendor/llvm/dist-release_70/lib/Transforms/Vectorize/LoopVectorize.cpp
==============================================================================
--- vendor/llvm/dist-release_70/lib/Transforms/Vectorize/LoopVectorize.cpp	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/lib/Transforms/Vectorize/LoopVectorize.cpp	Tue Sep 11 10:09:45 2018	(r338575)
@@ -4510,6 +4510,13 @@ void LoopVectorizationCostModel::collectLoopUniforms(u
     for (auto OV : I->operand_values()) {
       if (isOutOfScope(OV))
         continue;
+      // First order recurrence Phi's should typically be considered
+      // non-uniform.
+      auto *OP = dyn_cast<PHINode>(OV);
+      if (OP && Legal->isFirstOrderRecurrence(OP))
+        continue;
+      // If all the users of the operand are uniform, then add the
+      // operand into the uniform worklist.
       auto *OI = cast<Instruction>(OV);
       if (llvm::all_of(OI->users(), [&](User *U) -> bool {
             auto *J = cast<Instruction>(U);

Modified: vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/amdgpu-alias-analysis.ll
==============================================================================
--- vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/amdgpu-alias-analysis.ll	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/amdgpu-alias-analysis.ll	Tue Sep 11 10:09:45 2018	(r338575)
@@ -7,3 +7,27 @@ define void @test(i8 addrspace(5)* %p, i8 addrspace(1)
   ret void
 }
 
+; CHECK: MayAlias:      i8 addrspace(1)* %p1, i8 addrspace(4)* %p
+
+define void @test_constant_vs_global(i8 addrspace(4)* %p, i8 addrspace(1)* %p1) {
+  ret void
+}
+
+; CHECK: MayAlias:      i8 addrspace(1)* %p, i8 addrspace(4)* %p1
+
+define void @test_global_vs_constant(i8 addrspace(1)* %p, i8 addrspace(4)* %p1) {
+  ret void
+}
+
+; CHECK: MayAlias:      i8 addrspace(1)* %p1, i8 addrspace(6)* %p
+
+define void @test_constant_32bit_vs_global(i8 addrspace(6)* %p, i8 addrspace(1)* %p1) {
+  ret void
+}
+
+; CHECK: MayAlias:      i8 addrspace(4)* %p1, i8 addrspace(6)* %p
+
+define void @test_constant_32bit_vs_constant(i8 addrspace(6)* %p, i8 addrspace(4)* %p1) {
+  ret void
+}
+

Modified: vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
==============================================================================
--- vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/constant-address-space-32bit.ll	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/test/CodeGen/AMDGPU/constant-address-space-32bit.ll	Tue Sep 11 10:09:45 2018	(r338575)
@@ -12,7 +12,7 @@
 ; VIGFX9-DAG: s_load_dword s{{[0-9]}}, s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dword s{{[0-9]}}, s[2:3], 0x8
 define amdgpu_vs float @load_i32(i32 addrspace(6)* inreg %p0, i32 addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr i32, i32 addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds i32, i32 addrspace(6)* %p1, i32 2
   %r0 = load i32, i32 addrspace(6)* %p0
   %r1 = load i32, i32 addrspace(6)* %gep1
   %r = add i32 %r0, %r1
@@ -29,7 +29,7 @@ define amdgpu_vs float @load_i32(i32 addrspace(6)* inr
 ; VIGFX9-DAG: s_load_dwordx2 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx2 s[{{.*}}], s[2:3], 0x10
 define amdgpu_vs <2 x float> @load_v2i32(<2 x i32> addrspace(6)* inreg %p0, <2 x i32> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <2 x i32>, <2 x i32> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <2 x i32>, <2 x i32> addrspace(6)* %p1, i32 2
   %r0 = load <2 x i32>, <2 x i32> addrspace(6)* %p0
   %r1 = load <2 x i32>, <2 x i32> addrspace(6)* %gep1
   %r = add <2 x i32> %r0, %r1
@@ -46,7 +46,7 @@ define amdgpu_vs <2 x float> @load_v2i32(<2 x i32> add
 ; VIGFX9-DAG: s_load_dwordx4 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx4 s[{{.*}}], s[2:3], 0x20
 define amdgpu_vs <4 x float> @load_v4i32(<4 x i32> addrspace(6)* inreg %p0, <4 x i32> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <4 x i32>, <4 x i32> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <4 x i32>, <4 x i32> addrspace(6)* %p1, i32 2
   %r0 = load <4 x i32>, <4 x i32> addrspace(6)* %p0
   %r1 = load <4 x i32>, <4 x i32> addrspace(6)* %gep1
   %r = add <4 x i32> %r0, %r1
@@ -63,7 +63,7 @@ define amdgpu_vs <4 x float> @load_v4i32(<4 x i32> add
 ; VIGFX9-DAG: s_load_dwordx8 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx8 s[{{.*}}], s[2:3], 0x40
 define amdgpu_vs <8 x float> @load_v8i32(<8 x i32> addrspace(6)* inreg %p0, <8 x i32> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <8 x i32>, <8 x i32> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <8 x i32>, <8 x i32> addrspace(6)* %p1, i32 2
   %r0 = load <8 x i32>, <8 x i32> addrspace(6)* %p0
   %r1 = load <8 x i32>, <8 x i32> addrspace(6)* %gep1
   %r = add <8 x i32> %r0, %r1
@@ -80,7 +80,7 @@ define amdgpu_vs <8 x float> @load_v8i32(<8 x i32> add
 ; VIGFX9-DAG: s_load_dwordx16 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx16 s[{{.*}}], s[2:3], 0x80
 define amdgpu_vs <16 x float> @load_v16i32(<16 x i32> addrspace(6)* inreg %p0, <16 x i32> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <16 x i32>, <16 x i32> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <16 x i32>, <16 x i32> addrspace(6)* %p1, i32 2
   %r0 = load <16 x i32>, <16 x i32> addrspace(6)* %p0
   %r1 = load <16 x i32>, <16 x i32> addrspace(6)* %gep1
   %r = add <16 x i32> %r0, %r1
@@ -97,7 +97,7 @@ define amdgpu_vs <16 x float> @load_v16i32(<16 x i32> 
 ; VIGFX9-DAG: s_load_dword s{{[0-9]}}, s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dword s{{[0-9]}}, s[2:3], 0x8
 define amdgpu_vs float @load_float(float addrspace(6)* inreg %p0, float addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr float, float addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds float, float addrspace(6)* %p1, i32 2
   %r0 = load float, float addrspace(6)* %p0
   %r1 = load float, float addrspace(6)* %gep1
   %r = fadd float %r0, %r1
@@ -113,7 +113,7 @@ define amdgpu_vs float @load_float(float addrspace(6)*
 ; VIGFX9-DAG: s_load_dwordx2 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx2 s[{{.*}}], s[2:3], 0x10
 define amdgpu_vs <2 x float> @load_v2float(<2 x float> addrspace(6)* inreg %p0, <2 x float> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <2 x float>, <2 x float> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <2 x float>, <2 x float> addrspace(6)* %p1, i32 2
   %r0 = load <2 x float>, <2 x float> addrspace(6)* %p0
   %r1 = load <2 x float>, <2 x float> addrspace(6)* %gep1
   %r = fadd <2 x float> %r0, %r1
@@ -129,7 +129,7 @@ define amdgpu_vs <2 x float> @load_v2float(<2 x float>
 ; VIGFX9-DAG: s_load_dwordx4 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx4 s[{{.*}}], s[2:3], 0x20
 define amdgpu_vs <4 x float> @load_v4float(<4 x float> addrspace(6)* inreg %p0, <4 x float> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <4 x float>, <4 x float> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <4 x float>, <4 x float> addrspace(6)* %p1, i32 2
   %r0 = load <4 x float>, <4 x float> addrspace(6)* %p0
   %r1 = load <4 x float>, <4 x float> addrspace(6)* %gep1
   %r = fadd <4 x float> %r0, %r1
@@ -145,7 +145,7 @@ define amdgpu_vs <4 x float> @load_v4float(<4 x float>
 ; VIGFX9-DAG: s_load_dwordx8 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx8 s[{{.*}}], s[2:3], 0x40
 define amdgpu_vs <8 x float> @load_v8float(<8 x float> addrspace(6)* inreg %p0, <8 x float> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <8 x float>, <8 x float> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <8 x float>, <8 x float> addrspace(6)* %p1, i32 2
   %r0 = load <8 x float>, <8 x float> addrspace(6)* %p0
   %r1 = load <8 x float>, <8 x float> addrspace(6)* %gep1
   %r = fadd <8 x float> %r0, %r1
@@ -161,7 +161,7 @@ define amdgpu_vs <8 x float> @load_v8float(<8 x float>
 ; VIGFX9-DAG: s_load_dwordx16 s[{{.*}}], s[0:1], 0x0
 ; VIGFX9-DAG: s_load_dwordx16 s[{{.*}}], s[2:3], 0x80
 define amdgpu_vs <16 x float> @load_v16float(<16 x float> addrspace(6)* inreg %p0, <16 x float> addrspace(6)* inreg %p1) #0 {
-  %gep1 = getelementptr <16 x float>, <16 x float> addrspace(6)* %p1, i64 2
+  %gep1 = getelementptr inbounds <16 x float>, <16 x float> addrspace(6)* %p1, i32 2
   %r0 = load <16 x float>, <16 x float> addrspace(6)* %p0
   %r1 = load <16 x float>, <16 x float> addrspace(6)* %gep1
   %r = fadd <16 x float> %r0, %r1
@@ -212,12 +212,12 @@ main_body:
   %22 = call nsz float @llvm.amdgcn.interp.mov(i32 2, i32 0, i32 0, i32 %5) #8
   %23 = bitcast float %22 to i32
   %24 = shl i32 %23, 1
-  %25 = getelementptr [0 x <8 x i32>], [0 x <8 x i32>] addrspace(6)* %1, i32 0, i32 %24, !amdgpu.uniform !0
+  %25 = getelementptr inbounds [0 x <8 x i32>], [0 x <8 x i32>] addrspace(6)* %1, i32 0, i32 %24, !amdgpu.uniform !0
   %26 = load <8 x i32>, <8 x i32> addrspace(6)* %25, align 32, !invariant.load !0
   %27 = shl i32 %23, 2
   %28 = or i32 %27, 3
   %29 = bitcast [0 x <8 x i32>] addrspace(6)* %1 to [0 x <4 x i32>] addrspace(6)*
-  %30 = getelementptr [0 x <4 x i32>], [0 x <4 x i32>] addrspace(6)* %29, i32 0, i32 %28, !amdgpu.uniform !0
+  %30 = getelementptr inbounds [0 x <4 x i32>], [0 x <4 x i32>] addrspace(6)* %29, i32 0, i32 %28, !amdgpu.uniform !0
   %31 = load <4 x i32>, <4 x i32> addrspace(6)* %30, align 16, !invariant.load !0
   %32 = call nsz <4 x float> @llvm.amdgcn.image.sample.1d.v4f32.f32(i32 15, float 0.0, <8 x i32> %26, <4 x i32> %31, i1 0, i32 0, i32 0) #8
   %33 = extractelement <4 x float> %32, i32 0
@@ -246,12 +246,12 @@ main_body:
   %22 = call nsz float @llvm.amdgcn.interp.mov(i32 2, i32 0, i32 0, i32 %5) #8
   %23 = bitcast float %22 to i32
   %24 = shl i32 %23, 1
-  %25 = getelementptr [0 x <8 x i32>], [0 x <8 x i32>] addrspace(6)* %1, i32 0, i32 %24
+  %25 = getelementptr inbounds [0 x <8 x i32>], [0 x <8 x i32>] addrspace(6)* %1, i32 0, i32 %24
   %26 = load <8 x i32>, <8 x i32> addrspace(6)* %25, align 32, !invariant.load !0
   %27 = shl i32 %23, 2
   %28 = or i32 %27, 3
   %29 = bitcast [0 x <8 x i32>] addrspace(6)* %1 to [0 x <4 x i32>] addrspace(6)*
-  %30 = getelementptr [0 x <4 x i32>], [0 x <4 x i32>] addrspace(6)* %29, i32 0, i32 %28
+  %30 = getelementptr inbounds [0 x <4 x i32>], [0 x <4 x i32>] addrspace(6)* %29, i32 0, i32 %28
   %31 = load <4 x i32>, <4 x i32> addrspace(6)* %30, align 16, !invariant.load !0
   %32 = call nsz <4 x float> @llvm.amdgcn.image.sample.1d.v4f32.f32(i32 15, float 0.0, <8 x i32> %26, <4 x i32> %31, i1 0, i32 0, i32 0) #8
   %33 = extractelement <4 x float> %32, i32 0
@@ -266,6 +266,17 @@ main_body:
   %42 = insertvalue <{ i32, i32, i32, i32, i32, float, float, float, float, float, float, float, float, float, float, float, float, float, float, float }> %41, float %36, 8
   %43 = insertvalue <{ i32, i32, i32, i32, i32, float, float, float, float, float, float, float, float, float, float, float, float, float, float, float }> %42, float %20, 19
   ret <{ i32, i32, i32, i32, i32, float, float, float, float, float, float, float, float, float, float, float, float, float, float, float }> %43
+}
+
+; GCN-LABEL: {{^}}load_addr_no_fold:
+; GCN-DAG: s_add_i32 s0, s0, 4
+; GCN-DAG: s_mov_b32 s1, 0
+; GCN: s_load_dword s{{[0-9]}}, s[0:1], 0x0
+define amdgpu_vs float @load_addr_no_fold(i32 addrspace(6)* inreg noalias %p0) #0 {
+  %gep1 = getelementptr i32, i32 addrspace(6)* %p0, i32 1
+  %r1 = load i32, i32 addrspace(6)* %gep1
+  %r2 = bitcast i32 %r1 to float
+  ret float %r2
 }
 
 ; Function Attrs: nounwind readnone speculatable

Added: vendor/llvm/dist-release_70/test/CodeGen/ARM/ldrex-frame-size.ll
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ vendor/llvm/dist-release_70/test/CodeGen/ARM/ldrex-frame-size.ll	Tue Sep 11 10:09:45 2018	(r338575)
@@ -0,0 +1,36 @@
+; RUN: llc -mtriple=thumbv7-linux-gnueabi -o - %s | FileCheck %s
+
+; This alloca is just large enough that FrameLowering decides it needs a frame
+; to guarantee access, based on the range of ldrex.
+
+; The actual alloca size is a bit of black magic, unfortunately: the real
+; maximum accessible is 1020, but FrameLowering adds 16 bytes to its estimated
+; stack size just because so the alloca is not actually the what the limit gets
+; compared to. The important point is that we don't go up to ~4096, which is the
+; default with no strange instructions.
+define void @test_large_frame() {
+; CHECK-LABEL: test_large_frame:
+; CHECK: push
+; CHECK: sub.w sp, sp, #1004
+
+  %ptr = alloca i32, i32 251
+
+  %addr = getelementptr i32, i32* %ptr, i32 1
+  call i32 @llvm.arm.ldrex.p0i32(i32* %addr)
+  ret void
+}
+
+; This alloca is just is just the other side of the limit, so no frame
+define void @test_small_frame() {
+; CHECK-LABEL: test_small_frame:
+; CHECK-NOT: push
+; CHECK: sub.w sp, sp, #1000
+
+  %ptr = alloca i32, i32 250
+
+  %addr = getelementptr i32, i32* %ptr, i32 1
+  call i32 @llvm.arm.ldrex.p0i32(i32* %addr)
+  ret void
+}
+
+declare i32 @llvm.arm.ldrex.p0i32(i32*)

Modified: vendor/llvm/dist-release_70/test/CodeGen/ARM/ldstrex.ll
==============================================================================
--- vendor/llvm/dist-release_70/test/CodeGen/ARM/ldstrex.ll	Mon Sep 10 22:48:26 2018	(r338574)
+++ vendor/llvm/dist-release_70/test/CodeGen/ARM/ldstrex.ll	Tue Sep 11 10:09:45 2018	(r338575)
@@ -142,6 +142,91 @@ define void @excl_addrmode() {
   ret void
 }
 
+define void @test_excl_addrmode_folded() {

*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201809111009.w8BA9jUQ027257>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation