From owner-svn-src-all@freebsd.org Wed Jul 29 19:27:58 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49E1C9AEE6A; Wed, 29 Jul 2015 19:27:58 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 38FC91F4D; Wed, 29 Jul 2015 19:27:58 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.70]) by repo.freebsd.org (8.14.9/8.14.9) with ESMTP id t6TJRwLD098371; Wed, 29 Jul 2015 19:27:58 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.14.9/8.14.9/Submit) id t6TJRwcH098369; Wed, 29 Jul 2015 19:27:58 GMT (envelope-from dim@FreeBSD.org) Message-Id: <201507291927.t6TJRwcH098369@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Wed, 29 Jul 2015 19:27:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r286034 - stable/10/contrib/llvm/patches X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2015 19:27:58 -0000 Author: dim Date: Wed Jul 29 19:27:57 2015 New Revision: 286034 URL: https://svnweb.freebsd.org/changeset/base/286034 Log: Add updated llvm patch corresponding to r286033. Added: stable/10/contrib/llvm/patches/patch-r286033-llvm-r219009-x86-codegen-crash.diff Added: stable/10/contrib/llvm/patches/patch-r286033-llvm-r219009-x86-codegen-crash.diff ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/10/contrib/llvm/patches/patch-r286033-llvm-r219009-x86-codegen-crash.diff Wed Jul 29 19:27:57 2015 (r286034) @@ -0,0 +1,217 @@ +Pull in r219009 from upstream llvm trunk (by Adam Nemet): + + [ISel] Keep matching state consistent when folding during X86 address match + + In the X86 backend, matching an address is initiated by the 'addr' complex + pattern and its friends. During this process we may reassociate and-of-shift + into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the + shift into the scale of the address. + + However as demonstrated by the testcase, this can trigger CSE of not only the + shift and the AND which the code is prepared for but also the underlying load + node. In the testcase this node is sitting in the RecordedNode and MatchScope + data structures of the matcher and becomes a deleted node upon CSE. Returning + from the complex pattern function, we try to access it again hitting an assert + because the node is no longer a load even though this was checked before. + + Now obviously changing the DAG this late is bending the rules but I think it + makes sense somewhat. Outside of addresses we prefer and-of-shift because it + may lead to smaller immediates (FoldMaskAndShiftToScale is an even better + example because it create a non-canonical node). We currently don't recognize + addresses during DAGCombiner where arguably this canonicalization should be + performed. On the other hand, having this in the matcher allows us to cover + all the cases where an address can be used in an instruction. + + I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for + the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible + for initiating the recursive CSE on users + (http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it + is not strictly necessary since the shift is hooked into the visited user. Of + course it's safer to keep the DAG consistent at all times (e.g. for accurate + number of uses, etc.). + + So rather than changing the fundamentals, I've decided to continue along the + previous patches and detect the CSE. This patch installs a very targeted + DAGUpdateListener for the duration of a complex-pattern match and updates the + matching state accordingly. (Previous patches used HandleSDNode to detect the + CSE but that's not practical here). The listener is only installed on X86. + + I tested that there is no measurable overhead due to this while running + through the spec2k BC files with llc. The only thing we pay for is the + creation of the listener. The callback never ever triggers in spec2k since + this is a corner case. + + Fixes rdar://problem/18206171 + +This fixes a possible crash in x86 code generation when compiling recent +llvm/clang trunk sources. + +Introduced here: http://svnweb.freebsd.org/changeset/base/286033 + +Index: include/llvm/CodeGen/SelectionDAGISel.h +=================================================================== +--- include/llvm/CodeGen/SelectionDAGISel.h ++++ include/llvm/CodeGen/SelectionDAGISel.h +@@ -238,6 +238,12 @@ class SelectionDAGISel : public MachineFunctionPas + const unsigned char *MatcherTable, + unsigned TableSize); + ++ /// \brief Return true if complex patterns for this target can mutate the ++ /// DAG. ++ virtual bool ComplexPatternFuncMutatesDAG() const { ++ return false; ++ } ++ + private: + + // Calls to these functions are generated by tblgen. +Index: lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp +=================================================================== +--- lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp ++++ lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp +@@ -2345,6 +2345,45 @@ struct MatchScope { + bool HasChainNodesMatched, HasGlueResultNodesMatched; + }; + ++/// \\brief A DAG update listener to keep the matching state ++/// (i.e. RecordedNodes and MatchScope) uptodate if the target is allowed to ++/// change the DAG while matching. X86 addressing mode matcher is an example ++/// for this. ++class MatchStateUpdater : public SelectionDAG::DAGUpdateListener ++{ ++ SmallVectorImpl > &RecordedNodes; ++ SmallVectorImpl &MatchScopes; ++public: ++ MatchStateUpdater(SelectionDAG &DAG, ++ SmallVectorImpl > &RN, ++ SmallVectorImpl &MS) : ++ SelectionDAG::DAGUpdateListener(DAG), ++ RecordedNodes(RN), MatchScopes(MS) { } ++ ++ void NodeDeleted(SDNode *N, SDNode *E) { ++ // Some early-returns here to avoid the search if we deleted the node or ++ // if the update comes from MorphNodeTo (MorphNodeTo is the last thing we ++ // do, so it's unnecessary to update matching state at that point). ++ // Neither of these can occur currently because we only install this ++ // update listener during matching a complex patterns. ++ if (!E || E->isMachineOpcode()) ++ return; ++ // Performing linear search here does not matter because we almost never ++ // run this code. You'd have to have a CSE during complex pattern ++ // matching. ++ for (SmallVectorImpl >::iterator I = ++ RecordedNodes.begin(), IE = RecordedNodes.end(); I != IE; ++I) ++ if (I->first.getNode() == N) ++ I->first.setNode(E); ++ ++ for (SmallVectorImpl::iterator I = MatchScopes.begin(), ++ IE = MatchScopes.end(); I != IE; ++I) ++ for (SmallVector::iterator J = I->NodeStack.begin(), ++ JE = I->NodeStack.end(); J != JE; ++J) ++ if (J->getNode() == N) ++ J->setNode(E); ++ } ++}; + } + + SDNode *SelectionDAGISel:: +@@ -2599,6 +2638,14 @@ SelectCodeCommon(SDNode *NodeToMatch, const unsign + unsigned CPNum = MatcherTable[MatcherIndex++]; + unsigned RecNo = MatcherTable[MatcherIndex++]; + assert(RecNo < RecordedNodes.size() && "Invalid CheckComplexPat"); ++ ++ // If target can modify DAG during matching, keep the matching state ++ // consistent. ++ OwningPtr MSU; ++ if (ComplexPatternFuncMutatesDAG()) ++ MSU.reset(new MatchStateUpdater(*CurDAG, RecordedNodes, ++ MatchScopes)); ++ + if (!CheckComplexPattern(NodeToMatch, RecordedNodes[RecNo].second, + RecordedNodes[RecNo].first, CPNum, + RecordedNodes)) +Index: lib/Target/X86/X86ISelDAGToDAG.cpp +=================================================================== +--- lib/Target/X86/X86ISelDAGToDAG.cpp ++++ lib/Target/X86/X86ISelDAGToDAG.cpp +@@ -290,6 +290,13 @@ namespace { + const X86InstrInfo *getInstrInfo() const { + return getTargetMachine().getInstrInfo(); + } ++ ++ /// \brief Address-mode matching performs shift-of-and to and-of-shift ++ /// reassociation in order to expose more scaled addressing ++ /// opportunities. ++ bool ComplexPatternFuncMutatesDAG() const { ++ return true; ++ } + }; + } + +Index: test/CodeGen/X86/addr-mode-matcher.ll +=================================================================== +--- test/CodeGen/X86/addr-mode-matcher.ll ++++ test/CodeGen/X86/addr-mode-matcher.ll +@@ -0,0 +1,62 @@ ++; RUN: llc < %s | FileCheck %s ++ ++; This testcase used to hit an assert during ISel. For details, see the big ++; comment inside the function. ++ ++; CHECK-LABEL: foo: ++; The AND should be turned into a subreg access. ++; CHECK-NOT: and ++; The shift (leal) should be folded into the scale of the address in the load. ++; CHECK-NOT: leal ++; CHECK: movl {{.*}},4), ++ ++target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128" ++target triple = "i386-apple-macosx10.6.0" ++ ++define void @foo(i32 %a) { ++bb: ++ br label %bb1692 ++ ++bb1692: ++ %tmp1694 = phi i32 [ 0, %bb ], [ %tmp1745, %bb1692 ] ++ %xor = xor i32 0, %tmp1694 ++ ++; %load1 = (load (and (shl %xor, 2), 1020)) ++ %tmp1701 = shl i32 %xor, 2 ++ %tmp1702 = and i32 %tmp1701, 1020 ++ %tmp1703 = getelementptr inbounds [1028 x i8]* null, i32 0, i32 %tmp1702 ++ %tmp1704 = bitcast i8* %tmp1703 to i32* ++ %load1 = load i32* %tmp1704, align 4 ++ ++; %load2 = (load (shl (and %xor, 255), 2)) ++ %tmp1698 = and i32 %xor, 255 ++ %tmp1706 = shl i32 %tmp1698, 2 ++ %tmp1707 = getelementptr inbounds [1028 x i8]* null, i32 0, i32 %tmp1706 ++ %tmp1708 = bitcast i8* %tmp1707 to i32* ++ %load2 = load i32* %tmp1708, align 4 ++ ++ %tmp1710 = or i32 %load2, %a ++ ++; While matching xor we address-match %load1. The and-of-shift reassocication ++; in address matching transform this into into a shift-of-and and the resuting ++; node becomes identical to %load2. CSE replaces %load1 which leaves its ++; references in MatchScope and RecordedNodes stale. ++ %tmp1711 = xor i32 %load1, %tmp1710 ++ ++ %tmp1744 = getelementptr inbounds [256 x i32]* null, i32 0, i32 %tmp1711 ++ store i32 0, i32* %tmp1744, align 4 ++ %tmp1745 = add i32 %tmp1694, 1 ++ indirectbr i8* undef, [label %bb1756, label %bb1692] ++ ++bb1756: ++ br label %bb2705 ++ ++bb2705: ++ indirectbr i8* undef, [label %bb5721, label %bb5736] ++ ++bb5721: ++ br label %bb2705 ++ ++bb5736: ++ ret void ++}