From owner-svn-src-head@freebsd.org Wed May 3 05:13:47 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 69276D59B13 for ; Wed, 3 May 2017 05:13:47 +0000 (UTC) (envelope-from 0100015bccba63ab-edf2debb-6781-4d23-b21a-fa25b2a11803-000000@amazonses.com) Received: from a8-60.smtp-out.amazonses.com (a8-60.smtp-out.amazonses.com [54.240.8.60]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2FFF3B70 for ; Wed, 3 May 2017 05:13:47 +0000 (UTC) (envelope-from 0100015bccba63ab-edf2debb-6781-4d23-b21a-fa25b2a11803-000000@amazonses.com) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=ae7m2yrxjw65l2cqdpjxuucyrvy564tn; d=tarsnap.com; t=1493788419; h=Subject:To:References:From:Message-ID:Date:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; bh=+WJwCjqygjym8FdUpyll75KOafZSX250l/F3GokTzX4=; b=s9UJRJuZOT7Jx3xHjjsC88KpzrSTIkezd/6svPcLSBNxfgr/7F5t4RNnGNvhxude ZcdnsKYTT2eOWnm89WPQ9BwU8+2zPaOdwKdM9WtYt9wEQkpeUQ8BNljF52OZ/imdf+e JsXF/p9iCNHd36c6fTkAcPEdZLfg7eM7z1ShDmBY= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=224i4yxa5dv7c2xz3womw6peuasteono; d=amazonses.com; t=1493788419; h=Subject:To:References:From:Message-ID:Date:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding:Feedback-ID; bh=+WJwCjqygjym8FdUpyll75KOafZSX250l/F3GokTzX4=; b=nYYEekhf02xo/hLsLHb2hMIQTfsDmTu8KdloWe0oWQ7Yx1sgLyjNp3zhbDLzWfDT hdwYu1uJmAP2u1UCZepDWv6H9rggda3zS6fgvBwyS8y9EMxbbYGBwEEeMEnpTUFAfL/ rDy24zv4VzDZX3ppwlB64bI+vOJ1YS/74847b66E= Subject: Re: svn commit: r301198 - head/sys/dev/xen/netfront To: =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org References: <201606021116.u52BGajD047287@repo.freebsd.org> From: Colin Percival Message-ID: <0100015bccba63ab-edf2debb-6781-4d23-b21a-fa25b2a11803-000000@email.amazonses.com> Date: Wed, 3 May 2017 05:13:39 +0000 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <201606021116.u52BGajD047287@repo.freebsd.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-SES-Outgoing: 2017.05.03-54.240.8.60 Feedback-ID: 1.us-east-1.Lv9FVjaNvvR5llaqfLoOVbo2VxOELl7cjN0AOyXnPlk=:AmazonSES X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 May 2017 05:13:47 -0000 On 06/02/16 04:16, Roger Pau Monné wrote: > Author: royger > Date: Thu Jun 2 11:16:35 2016 > New Revision: 301198 > URL: https://svnweb.freebsd.org/changeset/base/301198 I think this commit is responsible for panics I'm seeing in EC2 on T2 family instances. Every time a DHCP request is made, we call into xn_ifinit_locked (not sure why -- something to do with making the interface promiscuous?) and hit this code > @@ -1760,7 +1715,7 @@ xn_ifinit_locked(struct netfront_info *n > xn_alloc_rx_buffers(rxq); > rxq->ring.sring->rsp_event = rxq->ring.rsp_cons + 1; > if (RING_HAS_UNCONSUMED_RESPONSES(&rxq->ring)) > - taskqueue_enqueue(rxq->tq, &rxq->intrtask); > + xn_rxeof(rxq); > XN_RX_UNLOCK(rxq); > } but under high traffic volumes I think a separate thread can already be running in xn_rxeof, having dropped the RX lock while it passes a packet up the stack. This would result in two different threads trying to process the same set of responses from the ring, with (unsurprisingly) bad results. I'm not 100% sure that this is what's causing the panic, but it's definitely happening under high traffic conditions immediately after xn_ifinit_locked is called, so I think my speculation is well-founded. There are a few things I don't understand here: 1. Why DHCP requests are resulting in calls into xn_ifinit_locked. 2. Why the calls into xn_ifinit_locked are only happening on T2 instances and not on any of the other EC2 instances I've tried. 3. Why xn_ifinit_locked is consuming ring responses. so I'm not sure what the solution is, but hopefully someone who knows this code better will be able to help... -- Colin Percival Security Officer Emeritus, FreeBSD | The power to serve Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid