From owner-freebsd-net@freebsd.org Tue Aug 15 22:35:44 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9328CDD1103; Tue, 15 Aug 2017 22:35:44 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wr0-f176.google.com (mail-wr0-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2BB6E38EF; Tue, 15 Aug 2017 22:35:43 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-wr0-f176.google.com with SMTP id x43so7341396wrb.3; Tue, 15 Aug 2017 15:35:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:references:cc:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=1BRdJioQHmQu/OSDTYnLutICDr9h7FDVnHJOLgreR1I=; b=mPcBk46XOvRvqU7gT/fNOxwrhOcDIwdIddl0l6EwQZrWTqk7grWj7hFom5iAQ+7txF e/czE/RakqzQHziJiiF773xE4p8Pdwr8GfUuZkTXNfdqFEFIppkduQ+VATqnkG1mUtlu C9028598Q8ExHNCuHUesu5yacbtJ8/ah8rXSU+0ZNXckeIlZ2CCJVVjyC9wp72RuyD9Z VP0JR1SYjumfkySrnRDi8zw8SvWt282Z8Lsy9SMf4EeIT8QD2GyqD5mZV+wk++AbZ+4R A1kEwie4O8lpHxConnlToYUpj8DtiSHx8/GD64HL6WggEH6GCFV9EsH1sIzU3hKlfVdi DKvQ== X-Gm-Message-State: AHYfb5hB16GjJ/vSzlhgDC//hMowTpMvcE+rHhBGMoWydZEh90pZwIiD 8YztN/xPmhzuRduzzVc= X-Received: by 10.223.151.65 with SMTP id r59mr20767615wrb.189.1502832821381; Tue, 15 Aug 2017 14:33:41 -0700 (PDT) Received: from [10.100.64.20] ([217.30.88.7]) by smtp.gmail.com with ESMTPSA id l43sm11393378wrl.33.2017.08.15.14.33.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 Aug 2017 14:33:39 -0700 (PDT) From: Julien Charbon Subject: Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???) To: Ben RUBSON References: <9c306f10-7c05-d28d-e551-a930603aaafa@selasky.org> <896dd782-cb2c-0259-65d1-b00daae452de@FreeBSD.org> <0DB9F6FF-8BC9-48F5-B359-AC1905B9EB06@gmail.com> <7f14c95d-1ef8-bf82-c469-e6566c3aba66@selasky.org> <76A5EE7E-1D2E-46B4-86F1-F219C3DCE6EA@gmail.com> <4C91C6E5-0725-42E7-9813-1F3ACF3DDD6E@gmail.com> <5840c25e-7472-3276-6df9-1ed4183078ad@selasky.org> <2ADA8C57-2C2D-4F97-9F0B-82D53EDDC649@gmail.com> <061cdf72-6285-8239-5380-58d9d19a1ef7@selasky.org> <92BEE83D-498F-47D5-A53C-39DCDC00A0FD@gmail.com> <5d8960d8-e1ff-8719-320f-d3ae84054714@selasky.org> <6B4A35F7-5694-4945-9575-19ADB678F9FA@gmail.com> <297a784a-3d80-b1a6-652e-a78621fe5a8b@selasky.org> <3ECCFBF1-18D9-4E33-8F39-0C366C3BB8B4@gmail.com> <0a5787c5-8a53-ab09-971a-dc1cd5f3aca0@freebsd.org> Cc: Hans Petter Selasky , FreeBSD Net , hiren , Slawa Olhovchenkov , FreeBSD Stable Message-ID: <645f2ee3-3eaa-660e-2a64-37d53e88322f@freebsd.org> Date: Tue, 15 Aug 2017 23:33:36 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Aug 2017 22:35:44 -0000 Hi Ben, On 8/11/17 11:32 AM, Ben RUBSON wrote: >> On 08 Aug 2017, at 13:33, Julien Charbon wrote: >> >> On 8/8/17 10:31 AM, Hans Petter Selasky wrote: >>> >>> Suggested fix attached. >> >> I agree we your conclusion. Just for the record, more precisely this >> regression seems to have been introduced with: >> (...) >> Thus good catch, and your patch looks good. I am going to just verify >> the other in_pcbrele_wlocked() calls in TCP stack. > > Julien, do you plan to make this fix reach 11.0-p12 ? I am checking if your issue is another flavor of the issue fixed by: https://svnweb.freebsd.org/base?view=revision&revision=307551 https://reviews.freebsd.org/D8211 This fix in not in 11.0 but in 11.1. Currently I did not found how an inp in INP_TIMEWAIT state can have been INP_FREED without having its tw set to NULL already except the issue fixed by r307551. Thus could you try to apply this patch: https://github.com/freebsd/freebsd/commit/acb5bfda99b753d9ead3529d04f20087c5f7d0a0.patch and see if you can still reproduce this issue? And in the spirit of r307551 fix and based on Hans patch I will also propose to add a kernel log describing the issue instead of starting an infinite loop when INVARIANT is not set. -- Julien