From owner-freebsd-net@FreeBSD.ORG  Sun Mar  3 10:14:47 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 88236A45
 for <freebsd-net@freebsd.org>; Sun,  3 Mar 2013 10:14:47 +0000 (UTC)
 (envelope-from sepherosa@gmail.com)
Received: from mail-we0-x22a.google.com (mail-we0-x22a.google.com
 [IPv6:2a00:1450:400c:c03::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id 178D0DC9
 for <freebsd-net@freebsd.org>; Sun,  3 Mar 2013 10:14:46 +0000 (UTC)
Received: by mail-we0-f170.google.com with SMTP id z53so3839489wey.1
 for <freebsd-net@freebsd.org>; Sun, 03 Mar 2013 02:14:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=ZcSKMS+LANeREY4ze6TFG9fc0w7myZ9+Bf8VLNFyTo4=;
 b=NjmCEWcUIirzZ+sNJDI0KOynYIkJ2KygO40CLbIYckqZPIn1nNyjj5wQ4cXWqPdO/M
 jOgWnGwvhoip+AO3Myre8yblzLK4LKXEGtnrX366B77x9Jl7gJyLa1SaMHgpQUwiT35p
 iKuo44GqQFrOasZ8ZyxMDIrZA13P3f8Wq0aIHSPf8rh3ikhhES8Mi/gOWS2cTz+uOFYg
 vffFzNa6rTyTV6F+xaPxOZ+3EH209xpYeHqTUWRBD/x5PT3+MxEG+sayuv9r8LKx9TWL
 /wTYwLekWFL6RGHOIEAIszcY8PQFgOx/IB8Q/D16wmi5moYAL6Uv959KDjFe7gFJJ2Ww
 E06A==
MIME-Version: 1.0
X-Received: by 10.180.185.44 with SMTP id ez12mr5516119wic.33.1362305685002;
 Sun, 03 Mar 2013 02:14:45 -0800 (PST)
Received: by 10.194.89.170 with HTTP; Sun, 3 Mar 2013 02:14:44 -0800 (PST)
In-Reply-To: <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
References: <512BAA60.3060703@biostat.wisc.edu>
 <CAFOYbckDFJKRip+e=a+_JPHhk+HbAikRBK0dHEBDDEgdsZT6sw@mail.gmail.com>
 <512BAF8D.7080308@biostat.wisc.edu>
 <CAFOYbcnEN=Pzd9k4hvR+wqP3_HJj3-QRQSwocfHDSehUH5YPXA@mail.gmail.com>
 <CAKOb=YYyJZyKzpEBT+o-Vmn7dedRfVW+wVh1KVM7oaWT63+qBg@mail.gmail.com>
 <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
Date: Sun, 3 Mar 2013 18:14:44 +0800
Message-ID: <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
Subject: Re: igb network lockups
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Nick Rogers <ncrogers@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Mar 2013 10:14:47 -0000

On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers <ncrogers@gmail.com> wrote:
> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers <ncrogers@gmail.com> wrote:
>> FWIW I have been experiencing a similar issue on a number of systems
>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from
>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms
>> are: interface stops passing traffic until the system is rebooted. I
>> have not yet been able to gain access to the systems to dig around
>> (after they have crashed), however my kernel/network settings are
>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to
>> happen about once a day on systems with around a sustained 50Mb/s of
>> traffic.
>>
>> I realize this is not much to go on but perhaps it helps. I am
>> debating trying the e1000 driver in the latest CURRENT on top of
>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week
>> ago. Would this change or perhaps another change to e1000 since
>> 9.1-RELEASE possibly affect stability in a positive way?
>>
>> Thanks.
>
> Heres relevant pciconf output:
>
> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = '82574L Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>     cap 05[d0] = MSI supports 1 message, 64 bit
>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected

For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it
is simply broken (you could check 82574 errata on Intel's website to
confirm what I have said here).

For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it
is simply broken (you could check 82575 errata on Intel's website to
confirm what I have said here).

Best Regards,
sephe

--
Tomorrow Will Never Die

On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers <ncrogers@gmail.com> wrote:
> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers <ncrogers@gmail.com> wrote:
>> FWIW I have been experiencing a similar issue on a number of systems
>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from
>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms
>> are: interface stops passing traffic until the system is rebooted. I
>> have not yet been able to gain access to the systems to dig around
>> (after they have crashed), however my kernel/network settings are
>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to
>> happen about once a day on systems with around a sustained 50Mb/s of
>> traffic.
>>
>> I realize this is not much to go on but perhaps it helps. I am
>> debating trying the e1000 driver in the latest CURRENT on top of
>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week
>> ago. Would this change or perhaps another change to e1000 since
>> 9.1-RELEASE possibly affect stability in a positive way?
>>
>> Thanks.
>
> Heres relevant pciconf output:
>
> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = '82574L Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>     cap 05[d0] = MSI supports 1 message, 64 bit
>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
> em1@pci0:2:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = '82574L Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>     cap 05[d0] = MSI supports 1 message, 64 bit
>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
> em2@pci0:7:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = '82574L Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>     cap 05[d0] = MSI supports 1 message, 64 bit
>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
> em3@pci0:8:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = '82574L Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>     cap 05[d0] = MSI supports 1 message, 64 bit
>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>
>
>>
>> On Mon, Feb 25, 2013 at 10:45 AM, Jack Vogel <jfvogel@gmail.com> wrote:
>>> Have you done any poking around, looking at stats to determine why the
>>> hangs? For instance,
>>> might your mbuf pool be depleted? Some other network resource perhaps?
>>>
>>> Jack
>>>
>>>
>>> On Mon, Feb 25, 2013 at 10:38 AM, Christopher D. Harrison <
>>> harrison@biostat.wisc.edu> wrote:
>>>
>>>>  Sure,
>>>> The problem appears on both systems running with ALTQ and vanilla.
>>>>     -C
>>>>
>>>> On 02/25/13 12:29, Jack Vogel wrote:
>>>>
>>>> I've not heard of this problem, but I think most users do not use ALTQ,
>>>> and we (Intel) do not
>>>> test using it. Can it be eliminated from the equation?
>>>>
>>>> Jack
>>>>
>>>>
>>>> On Mon, Feb 25, 2013 at 10:16 AM, Christopher D. Harrison <
>>>> harrison@biostat.wisc.edu> wrote:
>>>>
>>>>> I recently have been experiencing network "freezes" and network "lockups"
>>>>> on our Freebsd 9.1 systems which are running zfs and nfs file servers.
>>>>> I upgraded from 9.0 to 9.1 about 2 months ago and we have been having
>>>>> issues with almost bi-monthly.   The issue manifests in the system becomes
>>>>> unresponsive to any/all nfs clients.   The system is not resource bound as
>>>>> our I/O is low to disk and our network is usually in the 20mbit/40mbit
>>>>> range.   We do notice a correlation between temporary i/o spikes and
>>>>> network freezes but not enough to send our system in to "lockup" mode for
>>>>> the next 5min.   Currently we have 4 igb nics in 2 aggr's with 8 queue's
>>>>> per nic and our dev.igb reports:
>>>>>
>>>>> dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.4
>>>>>
>>>>> I am almost certain the problem is with the ibg driver as a friend is
>>>>> also experiencing the same problem with the same intel igb nic.   He has
>>>>> addressed the issue by restarting the network using netif on his systems.
>>>>> According to my friend, once the network interfaces get cleared, everything
>>>>> comes back and starts working as expected.
>>>>>
>>>>> I have noticed an issue with the igb driver and I was looking for
>>>>> thoughts on how to help address this problem.
>>>>>
>>>>> http://freebsd.1045724.n5.nabble.com/em-igb-if-transmit-drbr-and-ALTQ-td5760338.html
>>>>>
>>>>> Thoughts/Ideas are greatly appreciated!!!
>>>>>
>>>>>     -C
>>>>>
>>>>> _______________________________________________
>>>>> freebsd-net@freebsd.org mailing list
>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


--
Tomorrow Will Never Die

From owner-freebsd-net@FreeBSD.ORG  Sun Mar  3 15:20:53 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D040EA0B
 for <freebsd-net@freebsd.org>; Sun,  3 Mar 2013 15:20:53 +0000 (UTC)
 (envelope-from pawel.worach@gmail.com)
Received: from mail-la0-x229.google.com (mail-la0-x229.google.com
 [IPv6:2a00:1450:4010:c03::229])
 by mx1.freebsd.org (Postfix) with ESMTP id 4F5A6B61
 for <freebsd-net@freebsd.org>; Sun,  3 Mar 2013 15:20:53 +0000 (UTC)
Received: by mail-la0-f41.google.com with SMTP id fo12so4281115lab.14
 for <freebsd-net@freebsd.org>; Sun, 03 Mar 2013 07:20:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:message-id:date:from:user-agent:mime-version:to:subject
 :content-type:content-transfer-encoding;
 bh=hpLalTKDGz/GLk4+w9mlrfqY4aJchhfOG0teBmUnrwE=;
 b=rjHmshf9bIF7OjShNmweU0OgAcCJL38ZhJTbt3KwPduGq39NgBzs6Iy5eeg7kwzfjH
 U6XakKlPDSsxFACdYuR9eVDdUi6D8qe5GTNeycKKRUH3mYTl1y3CJUtWstvL3PJ+tKC5
 kblDDABmAcu995viuuTePHklN42vd5hHtD5tZwf3ne2jKm9kv+Mrt4zl0/XkncA7NzRk
 s5Rf+4jXdwkdYTTffmyPUGIdcrCzL3RpRHs7M6SWEo6c5TQrMx3OSX+jiYC+QnC0+Qd+
 LVlRbzyOLHYD8+ZpVV1+DgzDdMgpIdfDGC0jwMpbja0DPXTtvVR1CF5nZr+3WlPG9Pd+
 EJGQ==
X-Received: by 10.152.131.233 with SMTP id op9mr15142771lab.3.1362323696254;
 Sun, 03 Mar 2013 07:14:56 -0800 (PST)
Received: from one.local ([2001:16d8:ffce:0:5586:8f94:1e64:77f6])
 by mx.google.com with ESMTPS id j2sm6282510lbd.16.2013.03.03.07.14.54
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Sun, 03 Mar 2013 07:14:55 -0800 (PST)
Message-ID: <513368EE.9090802@gmail.com>
Date: Sun, 03 Mar 2013 16:14:54 +0100
From: Pawel Worach <pawel.worach@gmail.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130222 Thunderbird/17.0.3
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: ipfw NAT, keepalive from wrong source
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Mar 2013 15:20:53 -0000

Hi,

In the scenario below ipfw seems to be sending the keep-alive packets 
from the wrong source address if the traffic is NATed, on the external 
interface the packet is sent to the server with the original source. Did 
I configure my ipfw rules incorrectly ? I'm using in-kernel NAT on 
FreeBSD 9-STABLE r247666 with r247626 merged from head (that patch did 
not change the behavior).

Internal client (172.16.0.31) connects to an external ssh server 
(192.0.2.100) with hide-nat behind a.b.c.d.

tcpdump on outside interface (the second packets is likely the keepalive 
ACK the client sent as result of the keepalive the ipfw gateway sent on 
the inside which got forwarded on to the server, is that intentional ?):
15:36:28.075529 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 
2804620200, win 0, length 0
15:36:28.076823 IP a.b.c.d.41731 > 192.0.2.100.22: Flags [.], ack 2625, 
win 1040, options [nop,nop,TS val 151519866 ecr 3275697134], length 0
15:36:33.075499 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, 
win 0, length 0
15:36:38.075497 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, 
win 0, length 0
15:36:43.075519 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, 
win 0, length 0

tcpdump on inside interface:
15:36:28.078015 IP 192.0.2.100.22 > 172.16.0.31.41731: Flags [.], ack 
517940233, win 0, length 0
15:36:28.078040 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, 
win 1040, options [nop,nop,TS val 151519866 ecr 3275697134], length 0

State table (the keepalives where send at about 20-19 seconds before 
expiration):
03600     27      7867 (22s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     27      7867 (21s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     27      7867 (20s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     27      7867 (19s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     28      7919 (18s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     28      7919 (17s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     28      7919 (16s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     28      7919 (15s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
03600     28      7919 (14s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22
.. continues to 1 and disappears ..

Rules (em0 is the external interface):
${fwcmd} nat 10 config if em0 log same_ports unreg_only
${fwcmd} add nat 10 all from 172.16.0.0/12 to any via em0
${fwcmd} add nat 10 all from not 172.16.0.0/12 any to me via em0
${fwcmd} add allow tcp from 172.16.0.0/12 to any established
${fwcmd} add allow tcp from 172.16.0.0/12 to any setup keep-state # this 
is rule 03600)

Regards
Pawel

From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 11:06:46 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 6BFFFEB6
 for <freebsd-net@FreeBSD.org>; Mon,  4 Mar 2013 11:06:46 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 5D2D9E55
 for <freebsd-net@FreeBSD.org>; Mon,  4 Mar 2013 11:06:46 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r24B6kiU038832
 for <freebsd-net@FreeBSD.org>; Mon, 4 Mar 2013 11:06:46 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r24B6k00038830
 for freebsd-net@FreeBSD.org; Mon, 4 Mar 2013 11:06:46 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 4 Mar 2013 11:06:46 GMT
Message-Id: <201303041106.r24B6k00038830@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-net@FreeBSD.org
Subject: Current problem reports assigned to freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 11:06:46 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/176596  net        [firewire] [ip6] Crash with IPv6 and Firewire
o kern/176510  net        [udp] [panic] Kernel Panic in udp_input @ offset 0x475
o kern/176446  net        [netinet] [patch] Concurrency in ixgbe driving out-of-
o kern/176420  net        [kernel] [patch] incorrect errno for LOCAL_PEERCRED
o kern/176419  net        [kernel] [patch] socketpair support for LOCAL_PEERCRED
o kern/176401  net        [netgraph] page fault  in netgraph
o kern/176167  net        [ipsec][lagg] using lagg and ipsec causes immediate pa
o kern/176097  net        [lagg] [patch] lagg/lacp broken when aggregated interf
o kern/176027  net        [em] [patch] flow control systcl consistency for em dr
o kern/176026  net        [tcp] [patch] TCP wrappers caused quite a lot of warni
o bin/175974   net        ppp(8): logic issue
o kern/175864  net        [re] Intel MB D510MO, onboard ethernet not working aft
o kern/175852  net        [amd64] [patch] in_cksum_hdr() behaves differently on 
o kern/175734  net        no ethernet detected on system with EG20T PCH chipset 
o kern/175267  net        [pf] [tap] pf + tap keep state problem
o kern/175236  net        [epair] [gif] epair and gif Devices On Bridge
o kern/175182  net        [panic] kernel panic on RADIX_MPATH when deleting rout
o kern/175153  net        [tcp] will there miss a FIN when do TSO?
o kern/174959  net        [net] [patch] rnh_walktree_from visits spurious nodes
o kern/174958  net        [net] [patch] rnh_walktree_from makes unreasonable ass
o kern/174897  net        [route] Interface routes are broken
o kern/174851  net        [bxe] [patch] UDP checksum offload is wrong in bxe dri
o kern/174850  net        [bxe] [patch] bxe driver does not receive multicasts
o kern/174849  net        [bxe] [patch] bxe driver can hang kernel when reset
o kern/174822  net        [tcp] Page fault in tcp_discardcb under high traffic
o kern/174602  net        [gif] [ipsec] traceroute issue on gif tunnel with ipse
o kern/174535  net        [tcp] TCP fast retransmit feature works strange
o kern/173475  net        [tun] tun(4) stays opened by PID after process is term
o kern/173201  net        [ixgbe] [patch] Missing / broken ixgbe sysctl's and tu
o kern/173137  net        [em] em(4) unable to run at gigabit with 9.1-RC2
o kern/173002  net        [patch] data type size problem in if_spppsubr.c
o kern/172985  net        [patch] [ip6] lltable leak when adding and removing IP
o kern/172895  net        [ixgb] [ixgbe] do not properly determine link-state
o kern/172683  net        [ip6] Duplicate IPv6 Link Local Addresses
o kern/172675  net        [netinet] [patch] sysctl_tcp_hc_list (net.inet.tcp.hos
o kern/172113  net        [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4
o kern/171840  net        [ip6] IPv6 packets transmitting only on queue 0
o kern/171739  net        [bce] [panic] bce related kernel panic
o kern/171711  net        [dummynet] [panic] Kernel panic in dummynet
o kern/171532  net        [ndis] ndis(4) driver includes 'pccard'-specific code,
o kern/171531  net        [ndis] undocumented dependency for ndis(4)
o kern/171524  net        [ipmi] ipmi driver crashes kernel by reboot or shutdow
s kern/171508  net        [epair] [request] Add the ability to name epair device
o kern/171228  net        [re] [patch] if_re - eeprom write issues
o kern/170701  net        [ppp] killl ppp or reboot with active ppp connection c
o kern/170267  net        [ixgbe] IXGBE_LE32_TO_CPUS is probably an unintentiona
o kern/170081  net        [fxp] pf/nat/jails not working if checksum offloading 
o kern/169898  net        ifconfig(8) fails to set MTU on multiple interfaces.
o kern/169676  net        [bge] [hang] system hangs, fully or partially after re
o kern/169664  net        [bgp] Wrongful replacement of interface connected net 
o kern/169620  net        [ng] [pf] ng_l2tp incoming packet bypass pf firewall
o kern/169459  net        [ppp] umodem/ppp/3g stopped working after update from 
o kern/169438  net        [ipsec] ipv4-in-ipv6 tunnel mode IPsec does not work
p kern/168294  net        [ixgbe] [patch] ixgbe driver compiled in kernel has no
o kern/168246  net        [em] Multiple em(4) not working with qemu
o kern/168245  net        [arp] [regression] Permanent ARP entry not deleted on 
o kern/168244  net        [arp] [regression] Unable to manually remove permanent
o kern/168183  net        [bce] bce driver hang system
o kern/167947  net        [setfib] [patch] arpresolve checks only the default FI
o kern/167603  net        [ip] IP fragment reassembly's broken: file transfer ov
o kern/167500  net        [em] [panic] Kernel panics in em driver
o kern/167325  net        [netinet] [patch] sosend sometimes return EINVAL with 
o kern/167202  net        [igmp]: Sending multiple IGMP packets crashes kernel
o kern/167059  net        [tcp] [panic] System does panic in in_pcbbind() and ha
o kern/166940  net        [ipfilter] [panic] Double fault in kern 8.2
o kern/166462  net        [gre] gre(4) when using a tunnel source address from c
o kern/166372  net        [patch] ipfilter drops UDP packets with zero checksum 
o kern/166285  net        [arp] FreeBSD v8.1 REL p8 arp: unknown hardware addres
o kern/166255  net        [net] [patch] It should be possible to disable "promis
o kern/165963  net        [panic] [ipf] ipfilter/nat NULL pointer deference
o kern/165903  net        mbuf leak
o kern/165643  net        [net] [patch] Missing vnet restores in net/if_ethersub
o kern/165622  net        [ndis][panic][patch] Unregistered use of FPU in kernel
s kern/165562  net        [request] add support for Intel i350 in FreeBSD 7.4
o kern/165526  net        [bxe] UDP packets checksum calculation whithin if_bxe 
o kern/165488  net        [ppp] [panic] Fatal trap 12 jails and ppp , kernel wit
o kern/165305  net        [ip6] [request] Feature parity between IP_TOS and IPV6
o kern/165296  net        [vlan] [patch] Fix EVL_APPLY_VLID, update EVL_APPLY_PR
o kern/165181  net        [igb] igb freezes after about 2 weeks of uptime
o kern/165174  net        [patch] [tap] allow tap(4) to keep its address on clos
o kern/165152  net        [ip6] Does not work through the issue of ipv6 addresse
o kern/164495  net        [igb] connect double head igb to switch cause system t
o kern/164490  net        [pfil] Incorrect IP checksum on pfil pass from ip_outp
o kern/164475  net        [gre] gre misses RUNNING flag after a reboot
o kern/164265  net        [netinet] [patch] tcp_lro_rx computes wrong checksum i
o kern/163903  net        [igb] "igb0:tx(0)","bpf interface lock" v2.2.5 9-STABL
o kern/163481  net        freebsd do not add itself to ping route packet
o kern/162927  net        [tun] Modem-PPP error ppp[1538]: tun0: Phase: Clearing
o kern/162926  net        [ipfilter] Infinite loop in ipfilter with fragmented I
o kern/162558  net        [dummynet] [panic] seldom dummynet panics
o kern/162153  net        [em] intel em driver 7.2.4 don't compile
o kern/162110  net        [igb] [panic] RELENG_9 panics on boot in IGB driver - 
o kern/162028  net        [ixgbe] [patch] misplaced #endif in ixgbe.c
o kern/161277  net        [em] [patch] BMC cannot receive IPMI traffic after loa
o kern/160873  net        [igb] igb(4) from HEAD fails to build on 7-STABLE
o kern/160750  net        Intel PRO/1000 connection breaks under load until rebo
o kern/160693  net        [gif] [em] Multicast packet are not passed from GIF0 t
o kern/160293  net        [ieee80211] ppanic] kernel panic during network setup 
o kern/160206  net        [gif] gifX stops working after a while (IPv6 tunnel)
o kern/159817  net        [udp] write UDPv4: No buffer space available (code=55)
o kern/159629  net        [ipsec] [panic] kernel panic with IPsec in transport m
o kern/159621  net        [tcp] [panic] panic: soabort: so_count
o kern/159603  net        [netinet] [patch] in_ifscrubprefix() - network route c
o kern/159601  net        [netinet] [patch] in_scrubprefix() - loopback route re
o kern/159294  net        [em] em watchdog timeouts
o kern/159203  net        [wpi] Intel 3945ABG Wireless LAN not support IBSS
o kern/158930  net        [bpf] BPF element leak in ifp->bpf_if->bif_dlist
o kern/158726  net        [ip6] [patch] ICMPv6 Router Announcement flooding limi
o kern/158694  net        [ix] [lagg] ix0 is not working within lagg(4)
o kern/158665  net        [ip6] [panic] kernel pagefault in in6_setscope()
o kern/158635  net        [em] TSO breaks BPF packet captures with em driver
f kern/157802  net        [dummynet] [panic] kernel panic in dummynet
o kern/157785  net        amd64 + jail + ipfw + natd = very slow outbound traffi
o kern/157418  net        [em] em driver lockup during boot on Supermicro X9SCM-
o kern/157410  net        [ip6] IPv6 Router Advertisements Cause Excessive CPU U
o kern/157287  net        [re] [panic] INVARIANTS panic (Memory modified after f
o kern/157209  net        [ip6] [patch] locking error in rip6_input() (sys/netin
o kern/157200  net        [network.subr] [patch] stf(4) can not communicate betw
o kern/157182  net        [lagg] lagg interface not working together with epair 
o kern/156877  net        [dummynet] [panic] dummynet move_pkt() null ptr derefe
o kern/156667  net        [em] em0 fails to init on CURRENT after March 17
o kern/156408  net        [vlan] Routing failure when using VLANs vs. Physical e
o kern/156328  net        [icmp]: host can ping other subnet but no have IP from
o kern/156317  net        [ip6] Wrong order of IPv6 NS DAD/MLD Report
o kern/156283  net        [ip6] [patch] nd6_ns_input - rtalloc_mpath does not re
o kern/156279  net        [if_bridge][divert][ipfw] unable to correctly re-injec
o kern/156226  net        [lagg]: failover does not announce the failover to swi
o kern/156030  net        [ip6] [panic] Crash in nd6_dad_start() due to null ptr
o kern/155772  net        ifconfig(8): ioctl (SIOCAIFADDR): File exists on direc
o kern/155680  net        [multicast] problems with multicast
s kern/155642  net        [new driver] [request] Add driver for Realtek RTL8191S
o kern/155597  net        [panic] Kernel panics with "sbdrop" message
o kern/155420  net        [vlan] adding vlan break existent vlan
o kern/155177  net        [route] [panic] Panic when inject routes in kernel
p kern/155030  net        [igb] igb(4) DEVICE_POLLING does not work with carp(4)
o kern/155010  net        [msk] ntfs-3g via iscsi using msk driver cause kernel 
o kern/154943  net        [gif] ifconfig gifX create on existing gifX clears IP
s kern/154851  net        [new driver] [request]: Port brcm80211 driver from Lin
o kern/154850  net        [netgraph] [patch] ng_ether fails to name nodes when t
o kern/154679  net        [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R
o kern/154600  net        [tcp] [panic] Random kernel panics on tcp_output
o kern/154557  net        [tcp] Freeze tcp-session of the clients, if in the gat
o kern/154443  net        [if_bridge] Kernel module bridgestp.ko missing after u
o kern/154286  net        [netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/154255  net        [nfs] NFS not responding
o kern/154214  net        [stf] [panic] Panic when creating stf interface
o kern/154185  net        race condition in mb_dupcl
o kern/154169  net        [multicast] [ip6] Node Information Query multicast add
o kern/154134  net        [ip6] stuck kernel state in LISTEN on ipv6 daemon whic
o kern/154091  net        [netgraph] [panic] netgraph, unaligned mbuf?
o conf/154062  net        [vlan] [patch] change to way of auto-generatation of v
o kern/153937  net        [ral] ralink panics the system (amd64 freeBSDD 8.X) wh
o kern/153936  net        [ixgbe] [patch] MPRC workaround incorrectly applied to
o kern/153816  net        [ixgbe] ixgbe doesn't work properly with the Intel 10g
o kern/153772  net        [ixgbe] [patch] sysctls reference wrong XON/XOFF varia
o kern/153497  net        [netgraph] netgraph panic due to race conditions
o kern/153454  net        [patch] [wlan] [urtw] Support ad-hoc and hostap modes 
o kern/153308  net        [em] em interface use 100% cpu
o kern/153244  net        [em] em(4) fails to send UDP to port 0xffff
o kern/152893  net        [netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/152853  net        [em] tftpd (and likely other udp traffic) fails over e
o kern/152828  net        [em] poor performance on 8.1, 8.2-PRE
o kern/152569  net        [net]: Multiple ppp connections and routing table prob
o kern/152235  net        [arp] Permanent local ARP entries are not properly upd
o kern/152141  net        [vlan] [patch] encapsulate vlan in ng_ether before out
o kern/152036  net        [libc] getifaddrs(3) returns truncated sockaddrs for n
o kern/151690  net        [ep] network connectivity won't work until dhclient is
o kern/151681  net        [nfs] NFS mount via IPv6 leads to hang on client with 
o kern/151593  net        [igb] [panic] Kernel panic when bringing up igb networ
o kern/150920  net        [ixgbe][igb] Panic when packets are dropped with heade
o kern/150557  net        [igb] igb0: Watchdog timeout -- resetting
o kern/150251  net        [patch] [ixgbe] Late cable insertion broken
o kern/150249  net        [ixgbe] Media type detection broken
o bin/150224   net        ppp(8) does not reassign static IP after kill -KILL co
f kern/149969  net        [wlan] [ral] ralink rt2661 fails to maintain connectio
o kern/149937  net        [ipfilter] [patch] kernel panic in ipfilter IP fragmen
o kern/149643  net        [rum] device not sending proper beacon frames in ap mo
o kern/149609  net        [panic] reboot after adding second default route
o kern/149117  net        [inet] [patch] in_pcbbind: redundant test
o kern/149086  net        [multicast] Generic multicast join failure in 8.1
o kern/148018  net        [flowtable] flowtable crashes on ia64
o kern/147912  net        [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300  11
o kern/147894  net        [ipsec] IPv6-in-IPv4 does not work inside an ESP-only 
o kern/147155  net        [ip6] setfb not work with ipv6
o kern/146845  net        [libc] close(2) returns error 54 (connection reset by 
f kern/146792  net        [flowtable] flowcleaner 100% cpu's core load
o kern/146719  net        [pf] [panic] PF or dumynet kernel panic
o kern/146534  net        [icmp6] wrong source address in echo reply
o kern/146427  net        [mwl] Additional virtual access points don't work on m
f kern/146394  net        [vlan] IP source address for outgoing connections
o bin/146377   net        [ppp] [tun] Interface doesn't clear addresses when PPP
o kern/146358  net        [vlan] wrong destination MAC address
o kern/146165  net        [wlan] [panic] Setting bssid in adhoc mode causes pani
o kern/146082  net        [ng_l2tp] a false invaliant check was performed in ng_
o kern/146037  net        [panic] mpd + CoA = kernel panic
o kern/145825  net        [panic] panic: soabort: so_count
o kern/145728  net        [lagg] Stops working lagg between two servers.
p kern/145600  net        TCP/ECN behaves different to CE/CWR than ns2 reference
f kern/144917  net        [flowtable] [panic] flowtable crashes system [regressi
o kern/144882  net        MacBookPro =>4.1 does not connect to BSD in hostap wit
o kern/144874  net        [if_bridge] [patch] if_bridge frees mbuf after pfil ho
o conf/144700  net        [rc.d] async dhclient breaks stuff for too many people
o kern/144616  net        [nat] [panic] ip_nat panic FreeBSD 7.2
f kern/144315  net        [ipfw] [panic] freebsd 8-stable reboot after add ipfw 
o kern/144231  net        bind/connect/sendto too strict about sockaddr length
o kern/143846  net        [gif] bringing gif3 tunnel down causes gif0 tunnel to 
s kern/143673  net        [stf] [request] there should be a way to support multi
s kern/143666  net        [ip6] [request] PMTU black hole detection not implemen
o kern/143622  net        [pfil] [patch] unlock pfil lock while calling firewall
o kern/143593  net        [ipsec] When using IPSec, tcpdump doesn't show outgoin
o kern/143591  net        [ral] RT2561C-based DLink card (DWL-510) fails to work
o kern/143208  net        [ipsec] [gif] IPSec over gif interface not working
o kern/143034  net        [panic] system reboots itself in tcp code [regression]
o kern/142877  net        [hang] network-related repeatable 8.0-STABLE hard hang
o kern/142774  net        Problem with outgoing connections on interface with mu
o kern/142772  net        [libc] lla_lookup: new lle malloc failed
f kern/142518  net        [em] [lagg] Problem on 8.0-STABLE with em and lagg
o kern/142018  net        [iwi] [patch] Possibly wrong interpretation of beacon-
o kern/141861  net        [wi] data garbled with WEP and wi(4) with Prism 2.5
f kern/141741  net        Etherlink III NIC won't work after upgrade to FBSD 8, 
o kern/140742  net        rum(4) Two asus-WL167G adapters cannot talk to each ot
o kern/140682  net        [netgraph] [panic] random panic in netgraph
f kern/140634  net        [vlan] destroying if_lagg interface with if_vlan membe
o kern/140619  net        [ifnet] [patch] refine obsolete if_var.h comments desc
o kern/140346  net        [wlan] High bandwidth use causes loss of wlan connecti
o kern/140142  net        [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6
o kern/140066  net        [bwi] install report for 8.0 RC 2 (multiple problems)
o kern/139565  net        [ipfilter] ipfilter ioctl SIOCDELST broken
o kern/139387  net        [ipsec] Wrong lenth of PF_KEY messages in promiscuous 
o bin/139346   net        [patch] arp(8) add option to remove static entries lis
o kern/139268  net        [if_bridge] [patch] allow if_bridge to forward just VL
p kern/139204  net        [arp] DHCP server replies rejected, ARP entry lost bef
o kern/139117  net        [lagg] + wlan boot timing (EBUSY)
o kern/139058  net        [ipfilter] mbuf cluster leak on FreeBSD 7.2
o kern/138850  net        [dummynet] dummynet doesn't work correctly on a bridge
o kern/138782  net        [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00
o kern/138688  net        [rum] possibly broken on 8 Beta 4 amd64: able to wpa a
o kern/138678  net        [lo] FreeBSD does not assign linklocal address to loop
o kern/138407  net        [gre] gre(4) interface does not come up after reboot
o kern/138332  net        [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/
o kern/138266  net        [panic] kernel panic when udp benchmark test used as r
o kern/138177  net        [ipfilter] FreeBSD crashing repeatedly in ip_nat.c:257
f kern/138029  net        [bpf] [panic] periodically kernel panic and reboot
o kern/137881  net        [netgraph] [panic] ng_pppoe fatal trap 12
p bin/137841   net        [patch] wpa_supplicant(8) cannot verify SHA256 signed 
p kern/137776  net        [rum] panic in rum(4) driver on 8.0-BETA2
o bin/137641   net        ifconfig(8): various problems with "vlan_device.vlan_i
o kern/137392  net        [ip] [panic] crash in ip_nat.c line 2577
o kern/137372  net        [ral] FreeBSD doesn't support wireless interface from 
o kern/137089  net        [lagg] lagg falsely triggers IPv6 duplicate address de
o bin/136994   net        [patch] ifconfig(8) print carp mac address
o kern/136911  net        [netgraph] [panic] system panic on kldload ng_bpf.ko t
o kern/136618  net        [pf][stf] panic on cloning interface without unit numb
o kern/135502  net        [periodic] Warning message raised by rtfree function i
o kern/134583  net        [hang] Machine with jail freezes after random amount o
o kern/134531  net        [route] [panic] kernel crash related to routes/zebra
o kern/134157  net        [dummynet] dummynet loads cpu for 100% and make a syst
o kern/133969  net        [dummynet] [panic] Fatal trap 12: page fault while in 
o kern/133968  net        [dummynet] [panic] dummynet kernel panic
o kern/133736  net        [udp] ip_id not protected ...
o kern/133595  net        [panic] Kernel Panic at pcpu.h:195
o kern/133572  net        [ppp] [hang] incoming PPTP connection hangs the system
o kern/133490  net        [bpf] [panic] 'kmem_map too small' panic on Dell r900 
o kern/133235  net        [netinet] [patch] Process SIOCDLIFADDR command incorre
f kern/133213  net        arp and sshd errors on 7.1-PRERELEASE
o kern/133060  net        [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs
o kern/132889  net        [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d
o conf/132851  net        [patch] rc.conf(5): allow to setfib(1) for service run
o kern/132734  net        [ifmib] [panic] panic in net/if_mib.c
o kern/132705  net        [libwrap] [patch] libwrap - infinite loop if hosts.all
o kern/132672  net        [ndis] [panic] ndis with rt2860.sys causes kernel pani
o kern/132554  net        [ipl] There is no ippool start script/ipfilter magic t
o kern/132354  net        [nat] Getting some packages to ipnat(8) causes crash
o kern/132277  net        [crypto] [ipsec] poor performance using cryptodevice f
o kern/131781  net        [ndis] ndis keeps dropping the link
o kern/131776  net        [wi] driver fails to init
o kern/131753  net        [altq] [panic] kernel panic in hfsc_dequeue
o kern/131601  net        [ipfilter] [panic] 7-STABLE panic in nat_finalise (tcp
o bin/131365   net        route(8): route add changes interpretation of network 
f kern/130820  net        [ndis] wpa_supplicant(8) returns 'no space on device'
o kern/130628  net        [nfs] NFS / rpc.lockd deadlock on 7.1-R
o conf/130555  net        [rc.d] [patch] No good way to set ipfilter variables a
o kern/130525  net        [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau
o kern/130311  net        [wlan_xauth] [panic] hostapd restart causing kernel pa
o kern/130109  net        [ipfw] Can not set fib for packets originated from loc
f kern/130059  net        [panic] Leaking 50k mbufs/hour
f kern/129719  net        [nfs] [panic] Panic during shutdown, tcp_ctloutput: in
o kern/129517  net        [ipsec] [panic] double fault / stack overflow
f kern/129508  net        [carp] [panic] Kernel panic with EtherIP (may be relat
o kern/129219  net        [ppp] Kernel panic when using kernel mode ppp
o kern/129197  net        [panic] 7.0 IP stack related panic
o bin/128954   net        ifconfig(8) deletes valid routes
o bin/128602   net        [an] wpa_supplicant(8) crashes with an(4)
o kern/128448  net        [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res
o bin/128295   net        [patch] ifconfig(8) does not print TOE4 or TOE6 capabi
o bin/128001   net        wpa_supplicant(8), wlan(4), and wi(4) issues
o kern/127826  net        [iwi] iwi0 driver has reduced performance and connecti
o kern/127815  net        [gif] [patch] if_gif does not set vlan attributes from
o kern/127724  net        [rtalloc] rtfree: 0xc5a8f870 has 1 refs
f bin/127719   net        [arp] arp: Segmentation fault (core dumped)
f kern/127528  net        [icmp]: icmp socket receives icmp replies not owned by
p kern/127360  net        [socket] TOE socket options missing from sosetopt()
o bin/127192   net        routed(8) removes the secondary alias IP of interface 
f kern/127145  net        [wi]: prism (wi) driver crash at bigger traffic
o kern/126895  net        [patch] [ral] Add antenna selection (marked as TBD)
o kern/126874  net        [vlan]: Zebra problem if ifconfig vlanX destroy
o kern/126695  net        rtfree messages and network disruption upon use of if_
o kern/126339  net        [ipw] ipw driver drops the connection
o kern/126075  net        [inet] [patch] internet control accesses beyond end of
o bin/125922   net        [patch] Deadlock in arp(8)
o kern/125920  net        [arp] Kernel Routing Table loses Ethernet Link status 
o kern/125845  net        [netinet] [patch] tcp_lro_rx() should make use of hard
o kern/125258  net        [socket] socket's SO_REUSEADDR option does not work
o kern/125239  net        [gre] kernel crash when using gre
o kern/124341  net        [ral] promiscuous mode for wireless device ral0 looses
o kern/124225  net        [ndis] [patch] ndis network driver sometimes loses net
o kern/124160  net        [libc] connect(2) function loops indefinitely
o kern/124021  net        [ip6] [panic] page fault in nd6_output()
o kern/123968  net        [rum] [panic] rum driver causes kernel panic with WPA.
o kern/123892  net        [tap] [patch] No buffer space available
o kern/123890  net        [ppp] [panic] crash & reboot on work with PPP low-spee
o kern/123858  net        [stf] [patch] stf not usable behind a NAT
o kern/123796  net        [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not
o kern/123758  net        [panic] panic while restarting net/freenet6
o bin/123633   net        ifconfig(8) doesn't set inet and ether address in one 
o kern/123559  net        [iwi] iwi periodically disassociates/associates [regre
o bin/123465   net        [ip6] route(8): route add -inet6 <ipv6_addr> -interfac
o kern/123463  net        [ipsec] [panic] repeatable crash related to ipsec-tool
o conf/123330  net        [nsswitch.conf] Enabling samba wins in nsswitch.conf c
o kern/123160  net        [ip] Panic and reboot at sysctl kern.polling.enable=0
o kern/122989  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/122954  net        [lagg] IPv6 EUI64 incorrectly chosen for lagg devices
f kern/122780  net        [lagg] tcpdump on lagg interface during high pps wedge
o kern/122685  net        It is not visible passing packets in tcpdump(1)
o kern/122319  net        [wi] imposible to enable ad-hoc demo mode with Orinoco
o kern/122290  net        [netgraph] [panic] Netgraph related "kmem_map too smal
o kern/122252  net        [ipmi] [bge] IPMI problem with BCM5704 (does not work 
o kern/122033  net        [ral] [lor] Lock order reversal in ral0 at bootup ieee
o bin/121895   net        [patch] rtsol(8)/rtsold(8) doesn't handle managed netw
s kern/121774  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/121555  net        [panic] Fatal trap 12: current process = 12 (swi1: net
o kern/121443  net        [gif] [lor] icmp6_input/nd6_lookup
o kern/121437  net        [vlan] Routing to layer-2 address does not work on VLA
o bin/121359   net        [patch] [security] ppp(8): fix local stack overflow in
o kern/121257  net        [tcp] TSO + natd  -> slow outgoing tcp traffic
o kern/121181  net        [panic] Fatal trap 3: breakpoint instruction fault whi
o kern/120966  net        [rum] kernel panic with if_rum and WPA encryption
o kern/120566  net        [request]: ifconfig(8) make order of arguments more fr
o kern/120304  net        [netgraph] [patch] netgraph source assumes 32-bit time
o kern/120266  net        [udp] [panic] gnugk causes kernel panic when closing U
o bin/120060   net        routed(8) deletes link-level routes in the presence of
o kern/119945  net        [rum] [panic] rum device in hostap mode, cause kernel 
o kern/119791  net        [nfs] UDP NFS mount of aliased IP addresses from a Sol
o kern/119617  net        [nfs] nfs error on wpa network when reseting/shutdown
f kern/119516  net        [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi
o kern/119432  net        [arp] route add -host <host> -iface <nic> causes arp e
o kern/119225  net        [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr
o kern/118727  net        [netgraph] [patch] [request] add new ng_pf module
o kern/117423  net        [vlan] Duplicate IP on different interfaces
o bin/117339   net        [patch] route(8): loading routing management commands 
o bin/116643   net        [patch] [request] fstat(1): add INET/INET6 socket deta
o kern/116185  net        [iwi] if_iwi driver leads system to reboot
o kern/115239  net        [ipnat] panic with 'kmem_map too small' using ipnat
o kern/115019  net        [netgraph] ng_ether upper hook packet flow stops on ad
o kern/115002  net        [wi] if_wi timeout. failed allocation (busy bit). ifco
o kern/114915  net        [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f
o kern/113432  net        [ucom] WARNING: attempt to net_add_domain(netgraph) af
o kern/112722  net        [ipsec] [udp] IP v4 udp fragmented packet reject
o kern/112686  net        [patm] patm driver freezes System (FreeBSD 6.2-p4) i38
o bin/112557   net        [patch] ppp(8) lock file should not use symlink name
o kern/112528  net        [nfs] NFS over TCP under load hangs with "impossible p
o kern/111537  net        [inet6] [patch] ip6_input() treats mbuf cluster wrong
o kern/111457  net        [ral] ral(4) freeze
o kern/110284  net        [if_ethersubr] Invalid Assumption in SIOCSIFADDR in et
o kern/110249  net        [kernel] [regression] [patch] setsockopt() error regre
o kern/109470  net        [wi] Orinoco Classic Gold PC Card Can't Channel Hop
o bin/108895   net        pppd(8): PPPoE dead connections on 6.2 [regression]
o kern/107944  net        [wi] [patch] Forget to unlock mutex-locks
o conf/107035  net        [patch] bridge(8): bridge interface given in rc.conf n
o kern/106444  net        [netgraph] [panic] Kernel Panic on Binding to an ip to
o kern/106316  net        [dummynet] dummynet with multipass ipfw drops packets 
o kern/105945  net        Address can disappear from network interface
s kern/105943  net        Network stack may modify read-only mbuf chain copies
o bin/105925   net        problems with ifconfig(8) and vlan(4) [regression]
o kern/104851  net        [inet6] [patch] On link routes not configured when usi
o kern/104751  net        [netgraph] kernel panic, when getting info about my tr
o kern/103191  net        Unpredictable reboot
o kern/103135  net        [ipsec] ipsec with ipfw divert (not NAT) encodes a pac
o kern/102540  net        [netgraph] [patch] supporting vlan(4) by ng_fec(4)
o conf/102502  net        [netgraph] [patch] ifconfig name does't rename netgrap
o kern/102035  net        [plip] plip networking disables parallel port printing
o kern/101948  net        [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau
o kern/100709  net        [libc] getaddrinfo(3) should return TTL info
o kern/100519  net        [netisr] suggestion to fix suboptimal network polling
o kern/98978   net        [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel
o kern/98597   net        [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu
o bin/98218    net        wpa_supplicant(8) blacklist not working
o kern/97306   net        [netgraph] NG_L2TP locks after connection with failed 
o conf/97014   net        [gif] gifconfig_gif? in rc.conf does not recognize IPv
f kern/96268   net        [socket] TCP socket performance drops by 3000% if pack
o kern/95519   net        [ral] ral0 could not map mbuf
o kern/95288   net        [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr
o kern/95277   net        [netinet] [patch] IP Encapsulation mask_match() return
o kern/95267   net        packet drops periodically appear
f kern/93378   net        [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo
o kern/93019   net        [ppp] ppp and tunX problems: no traffic after restarti
o kern/92880   net        [libc] [patch] almost rewritten inet_network(3) functi
s kern/92279   net        [dc] Core faults everytime I reboot, possible NIC issu
o kern/91859   net        [ndis] if_ndis does not work with Asus WL-138
s kern/91777   net        [ipf] [patch] wrong behaviour with skip rule inside an
o kern/91364   net        [ral] [wep] WF-511 RT2500 Card PCI and WEP
o kern/91311   net        [aue] aue interface hanging
o kern/87521   net        [ipf] [panic] using ipfilter "auth" keyword leads to k
o kern/87421   net        [netgraph] [panic]: ng_ether + ng_eiface + if_bridge
o kern/86871   net        [tcp] [patch] allocation logic for PCBs in TIME_WAIT s
o kern/86427   net        [lor] Deadlock with FASTIPSEC and nat
o kern/86103   net        [ipf] Illegal NAT Traversal in IPFilter
o kern/85780   net        'panic: bogus refcnt 0' in routing/ipv6
o bin/85445    net        ifconfig(8): deprecated keyword to ifconfig inoperativ
p kern/85320   net        [gre] [patch] possible depletion of kernel stack in ip
o bin/82975    net        route change does not parse classfull network as given
o kern/82881   net        [netgraph] [panic] ng_fec(4) causes kernel panic after
o kern/82468   net        Using 64MB tcp send/recv buffers, trafficflow stops, i
o bin/82185    net        [patch] ndp(8) can delete the incorrect entry
o kern/81095   net        IPsec connection stops working if associated network i
o kern/78968   net        FreeBSD freezes on mbufs exhaustion (network interface
o kern/78090   net        [ipf] ipf filtering on bridged packets doesn't work if
o kern/77341   net        [ip6] problems with IPV6 implementation
s kern/77195   net        [ipf] [patch] ipfilter ioctl SIOCGNATL does not match 
o kern/75873   net        Usability problem with non-RFC-compliant IP spoof prot
s kern/75407   net        [an] an(4): no carrier after short time
a kern/71474   net        [route] route lookup does not skip interfaces marked d
o kern/71469   net        default route to internet magically disappears with mu
o kern/70904   net        [ipf] ipfilter ipnat problem with h323 proxy support
o kern/68889   net        [panic] m_copym, length > size of mbuf chain
o kern/66225   net        [netgraph] [patch] extend ng_eiface(4) control message
o kern/65616   net        IPSEC can't detunnel GRE packets after real ESP encryp
s kern/60293   net        [patch] FreeBSD arp poison patch
a kern/56233   net        IPsec tunnel (ESP) over IPv6: MTU computation is wrong
s bin/41647    net        ifconfig(8) doesn't accept lladdr along with inet addr
o kern/39937   net        ipstealth issue
a kern/38554   net        [patch] changing interface ipaddress doesn't seem to w
o kern/34665   net        [ipf] [hang] ipfilter rcmd proxy "hangs".
o kern/31940   net        ip queue length too short for >500kpps
o kern/31647   net        [libc] socket calls can return undocumented EINVAL
o kern/30186   net        [libc] getaddrinfo(3) does not handle incorrect servna
o kern/27474   net        [ipf] [ppp] Interactive use of user PPP and ipfilter c
f kern/24959   net        [patch] proper TCP_NOPUSH/TCP_CORK compatibility
o conf/23063   net        [arp] [patch] for static ARP tables in rc.network
o kern/21998   net        [socket] [patch] ident only for outgoing connections
o kern/5877    net        [socket] sb_cc counts control data as well as data dat

451 problems total.


From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 16:35:44 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id EA51BE19
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 16:35:44 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 43B29721
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 16:35:43 +0000 (UTC)
Received: (qmail 34361 invoked from network); 4 Mar 2013 17:49:43 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <lstewart@freebsd.org>; 4 Mar 2013 17:49:43 -0000
Message-ID: <5134CD5D.6090107@freebsd.org>
Date: Mon, 04 Mar 2013 17:35:41 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Lawrence Stewart <lstewart@freebsd.org>
Subject: Re: Bug in sbsndptr()
References: <512CBADB.3050004@freebsd.org>
In-Reply-To: <512CBADB.3050004@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 16:35:45 -0000

On 26.02.2013 14:38, Lawrence Stewart wrote:
> Hi Andre,

Hi Lawrence, :-)

> A colleague and I spent a very frustrating day tracing an accounting bug
> in the multipath TCP patch we're working on at CAIA to a bug in
> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the
> following patch fixes the bug (proposed commit log message is at the top
> of the patch):
>
> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff
>
> The patch should have no tangible effect to operation other than to
> ensure the function delivers on the promise to return the closest mbuf
> in the chain for the given offset.

I agree that the description of sbsndptr() can be misleading as it refers
to the point in time when the pointer was updated last.  Relative to now
the real offset may be at the beginning of the next mbuf.

As you note in the proposed commit message by the time the send pointer
is calculated we may have reached the end of the chain and must avoid
storing a NULL pointer.  The mbuf copy routines simply skips over the
additional mbuf in the chain using the returned offset.

I wonder how this has caused trouble with your multipath patch.  You'd
have to copy the sockbuf contents as well and unless you're using custom
sockbuf and mbuf chain functions this shouldn't be a problem.  Using
custom functions on a socket buffer is a delicate approach.  For a sockbuf
consumer being able to handle valid offsets into an mbuf chain is a core
feature and must-have part of the functionality.

> I would appreciate a review and any thoughts.

I think you have found a valid (micro-)optimization.  However you're
still making a dangerous assumption in that the next mbuf is indeed
the one you want.  This may not be true in subtle ways when the chain
contains m_len=0 mbufs in it.  I'm not aware of it actually happening
but it can't be ruled out either if custom sockbuf manipulation functions
are in use.

I'd recommend the following:
have you custom sockbuf function handle forward seeking like the other
m_copy() functions; and/or apply a patch along the (untested) example
below.

Cheers
-- 
Andre

Index: uipc_sockbuf.c
===================================================================
--- uipc_sockbuf.c      (revision 247775)
+++ uipc_sockbuf.c      (working copy)
@@ -936,10 +936,17 @@
                 return (sb->sb_mb);
         }

-       /* Return closest mbuf in chain for current offset. */
+       /* Return closest known mbuf in chain for current offset. */
         *moff = off - sb->sb_sndptroff;
         m = ret = sb->sb_sndptr ? sb->sb_sndptr : sb->sb_mb;

+       /* Possibly seek forward to return the closest mbuf to the offset. */
+       while (*moff >= m->m_len && ret->m_next != NULL) {
+               *moff -= m->m_len;
+               ret = m->m_next;
+       }
+       KASSERT(*moff != NULL, ("%s: moff is NULL", __func__));
+
         /* Advance by len to be as close as possible for the next transmit. */
         for (off = off - sb->sb_sndptroff + len - 1;
              off > 0 && m != NULL && off >= m->m_len;

From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 16:41:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E9FBA199
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 16:41:57 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com
 [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id A6CA276F
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 16:41:57 +0000 (UTC)
Received: by mail-vc0-f182.google.com with SMTP id fl17so3540197vcb.27
 for <freebsd-net@freebsd.org>; Mon, 04 Mar 2013 08:41:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=mLMCDuQOvB/E4N3J5JAoWAtZESzDmQ0eMdCy8LDaTpk=;
 b=Ai+dsVLGM4sadpofEti2R+umDi6eRcRvbOLpcpZ3r2ZAbR6PX5NeRfoTdiNql6XQtS
 7qt6aX+isQwXsSl5J8d0ZGx9jfChAZ0nAkllsUHQlHWoHBcPaZfkxjQ1JFGx6xyjCdI3
 9L0IY7K9ayEwkrcP5BzGp7Z9BVKZ5RcMcAMooIhmb7U2yzZxourWzLHqXfCC4z8KfImK
 U/Vca9h7NVTDIfbypayo/k68tVVyl8/vGcPylZnoFhyYZ8xhMWb3spe3CAwdJAqrI/SO
 sTGPe/rc3BjAt4OKkvXio921IxTn6XzIhHvZXqMD4am2d1RJLpEofc6+JToJQNhdqyns
 uXVg==
MIME-Version: 1.0
X-Received: by 10.220.219.73 with SMTP id ht9mr7873390vcb.47.1362415311147;
 Mon, 04 Mar 2013 08:41:51 -0800 (PST)
Received: by 10.52.176.131 with HTTP; Mon, 4 Mar 2013 08:41:51 -0800 (PST)
In-Reply-To: <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
References: <512BAA60.3060703@biostat.wisc.edu>
 <CAFOYbckDFJKRip+e=a+_JPHhk+HbAikRBK0dHEBDDEgdsZT6sw@mail.gmail.com>
 <512BAF8D.7080308@biostat.wisc.edu>
 <CAFOYbcnEN=Pzd9k4hvR+wqP3_HJj3-QRQSwocfHDSehUH5YPXA@mail.gmail.com>
 <CAKOb=YYyJZyKzpEBT+o-Vmn7dedRfVW+wVh1KVM7oaWT63+qBg@mail.gmail.com>
 <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
 <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
Date: Mon, 4 Mar 2013 08:41:51 -0800
Message-ID: <CAKOb=YYxEo2O09t3Fq9hw3hoLegDgDbouF6XwasKS-yGRbPQEQ@mail.gmail.com>
Subject: Re: igb network lockups
From: Nick Rogers <ncrogers@gmail.com>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 16:41:58 -0000

On Sun, Mar 3, 2013 at 2:14 AM, Sepherosa Ziehau <sepherosa@gmail.com> wrote:
> On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers <ncrogers@gmail.com> wrote:
>> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers <ncrogers@gmail.com> wrote:
>>> FWIW I have been experiencing a similar issue on a number of systems
>>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from
>>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms
>>> are: interface stops passing traffic until the system is rebooted. I
>>> have not yet been able to gain access to the systems to dig around
>>> (after they have crashed), however my kernel/network settings are
>>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to
>>> happen about once a day on systems with around a sustained 50Mb/s of
>>> traffic.
>>>
>>> I realize this is not much to go on but perhaps it helps. I am
>>> debating trying the e1000 driver in the latest CURRENT on top of
>>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week
>>> ago. Would this change or perhaps another change to e1000 since
>>> 9.1-RELEASE possibly affect stability in a positive way?
>>>
>>> Thanks.
>>
>> Heres relevant pciconf output:
>>
>> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>
> For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it
> is simply broken (you could check 82574 errata on Intel's website to
> confirm what I have said here).

Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set
hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this
is advisable?

>
> For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it
> is simply broken (you could check 82575 errata on Intel's website to
> confirm what I have said here).
>
> Best Regards,
> sephe
>
> --
> Tomorrow Will Never Die
>
> On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers <ncrogers@gmail.com> wrote:
>> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers <ncrogers@gmail.com> wrote:
>>> FWIW I have been experiencing a similar issue on a number of systems
>>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from
>>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms
>>> are: interface stops passing traffic until the system is rebooted. I
>>> have not yet been able to gain access to the systems to dig around
>>> (after they have crashed), however my kernel/network settings are
>>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to
>>> happen about once a day on systems with around a sustained 50Mb/s of
>>> traffic.
>>>
>>> I realize this is not much to go on but perhaps it helps. I am
>>> debating trying the e1000 driver in the latest CURRENT on top of
>>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week
>>> ago. Would this change or perhaps another change to e1000 since
>>> 9.1-RELEASE possibly affect stability in a positive way?
>>>
>>> Thanks.
>>
>> Heres relevant pciconf output:
>>
>> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> em1@pci0:2:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> em2@pci0:7:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> em3@pci0:8:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>>
>>
>>>
>>> On Mon, Feb 25, 2013 at 10:45 AM, Jack Vogel <jfvogel@gmail.com> wrote:
>>>> Have you done any poking around, looking at stats to determine why the
>>>> hangs? For instance,
>>>> might your mbuf pool be depleted? Some other network resource perhaps?
>>>>
>>>> Jack
>>>>
>>>>
>>>> On Mon, Feb 25, 2013 at 10:38 AM, Christopher D. Harrison <
>>>> harrison@biostat.wisc.edu> wrote:
>>>>
>>>>>  Sure,
>>>>> The problem appears on both systems running with ALTQ and vanilla.
>>>>>     -C
>>>>>
>>>>> On 02/25/13 12:29, Jack Vogel wrote:
>>>>>
>>>>> I've not heard of this problem, but I think most users do not use ALTQ,
>>>>> and we (Intel) do not
>>>>> test using it. Can it be eliminated from the equation?
>>>>>
>>>>> Jack
>>>>>
>>>>>
>>>>> On Mon, Feb 25, 2013 at 10:16 AM, Christopher D. Harrison <
>>>>> harrison@biostat.wisc.edu> wrote:
>>>>>
>>>>>> I recently have been experiencing network "freezes" and network "lockups"
>>>>>> on our Freebsd 9.1 systems which are running zfs and nfs file servers.
>>>>>> I upgraded from 9.0 to 9.1 about 2 months ago and we have been having
>>>>>> issues with almost bi-monthly.   The issue manifests in the system becomes
>>>>>> unresponsive to any/all nfs clients.   The system is not resource bound as
>>>>>> our I/O is low to disk and our network is usually in the 20mbit/40mbit
>>>>>> range.   We do notice a correlation between temporary i/o spikes and
>>>>>> network freezes but not enough to send our system in to "lockup" mode for
>>>>>> the next 5min.   Currently we have 4 igb nics in 2 aggr's with 8 queue's
>>>>>> per nic and our dev.igb reports:
>>>>>>
>>>>>> dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.4
>>>>>>
>>>>>> I am almost certain the problem is with the ibg driver as a friend is
>>>>>> also experiencing the same problem with the same intel igb nic.   He has
>>>>>> addressed the issue by restarting the network using netif on his systems.
>>>>>> According to my friend, once the network interfaces get cleared, everything
>>>>>> comes back and starts working as expected.
>>>>>>
>>>>>> I have noticed an issue with the igb driver and I was looking for
>>>>>> thoughts on how to help address this problem.
>>>>>>
>>>>>> http://freebsd.1045724.n5.nabble.com/em-igb-if-transmit-drbr-and-ALTQ-td5760338.html
>>>>>>
>>>>>> Thoughts/Ideas are greatly appreciated!!!
>>>>>>
>>>>>>     -C
>>>>>>
>>>>>> _______________________________________________
>>>>>> freebsd-net@freebsd.org mailing list
>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> freebsd-net@freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>
>
>
> --
> Tomorrow Will Never Die

From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 18:16:19 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 47D9665D
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 18:16:19 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vc0-f177.google.com (mail-vc0-f177.google.com
 [209.85.220.177]) by mx1.freebsd.org (Postfix) with ESMTP id D939ECC0
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 18:16:18 +0000 (UTC)
Received: by mail-vc0-f177.google.com with SMTP id m18so3541785vcm.22
 for <freebsd-net@freebsd.org>; Mon, 04 Mar 2013 10:16:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=CM8tmangjINKI2bTYR+D4LqV85uQDGow7zqBrVisZxw=;
 b=wL2eXN3K8hto9pQ40gwmt6IlX87odAVQAjIGI2/y/QEFjWb/nHQxwNa8B5Y/Q48Z5K
 c4V/PnZ7+dcs+Uqp5HA4po+t/2QXoT4F772I94MB30qBpzIsFvhAH/wFAZVMwZ0OfAYd
 j07UlMLrhq1enF92DOJ8OC20AVV8T04sLVMo/zjERe8u7Tf+CV9chvmEaFtADyOx/zd1
 ea7gSK2ttlB50WTC3Ugk3HO3C7uCF2Dzq42GwksQ4laQfASqrCGHGoaqXSW/EhUBoHHF
 nHZwEGoH0438uUAAk/osv2l8Qx/vLRP8i52yq9XdMEtfAL3UwnMNsl4HNMGKesEVKpPj
 qmyQ==
MIME-Version: 1.0
X-Received: by 10.58.214.231 with SMTP id od7mr8381659vec.44.1362420978031;
 Mon, 04 Mar 2013 10:16:18 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Mon, 4 Mar 2013 10:16:17 -0800 (PST)
In-Reply-To: <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
References: <512BAA60.3060703@biostat.wisc.edu>
 <CAFOYbckDFJKRip+e=a+_JPHhk+HbAikRBK0dHEBDDEgdsZT6sw@mail.gmail.com>
 <512BAF8D.7080308@biostat.wisc.edu>
 <CAFOYbcnEN=Pzd9k4hvR+wqP3_HJj3-QRQSwocfHDSehUH5YPXA@mail.gmail.com>
 <CAKOb=YYyJZyKzpEBT+o-Vmn7dedRfVW+wVh1KVM7oaWT63+qBg@mail.gmail.com>
 <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
 <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
Date: Mon, 4 Mar 2013 10:16:17 -0800
Message-ID: <CAFOYbc=j6AU+78O3VZotRJVozyRATYkqB97gCXHfb6AkJr70uQ@mail.gmail.com>
Subject: Re: igb network lockups
From: Jack Vogel <jfvogel@gmail.com>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Nick Rogers <ncrogers@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 18:16:19 -0000

On Sun, Mar 3, 2013 at 2:14 AM, Sepherosa Ziehau <sepherosa@gmail.com>wrote:
...

>
>
> For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it
> is simply broken (you could check 82574 errata on Intel's website to
> confirm what I have said here).
>

If you actually checked the errata you will find that its not "simply
broken",
furthermore it most certainly SHOULD be enabled, it is by default in the
Linux driver as well as mine, the issue is not with the 82574, its with some
system designs that have upstream PCIE problems.

If you experience problems with a particular system, then we recommend
disabling MSIX to determine if this hardware issue may be behind it. In most
cases MSIX works just fine.


>
> For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it
> is simply broken (you could check 82575 errata on Intel's website to
> confirm what I have said here).
>
>
The same issue obtains on the 82575, its a system issue, and we have
tested the part on our Reference systems in prolonged stress without any
problem. So the same recommendation as above applies.

Jack Vogel
Intel Network Division

From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 18:22:40 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0A785857
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 18:22:40 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vc0-f170.google.com (mail-vc0-f170.google.com
 [209.85.220.170]) by mx1.freebsd.org (Postfix) with ESMTP id B5A58D14
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 18:22:39 +0000 (UTC)
Received: by mail-vc0-f170.google.com with SMTP id p16so3652890vcq.29
 for <freebsd-net@freebsd.org>; Mon, 04 Mar 2013 10:22:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=Fib5MBKum0uhZMc7N0ASsM6cESlZHIDHixTm8hRSZ78=;
 b=sAcHMPMjpB/xcqCVTw1tu5i0lstURAtTTyQwGtu8VnQ7ees82yJ4/Fgy+8inbOX3gH
 oH5K4TbYJEdhq7kJcAHcKO3OUA+rjHlxTvESgH3mcO2sGQVdSLtRq9RpLTJO6pda2pp2
 iEnX3iNrV7Yw1hO4eIYnMaLo273VWBsAqmFxb0gahwkFryqmgQP6QDx4c+yFzDaaS9Z8
 QwdH24opM2UnH19zlBZuxxW68m1/bYis4S29yfuMOaSEoeSQcUIJvGn+gPM2luBfcuKY
 FJpWjR4HecKQaFI4zFNirFXMvtZT6qdsfntPdLOTr6D2rXC0O9sfVb/xxb40dtYvVqB4
 /0jA==
MIME-Version: 1.0
X-Received: by 10.220.153.143 with SMTP id k15mr8111012vcw.33.1362421353127;
 Mon, 04 Mar 2013 10:22:33 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Mon, 4 Mar 2013 10:22:32 -0800 (PST)
In-Reply-To: <CAKOb=YYxEo2O09t3Fq9hw3hoLegDgDbouF6XwasKS-yGRbPQEQ@mail.gmail.com>
References: <512BAA60.3060703@biostat.wisc.edu>
 <CAFOYbckDFJKRip+e=a+_JPHhk+HbAikRBK0dHEBDDEgdsZT6sw@mail.gmail.com>
 <512BAF8D.7080308@biostat.wisc.edu>
 <CAFOYbcnEN=Pzd9k4hvR+wqP3_HJj3-QRQSwocfHDSehUH5YPXA@mail.gmail.com>
 <CAKOb=YYyJZyKzpEBT+o-Vmn7dedRfVW+wVh1KVM7oaWT63+qBg@mail.gmail.com>
 <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
 <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
 <CAKOb=YYxEo2O09t3Fq9hw3hoLegDgDbouF6XwasKS-yGRbPQEQ@mail.gmail.com>
Date: Mon, 4 Mar 2013 10:22:32 -0800
Message-ID: <CAFOYbcnJtS5W+w2Vob16QVGFimn=iWyUXXbTfaEhr_a5KXbFWg@mail.gmail.com>
Subject: Re: igb network lockups
From: Jack Vogel <jfvogel@gmail.com>
To: Nick Rogers <ncrogers@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Sepherosa Ziehau <sepherosa@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 18:22:40 -0000

>
> Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set
> hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this
> is advisable?
>
>
As I explained in a previous email, this is not advisable unless you are
experiencing problems (like hangs), if you are then its one possible
cause, so try falling back to MSI to see if it eliminates your problem.

And, 82574 is the only devise the em driver supports at present that is
capable of MSIX, all others use the igb driver.

Regards,

Jack

From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 18:58:35 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C469C20E
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 18:58:35 +0000 (UTC)
 (envelope-from zbeeble@gmail.com)
Received: from mail-ve0-f178.google.com (mail-ve0-f178.google.com
 [209.85.128.178]) by mx1.freebsd.org (Postfix) with ESMTP id 7865BE5C
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 18:58:35 +0000 (UTC)
Received: by mail-ve0-f178.google.com with SMTP id db10so4987775veb.37
 for <freebsd-net@freebsd.org>; Mon, 04 Mar 2013 10:58:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=hOzfVdUa4NIuj0zHKnleUyMv1C2irzEdfjdkArFukwI=;
 b=QBje+CyYniNWJIH3IF9aDu4EBQpoSuwtBe8rWNGrXMyTWRUOzaC88VwWkUFXtmmg3B
 gxyGyEifNTWbxAUXzp+OlHFuP00L+DRW4KAo5/+H1H8TselRvNxVc2euLlvjcnOdw0WU
 3MCgWb/I/IcySCRsbZDum2lDl5WkemL5wpzBHaWflo8cjzHiPO5NEcsutz+s/fXkA3Pc
 vXjaENqmevDSceo8AYYkavka9fqof5ewamZJvZtPd6kvRkVXkKQbAOxLwmC1837LEXZH
 4kDyzqROO2Eso5Qpy+fAOfcBO7tCxb8j+qwnCvjAJhosgtLcxIS8EOAbxJOZLkC6/gnM
 Rr0Q==
MIME-Version: 1.0
X-Received: by 10.220.107.210 with SMTP id c18mr8280287vcp.5.1362423514653;
 Mon, 04 Mar 2013 10:58:34 -0800 (PST)
Received: by 10.220.232.6 with HTTP; Mon, 4 Mar 2013 10:58:34 -0800 (PST)
In-Reply-To: <CAFOYbcnJtS5W+w2Vob16QVGFimn=iWyUXXbTfaEhr_a5KXbFWg@mail.gmail.com>
References: <512BAA60.3060703@biostat.wisc.edu>
 <CAFOYbckDFJKRip+e=a+_JPHhk+HbAikRBK0dHEBDDEgdsZT6sw@mail.gmail.com>
 <512BAF8D.7080308@biostat.wisc.edu>
 <CAFOYbcnEN=Pzd9k4hvR+wqP3_HJj3-QRQSwocfHDSehUH5YPXA@mail.gmail.com>
 <CAKOb=YYyJZyKzpEBT+o-Vmn7dedRfVW+wVh1KVM7oaWT63+qBg@mail.gmail.com>
 <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
 <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
 <CAKOb=YYxEo2O09t3Fq9hw3hoLegDgDbouF6XwasKS-yGRbPQEQ@mail.gmail.com>
 <CAFOYbcnJtS5W+w2Vob16QVGFimn=iWyUXXbTfaEhr_a5KXbFWg@mail.gmail.com>
Date: Mon, 4 Mar 2013 13:58:34 -0500
Message-ID: <CACpH0Mfi2VKuCtr=7ErYT1yVYUMA5Pfg6SRO2wYo_OF5CExgQQ@mail.gmail.com>
Subject: Re: igb network lockups
From: Zaphod Beeblebrox <zbeeble@gmail.com>
To: Jack Vogel <jfvogel@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Nick Rogers <ncrogers@gmail.com>, Sepherosa Ziehau <sepherosa@gmail.com>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 18:58:35 -0000

For everyone having lockup problems with IGB, I'd like to ask if they could
try disabling hyperthreads --- this worked for me on one system but has
been unnecessary on others.

From owner-freebsd-net@FreeBSD.ORG  Mon Mar  4 20:13:55 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 1B19DB44
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 20:13:55 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from mail-ve0-f181.google.com (mail-ve0-f181.google.com
 [209.85.128.181])
 by mx1.freebsd.org (Postfix) with ESMTP id AD4551190
 for <freebsd-net@freebsd.org>; Mon,  4 Mar 2013 20:13:54 +0000 (UTC)
Received: by mail-ve0-f181.google.com with SMTP id d10so5026517vea.40
 for <freebsd-net@freebsd.org>; Mon, 04 Mar 2013 12:13:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=S3Ck/6jUPhI7MikRgsek4u7CKU45PgCwVlbwCZbg6+A=;
 b=V75Eag5ZkSsaoz7FFPnrF93rOz0Gswj5oDLWy/fouoBHBmfCGChhwyljARq564Y9hA
 STjAhJCmcok9WQoF8rsENIHatGGTIrXFSaBBXd5c3JToxJPxkp3j/YP92D/PqNAa1cqi
 GOHNJxy0aFHbVf2y3RF7ZFvkQweWW+5X5hcK50N8ZwzUQQ6uhgLT1fOnIJ+2fO3qAecm
 o7U5r39M28d5Q/Rsau7Fv/jdA5wCVaH9Qr20KY2krZoMbybjaAv5IkROj0sZtPOJ5hIp
 KbTTTqa1lYMXlnjNxTqmn6ZZk/RJVlGQaFhakvog6yV+oRKB+p06GQ/aw7m2SJReBSBA
 sdsg==
MIME-Version: 1.0
X-Received: by 10.52.16.40 with SMTP id c8mr7263867vdd.99.1362428028434; Mon,
 04 Mar 2013 12:13:48 -0800 (PST)
Received: by 10.52.176.131 with HTTP; Mon, 4 Mar 2013 12:13:48 -0800 (PST)
In-Reply-To: <CAFOYbcnJtS5W+w2Vob16QVGFimn=iWyUXXbTfaEhr_a5KXbFWg@mail.gmail.com>
References: <512BAA60.3060703@biostat.wisc.edu>
 <CAFOYbckDFJKRip+e=a+_JPHhk+HbAikRBK0dHEBDDEgdsZT6sw@mail.gmail.com>
 <512BAF8D.7080308@biostat.wisc.edu>
 <CAFOYbcnEN=Pzd9k4hvR+wqP3_HJj3-QRQSwocfHDSehUH5YPXA@mail.gmail.com>
 <CAKOb=YYyJZyKzpEBT+o-Vmn7dedRfVW+wVh1KVM7oaWT63+qBg@mail.gmail.com>
 <CAKOb=YYRu94CRC8Fd1TrWezHig6Od_uNpO2f+tCBQTBNQVjtog@mail.gmail.com>
 <CAMOc5cz+knVK=skEz1z=WNAjd5mL3DeOVBasHnJ6ggsNtiQdbA@mail.gmail.com>
 <CAKOb=YYxEo2O09t3Fq9hw3hoLegDgDbouF6XwasKS-yGRbPQEQ@mail.gmail.com>
 <CAFOYbcnJtS5W+w2Vob16QVGFimn=iWyUXXbTfaEhr_a5KXbFWg@mail.gmail.com>
Date: Mon, 4 Mar 2013 12:13:48 -0800
Message-ID: <CAKOb=YZhj9wx7NLSKFGmFhX2fa57u99W2o3La-vvbuKLDRMCkQ@mail.gmail.com>
Subject: Re: igb network lockups
From: Nick Rogers <ncrogers@gmail.com>
To: Jack Vogel <jfvogel@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Sepherosa Ziehau <sepherosa@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2013 20:13:55 -0000

On Mon, Mar 4, 2013 at 10:22 AM, Jack Vogel <jfvogel@gmail.com> wrote:
>
>>
>> Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set
>> hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this
>> is advisable?
>>
>
> As I explained in a previous email, this is not advisable unless you are
> experiencing problems (like hangs), if you are then its one possible
> cause, so try falling back to MSI to see if it eliminates your problem.
>
> And, 82574 is the only devise the em driver supports at present that is
> capable of MSIX, all others use the igb driver.
>

Jack, thanks for clarifying. Its much appreciated.

> Regards,
>
> Jack
>

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 03:21:28 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 313B7D60;
 Tue,  5 Mar 2013 03:21:28 +0000 (UTC)
 (envelope-from lstewart@freebsd.org)
Received: from lauren.room52.net (lauren.room52.net [210.50.193.198])
 by mx1.freebsd.org (Postfix) with ESMTP id BB199B81;
 Tue,  5 Mar 2013 03:21:27 +0000 (UTC)
Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au
 [136.186.229.95])
 by lauren.room52.net (Postfix) with ESMTPSA id 3210E7E820;
 Tue,  5 Mar 2013 14:21:18 +1100 (EST)
Message-ID: <513564AD.7000006@freebsd.org>
Date: Tue, 05 Mar 2013 14:21:17 +1100
From: Lawrence Stewart <lstewart@freebsd.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130213 Thunderbird/17.0.2
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Bug in sbsndptr()
References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org>
In-Reply-To: <5134CD5D.6090107@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY
 autolearn=unavailable version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 03:21:28 -0000

On 03/05/13 03:35, Andre Oppermann wrote:
> On 26.02.2013 14:38, Lawrence Stewart wrote:
>> Hi Andre,
> 
> Hi Lawrence, :-)
> 
>> A colleague and I spent a very frustrating day tracing an accounting bug
>> in the multipath TCP patch we're working on at CAIA to a bug in
>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the
>> following patch fixes the bug (proposed commit log message is at the top
>> of the patch):
>>
>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff
>>
>>
>> The patch should have no tangible effect to operation other than to
>> ensure the function delivers on the promise to return the closest mbuf
>> in the chain for the given offset.
> 
> I agree that the description of sbsndptr() can be misleading as it refers
> to the point in time when the pointer was updated last.  Relative to now
> the real offset may be at the beginning of the next mbuf.

Right, and we ran into the issue because we made an assumption based on
the use of the present tense in the comment:

    "Return closest mbuf in chain for current offset."

> As you note in the proposed commit message by the time the send pointer
> is calculated we may have reached the end of the chain and must avoid
> storing a NULL pointer.  The mbuf copy routines simply skips over the
> additional mbuf in the chain using the returned offset.
> 
> I wonder how this has caused trouble with your multipath patch.  You'd
> have to copy the sockbuf contents as well and unless you're using custom
> sockbuf and mbuf chain functions this shouldn't be a problem.  Using
> custom functions on a socket buffer is a delicate approach.  For a sockbuf
> consumer being able to handle valid offsets into an mbuf chain is a core
> feature and must-have part of the functionality.

No custom sockbuf or mbuf routines are in use. We've implemented a
mapping shim between subflows and the socket buffer. When a subflow asks
the multipath layer for some data to send, the multipath layer returns a
mapping onto the socket buffer, which will remain valid until such time
as the subflow has marked the mapped data as acknowledged.

Part of the map accounting is tracking the pointer of the first mbuf in
the sockbuf where the map's data begins. Our accounting assumed the mbuf
+ the offset returned by sbsndptr had data available, which is how we
triggered the problem. We could have accounted for the issue in our new
map accounting code, but that would add additional complexity to some
already complex code and the better solution is to make sbsndptr DTRT.

>> I would appreciate a review and any thoughts.
> 
> I think you have found a valid (micro-)optimization.  However you're
> still making a dangerous assumption in that the next mbuf is indeed
> the one you want.  This may not be true in subtle ways when the chain
> contains m_len=0 mbufs in it.  I'm not aware of it actually happening
> but it can't be ruled out either if custom sockbuf manipulation functions
> are in use.

True, though I'm struggling to think why there would be m_len=0 mbufs
interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain.

> I'd recommend the following:
> have you custom sockbuf function handle forward seeking like the other
> m_copy() functions; and/or apply a patch along the (untested) example
> below.

If you believe it is both correct and possible for m_len=0 mbufs to
exist in a socket buffer chain, then I agree that we should amend my
proposed patch to loop and skip over m_len=0 mbufs as you've suggested.

However, I'm more inclined to suspect it is undesirable and potentially
buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on
which sbsndptr is being used, and would instead suggest a
"KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if
block.

Thoughts?

Cheers,
Lawrence

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 09:04:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 39F75FEA
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 09:04:57 +0000 (UTC)
 (envelope-from barczyzna@home.pl)
Received: from v045229.home.net.pl (v045229.home.net.pl [89.161.226.17])
 by mx1.freebsd.org (Postfix) with SMTP id D2AF7A63
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 09:04:56 +0000 (UTC)
Date: Tue, 5 Mar 2013 08:58:14 -0000
Message-ID: <20130305085814.28429.qmail@home.pl>
To: haendler@mailsnare.net, emcgough@sbcglobal.net, worlord668@hotmail.com,
 freebsd-net@freebsd.org, jppbulk@hotmail.com, kalypsomcs@att.net,
 brooke.mcgough@skorburgcompany.com
Subject: Recommend
From: <soze_kizer@hotmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 09:04:57 -0000

http://edsmithrealestate.com/readme.php?ma=665&nqb=46g=5&awu=k01&yqk=3&mcg=1313&rby=547049&oj=v1k8

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 09:36:49 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 27969227;
 Tue,  5 Mar 2013 09:36:49 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id ED9ADE07;
 Tue,  5 Mar 2013 09:36:48 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r259amPH057833;
 Tue, 5 Mar 2013 09:36:48 GMT
 (envelope-from glebius@freefall.freebsd.org)
Received: (from glebius@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r259ampR057832;
 Tue, 5 Mar 2013 09:36:48 GMT (envelope-from glebius)
Date: Tue, 5 Mar 2013 09:36:48 GMT
Message-Id: <201303050936.r259ampR057832@freefall.freebsd.org>
To: asa@cs.txstate.edu, glebius@FreeBSD.org, freebsd-net@FreeBSD.org,
 glebius@FreeBSD.org
From: glebius@FreeBSD.org
Subject: Re: kern/176510: [udp] [panic] Kernel Panic in udp_input @ offset
 0x475
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 09:36:49 -0000

Synopsis: [udp] [panic] Kernel Panic in udp_input @ offset 0x475

State-Changed-From-To: open->closed
State-Changed-By: glebius
State-Changed-When: Tue Mar 5 09:36:16 UTC 2013
State-Changed-Why: 
Fixed in stable/9 in r241435.


Responsible-Changed-From-To: freebsd-net->glebius
Responsible-Changed-By: glebius
Responsible-Changed-When: Tue Mar 5 09:36:16 UTC 2013
Responsible-Changed-Why: 
Fixed in stable/9 in r241435.

http://www.freebsd.org/cgi/query-pr.cgi?pr=176510

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 13:54:56 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 512B3616
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 13:54:56 +0000 (UTC)
 (envelope-from s.khanchi@gmail.com)
Received: from mail-we0-x229.google.com (mail-we0-x229.google.com
 [IPv6:2a00:1450:400c:c03::229])
 by mx1.freebsd.org (Postfix) with ESMTP id D8DB5FD1
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 13:54:55 +0000 (UTC)
Received: by mail-we0-f169.google.com with SMTP id t11so6486599wey.28
 for <freebsd-net@freebsd.org>; Tue, 05 Mar 2013 05:54:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:mime-version:sender:from:date:x-google-sender-auth
 :message-id:subject:to:content-type;
 bh=X2eJpBnhlRRhOEaQq7XHdAt0DaO55myJly0uxs/yZgc=;
 b=trTFzGsxRRuUd0RK+tQ96x+81932bV/C26KtHSRfbmSKDp+/7/nYpnsT8MQwbgPlF8
 UUPFfS85Z2+24fSybHu8Z+IpsIGRIpCGb6XebHhFZnwdK6uovh8izfA2MHsD/3cWshVr
 gOKUMr0oVQbZkuewhvQ62L1UW6KN1i6sUhbI7sS0cqvftG6LjkFvOtVj7Jmn0V09jaUe
 84LM6Pv1xex8grIwtZj4GxFJiZ8Wj4mFBzNAzUku4VAyV8gzlCeP97ZOOFjnDz1Ftjxg
 Qdb9zV66+cWAW6zRJGM04UyjzGhgMFt3h4KZYyZ6sTsDisGh5Rq8dpexPwQzafQAhLVj
 0U5A==
X-Received: by 10.180.81.164 with SMTP id b4mr18908535wiy.34.1362491689049;
 Tue, 05 Mar 2013 05:54:49 -0800 (PST)
MIME-Version: 1.0
Sender: s.khanchi@gmail.com
Received: by 10.194.121.104 with HTTP; Tue, 5 Mar 2013 05:54:29 -0800 (PST)
From: h bagade <bagadeh@gmail.com>
Date: Tue, 5 Mar 2013 17:24:29 +0330
X-Google-Sender-Auth: Qb6d16XEcq-m2MOhHnUmU3zsT30
Message-ID: <CAARSjE3h87y00_JeurzPzmkDaU5C58v=iLB-etwJ0RdtLh5f+g@mail.gmail.com>
Subject: how to get mac address info in kernel code?
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 13:54:56 -0000

Hi all,

I need to get interface MAC address within the kernel code and I couldn't
use "getifaddrs" because it's user-mode. How can I have the MAC address
information within kernel code?

Any hints or comments are really appreciated.

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 14:03:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0075E801
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 14:03:56 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 77294120
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 14:03:56 +0000 (UTC)
Received: (qmail 41221 invoked from network); 5 Mar 2013 15:17:45 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <lstewart@freebsd.org>; 5 Mar 2013 15:17:45 -0000
Message-ID: <5135FB48.1000809@freebsd.org>
Date: Tue, 05 Mar 2013 15:03:52 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Lawrence Stewart <lstewart@freebsd.org>
Subject: Re: Bug in sbsndptr()
References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org>
 <513564AD.7000006@freebsd.org>
In-Reply-To: <513564AD.7000006@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 14:03:57 -0000

On 05.03.2013 04:21, Lawrence Stewart wrote:
> On 03/05/13 03:35, Andre Oppermann wrote:
>> On 26.02.2013 14:38, Lawrence Stewart wrote:
>>> Hi Andre,
>>
>> Hi Lawrence, :-)
>>
>>> A colleague and I spent a very frustrating day tracing an accounting bug
>>> in the multipath TCP patch we're working on at CAIA to a bug in
>>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the
>>> following patch fixes the bug (proposed commit log message is at the top
>>> of the patch):
>>>
>>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff
>>>
>>>
>>> The patch should have no tangible effect to operation other than to
>>> ensure the function delivers on the promise to return the closest mbuf
>>> in the chain for the given offset.
>>
>> I agree that the description of sbsndptr() can be misleading as it refers
>> to the point in time when the pointer was updated last.  Relative to now
>> the real offset may be at the beginning of the next mbuf.
>
> Right, and we ran into the issue because we made an assumption based on
> the use of the present tense in the comment:
>
>      "Return closest mbuf in chain for current offset."

I apologize for the incorrect and misleading description. :-)

>> As you note in the proposed commit message by the time the send pointer
>> is calculated we may have reached the end of the chain and must avoid
>> storing a NULL pointer.  The mbuf copy routines simply skips over the
>> additional mbuf in the chain using the returned offset.
>>
>> I wonder how this has caused trouble with your multipath patch.  You'd
>> have to copy the sockbuf contents as well and unless you're using custom
>> sockbuf and mbuf chain functions this shouldn't be a problem.  Using
>> custom functions on a socket buffer is a delicate approach.  For a sockbuf
>> consumer being able to handle valid offsets into an mbuf chain is a core
>> feature and must-have part of the functionality.
>
> No custom sockbuf or mbuf routines are in use. We've implemented a
> mapping shim between subflows and the socket buffer. When a subflow asks
> the multipath layer for some data to send, the multipath layer returns a
> mapping onto the socket buffer, which will remain valid until such time
> as the subflow has marked the mapped data as acknowledged.
 >
> Part of the map accounting is tracking the pointer of the first mbuf in
> the sockbuf where the map's data begins. Our accounting assumed the mbuf
> + the offset returned by sbsndptr had data available, which is how we
> triggered the problem. We could have accounted for the issue in our new
> map accounting code, but that would add additional complexity to some
> already complex code and the better solution is to make sbsndptr DTRT.

So effectively you run a separate sbsndptr for each subflow using the
real sbsndptr to track the head of the queue?

/me fears the day a mptcp import comes up.  tcp-complexity^^3. :-o

>>> I would appreciate a review and any thoughts.
>>
>> I think you have found a valid (micro-)optimization.  However you're
>> still making a dangerous assumption in that the next mbuf is indeed
>> the one you want.  This may not be true in subtle ways when the chain
>> contains m_len=0 mbufs in it.  I'm not aware of it actually happening
>> but it can't be ruled out either if custom sockbuf manipulation functions
>> are in use.
>
> True, though I'm struggling to think why there would be m_len=0 mbufs
> interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain.

sbcompress() doesn't allow for m_len=0 mbufs.  This holds true as long
as the sbappend functions are used.  If not, we may get anything there.
As long as nobody is using custom sockbuf appends we're safe.  Because
I first assumed from your description some custom sockbuf munging the
guarantee wouldn't haven been there anymore.

>> I'd recommend the following:
>> have you custom sockbuf function handle forward seeking like the other
>> m_copy() functions; and/or apply a patch along the (untested) example
>> below.
>
> If you believe it is both correct and possible for m_len=0 mbufs to
> exist in a socket buffer chain, then I agree that we should amend my
> proposed patch to loop and skip over m_len=0 mbufs as you've suggested.

No.  So far it is neither possible or correct.

> However, I'm more inclined to suspect it is undesirable and potentially
> buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on
> which sbsndptr is being used, and would instead suggest a
> "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if
> block.

Agreed.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 15:53:43 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 46BA846B
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 15:53:43 +0000 (UTC)
 (envelope-from gnn@neville-neil.com)
Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176])
 by mx1.freebsd.org (Postfix) with ESMTP id 0A1059F6
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 15:53:42 +0000 (UTC)
Received: from [209.249.190.124] (port=53951
 helo=dhcp-10-2-210-24.hudson-trading.com)
 by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80)
 (envelope-from <gnn@neville-neil.com>)
 id 1UCuBS-0005rG-Cu; Tue, 05 Mar 2013 10:53:42 -0500
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: how to get mac address info in kernel code?
From: George Neville-Neil <gnn@neville-neil.com>
In-Reply-To: <CAARSjE3h87y00_JeurzPzmkDaU5C58v=iLB-etwJ0RdtLh5f+g@mail.gmail.com>
Date: Tue, 5 Mar 2013 10:53:42 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <8EB66934-D33C-425E-A076-66E31B618DCA@neville-neil.com>
References: <CAARSjE3h87y00_JeurzPzmkDaU5C58v=iLB-etwJ0RdtLh5f+g@mail.gmail.com>
To: h bagade <bagadeh@gmail.com>
X-Mailer: Apple Mail (2.1499)
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - vps.hungerhost.com
X-AntiAbuse: Original Domain - freebsd.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - neville-neil.com
X-Get-Message-Sender-Via: vps.hungerhost.com: authenticated_id:
 gnn@neville-neil.com
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 15:53:43 -0000


On Mar 5, 2013, at 08:54 , h bagade <bagadeh@gmail.com> wrote:

> Hi all,
>=20
> I need to get interface MAC address within the kernel code and I =
couldn't
> use "getifaddrs" because it's user-mode. How can I have the MAC =
address
> information within kernel code?
>=20
> Any hints or comments are really appreciated.

If you have access to the struct ifnet you can look at the if_addr =
member, which is
a struct ifaddr, defined in if_var.h .

Best,
George


From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 17:39:40 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id E953118D
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 17:39:40 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from mail-vb0-x231.google.com (mail-vb0-x231.google.com
 [IPv6:2607:f8b0:400c:c02::231])
 by mx1.freebsd.org (Postfix) with ESMTP id A22A81D3
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 17:39:40 +0000 (UTC)
Received: by mail-vb0-f49.google.com with SMTP id s24so1348745vbi.22
 for <freebsd-net@freebsd.org>; Tue, 05 Mar 2013 09:39:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:date:message-id:subject:from:to
 :content-type; bh=7F9ZNFFu2MhjeuJJobpDMVVbM3mN86LBeufB7a0f1Jo=;
 b=0U0dbviakVyoj1lH4prDPDIkTry/sBF8Isqij9cQqC12SKIu5eAtP9NaYb0KbwiFb0
 4Ram6q07P082eBYcKF6TRAkIU89J0iWnVcQ0A3/zhWTtLpaImErjDO2W9TFOhwE6SnwL
 GhadL0nD/PcbSytEsUo6lUAGS1OtOJKv8jNw4ITfx69MAuroMq1BStSYZzx5OfBfIpID
 m8uzKxYbX3zl/c5LqLVKP1raqCzb2Aus/icS8u1CwC4AFFyRzPxgi2yYcd18HzO6CpJj
 5/4/oOMn+3HPmTbWxhs/T80nvuviS2Ja4//MsCTT1fvsxGzs8nFxMvrLkqeaCXdqPEDI
 lxGQ==
MIME-Version: 1.0
X-Received: by 10.220.227.131 with SMTP id ja3mr8434935vcb.54.1362505180089;
 Tue, 05 Mar 2013 09:39:40 -0800 (PST)
Received: by 10.52.176.131 with HTTP; Tue, 5 Mar 2013 09:39:39 -0800 (PST)
Date: Tue, 5 Mar 2013 09:39:39 -0800
Message-ID: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
Subject: Default route changes unexpectedly
From: Nick Rogers <ncrogers@gmail.com>
To: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 17:39:41 -0000

Hello,

I am attempting to create awareness of a serious issue affecting users
of FreeBSD 9.x and PF. There appears to be a bug that allows the
kernel's routing table to be corrupted by traffic routing through the
system. Under heavy traffic load, the default route can seemingly
randomly change to an IP address that is not directly connected to the
network (i.e., is not configured anywhere). Dhclient is not in the
mix, nor is routed, bgpd, etc. Running `route monitor` shows no
evidence of the change in the default route. The one commonality
between all the systems experiencing this problem seems to be the use
of PF.

Obviously this is a serious problem as it causes all Internet-bound
traffic to stop routing until the default route is corrected. Some
users, including myself, are working around this problem by installing
a script that runs multiple times a second to check if the default
route is incorrect and fixing it if necessary, which mitigates the
amount of downtime caused by the bug.

Please refer to these past posts for more examples and evidence of
other users experiencing this problem:

http://forums.freebsd.org/showthread.php?p=211610#post211610

http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html

http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html

http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html

There is also a PR that was incorrectly labeled as an IPFW issue.
Myself and others believe this issue is not restricted to the use of
IPFW and that the PR should be relabeled. I am inclined to think it is
strictly a PF issue since I am not using IPFW, however there is
evidence of the default route changing on people using IPFW for past
versions of FreeBSD (7.x/8.x), so perhaps this is related.

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749

Another PR for the same problem but specific to IPFW and 8.2-RELEASE

http://www.freebsd.org/cgi/query-pr.cgi?pr=157796

I am hoping someone reading this can give the problem the attention it
deserves. Thank you.

-Nick

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 21:18:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4DC7A836
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 21:18:05 +0000 (UTC)
 (envelope-from barney_cordoba@yahoo.com)
Received: from nm5-vm1.bullet.mail.ne1.yahoo.com
 (nm5-vm1.bullet.mail.ne1.yahoo.com [98.138.91.32])
 by mx1.freebsd.org (Postfix) with ESMTP id D3E90DDA
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 21:18:04 +0000 (UTC)
Received: from [98.138.226.176] by nm5.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Mar 2013 21:17:58 -0000
Received: from [98.138.226.166] by tm11.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Mar 2013 21:17:58 -0000
Received: from [127.0.0.1] by omp1067.mail.ne1.yahoo.com with NNFMP;
 05 Mar 2013 21:17:57 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 988950.57047.bm@omp1067.mail.ne1.yahoo.com
Received: (qmail 11860 invoked by uid 60001); 5 Mar 2013 21:17:57 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1362518277; bh=QeBARCoW0PPYZndkrrLFXYjsSVbP6hHpuGGFJLWRtK4=;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
 b=zy4PMMWzSpm6IJcF154tLUSpo9zqhAJwyC61qKpM70RNLrgP1vI/5T10Rt9x0pUr+jf+wI3dMwAyMh+ZudDyjkWEeJq8Vj50oXMvgvRTHeZrpgDRR8aIO6Gd6aQoZ5HmCBPqVtoU59lTBfYy4jJX3XJaWVnWI1crUlem/fIlArg=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
 b=nJjw7CAoNhZKXaqIzpIpRn3PHFAqbKKRfCicQsm5HyVc9rLygjHBc/5tmr+r9VmsyWCRF94ffdDRl94iGMXkixSC++B4Y/taHzHfWDuSOuBkRe5WFaNj3TMn2MsC4ze3Kaw57GbDmypSDWCeTz1392ToIGyrGYB5+3+8qaxY8lI=;
X-YMail-OSG: .hcdfk0VM1lj8vyuBrDtNWgP0GPkC7pqIC3Far0gBQNFLab
 05UeLHBSwQHREiEB1D2jKsMgOq0t.1TlNW9Fetgv8oo13M4zDVpBof.plS69
 zg434GOIPDqb.2dTRLev89aD8apfKTOXM8SaqJfmAK_wBTzcPKYzvyi7UcsZ
 ymLVZLvhcuscrEnO8xXzge4ITQ0_2Y26kUnzBw5HSIy4D9Xoc2p2.cKQA1tG
 VXMhJ9gXq5XBUKIXGV5mPj7z35sDzkKi1UWoPzanpSxSM.my_DothvH_2HQ3
 SUL6ydwKcy5vgQXuHDpeGlFliuDqA6jyyopuXpBLVLpVu8NOwspOQsJanY9t
 BOu43kMDLDomaR07fUfS5iIbcQd6U09xX.eKVwSMzEnZhqHwQwYx.JtIxYpq
 WigNNfWQRsGeW86zfjDeDrnKmySEuCyySukRaKjDn05u._E2N1Wa63iGnMKW
 yGaH17mn2.0GEmZW5W0WghljPeyul98cUFQ5p9UmNP38sCDP1ZXToJNaKpiZ
 DYwKno4vh_p2dgiVdZt6Gnm1TleFn
Received: from [174.48.128.27] by web121603.mail.ne1.yahoo.com via HTTP;
 Tue, 05 Mar 2013 13:17:57 PST
X-Rocket-MIMEInfo: 001.001,
 DQoNCi0tLSBPbiBNb24sIDMvNC8xMywgWmFwaG9kIEJlZWJsZWJyb3ggPHpiZWVibGVAZ21haWwuY29tPiB3cm90ZToNCg0KPiBGcm9tOiBaYXBob2QgQmVlYmxlYnJveCA8emJlZWJsZUBnbWFpbC5jb20.DQo.IFN1YmplY3Q6IFJlOiBpZ2IgbmV0d29yayBsb2NrdXBzDQo.IFRvOiAiSmFjayBWb2dlbCIgPGpmdm9nZWxAZ21haWwuY29tPg0KPiBDYzogIk5pY2sgUm9nZXJzIiA8bmNyb2dlcnNAZ21haWwuY29tPiwgIlNlcGhlcm9zYSBaaWVoYXUiIDxzZXBoZXJvc2FAZ21haWwuY29tPiwgIkNocmlzdG9waGUBMAEBAQE-
X-Mailer: YahooMailClassic/15.1.4 YahooMailWebService/0.8.135.514
Message-ID: <1362518277.2420.YahooMailClassic@web121603.mail.ne1.yahoo.com>
Date: Tue, 5 Mar 2013 13:17:57 -0800 (PST)
From: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: igb network lockups
To: Jack Vogel <jfvogel@gmail.com>, Zaphod Beeblebrox <zbeeble@gmail.com>
In-Reply-To: <CACpH0Mfi2VKuCtr=7ErYT1yVYUMA5Pfg6SRO2wYo_OF5CExgQQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Nick Rogers <ncrogers@gmail.com>, Sepherosa Ziehau <sepherosa@gmail.com>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 21:18:05 -0000


--- On Mon, 3/4/13, Zaphod Beeblebrox <zbeeble@gmail.com> wrote:

> From: Zaphod Beeblebrox <zbeeble@gmail.com>
> Subject: Re: igb network lockups
> To: "Jack Vogel" <jfvogel@gmail.com>
> Cc: "Nick Rogers" <ncrogers@gmail.com>, "Sepherosa Ziehau" <sepherosa@gmail.com>, "Christopher D. Harrison" <harrison@biostat.wisc.edu>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
> Date: Monday, March 4, 2013, 1:58 PM
> For everyone having lockup problems
> with IGB, I'd like to ask if they could
> try disabling hyperthreads --- this worked for me on one
> system but has
> been unnecessary on others.

Gee, maybe binding an interrupt to a virtual cpu isn't a good idea?

BC

From owner-freebsd-net@FreeBSD.ORG  Tue Mar  5 21:20:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C827C98C
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 21:20:21 +0000 (UTC)
 (envelope-from barney_cordoba@yahoo.com)
Received: from nm10-vm2.bullet.mail.ne1.yahoo.com
 (nm10-vm2.bullet.mail.ne1.yahoo.com [98.138.90.158])
 by mx1.freebsd.org (Postfix) with ESMTP id 60A62E01
 for <freebsd-net@freebsd.org>; Tue,  5 Mar 2013 21:20:21 +0000 (UTC)
Received: from [98.138.226.180] by nm10.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Mar 2013 21:17:59 -0000
Received: from [98.138.87.6] by tm15.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Mar 2013 21:17:59 -0000
Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP;
 05 Mar 2013 21:17:59 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 411997.34112.bm@omp1006.mail.ne1.yahoo.com
Received: (qmail 78306 invoked by uid 60001); 5 Mar 2013 21:17:59 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1362518279; bh=QeBARCoW0PPYZndkrrLFXYjsSVbP6hHpuGGFJLWRtK4=;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
 b=mxYwWxZjIbYtMNWoMd26vuGzLVHiHBoMgEuIleAqu+FEvJ8IKwuoOPC6OdnpZj0aEnfWD0nLDwOJxcF6/ahvh/77H06I06zrN621GCQ/or6KMdY/+jRIirJwL9MxBaoBeAHZXNZeFS5P/yRglnhN+6VplygfcxPkRh4bCtex4Cs=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
 b=ahXKYR9M6pZdWiF9Bv57VnzGSkkb/rpTmCZZsv9XBJq81natVPiihCMSH34NDtlSvtSqcUe/9JQkqurRVoYlFHlB9e/RScLmM/0h8UAwgHsiRuEOVqEjDB6KfO/pm6Z0l0ipUQk/llABNM1wxEorKHH7g0XojNMbzf3gHXgisG8=;
X-YMail-OSG: Mp.qklcVM1ncq113JaRK6G4rZjh.O5ttH6FOipt6c2m4lu3
 33VRo3UEfFVhtjAWSLT81b0T2fnUfNsr7cgBS2R_Rcg5n.1bp5uhH7VSBjcX
 hkUopQejX021Rjwy91JigmhVG3f9rmS3_qH1m7Rwhu110RzhTr9__f68v1TC
 9vVw_vp6JEz4KD1UM2ZQrJSNF2DYUYHn3LSV.q_PKzXm5XmzF.zY0v3lz0CY
 tBL1g.mwGrR3v2WHuZ4Fl5bVyzOc7N.Lf1v2gFNq5YZYiHtk9HKyDSMx0D2H
 ZDm7M2M0IL3_.WuDo7QF1ZKDKXRhgwmlOuDAammyJNmfOb6uzZ9pqqUFoaca
 3bySMyQIeiJCa0vgh.45rk7T.3p3THqrZUeE5OhThG.zCXoBkC6hReClRJKc
 Cv1JTNctxGSsZ3WP2HGvF0XGzkEOk8m3Nx5eD2eaikiE54bGNCLUo1RRKs_U
 ZYOjj4T3rB7mPUC5vP2Mq8K7CQ.dcqRreSN5iGlGaUrqYuOo6EDoOBBaApSz
 VDB4qVCjuCMU7ENrU_IjynpyR0Gxe
Received: from [174.48.128.27] by web121605.mail.ne1.yahoo.com via HTTP;
 Tue, 05 Mar 2013 13:17:59 PST
X-Rocket-MIMEInfo: 001.001,
 DQoNCi0tLSBPbiBNb24sIDMvNC8xMywgWmFwaG9kIEJlZWJsZWJyb3ggPHpiZWVibGVAZ21haWwuY29tPiB3cm90ZToNCg0KPiBGcm9tOiBaYXBob2QgQmVlYmxlYnJveCA8emJlZWJsZUBnbWFpbC5jb20.DQo.IFN1YmplY3Q6IFJlOiBpZ2IgbmV0d29yayBsb2NrdXBzDQo.IFRvOiAiSmFjayBWb2dlbCIgPGpmdm9nZWxAZ21haWwuY29tPg0KPiBDYzogIk5pY2sgUm9nZXJzIiA8bmNyb2dlcnNAZ21haWwuY29tPiwgIlNlcGhlcm9zYSBaaWVoYXUiIDxzZXBoZXJvc2FAZ21haWwuY29tPiwgIkNocmlzdG9waGUBMAEBAQE-
X-Mailer: YahooMailClassic/15.1.4 YahooMailWebService/0.8.135.514
Message-ID: <1362518279.75650.YahooMailClassic@web121605.mail.ne1.yahoo.com>
Date: Tue, 5 Mar 2013 13:17:59 -0800 (PST)
From: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: igb network lockups
To: Jack Vogel <jfvogel@gmail.com>, Zaphod Beeblebrox <zbeeble@gmail.com>
In-Reply-To: <CACpH0Mfi2VKuCtr=7ErYT1yVYUMA5Pfg6SRO2wYo_OF5CExgQQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Nick Rogers <ncrogers@gmail.com>, Sepherosa Ziehau <sepherosa@gmail.com>,
 "Christopher D. Harrison" <harrison@biostat.wisc.edu>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 21:20:21 -0000


--- On Mon, 3/4/13, Zaphod Beeblebrox <zbeeble@gmail.com> wrote:

> From: Zaphod Beeblebrox <zbeeble@gmail.com>
> Subject: Re: igb network lockups
> To: "Jack Vogel" <jfvogel@gmail.com>
> Cc: "Nick Rogers" <ncrogers@gmail.com>, "Sepherosa Ziehau" <sepherosa@gmail.com>, "Christopher D. Harrison" <harrison@biostat.wisc.edu>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
> Date: Monday, March 4, 2013, 1:58 PM
> For everyone having lockup problems
> with IGB, I'd like to ask if they could
> try disabling hyperthreads --- this worked for me on one
> system but has
> been unnecessary on others.

Gee, maybe binding an interrupt to a virtual cpu isn't a good idea?

BC

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 01:54:18 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C3E81CF3;
 Wed,  6 Mar 2013 01:54:18 +0000 (UTC)
 (envelope-from lstewart@freebsd.org)
Received: from lauren.room52.net (lauren.room52.net [210.50.193.198])
 by mx1.freebsd.org (Postfix) with ESMTP id 482419A6;
 Wed,  6 Mar 2013 01:54:17 +0000 (UTC)
Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au
 [136.186.229.95])
 by lauren.room52.net (Postfix) with ESMTPSA id 414EC7E84A;
 Wed,  6 Mar 2013 12:54:15 +1100 (EST)
Message-ID: <5136A1C6.4000406@freebsd.org>
Date: Wed, 06 Mar 2013 12:54:14 +1100
From: Lawrence Stewart <lstewart@freebsd.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130213 Thunderbird/17.0.2
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Bug in sbsndptr()
References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org>
 <513564AD.7000006@freebsd.org> <5135FB48.1000809@freebsd.org>
In-Reply-To: <5135FB48.1000809@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY
 autolearn=unavailable version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 01:54:18 -0000

On 03/06/13 01:03, Andre Oppermann wrote:
> On 05.03.2013 04:21, Lawrence Stewart wrote:
>> On 03/05/13 03:35, Andre Oppermann wrote:
>>> On 26.02.2013 14:38, Lawrence Stewart wrote:
>>>> Hi Andre,
>>>
>>> Hi Lawrence, :-)
>>>
>>>> A colleague and I spent a very frustrating day tracing an accounting
>>>> bug
>>>> in the multipath TCP patch we're working on at CAIA to a bug in
>>>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the
>>>> following patch fixes the bug (proposed commit log message is at the
>>>> top
>>>> of the patch):
>>>>
>>>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff
>>>>
>>>>
>>>>
>>>> The patch should have no tangible effect to operation other than to
>>>> ensure the function delivers on the promise to return the closest mbuf
>>>> in the chain for the given offset.
>>>
>>> I agree that the description of sbsndptr() can be misleading as it
>>> refers
>>> to the point in time when the pointer was updated last.  Relative to now
>>> the real offset may be at the beginning of the next mbuf.
>>
>> Right, and we ran into the issue because we made an assumption based on
>> the use of the present tense in the comment:
>>
>>      "Return closest mbuf in chain for current offset."
> 
> I apologize for the incorrect and misleading description. :-)

No drama, just explaining the crux of the problem from our perspective
so it's clear why we ran into this.

>>> As you note in the proposed commit message by the time the send pointer
>>> is calculated we may have reached the end of the chain and must avoid
>>> storing a NULL pointer.  The mbuf copy routines simply skips over the
>>> additional mbuf in the chain using the returned offset.
>>>
>>> I wonder how this has caused trouble with your multipath patch.  You'd
>>> have to copy the sockbuf contents as well and unless you're using custom
>>> sockbuf and mbuf chain functions this shouldn't be a problem.  Using
>>> custom functions on a socket buffer is a delicate approach.  For a
>>> sockbuf
>>> consumer being able to handle valid offsets into an mbuf chain is a core
>>> feature and must-have part of the functionality.
>>
>> No custom sockbuf or mbuf routines are in use. We've implemented a
>> mapping shim between subflows and the socket buffer. When a subflow asks
>> the multipath layer for some data to send, the multipath layer returns a
>> mapping onto the socket buffer, which will remain valid until such time
>> as the subflow has marked the mapped data as acknowledged.
>>
>> Part of the map accounting is tracking the pointer of the first mbuf in
>> the sockbuf where the map's data begins. Our accounting assumed the mbuf
>> + the offset returned by sbsndptr had data available, which is how we
>> triggered the problem. We could have accounted for the issue in our new
>> map accounting code, but that would add additional complexity to some
>> already complex code and the better solution is to make sbsndptr DTRT.
> 
> So effectively you run a separate sbsndptr for each subflow using the
> real sbsndptr to track the head of the queue?

Yes, essentially works as you describe. The initial goal/design was to
make multi-stream support a first class citizen inside the socket
buffer, but we ran out of time to do this. The design we've come up with
is a reasonable interim to get to an alpha patch release, which should
be happening later this week if you're interested to take a look. We'll
make an announcement when it's up on the website.

> /me fears the day a mptcp import comes up.  tcp-complexity^^3. :-o

Yeah it's pretty invasive but does bring some useful features too. There
is a lot more work to do before I'd consider proposing we import it into
the stack and even then, we'll want to have a robust discussion about
when and how to do it.

Given that this is being done as part of a research project, we've also
taken the opportunity to experiment with changing some ideas and
idiosyncrasies in the existing stack code and will be doing a lot of
experimental research with the stack and iteratively refining things as
we go.

>>>> I would appreciate a review and any thoughts.
>>>
>>> I think you have found a valid (micro-)optimization.  However you're
>>> still making a dangerous assumption in that the next mbuf is indeed
>>> the one you want.  This may not be true in subtle ways when the chain
>>> contains m_len=0 mbufs in it.  I'm not aware of it actually happening
>>> but it can't be ruled out either if custom sockbuf manipulation
>>> functions
>>> are in use.
>>
>> True, though I'm struggling to think why there would be m_len=0 mbufs
>> interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain.
> 
> sbcompress() doesn't allow for m_len=0 mbufs.  This holds true as long
> as the sbappend functions are used.  If not, we may get anything there.
> As long as nobody is using custom sockbuf appends we're safe.  Because
> I first assumed from your description some custom sockbuf munging the
> guarantee wouldn't haven been there anymore.

Ok cool.

>>> I'd recommend the following:
>>> have you custom sockbuf function handle forward seeking like the other
>>> m_copy() functions; and/or apply a patch along the (untested) example
>>> below.
>>
>> If you believe it is both correct and possible for m_len=0 mbufs to
>> exist in a socket buffer chain, then I agree that we should amend my
>> proposed patch to loop and skip over m_len=0 mbufs as you've suggested.
> 
> No.  So far it is neither possible or correct.
> 
>> However, I'm more inclined to suspect it is undesirable and potentially
>> buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on
>> which sbsndptr is being used, and would instead suggest a
>> "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if
>> block.
> 
> Agreed.

How does this look?

http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314_v2.diff

Sockbuf code is tricky so I'll test this for a while and commit after it
has had a reasonable run and not shown any side effects.

Cheers,
Lawrence

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 05:48:17 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4B720D20
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 05:48:17 +0000 (UTC)
 (envelope-from emz@norma.perm.ru)
Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2])
 by mx1.freebsd.org (Postfix) with ESMTP id F357D234
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 05:48:16 +0000 (UTC)
Received: from bsdrookie.norma.com. ([IPv6:fd00::726])
 by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r265mDQA002265
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
 for <freebsd-net@freebsd.org>; Wed, 6 Mar 2013 11:48:14 +0600 (YEKT)
 (envelope-from emz@norma.perm.ru)
Message-ID: <5136D89D.4000902@norma.perm.ru>
Date: Wed, 06 Mar 2013 11:48:13 +0600
From: "Eugene M. Zheganin" <emz@norma.perm.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
References: <F02BE044-1C4F-43EB-8091-BC62362C2E5F@sd63.bc.ca>
 <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
In-Reply-To: <20130228053558.GA1474@michelle.cdnetworks.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (elf.hq.norma.perm.ru [IPv6:fd00::30a]);
 Wed, 06 Mar 2013 11:48:14 +0600 (YEKT)
X-Spam-Status: No hits=-97.8 bayes=0.5 testhits RDNS_NONE=1.274,
 SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 05:48:17 -0000

Hi.

On 28.02.2013 11:35, YongHyeon PYUN wrote:
> The reporter said the machine was Sun Fire X2200 M2 so I guess you
> may see the same issue on both stable/9 and stable/8. Ideally the
> loader tunable hw.bge.allow_asf should not be there and driver
> should take care of it by checking the existence of ASF/IPMI
> firmware.
>
>
Unfortunately, I just had the 'bge0 - watchdog timeout - resetting' on a
recent 8.3-STABLE and a 'Broadcom NetXtreme BCM5722 Gigabit (94309)'
(according to the pciconf -lv) controller. I haven't seen this in a year
or two (I guess), the machine was running 8.2-STABLE. So, in order to
fight this (machine is freezing during these messages) whet should I do
? Is upgrading to 10.0-CURRENT an option ? hw.bge.allow_asf is 0 already.

Thanks.
Eugene.

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 06:27:06 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 57F587E3
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 06:27:06 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pb0-f52.google.com (mail-pb0-f52.google.com
 [209.85.160.52]) by mx1.freebsd.org (Postfix) with ESMTP id F276C368
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 06:27:05 +0000 (UTC)
Received: by mail-pb0-f52.google.com with SMTP id ma3so5532032pbc.25
 for <freebsd-net@freebsd.org>; Tue, 05 Mar 2013 22:27:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=5pksi+SItNBqsH0a7bRloNG3e9weeUdPPTblC0GN32M=;
 b=v+e2oe/P80/6agI6D3qmtB8ovtJVYCsYHtru50/x/+ft+BCiR/62Yy58bsd8hA7eny
 X8IQJ+DFnOyEdCTOGhr+PN4sFm01kjqDTTf0uDdU/nbiNsq2PIUWO4jzCBbvw57CkqrZ
 EwIdr69CBwEq2JZQLRK7ONK9enu75RSPQDf38dALo7pao9LsBIdzvyZVeC4i7v994lvc
 YeUAkobRSCdH6bwJGK/Mc62gICRu/UWDsg5vVRIv0pUT7f9aqO+VHVa5+yebxoIzwlqN
 8bfvpCgrYdDTiDUoKGlalvCXxUhKQLrH1bpTIJWXa3GeKVxnLtSIlkq5LZIEHysIUYRh
 jsxw==
X-Received: by 10.68.138.225 with SMTP id qt1mr42815544pbb.82.1362551225280;
 Tue, 05 Mar 2013 22:27:05 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id zm1sm29929930pbc.26.2013.03.05.22.27.01
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Tue, 05 Mar 2013 22:27:04 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Wed, 06 Mar 2013 15:26:58 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Wed, 6 Mar 2013 15:26:58 +0900
To: "Eugene M. Zheganin" <emz@norma.perm.ru>
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
Message-ID: <20130306062658.GC1483@michelle.cdnetworks.com>
References: <F02BE044-1C4F-43EB-8091-BC62362C2E5F@sd63.bc.ca>
 <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5136D89D.4000902@norma.perm.ru>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 06:27:06 -0000

On Wed, Mar 06, 2013 at 11:48:13AM +0600, Eugene M. Zheganin wrote:
> Hi.
> 
> On 28.02.2013 11:35, YongHyeon PYUN wrote:
> > The reporter said the machine was Sun Fire X2200 M2 so I guess you
> > may see the same issue on both stable/9 and stable/8. Ideally the
> > loader tunable hw.bge.allow_asf should not be there and driver
> > should take care of it by checking the existence of ASF/IPMI
> > firmware.
> >
> >
> Unfortunately, I just had the 'bge0 - watchdog timeout - resetting' on a
> recent 8.3-STABLE and a 'Broadcom NetXtreme BCM5722 Gigabit (94309)'
> (according to the pciconf -lv) controller. I haven't seen this in a year
> or two (I guess), the machine was running 8.2-STABLE. So, in order to
> fight this (machine is freezing during these messages) whet should I do
> ? Is upgrading to 10.0-CURRENT an option ? hw.bge.allow_asf is 0 already.

If you were using latest stable/8, the result would be same on
CURRENT.
How frequently do you see the watchdog timeouts? Is there way to
reproduce it?
Would you show me the output of dmesg (bge(4) and brgphy(4) only)
and "pciconf -lcbv"?

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 06:30:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EEDE8B3F
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 06:30:57 +0000 (UTC)
 (envelope-from sodynet1@gmail.com)
Received: from mail-pb0-f53.google.com (mail-pb0-f53.google.com
 [209.85.160.53]) by mx1.freebsd.org (Postfix) with ESMTP id CD38B390
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 06:30:57 +0000 (UTC)
Received: by mail-pb0-f53.google.com with SMTP id un1so5499463pbc.40
 for <freebsd-net@freebsd.org>; Tue, 05 Mar 2013 22:30:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=T6iM0YcmDpDtHfIJIQgDrMTmWTOR5/cSTNpaa1aLJnY=;
 b=jik4d+bvvP9bFkrtc5kxFh7InZf8gvhIVvy5roXu/yQWEbJBe7MWW7lqzV1eSsu4tS
 5pCSwPDH3nhMrgO9U5RzMxG1MU46avkjGu51UivY2HkyTK/wFRxorLP4yQjaYrGoPPo8
 7hOeKenl/HivmEO6fHJYsW4saRPrbYZgJMBZjkknsRmR3oJLubXsY1nQoK322ZaVl3ih
 y4W2j1ts0ekfo9iNc0nRkHVvH6RLMRG2hb9wmX+bYaswwY1IgrPgujzx1mthqdUd5u0v
 6plfEAT84bPm077KSxf/I4ddQf6imVWPXIyupMV3s1DrsSJh7C0Fyhxn80huscUUTTLN
 cHPg==
MIME-Version: 1.0
X-Received: by 10.68.196.225 with SMTP id ip1mr43465231pbc.72.1362551456896;
 Tue, 05 Mar 2013 22:30:56 -0800 (PST)
Received: by 10.70.34.103 with HTTP; Tue, 5 Mar 2013 22:30:56 -0800 (PST)
Received: by 10.70.34.103 with HTTP; Tue, 5 Mar 2013 22:30:56 -0800 (PST)
In-Reply-To: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
Date: Wed, 6 Mar 2013 08:30:56 +0200
Message-ID: <CAEW+ogY5VM7ENbWYyCNfGnNojPVYX=SU5d-Y7_AY1fkQm=zozQ@mail.gmail.com>
Subject: Re: Default route changes unexpectedly
From: Sami Halabi <sodynet1@gmail.com>
To: Nick Rogers <ncrogers@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 06:30:58 -0000

Hi,
I can say also i faced this problem in 9.1-preRelease. And i'm not using
pf, i usyally use ipfw. but i didn't see this happening for a while...

Sami
On Mar 5, 2013 7:39 PM, "Nick Rogers" <ncrogers@gmail.com> wrote:

> Hello,
>
> I am attempting to create awareness of a serious issue affecting users
> of FreeBSD 9.x and PF. There appears to be a bug that allows the
> kernel's routing table to be corrupted by traffic routing through the
> system. Under heavy traffic load, the default route can seemingly
> randomly change to an IP address that is not directly connected to the
> network (i.e., is not configured anywhere). Dhclient is not in the
> mix, nor is routed, bgpd, etc. Running `route monitor` shows no
> evidence of the change in the default route. The one commonality
> between all the systems experiencing this problem seems to be the use
> of PF.
>
> Obviously this is a serious problem as it causes all Internet-bound
> traffic to stop routing until the default route is corrected. Some
> users, including myself, are working around this problem by installing
> a script that runs multiple times a second to check if the default
> route is incorrect and fixing it if necessary, which mitigates the
> amount of downtime caused by the bug.
>
> Please refer to these past posts for more examples and evidence of
> other users experiencing this problem:
>
> http://forums.freebsd.org/showthread.php?p=211610#post211610
>
>
> http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html
>
> http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html
>
> There is also a PR that was incorrectly labeled as an IPFW issue.
> Myself and others believe this issue is not restricted to the use of
> IPFW and that the PR should be relabeled. I am inclined to think it is
> strictly a PF issue since I am not using IPFW, however there is
> evidence of the default route changing on people using IPFW for past
> versions of FreeBSD (7.x/8.x), so perhaps this is related.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749
>
> Another PR for the same problem but specific to IPFW and 8.2-RELEASE
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=157796
>
> I am hoping someone reading this can give the problem the attention it
> deserves. Thank you.
>
> -Nick
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 06:39:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5913FD3D;
 Wed,  6 Mar 2013 06:39:57 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 1CDA33E3;
 Wed,  6 Mar 2013 06:39:57 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r266du6g007177;
 Wed, 6 Mar 2013 06:39:56 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r266duAQ007176;
 Wed, 6 Mar 2013 06:39:56 GMT (envelope-from linimon)
Date: Wed, 6 Mar 2013 06:39:56 GMT
Message-Id: <201303060639.r266duAQ007176@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/176671: [epair] MAC address for epair device not unique
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 06:39:57 -0000

Old Synopsis: MAC address for epair device not unique
New Synopsis: [epair] MAC address for epair device not unique

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Wed Mar 6 06:39:09 UTC 2013
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=176671

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 06:44:02 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 06D16EA5;
 Wed,  6 Mar 2013 06:44:02 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id CBB1E5EC;
 Wed,  6 Mar 2013 06:44:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r266i1jc008797;
 Wed, 6 Mar 2013 06:44:01 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r266i1Dh008796;
 Wed, 6 Mar 2013 06:44:01 GMT (envelope-from linimon)
Date: Wed, 6 Mar 2013 06:44:01 GMT
Message-Id: <201303060644.r266i1Dh008796@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/176667: [libalias] [patch] libalias locks on uninitalized
 data
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 06:44:02 -0000

Old Synopsis: libalias locks on uninitalized data
New Synopsis: [libalias] [patch] libalias locks on uninitalized data

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Wed Mar 6 06:43:36 UTC 2013
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=176667

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 07:01:40 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id F0EDC767
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 07:01:40 +0000 (UTC)
 (envelope-from VenkatKumar.Duvvuru@Emulex.Com)
Received: from CMEXEDGE1.ext.emulex.com (cmexedge1.ext.emulex.com
 [138.239.224.99]) by mx1.freebsd.org (Postfix) with ESMTP id A22B16A7
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 07:01:40 +0000 (UTC)
Received: from CMEXHTCAS2.ad.emulex.com (138.239.115.218) by
 CMEXEDGE1.ext.emulex.com (138.239.224.99) with Microsoft SMTP Server (TLS) id
 14.2.318.4; Tue, 5 Mar 2013 23:02:53 -0800
Received: from CMEXMB1.ad.emulex.com ([169.254.1.137]) by
 CMEXHTCAS2.ad.emulex.com ([2002:8aef:73da::8aef:73da]) with mapi id
 14.02.0318.004; Tue, 5 Mar 2013 23:01:31 -0800
From: "Duvvuru,Venkat Kumar" <VenkatKumar.Duvvuru@Emulex.Com>
To: Josh Paetzel <josh@tcbug.org>
Subject: RE: OCE driver patches
Thread-Topic: OCE driver patches
Thread-Index: Ac4FKZvQ2m3Cu1QRR3u/QVUeigTouwB8AcmAA9n7JiAAEvolAADaXNqw
Date: Wed, 6 Mar 2013 07:01:31 +0000
Message-ID: <BF3270C86E8B1349A26C34E4EC1C44CB215D9B59@CMEXMB1.ad.emulex.com>
References: <BF3270C86E8B1349A26C34E4EC1C44CB215A0220@CMEXMB1.ad.emulex.com>
 <B3A73C45-3299-4EDD-BE38-9D027E9D548A@tcbug.org>
 <BF3270C86E8B1349A26C34E4EC1C44CB215D84F9@CMEXMB1.ad.emulex.com>
 <E59FC35E-9111-4667-B0D7-0D716C2F4DF8@tcbug.org>
In-Reply-To: <E59FC35E-9111-4667-B0D7-0D716C2F4DF8@tcbug.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator: 
x-originating-ip: [138.239.140.229]
Content-Type: multipart/mixed;
 boundary="_002_BF3270C86E8B1349A26C34E4EC1C44CB215D9B59CMEXMB1ademulex_"
MIME-Version: 1.0
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 07:01:41 -0000

--_002_BF3270C86E8B1349A26C34E4EC1C44CB215D9B59CMEXMB1ademulex_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hi Josh,
I'm attaching a .tgz file of the patches (oce0.patch to oce24.patch, Please=
 make sure that you apply them in the same order) that I told you about for=
 the Emulex's OCE driver.
I had opened a PR for the same, However .tgz files are not allowed as attac=
hments to the problem report, So I just renamed .tgz file to .txt for uploa=
ding.
However I have not gotten any email notification on that problem report, co=
uld be because of the attachment problem I guess.=20
Please let me know if I could open a PR without any attachment and send you=
 that PR number.
Also  there is a notification about freebsd 8.4 code freeze by March 8. It =
would be nice if we could get these patches in before the code freeze on 8.=
4 as well.

Pls suggest.

Thanks,
Venkat.

-----Original Message-----
From: Josh Paetzel [mailto:josh@tcbug.org]=20
Sent: Friday, March 01, 2013 8:08 PM
To: Duvvuru,Venkat Kumar
Cc: freebsd-net@freebsd.org
Subject: Re: OCE driver patches

On Mar 1, 2013, at 5:36 AM, "Duvvuru,Venkat Kumar" <VenkatKumar.Duvvuru@Emu=
lex.Com> wrote:

> Hi Josh,
> I have a bunch of patches (~25 in number) to submit. Please let me know t=
he process to submit them.
> Do I just attach them in a single email or open pr's for each of them??
> Pls suggest.
>=20
> /Venkat
>=20

Venkat,

I think it depends on how you want them committed to FreeBSD.=20

If the patches are atomic changes that should be kept atomic in the FreeBSD=
 source tree then I'll commit them seperately. This is a tad time consuming=
 as I test them atomically before committing them.  If they can be committe=
d in one go then I can just apply them all, test the end result, and commit=
 that.

One PR with the patches attached and a note saying these can all go in in o=
ne go is appropriate in the latter case, the former would be best served by=
 seperate PRs.

Thanks,

Josh Paetzel

--_002_BF3270C86E8B1349A26C34E4EC1C44CB215D9B59CMEXMB1ademulex_--

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 07:32:42 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 9EB14F2C
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 07:32:42 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-wg0-f46.google.com (mail-wg0-f46.google.com [74.125.82.46])
 by mx1.freebsd.org (Postfix) with ESMTP id 133B67A1
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 07:32:41 +0000 (UTC)
Received: by mail-wg0-f46.google.com with SMTP id fg15so6853704wgb.13
 for <freebsd-net@freebsd.org>; Tue, 05 Mar 2013 23:32:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=Tynyg9fXL6DfIoA1/e9slZogS8k9CTqbtZDi5floLgk=;
 b=CTF+x27Ya5LjbXh32BYYzSe+bDbdk55VVecTJKiXzkox3XKd46wlgds1rMbbjnDpgA
 RjpVsbOxjxOH0SAQyLdLMp86HxQX4nbnfHXACkbGbWxnRFC6FJQI196yM3mIB/oi6Gni
 t1abd7GwwuHMskkzWEy6sL76S9r/kOr7sirRJ+1RjYraxq12guiE/i4feeIjVt5EquHC
 N0iUvfWnSmvjGZe2LFb4SkHkBpIGnRsGxwzo4BlUDfZkqAkr8PF61mrBndUD/i42d6uH
 xBRvgwRGnFpTL3/BJeEJa7oAxnzMVi44dATkPlPhkQ0iS7CXdckBXfalb/G+ZxyH7nV0
 A6+Q==
MIME-Version: 1.0
X-Received: by 10.180.108.3 with SMTP id hg3mr23594376wib.33.1362555160454;
 Tue, 05 Mar 2013 23:32:40 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.216.114.201 with HTTP; Tue, 5 Mar 2013 23:32:40 -0800 (PST)
In-Reply-To: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
Date: Tue, 5 Mar 2013 23:32:40 -0800
X-Google-Sender-Auth: O_oVYtcpGYq0nhueZ0odFTAQD1w
Message-ID: <CAJ-VmomjjH1ETnHg5kBd+M8qDo6JQaM5ebPem7xHoZ7O2-J9iA@mail.gmail.com>
Subject: Re: Default route changes unexpectedly
From: Adrian Chadd <adrian@freebsd.org>
To: Nick Rogers <ncrogers@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 07:32:42 -0000

It's a known problem; it just seems that it doesn't overlap/intersect
the day to day activities of any network focused freebsd developers.

If you guys want it fixed then you may have to find a developer to
hire on contract to fix it, or find some kind of ruleset/traffic
generation setup that reliably triggers the bug.


Adrian

On 5 March 2013 09:39, Nick Rogers <ncrogers@gmail.com> wrote:
> Hello,
>
> I am attempting to create awareness of a serious issue affecting users
> of FreeBSD 9.x and PF. There appears to be a bug that allows the
> kernel's routing table to be corrupted by traffic routing through the
> system. Under heavy traffic load, the default route can seemingly
> randomly change to an IP address that is not directly connected to the
> network (i.e., is not configured anywhere). Dhclient is not in the
> mix, nor is routed, bgpd, etc. Running `route monitor` shows no
> evidence of the change in the default route. The one commonality
> between all the systems experiencing this problem seems to be the use
> of PF.
>
> Obviously this is a serious problem as it causes all Internet-bound
> traffic to stop routing until the default route is corrected. Some
> users, including myself, are working around this problem by installing
> a script that runs multiple times a second to check if the default
> route is incorrect and fixing it if necessary, which mitigates the
> amount of downtime caused by the bug.
>
> Please refer to these past posts for more examples and evidence of
> other users experiencing this problem:
>
> http://forums.freebsd.org/showthread.php?p=211610#post211610
>
> http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html
>
> http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html
>
> There is also a PR that was incorrectly labeled as an IPFW issue.
> Myself and others believe this issue is not restricted to the use of
> IPFW and that the PR should be relabeled. I am inclined to think it is
> strictly a PF issue since I am not using IPFW, however there is
> evidence of the default route changing on people using IPFW for past
> versions of FreeBSD (7.x/8.x), so perhaps this is related.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749
>
> Another PR for the same problem but specific to IPFW and 8.2-RELEASE
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=157796
>
> I am hoping someone reading this can give the problem the attention it
> deserves. Thank you.
>
> -Nick
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 08:25:26 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B0BD7B73
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 08:25:26 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 11C3598F
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 08:25:25 +0000 (UTC)
Received: (qmail 52640 invoked from network); 6 Mar 2013 09:39:06 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <ncrogers@gmail.com>; 6 Mar 2013 09:39:06 -0000
Message-ID: <5136FD71.6000408@freebsd.org>
Date: Wed, 06 Mar 2013 09:25:21 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Nick Rogers <ncrogers@gmail.com>
Subject: Re: Default route changes unexpectedly
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
In-Reply-To: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 08:25:26 -0000

On 05.03.2013 18:39, Nick Rogers wrote:
> Hello,
>
> I am attempting to create awareness of a serious issue affecting users
> of FreeBSD 9.x and PF. There appears to be a bug that allows the
> kernel's routing table to be corrupted by traffic routing through the
> system. Under heavy traffic load, the default route can seemingly
> randomly change to an IP address that is not directly connected to the
> network (i.e., is not configured anywhere). Dhclient is not in the
> mix, nor is routed, bgpd, etc. Running `route monitor` shows no
> evidence of the change in the default route. The one commonality
> between all the systems experiencing this problem seems to be the use
> of PF.
>
> Obviously this is a serious problem as it causes all Internet-bound
> traffic to stop routing until the default route is corrected. Some
> users, including myself, are working around this problem by installing
> a script that runs multiple times a second to check if the default
> route is incorrect and fixing it if necessary, which mitigates the
> amount of downtime caused by the bug.

Can you describe your traffic forwarding setup in more detail?
Is it only pf, or do you run netgraph, or other things as well?
Do you use flow routing?

How frequent does this happen?

I'm trying to create a stack graph to see which parts of the network
stack are involved in handling your packet.

-- 
Andre

> Please refer to these past posts for more examples and evidence of
> other users experiencing this problem:
>
> http://forums.freebsd.org/showthread.php?p=211610#post211610
>
> http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html
>
> http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html
>
> There is also a PR that was incorrectly labeled as an IPFW issue.
> Myself and others believe this issue is not restricted to the use of
> IPFW and that the PR should be relabeled. I am inclined to think it is
> strictly a PF issue since I am not using IPFW, however there is
> evidence of the default route changing on people using IPFW for past
> versions of FreeBSD (7.x/8.x), so perhaps this is related.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749
>
> Another PR for the same problem but specific to IPFW and 8.2-RELEASE
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=157796
>
> I am hoping someone reading this can give the problem the attention it
> deserves. Thank you.
>
> -Nick
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>
>


From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 08:45:27 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5C7D311C;
 Wed,  6 Mar 2013 08:45:27 +0000 (UTC)
 (envelope-from krzysiek@airnet.opole.pl)
Received: from base.airnet.opole.pl (ns2.airmax.pl [176.111.128.3])
 by mx1.freebsd.org (Postfix) with ESMTP id 1AE26A53;
 Wed,  6 Mar 2013 08:45:25 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by base.airnet.opole.pl (Postfix) with ESMTP id 036B87FF059;
 Wed,  6 Mar 2013 09:38:47 +0100 (CET)
Received: from base.airnet.opole.pl ([127.0.0.1])
 by localhost (mail.airnet.opole.pl [127.0.0.1]) (maiad, port 10024)
 with ESMTP id 66913-06; Wed,  6 Mar 2013 09:38:46 +0100 (CET)
Received: from [10.10.11.223] (unknown [176.111.138.12])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 (Authenticated sender: krzysiek@airnet.opole.pl)
 by base.airnet.opole.pl (Postfix) with ESMTPSA id 1C8E87FF04D;
 Wed,  6 Mar 2013 09:38:44 +0100 (CET)
Message-ID: <51370093.40009@airnet.opole.pl>
Date: Wed, 06 Mar 2013 09:38:43 +0100
From: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Default route changes unexpectedly
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
 <5136FD71.6000408@freebsd.org>
In-Reply-To: <5136FD71.6000408@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 08:45:27 -0000

W dniu 2013-03-06 09:25, Andre Oppermann pisze:
> Can you describe your traffic forwarding setup in more detail?
> Is it only pf, or do you run netgraph, or other things as well?
> Do you use flow routing?
>
> How frequent does this happen?
>
> I'm trying to create a stack graph to see which parts of the network
> stack are involved in handling your packet.
>

Hi,
In my case, I do use PF for filtering and NAT (without routing options 
like 'route-to' or 'reply-to') together with ALTQ (PRIQ).
I also use IPFW+Dummynet combo for shaping.

net.inet.ip.sourceroute: 0
net.inet.ip.accept_sourceroute: 0

Router traffic is about 300Mb/s in peak.

Frequency:
Wed Oct 3 14:19:15 CEST 2012
Thu Dec 13 04:39:43 CET 2012
Thu Dec 13 04:39:46 CET 2012
Thu Dec 13 04:39:47 CET 2012
Thu Dec 13 04:39:50 CET 2012
Thu Dec 13 04:39:53 CET 2012
Thu Dec 13 04:39:59 CET 2012
Thu Dec 13 04:40:11 CET 2012
Fri Jan 4 07:47:00 CET 2013
Mon Jan 28 18:35:43 CET 2013
Sat Feb 2 22:43:01 CET 2013

I do only monitor default route change, but this bug also affects static 
routes (i.e. I have one static route and it changes more frequently that 
default route).

Please let me know if I can provide any more feedback.

Krzysiek


From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 09:12:13 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B9448686;
 Wed,  6 Mar 2013 09:12:13 +0000 (UTC)
 (envelope-from dhartmei@insomnia.benzedrine.cx)
Received: from insomnia.benzedrine.cx
 (cust.static.213-3-30-106.swisscomdata.ch [213.3.30.106])
 by mx1.freebsd.org (Postfix) with ESMTP id D3995BE4;
 Wed,  6 Mar 2013 09:12:12 +0000 (UTC)
Received: from insomnia.benzedrine.cx (localhost [127.0.0.1])
 by insomnia.benzedrine.cx (8.14.5/8.14.5) with ESMTP id r268rCxp018755
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Wed, 6 Mar 2013 09:53:12 +0100 (MET)
Received: (from dhartmei@localhost)
 by insomnia.benzedrine.cx (8.14.5/8.14.5/Submit) id r268rBUr023680;
 Wed, 6 Mar 2013 09:53:11 +0100 (MET)
Date: Wed, 6 Mar 2013 09:53:11 +0100
From: Daniel Hartmeier <daniel@benzedrine.cx>
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Default route changes unexpectedly
Message-ID: <20130306085311.GA12382@insomnia.benzedrine.cx>
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
 <5136FD71.6000408@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5136FD71.6000408@freebsd.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: Nick Rogers <ncrogers@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 09:12:13 -0000

On Wed, Mar 06, 2013 at 09:25:21AM +0100, Andre Oppermann wrote:

> I'm trying to create a stack graph to see which parts of the network
> stack are involved in handling your packet.

Ask people if they're using multiple pfil hooks (even just having
ipfilter loaded counts, for instance).

If that's a common factor, see
http://marc.info/?l=freebsd-net&m=133888532814565&w=2

Daniel

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 09:13:50 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 213FE88F;
 Wed,  6 Mar 2013 09:13:50 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qe0-f49.google.com (mail-qe0-f49.google.com
 [209.85.128.49]) by mx1.freebsd.org (Postfix) with ESMTP id B7EB5BFC;
 Wed,  6 Mar 2013 09:13:49 +0000 (UTC)
Received: by mail-qe0-f49.google.com with SMTP id 1so5145504qec.8
 for <multiple recipients>; Wed, 06 Mar 2013 01:13:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=WPUlVVDT8iKgrMxHdE8jqje/PV2KmIfkfRsKR3Hd4xg=;
 b=y6dLyPqK9SCtBHCTa5h1mD9ikmogMNJZQrvqlTtBFGsxSTOlKCNQZN5oqcUQ/nyHSw
 bnVOUemuaFSoFK147ylsavhBAv7qqXbfq/081RggUvx8cENh1vsCs8k5ba939zxBXZbd
 PyTQ3tkgQei+n/NwDMr1hdRnFCaGN2RIpzx4MvFo7hXJjK16YqU4+dJrDK+81V7Uh1Bk
 uT/XGDwnKVvF7zWT1bSzzTfBKcf3a/k5Tsl9+1XOsj4By7qtUXs76sxsU26ynjuA2X91
 xbnL89FLJFtsqAOVKkmchsgEHKU64s6mDnl8R9tbrUFnsGFpJblshbW+TwP7LzlVO8fq
 Jghg==
MIME-Version: 1.0
X-Received: by 10.49.30.70 with SMTP id q6mr46092639qeh.28.1362561223158; Wed,
 06 Mar 2013 01:13:43 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Wed, 6 Mar 2013 01:13:43 -0800 (PST)
In-Reply-To: <51370093.40009@airnet.opole.pl>
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
 <5136FD71.6000408@freebsd.org> <51370093.40009@airnet.opole.pl>
Date: Wed, 6 Mar 2013 10:13:43 +0100
X-Google-Sender-Auth: -lDIT0I12nZHgWq6L3qEAhFF5O0
Message-ID: <CAPBZQG1ZeuaPqg6tF72ziZ5-yCDtrVtM4OWO2qO9k8P+osrqDQ@mail.gmail.com>
Subject: Re: Default route changes unexpectedly
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 Andre Oppermann <andre@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 09:13:50 -0000

On Wed, Mar 6, 2013 at 9:38 AM, Krzysztof Barcikowski <
krzysiek@airnet.opole.pl> wrote:

> W dniu 2013-03-06 09:25, Andre Oppermann pisze:
>
>  Can you describe your traffic forwarding setup in more detail?
>> Is it only pf, or do you run netgraph, or other things as well?
>> Do you use flow routing?
>>
>> How frequent does this happen?
>>
>> I'm trying to create a stack graph to see which parts of the network
>> stack are involved in handling your packet.
>>
>>
> Hi,
> In my case, I do use PF for filtering and NAT (without routing options
> like 'route-to' or 'reply-to') together with ALTQ (PRIQ).
> I also use IPFW+Dummynet combo for shaping.
>
> net.inet.ip.sourceroute: 0
> net.inet.ip.accept_**sourceroute: 0
>
> Router traffic is about 300Mb/s in peak.
>
> Frequency:
> Wed Oct 3 14:19:15 CEST 2012
> Thu Dec 13 04:39:43 CET 2012
> Thu Dec 13 04:39:46 CET 2012
> Thu Dec 13 04:39:47 CET 2012
> Thu Dec 13 04:39:50 CET 2012
> Thu Dec 13 04:39:53 CET 2012
> Thu Dec 13 04:39:59 CET 2012
> Thu Dec 13 04:40:11 CET 2012
> Fri Jan 4 07:47:00 CET 2013
> Mon Jan 28 18:35:43 CET 2013
> Sat Feb 2 22:43:01 CET 2013
>
> I do only monitor default route change, but this bug also affects static
> routes (i.e. I have one static route and it changes more frequently that
> default route).
>
> Please let me know if I can provide any more feedback.
>
> Krzysiek
>
>
>
>
Do you have flowtable support in your kernel?
Can you try without it enabled?


>
>
> ______________________________**_________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
> "
>


-- 
Ermal

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 09:29:03 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2FF88EB
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 09:29:03 +0000 (UTC)
 (envelope-from krzysiek@airnet.opole.pl)
Received: from base.airnet.opole.pl (ns2.airmax.pl [176.111.128.3])
 by mx1.freebsd.org (Postfix) with ESMTP id D801DD34
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 09:29:02 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by base.airnet.opole.pl (Postfix) with ESMTP id 0453D7FF055
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 10:29:00 +0100 (CET)
Received: from base.airnet.opole.pl ([127.0.0.1])
 by localhost (mail.airnet.opole.pl [127.0.0.1]) (maiad, port 10024)
 with ESMTP id 54708-04 for <freebsd-net@freebsd.org>;
 Wed,  6 Mar 2013 10:28:59 +0100 (CET)
Received: from [10.10.11.223] (unknown [176.111.138.12])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 (Authenticated sender: krzysiek@airnet.opole.pl)
 by base.airnet.opole.pl (Postfix) with ESMTPSA id C93937FF051
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 10:28:59 +0100 (CET)
Message-ID: <51370C5A.1080701@airnet.opole.pl>
Date: Wed, 06 Mar 2013 10:28:58 +0100
From: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Re: Default route changes unexpectedly
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
 <5136FD71.6000408@freebsd.org> <51370093.40009@airnet.opole.pl>
 <CAPBZQG1ZeuaPqg6tF72ziZ5-yCDtrVtM4OWO2qO9k8P+osrqDQ@mail.gmail.com>
In-Reply-To: <CAPBZQG1ZeuaPqg6tF72ziZ5-yCDtrVtM4OWO2qO9k8P+osrqDQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 09:29:03 -0000

I believe I don't have flowtable suport in kernel (no FLOWTABLE option), 
and no sysctl's related to flowtable.

How to check if I'm using multiple pfil hooks?

Best regards!
Krzysiek

W dniu 2013-03-06 10:13, Ermal Lu�i pisze:
> On Wed, Mar 6, 2013 at 9:38 AM, Krzysztof Barcikowski <
> krzysiek@airnet.opole.pl> wrote:
>
>> W dniu 2013-03-06 09:25, Andre Oppermann pisze:
>>
>>   Can you describe your traffic forwarding setup in more detail?
>>> Is it only pf, or do you run netgraph, or other things as well?
>>> Do you use flow routing?
>>>
>>> How frequent does this happen?
>>>
>>> I'm trying to create a stack graph to see which parts of the network
>>> stack are involved in handling your packet.
>>>
>>>
>> Hi,
>> In my case, I do use PF for filtering and NAT (without routing options
>> like 'route-to' or 'reply-to') together with ALTQ (PRIQ).
>> I also use IPFW+Dummynet combo for shaping.
>>
>> net.inet.ip.sourceroute: 0
>> net.inet.ip.accept_**sourceroute: 0
>>
>> Router traffic is about 300Mb/s in peak.
>>
>> Frequency:
>> Wed Oct 3 14:19:15 CEST 2012
>> Thu Dec 13 04:39:43 CET 2012
>> Thu Dec 13 04:39:46 CET 2012
>> Thu Dec 13 04:39:47 CET 2012
>> Thu Dec 13 04:39:50 CET 2012
>> Thu Dec 13 04:39:53 CET 2012
>> Thu Dec 13 04:39:59 CET 2012
>> Thu Dec 13 04:40:11 CET 2012
>> Fri Jan 4 07:47:00 CET 2013
>> Mon Jan 28 18:35:43 CET 2013
>> Sat Feb 2 22:43:01 CET 2013
>>
>> I do only monitor default route change, but this bug also affects static
>> routes (i.e. I have one static route and it changes more frequently that
>> default route).
>>
>> Please let me know if I can provide any more feedback.
>>
>> Krzysiek
>>
>>
>>
>>
> Do you have flowtable support in your kernel?
> Can you try without it enabled?
>
>
>>
>> ______________________________**_________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
>> "
>>
>
>


From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 10:00:38 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0E850C2A
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 10:00:38 +0000 (UTC)
 (envelope-from emz@norma.perm.ru)
Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 575CFE5D
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 10:00:37 +0000 (UTC)
Received: from bsdrookie.norma.com. ([IPv6:fd00::726])
 by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r26A0YB7029546
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
 for <freebsd-net@freebsd.org>; Wed, 6 Mar 2013 16:00:35 +0600 (YEKT)
 (envelope-from emz@norma.perm.ru)
Message-ID: <513713C2.1000007@norma.perm.ru>
Date: Wed, 06 Mar 2013 16:00:34 +0600
From: "Eugene M. Zheganin" <emz@norma.perm.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
References: <F02BE044-1C4F-43EB-8091-BC62362C2E5F@sd63.bc.ca>
 <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
In-Reply-To: <20130306062658.GC1483@michelle.cdnetworks.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (elf.hq.norma.perm.ru [IPv6:fd00::30a]);
 Wed, 06 Mar 2013 16:00:35 +0600 (YEKT)
X-Spam-Status: No hits=-97.8 bayes=0.5 testhits RDNS_NONE=1.274,
 SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 10:00:38 -0000

Hi.
Hi.

On 06.03.2013 12:26, YongHyeon PYUN wrote:
> If you were using latest stable/8, the result would be same on
> CURRENT.
> How frequently do you see the watchdog timeouts? Is there way to
> reproduce it?
> Would you show me the output of dmesg (bge(4) and brgphy(4) only)
> and "pciconf -lcbv"?
I upgraded one om my routers 2 days ago to 8.3-STABLE, and got today a
freeze. Uptime was less than a day.
I have like dozens of these IBM system x3250, all of them run various
8.2-STABLE's, that's why I worry that much. I don't know if this is
triggered by some of my actions. These routers run gre/ipsec, dirrerent
routing stuff (quagga, bird), proxies and pf. In 2011/early 2012 I saw
similar watchdog issues on these machines, and I disabled the tso on
them. I don't know whether this is a coincidence or it really helps, but
after that I didn't see these watchdog issues until today.

I've also discovered that this particular server is running some old
bioses/firmwares including the fact that it misses some NetXtreme
updates available from IBM. Would applying such updates resolve the
situation ?

I am ok with that fact that I cannot run ipmi/sol on these machines, but
it would be nice if this watchdog issue could be somehow resolved.
Furthermore, I have some spare machines that I can provide full access
to, including ipkvm stuff. Since the machine is only partially freezing,
I cannot even rely on the ichwd and watchdogd to reboot it.

pciconf (there's two controllers in this server, I use the first, but
anyway):

bge0@pci0:2:0:0:        class=0x020000 card=0x03781014 chip=0x165a14e4
rev=0x00 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'Broadcom NetXtreme BCM5722 Gigabit (94309)'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 64, base 0xe8200000, size 65536, enabled
    cap 01[48] = powerspec 3  supports D0 D3  current D0
    cap 03[50] = VPD
    cap 09[58] = vendor (length 120)
    cap 05[e8] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 10[d0] = PCI-Express 1 endpoint max data 128(128) link x1(x1)
                 speed 2.5(2.5)
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected
ecap 0002[13c] = VC 1 max VC0
ecap 0003[160] = Serial 1 001a64fffe21962d
ecap 0004[16c] = Power Budgeting 1
bge1@pci0:3:1:0:        class=0x020000 card=0x026f1014 chip=0x16c714e4
rev=0x10 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'BCM5703A3 NetXtreme Gigabit Ethernet'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 64, base 0xe8400000, size 65536, enabled
    cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split
transaction
    cap 01[48] = powerspec 2  supports D0 D3  current D0
    cap 03[50] = VPD
    cap 05[58] = MSI supports 8 messages, 64 bit

dmesg:

bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.
0x00a200> mem 0xe8200000-0xe820ffff irq 16 at device 0.0 on pci2
bge0: CHIP ID 0x0000a200; ASIC REV 0x0a; CHIP REV 0xa2; PCI-E
miibus0: <MII bus> on bge0
bge0: Ethernet address: 00:1a:64:21:96:2d
bge0: [FILTER]
bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.
0x001100> mem 0xe8400000-0xe840ffff irq 21 at device 1.0 on pci3
bge1: CHIP ID 0x00001100; ASIC REV 0x01; CHIP REV 0x11; PCI on PCI-X 33
MHz; 32bit
miibus1: <MII bus> on bge1
bge1: Ethernet address: 00:1a:64:21:96:2e
bge1: [ITHREAD]
[emz@omega:~]# cat /var/run/dmesg.boot | egrep 'bge|brg'
bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.
0x00a200> mem 0xe8200000-0xe820ffff irq 16 at device 0.0 on pci2
bge0: CHIP ID 0x0000a200; ASIC REV 0x0a; CHIP REV 0xa2; PCI-E
miibus0: <MII bus> on bge0
brgphy0: <BCM5722 10/100/1000baseTX PHY> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 00:1a:64:21:96:2d
bge0: [FILTER]
bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.
0x001100> mem 0xe8400000-0xe840ffff irq 21 at device 1.0 on pci3
bge1: CHIP ID 0x00001100; ASIC REV 0x01; CHIP REV 0x11; PCI on PCI-X 33
MHz; 32bit
miibus1: <MII bus> on bge1
brgphy1: <BCM5703 10/100/1000baseTX PHY> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 00:1a:64:21:96:2e
bge1: [ITHREAD]


Thanks.
Eugene.

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 10:12:11 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4CC711CF
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 10:12:11 +0000 (UTC)
 (envelope-from s.khanchi@gmail.com)
Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49])
 by mx1.freebsd.org (Postfix) with ESMTP id E2AFDEF1
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 10:12:10 +0000 (UTC)
Received: by mail-wg0-f49.google.com with SMTP id 15so7048352wgd.28
 for <freebsd-net@freebsd.org>; Wed, 06 Mar 2013 02:12:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:mime-version:sender:in-reply-to:references:from:date
 :x-google-sender-auth:message-id:subject:to:cc:content-type;
 bh=9lMR4V2uS0cOekjpDYrJNsbRjYW9wEFwVn9BHGebNcs=;
 b=l2NaisGQprDMO3rJF2qNkCZidkDHC31iZkXN+ziBdLzcuHog1SBNHk4+s1cBv9ZSM6
 TvvIYVLnoC/4Pmdp3NE7ujTAz+sXF7T5/sZDc93QVH1sFpqJ3cy+UkwRev6GqJshXQRQ
 lMqs1nekO3J2R7D/V0DbXnu2eZaH+CTNmh8P+pm0BTDM/AGqbTEDprEC8Cm+HYr9bBpj
 z7BkV1CorlKMLWbvZ3adWnB2yf6xSmIv7OSDhV6hKSWtS8m1myVw4z3MR3uh7XXQPFzL
 y2+oxZ4uvuM/Z4/09TWcn1uqenJtPixBhUSyhtqv49pgw3mvDeXaoXt1Ee/tiJ/LmQJp
 HQfQ==
X-Received: by 10.194.170.165 with SMTP id an5mr45512266wjc.41.1362564730098; 
 Wed, 06 Mar 2013 02:12:10 -0800 (PST)
MIME-Version: 1.0
Sender: s.khanchi@gmail.com
Received: by 10.194.121.104 with HTTP; Wed, 6 Mar 2013 02:11:47 -0800 (PST)
In-Reply-To: <8EB66934-D33C-425E-A076-66E31B618DCA@neville-neil.com>
References: <CAARSjE3h87y00_JeurzPzmkDaU5C58v=iLB-etwJ0RdtLh5f+g@mail.gmail.com>
 <8EB66934-D33C-425E-A076-66E31B618DCA@neville-neil.com>
From: h bagade <bagadeh@gmail.com>
Date: Wed, 6 Mar 2013 13:41:47 +0330
X-Google-Sender-Auth: 6_LdrbOOVlHX6mo_X5ws-iu2EzI
Message-ID: <CAARSjE3muSWfq26k9QvbbtkaZoFJPhZOGMB3bjPRq-UpfTUOqA@mail.gmail.com>
Subject: Re: how to get mac address info in kernel code?
To: George Neville-Neil <gnn@neville-neil.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 10:12:11 -0000

On Tue, Mar 5, 2013 at 7:23 PM, George Neville-Neil <gnn@neville-neil.com>wrote:

>
> On Mar 5, 2013, at 08:54 , h bagade <bagadeh@gmail.com> wrote:
>
> > Hi all,
> >
> > I need to get interface MAC address within the kernel code and I couldn't
> > use "getifaddrs" because it's user-mode. How can I have the MAC address
> > information within kernel code?
> >
> > Any hints or comments are really appreciated.
>
> If you have access to the struct ifnet you can look at the if_addr member,
> which is
> a struct ifaddr, defined in if_var.h .
>
> Best,
> George
>

Thanks for your suggestion. I will make it a try.

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 11:56:33 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 82CEA686
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 11:56:33 +0000 (UTC)
 (envelope-from emz@norma.perm.ru)
Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 53010601
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 11:56:32 +0000 (UTC)
Received: from bsdrookie.norma.com. ([IPv6:fd00::726])
 by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r26BuTII049900
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
 for <freebsd-net@freebsd.org>; Wed, 6 Mar 2013 17:56:30 +0600 (YEKT)
 (envelope-from emz@norma.perm.ru)
Message-ID: <51372EED.7080803@norma.perm.ru>
Date: Wed, 06 Mar 2013 17:56:29 +0600
From: "Eugene M. Zheganin" <emz@norma.perm.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
CC: freebsd-net@freebsd.org
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
References: <F02BE044-1C4F-43EB-8091-BC62362C2E5F@sd63.bc.ca>
 <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
In-Reply-To: <20130306062658.GC1483@michelle.cdnetworks.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (elf.hq.norma.perm.ru [IPv6:fd00::30a]);
 Wed, 06 Mar 2013 17:56:30 +0600 (YEKT)
X-Spam-Status: No hits=-96.5 bayes=0.5 testhits MISSING_HEADERS=1.207,
 RDNS_NONE=1.274,SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no
 version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 11:56:33 -0000

Hi.

On 06.03.2013 12:26, YongHyeon PYUN wrote:
> If you were using latest stable/8, the result would be same on
> CURRENT.
> How frequently do you see the watchdog timeouts? Is there way to
> reproduce it?
> Would you show me the output of dmesg (bge(4) and brgphy(4) only)
> and "pciconf -lcbv"?
I just thought. I have never saw a watchdog timeout on an i386. Like,
never (on same system x3250 and same controllers - these servers are
from the same bunch). However all of my i386 machines run less recent
versions of FreeBSD.
Does this make sense ? I mean amd64 and related stuff.

Thanks
Eugene.

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 15:21:09 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A5AFEE7B
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 15:21:09 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from sam.nabble.com (sam.nabble.com [216.139.236.26])
 by mx1.freebsd.org (Postfix) with ESMTP id 775A8E5
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 15:21:08 +0000 (UTC)
Received: from [192.168.236.26] (helo=sam.nabble.com)
 by sam.nabble.com with esmtp (Exim 4.72)
 (envelope-from <ncrogers@gmail.com>) id 1UDG9O-0004DD-Ax
 for freebsd-net@freebsd.org; Wed, 06 Mar 2013 07:21:02 -0800
Date: Wed, 6 Mar 2013 07:21:02 -0800 (PST)
From: Courtland <ncrogers@gmail.com>
To: freebsd-net@freebsd.org
Message-ID: <1362583262334-5793139.post@n5.nabble.com>
In-Reply-To: <2DE61B0869B7484997BCA012845482C7EBE62DDD88@WIN2008.Domnt.abi.ca>
References: <2DE61B0869B7484997BCA012845482C7EBE62DDD88@WIN2008.Domnt.abi.ca>
Subject: Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 15:21:09 -0000

Has there been any progress on resolving this problem. Does anyone have a
better idea as to where it is breaking down?

I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF for
NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The
default gateway changes to an IP that is not on my network when under heavy
network load.

The last time this happened I had a stream of arpresolve messages in the
kernel for the IP that the default route was changed to.
Mar  5 19:12:53  kernel: arpresolve: can't allocate llinfo for
50.142.201.101
The default route was changed to 50.142.201.101 after these messages.


--
View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html
Sent from the freebsd-net mailing list archive at Nabble.com.

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 16:16:28 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 3EFB9EE2
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 16:16:28 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-we0-x22f.google.com (mail-we0-x22f.google.com
 [IPv6:2a00:1450:400c:c03::22f])
 by mx1.freebsd.org (Postfix) with ESMTP id D88855F2
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 16:16:27 +0000 (UTC)
Received: by mail-we0-f175.google.com with SMTP id x8so8338885wey.20
 for <freebsd-net@freebsd.org>; Wed, 06 Mar 2013 08:16:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:date:x-google-sender-auth:message-id
 :subject:from:to:cc:content-type;
 bh=4qEZV/i3KKzSEHVRGpNufiVbrhmx3176fBsOpgkhyPY=;
 b=uxgywnmT5phLvZ5mEVfjL2AH4lHLBxQ8qVuRl6hFv6bO3npOOjmMz7tmcUsxPkgJuy
 FbelbZPuxDvj/dTPcmoQbqYueGP+S2tZhlZVmqTWbrVmZ+PLmYyU65JOL5iSRcjeR2Yb
 rCkEyoI7UclFg/s/vdSvDfj8SKd0+AyDEDJPCGBNjsarLP42numuufbkyWpFGpsqW9AY
 riBCbk0HkGIwK2jtmynQ/j8uOyX5to5yB5Wz28QWhjmg2ElZJDiD1O/fJAbM5JlbNmSU
 1+cL+AskzCTBhO7LZqtYH2w8TSL63WqTfSfXzn3OxDJNO4K4/LYi3oK6f4IH+S1JVIjf
 gpSA==
MIME-Version: 1.0
X-Received: by 10.180.87.170 with SMTP id az10mr27682576wib.3.1362586584029;
 Wed, 06 Mar 2013 08:16:24 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.217.51.2 with HTTP; Wed, 6 Mar 2013 08:16:23 -0800 (PST)
Date: Wed, 6 Mar 2013 08:16:23 -0800
X-Google-Sender-Auth: XyQlxVmqtxkTvTsrIfDMhQHFid4
Message-ID: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
Subject: Default route changes unexpectedly #2 (was Re: kernel: arpresolve:
 can't allocate llinfo for 65.59.233.102)
From: Adrian Chadd <adrian@freebsd.org>
To: Courtland <ncrogers@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 16:16:28 -0000

Another instance of it..


Adrian


On 6 March 2013 07:21, Courtland <ncrogers@gmail.com> wrote:
> Has there been any progress on resolving this problem. Does anyone have a
> better idea as to where it is breaking down?
>
> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF for
> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The
> default gateway changes to an IP that is not on my network when under heavy
> network load.
>
> The last time this happened I had a stream of arpresolve messages in the
> kernel for the IP that the default route was changed to.
> Mar  5 19:12:53  kernel: arpresolve: can't allocate llinfo for
> 50.142.201.101
> The default route was changed to 50.142.201.101 after these messages.
>
>
>
>
> --
> View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html
> Sent from the freebsd-net mailing list archive at Nabble.com.
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 18:27:48 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 745B14DB
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 18:27:48 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id D2C13DBB
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 18:27:47 +0000 (UTC)
Received: (qmail 67564 invoked from network); 6 Mar 2013 19:41:23 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <ncrogers@gmail.com>; 6 Mar 2013 19:41:23 -0000
Message-ID: <51378A9D.6080306@freebsd.org>
Date: Wed, 06 Mar 2013 19:27:41 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Courtland <ncrogers@gmail.com>
Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve:
 can't allocate llinfo for 65.59.233.102)
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
In-Reply-To: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 18:27:48 -0000

Courtland,

the arpresolve observation is very important.  Do you have flowtable
enabled in your kernel?

-- 
Andre

On 06.03.2013 17:16, Adrian Chadd wrote:
> Another instance of it..
> Adrian
> On 6 March 2013 07:21, Courtland <ncrogers@gmail.com> wrote:
>> Has there been any progress on resolving this problem. Does anyone have a
>> better idea as to where it is breaking down?
>>
>> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF for
>> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The
>> default gateway changes to an IP that is not on my network when under heavy
>> network load.
>>
>> The last time this happened I had a stream of arpresolve messages in the
>> kernel for the IP that the default route was changed to.
>> Mar  5 19:12:53  kernel: arpresolve: can't allocate llinfo for
>> 50.142.201.101
>> The default route was changed to 50.142.201.101 after these messages.
>>
>>
>>
>>
>> --
>> View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html
>> Sent from the freebsd-net mailing list archive at Nabble.com.
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>
>


From owner-freebsd-net@FreeBSD.ORG  Wed Mar  6 21:02:37 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 008BC8E8
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 21:02:36 +0000 (UTC)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225])
 by mx1.freebsd.org (Postfix) with ESMTP id B7ABA836
 for <freebsd-net@freebsd.org>; Wed,  6 Mar 2013 21:02:35 +0000 (UTC)
Received: from udns.ultimateDNS.NET (localhost [127.0.0.1])
 by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r26L2TUN033493
 for <freebsd-net@freebsd.org>; Wed, 6 Mar 2013 13:02:35 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: (from www@localhost)
 by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r26L2OPa033492;
 Wed, 6 Mar 2013 13:02:24 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimatedns.net ([209.180.214.225])
 (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP;
 Wed, 6 Mar 2013 13:02:24 -0800 (PST)
Message-ID: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
Date: Wed, 6 Mar 2013 13:02:24 -0800 (PST)
Subject: Implementing IP6 in 8.3
From: "freebsd-net" <fbsdmail@dnswatch.com>
To: "freebsd-net" <freebsd-net@freebsd.org>
User-Agent: UDNSMS/2.0.3
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Mar 2013 21:02:37 -0000

Greetings,
 I'm evaluating an ISP for the sake of building BSD operating systems on hardware
that they use (DSL modems, in this case). When I had my old NEC server, I had a
MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at
it for use in alot of hardware I have laying around. In my current situation, I'm
using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively
new modem, it doesn't support IP6. It is my hope to replace the OS with one that
does. :)
I leased a /48 of IP4's from them, which /also/ came with as many IP6's.
So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers),
I'm wondering 2 things:
If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho
my gateway (modem) doesn't?
Am I /correctly/ attempting to use it?
I'm answering authoritatively for the many domains I own. They have all functioned
well for many years via IP4. I have added the requisite AAAA records in all the zones,
as well as the associated RR's.
While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of
DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48.
So it's up to the modem to answer accordingly.
BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically,
via rc.conf(5). While I've read as much as I can find on the topic related to BSD,
boot messages indicate at least -- "IP6 gateway unreachable".
I'm currently using:
rc.conf(5):
ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000"
ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000"
I also have the corresponding host IP in hosts(5).

Any help, pointers, guidance, answers /greatly/ appreciated.

Thank you for all your time, and consideration.

--Chris


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 02:25:00 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EA82ADD2
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 02:25:00 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pb0-f47.google.com (mail-pb0-f47.google.com
 [209.85.160.47]) by mx1.freebsd.org (Postfix) with ESMTP id 94D46926
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 02:25:00 +0000 (UTC)
Received: by mail-pb0-f47.google.com with SMTP id rp2so6899271pbb.20
 for <freebsd-net@freebsd.org>; Wed, 06 Mar 2013 18:24:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=YPZ8COZutJIEvqb0+oF9RXA1P51EFK3FBcE9iwJVvH8=;
 b=r5pS8R6NfJMjusQLxzlaVWwwEro8drxWKzLr4BR0VE8bEI8nBhPIP/NI1/agAYyp+J
 V0wm6n/1NuQco0paym2Rnvt6P3sJW3ZMEVzWhLOmn1AeDpbffT0tQZtjfKzHSHTYZfiH
 LOoOEcPVVmzIsAr3EW5PmOYeSQMHmS2TLlcng8VTJfcjaP00xRSvZyTaE88VXkw97nme
 lO0uHstQy7XMzUljgvn6/LD/ZZL/wUyRUeAD9T3uo+eBYrbRYWlXTttI+mPzxIKRCqDf
 LFrlIr1USSYbS+ydvD7ghzOZyPWAFIqa4Wbcjv9W++e7kjQht0lwpUuWQgIkrBaUgzw2
 YsZw==
X-Received: by 10.68.33.98 with SMTP id q2mr51351215pbi.135.1362623094627;
 Wed, 06 Mar 2013 18:24:54 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id av14sm243052pac.18.2013.03.06.18.24.51
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Wed, 06 Mar 2013 18:24:53 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Thu, 07 Mar 2013 11:24:46 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Thu, 7 Mar 2013 11:24:46 +0900
To: "Eugene M. Zheganin" <emz@norma.perm.ru>
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
Message-ID: <20130307022446.GB3108@michelle.cdnetworks.com>
References: <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <513713C2.1000007@norma.perm.ru>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 02:25:01 -0000

On Wed, Mar 06, 2013 at 04:00:34PM +0600, Eugene M. Zheganin wrote:
> Hi.
> Hi.
> 
> On 06.03.2013 12:26, YongHyeon PYUN wrote:
> > If you were using latest stable/8, the result would be same on
> > CURRENT.
> > How frequently do you see the watchdog timeouts? Is there way to
> > reproduce it?
> > Would you show me the output of dmesg (bge(4) and brgphy(4) only)
> > and "pciconf -lcbv"?
> I upgraded one om my routers 2 days ago to 8.3-STABLE, and got today a
> freeze. Uptime was less than a day.
> I have like dozens of these IBM system x3250, all of them run various
> 8.2-STABLE's, that's why I worry that much. I don't know if this is

What was previous SVN revision number on that machine?
The support for 5718/5719/5720 was merged to stable/8 about 3
months ago.

> triggered by some of my actions. These routers run gre/ipsec, dirrerent
> routing stuff (quagga, bird), proxies and pf. In 2011/early 2012 I saw
> similar watchdog issues on these machines, and I disabled the tso on
> them. I don't know whether this is a coincidence or it really helps, but
> after that I didn't see these watchdog issues until today.

I'm not aware of TSO issue on your controller. pf(4) had TSO issue
but I guess it was fixed long time ago.

> 
> I've also discovered that this particular server is running some old
> bioses/firmwares including the fact that it misses some NetXtreme
> updates available from IBM. Would applying such updates resolve the
> situation ?
> 

Updating etherent controller firmware is always good idea. But I'm
not sure whether this address the issue.

> I am ok with that fact that I cannot run ipmi/sol on these machines, but
> it would be nice if this watchdog issue could be somehow resolved.

Actually this is the first report after the merge which seems to
break bge(4).

> Furthermore, I have some spare machines that I can provide full access
> to, including ipkvm stuff. Since the machine is only partially freezing,
> I cannot even rely on the ichwd and watchdogd to reboot it.

Sorry no clue yet.

> 
> pciconf (there's two controllers in this server, I use the first, but
> anyway):

Thanks for the info.

[...]

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 05:09:00 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id BCC0B624
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 05:09:00 +0000 (UTC)
 (envelope-from emz@norma.perm.ru)
Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 4C780F16
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 05:09:00 +0000 (UTC)
Received: from [192.168.248.33] ([192.168.248.33])
 by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r2758vVt080708
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
 for <freebsd-net@freebsd.org>; Thu, 7 Mar 2013 11:08:57 +0600 (YEKT)
 (envelope-from emz@norma.perm.ru)
Message-ID: <513820E2.806@norma.perm.ru>
Date: Thu, 07 Mar 2013 11:08:50 +0600
From: "Eugene M. Zheganin" <emz@norma.perm.ru>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
References: <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
 <20130307022446.GB3108@michelle.cdnetworks.com>
In-Reply-To: <20130307022446.GB3108@michelle.cdnetworks.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (elf.hq.norma.perm.ru [192.168.3.10]); Thu, 07 Mar 2013 11:08:57 +0600 (YEKT)
X-Spam-Status: No hits=-101.0 bayes=0.5 testhits ALL_TRUSTED=-1,
 USER_IN_WHITELIST=-100 autolearn=unavailable version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 05:09:00 -0000

Hi.

On 07.03.2013 8:24, YongHyeon PYUN wrote:
> What was previous SVN revision number on that machine?
> The support for 5718/5719/5720 was merged to stable/8 about 3
> months ago.
>
It was definitely older than "months". It was running something similar 
to  "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the 
uname from a neighbor machine.

I have, as I said, identical servers running FreeBSD. Here are some of 
the unames that I don't see timeouts on:

8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days)
8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous 
uptime around 180 days)
8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days)

One more question:  could it be a zfs-related issue ? Some kernel-level 
locking ? All of those run zfs also (no ufs at all).

Thanks.
Eugene.


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 06:23:45 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 122D6EFB
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 06:23:45 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com
 [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id AE742196
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 06:23:44 +0000 (UTC)
Received: by mail-pb0-f54.google.com with SMTP id rr4so120094pbb.41
 for <freebsd-net@freebsd.org>; Wed, 06 Mar 2013 22:23:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=JDLNkJAP0h5DdluY+tYGUCTMWl05ObqPfXOZ945CAYM=;
 b=FtFTWEbTe6mjYy83ylJsxMS6HOYUhOxrtGfORWWcA3WoK4axgH82WABYEtOrkbYr1O
 RG1d7mYWAQVJd5DGGJrcF7fEdHdr83UtgM39jwkoV8gdU9UEXYXqocCt5a0dROI7ZZjs
 Llk9RkB5FDbNQVdOXwcUom8nHngxO7GxRC0/EPRcGEUryAWAmsXJrKZhtccfAEpdZ3hE
 LdhQF/k1pJcxc8BxCnbRs6OkBRab3kYtOq8J3e0YlPx59adUl/Z4NLCyrtk/u2FkTz17
 QcKGyHV9eQxu/oEZE/fLyXdFy0Em0d4Y5iNGaFkMYyrRuV7AI8ywOkzL/FKPtQwQE0Wa
 cu9A==
X-Received: by 10.68.116.169 with SMTP id jx9mr51243706pbb.94.1362637424182;
 Wed, 06 Mar 2013 22:23:44 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id f10sm1014220paf.17.2013.03.06.22.23.40
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Wed, 06 Mar 2013 22:23:42 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Thu, 07 Mar 2013 15:23:35 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Thu, 7 Mar 2013 15:23:35 +0900
To: "Eugene M. Zheganin" <emz@norma.perm.ru>
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
Message-ID: <20130307062335.GB1478@michelle.cdnetworks.com>
References: <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
 <20130307022446.GB3108@michelle.cdnetworks.com> <513820E2.806@norma.perm.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <513820E2.806@norma.perm.ru>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 06:23:45 -0000

On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote:
> Hi.
> 
> On 07.03.2013 8:24, YongHyeon PYUN wrote:
> >What was previous SVN revision number on that machine?
> >The support for 5718/5719/5720 was merged to stable/8 about 3
> >months ago.
> >
> It was definitely older than "months". It was running something similar 
> to  "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the 
> uname from a neighbor machine.
> 
> I have, as I said, identical servers running FreeBSD. Here are some of 
> the unames that I don't see timeouts on:
> 
> 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days)
> 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous 
> uptime around 180 days)

These servers do not have 5718/5719/5720 changes.

> 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days)

This server has the bge(4) change but it didn't trigger watchdog
timeouts.  Does this server use the same controller? If yes, the
issue didn't come from bge(4) change.
> 
> One more question:  could it be a zfs-related issue ? Some kernel-level 
> locking ? All of those run zfs also (no ufs at all).

Sorry I have no idea on ZFS.

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 06:24:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 311C3F94
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 06:24:57 +0000 (UTC)
 (envelope-from zeus@ibs.dn.ua)
Received: from relay.ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25])
 by mx1.freebsd.org (Postfix) with ESMTP id 7ED5F1A6
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 06:24:56 +0000 (UTC)
Received: from ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) 
 by relay.ibs.dn.ua with ESMTP id r276MpqC085463
 for <freebsd-net@freebsd.org>; Thu, 7 Mar 2013 08:22:52 +0200 (EET)
Message-ID: <20130307082251.85461@relay.ibs.dn.ua>
Date: Thu, 07 Mar 2013 08:22:51 +0300
From: Zeus Panchenko <zeus@ibs.dn.ua>
To: <freebsd-net@freebsd.org>
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
In-reply-to: Your message of Thu, 7 Mar 2013 11:24:46 +0900
 <20130307022446.GB3108@michelle.cdnetworks.com>
References: <D557DE29-DED8-4B89-9D1C-171FC17D435E@hub.org>
 <201302241106.42477.vegeta@tuxpowered.net>
 <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
 <20130307022446.GB3108@michelle.cdnetworks.com>
Organization: I.B.S. LLC
X-Mailer: MH-E 8.3.1; GNU Mailutils 2.99.97; GNU Emacs 24.0.93
X-Face: &sReWXo3Iwtqql1[My(t1Gkx;
 y?KF@KF`4X+'9Cs@PtK^y%}^.>Mtbpyz6U=,Op:KPOT.uG
 )Nvx`=er!l?WASh7KeaGhga"1[&yz$_7ir'cVp7o%CGbJ/V)j/=]vzvvcqcZkf; JDurQG6wTg+?/xA
 go`}1.Ze//K; Fk&/&OoHd'[b7iGt2UO>o(YskCT[_D)kh4!yY'<&:yt+zM=A`@`~9U+P[qS:f;
 #9z~ Or/Bo#N-'S'!'[3Wog'ADkyMqmGDvga?WW)qd=?)`Y&k=o}>!ST\
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Zeus Panchenko <zeus@ibs.dn.ua>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 06:24:57 -0000

Hi,

here is my situation, much like the issue

On 06.03.2013 12:26, YongHyeon PYUN wrote:
> If you were using latest stable/8, the result would be same on
> CURRENT.

I use FreeBSD 9.1-RELEASE #0 r243825: amd65 + ZFS
on HP ProLiant DL360e Gen8 

the box has two 4 headed cards igb(4) I350 and bge(4) NetXtreme BCM5719
according the pciconf data

> How frequently do you see the watchdog timeouts? Is there way to
> reproduce it?

I noticed that after activation, bge(4) stops respond and interface
becomes useless, while igb(4) works fine after some sysctl-ing

for now I'm forced to not to use bge(4) at all :(

> Would you show me the output of dmesg (bge(4) and brgphy(4) only)
> and "pciconf -lcbv"?

> grep "bge\|brgphy" dmesg.boot
bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem 0xfa3f0000-0xfa3fffff,0xfa3e0000-0xfa3effff,0xfa3d0000-0xfa3dffff irq 40 at device 0.0 on pci6
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
bge0: Ethernet address: ac:16:2d:83:ec:2c
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem 0xfa3c0000-0xfa3cffff,0xfa3b0000-0xfa3bffff,0xfa3a0000-0xfa3affff irq 44 at device 0.1 on pci6
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
bge1: Ethernet address: ac:16:2d:83:ec:2d
bge2: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem 0xfa390000-0xfa39ffff,0xfa380000-0xfa38ffff,0xfa370000-0xfa37ffff irq 40 at device 0.2 on pci6
bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2: <MII bus> on bge2
bge2: Ethernet address: ac:16:2d:83:ec:2e
bge3: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem 0xfa360000-0xfa36ffff,0xfa350000-0xfa35ffff,0xfa340000-0xfa34ffff irq 44 at device 0.3 on pci6
bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3: <MII bus> on bge3
bge3: Ethernet address: ac:16:2d:83:ec:2f

brgphy0: <BCM5719C 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
brgphy1: <BCM5719C 1000BASE-T media interface> PHY 2 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
brgphy2: <BCM5719C 1000BASE-T media interface> PHY 3 on miibus2
brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
brgphy3: <BCM5719C 1000BASE-T media interface> PHY 4 on miibus3
brgphy3:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow


> pciconf -lcbv
hostb0@pci0:0:0:0:	class=0x060000 card=0x18a8103c chip=0x3c008086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMI2'
    class      = bridge
    subclass   = HOST-PCI
    cap 10[90] = PCI-Express 2 root port max data 128(128) link x0(x4)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000b[144] = unknown 1
ecap 000b[1d0] = unknown 1
ecap 000b[280] = unknown 1
pcib1@pci0:0:1:0:	class=0x060400 card=0x18a8103c chip=0x3c028086 rev=0x07 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge IIO PCI Express Root Port 1a'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x18a8103c
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000d[110] = unknown 1
ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000b[1d0] = unknown 1
ecap 0019[250] = unknown 1
ecap 000b[280] = unknown 1
pcib2@pci0:0:1:1:	class=0x060400 card=0x18a8103c chip=0x3c038086 rev=0x07 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge IIO PCI Express Root Port 1b'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x18a8103c
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000d[110] = unknown 1
ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000b[1d0] = unknown 1
ecap 0019[250] = unknown 1
ecap 000b[280] = unknown 1
pcib3@pci0:0:3:0:	class=0x060400 card=0x18a8103c chip=0x3c088086 rev=0x07 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge IIO PCI Express Root Port 3a in PCI Express Mode'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x18a8103c
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 256(256) link x4(x16)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000d[110] = unknown 1
ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000b[1d0] = unknown 1
ecap 0019[250] = unknown 1
ecap 000b[280] = unknown 1
pcib4@pci0:0:3:1:	class=0x060400 card=0x18a8103c chip=0x3c098086 rev=0x07 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge IIO PCI Express Root Port 3b'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x18a8103c
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000d[110] = unknown 1
ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000b[1d0] = unknown 1
ecap 0019[250] = unknown 1
ecap 000b[280] = unknown 1
pcib5@pci0:0:3:2:	class=0x060400 card=0x18a8103c chip=0x3c0a8086 rev=0x07 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge IIO PCI Express Root Port 3c'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x18a8103c
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000d[110] = unknown 1
ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000b[1d0] = unknown 1
ecap 0019[250] = unknown 1
ecap 000b[280] = unknown 1
pcib6@pci0:0:3:3:	class=0x060400 card=0x18a8103c chip=0x3c0b8086 rev=0x07 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge IIO PCI Express Root Port 3d'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x18a8103c
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 000b[100] = unknown 1
ecap 000d[110] = unknown 1
ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000b[1d0] = unknown 1
ecap 0019[250] = unknown 1
ecap 000b[280] = unknown 1
none0@pci0:0:4:0:	class=0x088000 card=0x18a8103c chip=0x3c208086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 0'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa4f0000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none1@pci0:0:4:1:	class=0x088000 card=0x18a8103c chip=0x3c218086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 1'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa4e0000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none2@pci0:0:4:2:	class=0x088000 card=0x18a8103c chip=0x3c228086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 2'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa4d0000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none3@pci0:0:4:3:	class=0x088000 card=0x18a8103c chip=0x3c238086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 3'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa4c0000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none4@pci0:0:4:4:	class=0x088000 card=0x18a8103c chip=0x3c248086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 4'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa4b0000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none5@pci0:0:4:5:	class=0x088000 card=0x18a8103c chip=0x3c258086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 5'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa4a0000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none6@pci0:0:4:6:	class=0x088000 card=0x18a8103c chip=0x3c268086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 6'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa490000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none7@pci0:0:4:7:	class=0x088000 card=0x18a8103c chip=0x3c278086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge DMA Channel 7'
    class      = base peripheral
    bar   [10] = type Memory, range 64, base 0xfa480000, size 16384, enabled
    cap 11[80] = MSI-X supports 1 message in map 0x10
    cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
none8@pci0:0:5:0:	class=0x088000 card=0x18a8103c chip=0x3c288086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge Address Map, VTd_Misc, System Management'
    class      = base peripheral
    cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
none9@pci0:0:5:2:	class=0x088000 card=0x18a8103c chip=0x3c2a8086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge Control Status and Global Errors'
    class      = base peripheral
    cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link x0(x0)
ioapic0@pci0:0:5:4:	class=0x080020 card=0x18a8103c chip=0x3c2c8086 rev=0x07 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sandy Bridge I/O APIC'
    class      = base peripheral
    subclass   = interrupt controller
    bar   [10] = type Memory, range 32, base 0xfa470000, size 4096, enabled
    cap 01[6c] = powerspec 3  supports D0 D3  current D0
pcib7@pci0:0:17:0:	class=0x060400 card=0x18a9103c chip=0x1d3e8086 rev=0x05 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Patsburg PCI Express Virtual Root Port'
    class      = bridge
    subclass   = PCI-PCI
    cap 10[40] = PCI-Express 2 root port max data 128(128) link x1(x1)
    cap 01[80] = powerspec 3  supports D0 D3  current D0
    cap 0d[88] = PCI Bridge card=0x18a9103c
    cap 05[90] = MSI supports 1 message 
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000d[138] = unknown 1
ehci0@pci0:0:26:0:	class=0x0c0320 card=0x18a9103c chip=0x1d2d8086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Patsburg USB2 Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
    bar   [10] = type Memory, range 32, base 0xfa460000, size 1024, enabled
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
    cap 13[98] = PCI Advanced Features: FLR TP
pcib8@pci0:0:28:0:	class=0x060400 card=0x18a9103c chip=0x1d108086 rev=0xb5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Patsburg PCI Express Root Port 1'
    class      = bridge
    subclass   = PCI-PCI
    cap 10[40] = PCI-Express 2 root port max data 128(128) link x0(x4)
    cap 05[80] = MSI supports 1 message 
    cap 0d[90] = PCI Bridge card=0x18a9103c
    cap 01[a0] = powerspec 2  supports D0 D3  current D0
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
pcib9@pci0:0:28:4:	class=0x060400 card=0x18a9103c chip=0x1d188086 rev=0xb5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Patsburg PCI Express Root Port 5'
    class      = bridge
    subclass   = PCI-PCI
    cap 10[40] = PCI-Express 2 root port max data 128(128) link x2(x2)
    cap 05[80] = MSI supports 1 message 
    cap 0d[90] = PCI Bridge card=0x18a9103c
    cap 01[a0] = powerspec 2  supports D0 D3  current D0
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
pcib10@pci0:0:28:7:	class=0x060400 card=0x18a9103c chip=0x1d1e8086 rev=0xb5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Patsburg PCI Express Root Port 8'
    class      = bridge
    subclass   = PCI-PCI
    cap 10[40] = PCI-Express 2 root port max data 128(128) link x1(x1)
    cap 05[80] = MSI supports 1 message 
    cap 0d[90] = PCI Bridge card=0x18a9103c
    cap 01[a0] = powerspec 2  supports D0 D3  current D0
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ehci1@pci0:0:29:0:	class=0x0c0320 card=0x18a9103c chip=0x1d268086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Patsburg USB2 Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
    bar   [10] = type Memory, range 32, base 0xfa450000, size 1024, enabled
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
    cap 13[98] = PCI Advanced Features: FLR TP
pcib11@pci0:0:30:0:	class=0x060401 card=0x18a9103c chip=0x244e8086 rev=0xa5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801 PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[50] = PCI Bridge card=0x18a9103c
isab0@pci0:0:31:0:	class=0x060100 card=0x00000000 chip=0x1d418086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Patsburg LPC Controller'
    class      = bridge
    subclass   = PCI-ISA
    cap 09[e0] = vendor (length 12) Intel cap 1 version 0
		 features: AMT, 4 PCI-e x1 slots
ahci0@pci0:0:31:2:	class=0x010601 card=0x18a9103c chip=0x1d028086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Patsburg 6-Port SATA AHCI Controller'
    class      = mass storage
    subclass   = SATA
    bar   [10] = type I/O Port, range 32, base 0x4000, size  8, enabled
    bar   [14] = type I/O Port, range 32, base 0x4008, size  4, enabled
    bar   [18] = type I/O Port, range 32, base 0x4010, size  8, enabled
    bar   [1c] = type I/O Port, range 32, base 0x4018, size  4, enabled
    bar   [20] = type I/O Port, range 32, base 0x4020, size 32, enabled
    bar   [24] = type Memory, range 32, base 0xfa440000, size 2048, enabled
    cap 05[80] = MSI supports 1 message enabled with 1 message
    cap 01[70] = powerspec 3  supports D0 D3  current D0
    cap 12[a8] = SATA Index-Data Pair
    cap 13[b0] = PCI Advanced Features: FLR TP
bge0@pci0:6:0:0:	class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'NetXtreme BCM5719 Gigabit Ethernet PCIe'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base 0xfa3f0000, size 65536, enabled
    bar   [18] = type Prefetchable Memory, range 64, base 0xfa3e0000, size 65536, enabled
    bar   [20] = type Prefetchable Memory, range 64, base 0xfa3d0000, size 65536, enabled
    cap 01[48] = powerspec 3  supports D0 D3  current D0
    cap 03[50] = VPD
    cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message
    cap 11[a0] = MSI-X supports 17 messages in map 0x20
    cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4)
ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected
ecap 0003[13c] = Serial 1 0000ac162d83ec2c
ecap 0004[150] = unknown 1
ecap 0002[160] = VC 1 max VC0
ecap 0017[230] = unknown 1
bge1@pci0:6:0:1:	class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'NetXtreme BCM5719 Gigabit Ethernet PCIe'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base 0xfa3c0000, size 65536, enabled
    bar   [18] = type Prefetchable Memory, range 64, base 0xfa3b0000, size 65536, enabled
    bar   [20] = type Prefetchable Memory, range 64, base 0xfa3a0000, size 65536, enabled
    cap 01[48] = powerspec 3  supports D0 D3  current D0
    cap 03[50] = VPD
    cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message
    cap 11[a0] = MSI-X supports 17 messages in map 0x20
    cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4)
ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected
ecap 0003[13c] = Serial 1 0000ac162d83ec2d
ecap 0004[150] = unknown 1
ecap 0002[160] = VC 1 max VC0
ecap 0017[230] = unknown 1
bge2@pci0:6:0:2:	class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'NetXtreme BCM5719 Gigabit Ethernet PCIe'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base 0xfa390000, size 65536, enabled
    bar   [18] = type Prefetchable Memory, range 64, base 0xfa380000, size 65536, enabled
    bar   [20] = type Prefetchable Memory, range 64, base 0xfa370000, size 65536, enabled
    cap 01[48] = powerspec 3  supports D0 D3  current D0
    cap 03[50] = VPD
    cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message
    cap 11[a0] = MSI-X supports 17 messages in map 0x20
    cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4)
ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected
ecap 0003[13c] = Serial 1 0000ac162d83ec2e
ecap 0004[150] = unknown 1
ecap 0002[160] = VC 1 max VC0
ecap 0017[230] = unknown 1
bge3@pci0:6:0:3:	class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'NetXtreme BCM5719 Gigabit Ethernet PCIe'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base 0xfa360000, size 65536, enabled
    bar   [18] = type Prefetchable Memory, range 64, base 0xfa350000, size 65536, enabled
    bar   [20] = type Prefetchable Memory, range 64, base 0xfa340000, size 65536, enabled
    cap 01[48] = powerspec 3  supports D0 D3  current D0
    cap 03[50] = VPD
    cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message
    cap 11[a0] = MSI-X supports 17 messages in map 0x20
    cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4)
ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected
ecap 0003[13c] = Serial 1 0000ac162d83ec2f
ecap 0004[150] = unknown 1
ecap 0002[160] = VC 1 max VC0
ecap 0017[230] = unknown 1
igb0@pci0:2:0:0:	class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I350 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfbf00000, size 1048576, enabled
    bar   [18] = type I/O Port, range 32, base 0x5000, size 32, enabled
    bar   [1c] = type Memory, range 32, base 0xfbef0000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
    cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4)
ecap 0001[100] = AER 2
ecap 0003[140] = Serial 1 6c3be5ffffb2dba0
ecap 000e[150] = unknown 1
ecap 0010[160] = unknown 1
ecap 0017[1a0] = unknown 1
ecap 0018[1c0] = unknown 1
ecap 000d[1d0] = unknown 1
igb1@pci0:2:0:1:	class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I350 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfbd00000, size 1048576, enabled
    bar   [18] = type I/O Port, range 32, base 0x5020, size 32, enabled
    bar   [1c] = type Memory, range 32, base 0xfbcf0000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
    cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4)
ecap 0001[100] = AER 2
ecap 0003[140] = Serial 1 6c3be5ffffb2dba0
ecap 000e[150] = unknown 1
ecap 0010[160] = unknown 1
ecap 0017[1a0] = unknown 1
ecap 000d[1d0] = unknown 1
igb2@pci0:2:0:2:	class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I350 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfbb00000, size 1048576, enabled
    bar   [18] = type I/O Port, range 32, base 0x5040, size 32, enabled
    bar   [1c] = type Memory, range 32, base 0xfbaf0000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
    cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4)
ecap 0001[100] = AER 2
ecap 0003[140] = Serial 1 6c3be5ffffb2dba0
ecap 000e[150] = unknown 1
ecap 0010[160] = unknown 1
ecap 0017[1a0] = unknown 1
ecap 000d[1d0] = unknown 1
igb3@pci0:2:0:3:	class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I350 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xfb900000, size 1048576, enabled
    bar   [18] = type I/O Port, range 32, base 0x5060, size 32, enabled
    bar   [1c] = type Memory, range 32, base 0xfb8f0000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
    cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4)
ecap 0001[100] = AER 2
ecap 0003[140] = Serial 1 6c3be5ffffb2dba0
ecap 000e[150] = unknown 1
ecap 0010[160] = unknown 1
ecap 0017[1a0] = unknown 1
ecap 000d[1d0] = unknown 1
none10@pci0:1:0:0:	class=0x088000 card=0x3381103c chip=0x3306103c rev=0x05 hdr=0x00
    vendor     = 'Hewlett-Packard Company'
    device     = 'Integrated Lights-Out Standard Slave Instrumentation & System Support'
    class      = base peripheral
    bar   [10] = type I/O Port, range 32, base 0x3000, size 256, enabled
    bar   [14] = type Memory, range 32, base 0xfb7f0000, size 512, enabled
    bar   [18] = type I/O Port, range 32, base 0x3400, size 256, enabled
    cap 01[78] = powerspec 3  supports D0 D3  current D0
    cap 05[b0] = MSI supports 1 message, 64 bit 
    cap 10[c0] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1)
vgapci0@pci0:1:0:1:	class=0x030000 card=0x3381103c chip=0x0533102b rev=0x00 hdr=0x00
    vendor     = 'Matrox Graphics, Inc.'
    device     = 'MGA G200EH'
    class      = display
    subclass   = VGA
    bar   [10] = type Prefetchable Memory, range 32, base 0xf9000000, size 16777216, enabled
    bar   [14] = type Memory, range 32, base 0xfb7e0000, size 16384, enabled
    bar   [18] = type Memory, range 32, base 0xfa800000, size 8388608, enabled
    cap 01[a8] = powerspec 3  supports D0 D3  current D0
    cap 05[b0] = MSI supports 1 message, 64 bit 
    cap 10[c0] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1)
none11@pci0:1:0:2:	class=0x088000 card=0x3381103c chip=0x3307103c rev=0x05 hdr=0x00
    vendor     = 'Hewlett-Packard Company'
    device     = 'Integrated Lights-Out Standard Management Processor Support and Messaging'
    class      = base peripheral
    bar   [10] = type I/O Port, range 32, base 0x3800, size 256, enabled
    bar   [14] = type Memory, range 32, base 0xfa7f0000, size 256, enabled
    bar   [18] = type Memory, range 32, base 0xfa600000, size 1048576, enabled
    bar   [1c] = type Memory, range 32, base 0xfa580000, size 524288, enabled
    bar   [20] = type Memory, range 32, base 0xfa570000, size 32768, enabled
    bar   [24] = type Memory, range 32, base 0xfa560000, size 32768, enabled
    cap 01[78] = powerspec 3  supports D0 D3  current D0
    cap 05[b0] = MSI supports 1 message, 64 bit 
    cap 10[c0] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1)
uhci0@pci0:1:0:4:	class=0x0c0300 card=0x3381103c chip=0x3300103c rev=0x02 hdr=0x00
    vendor     = 'Hewlett-Packard Company'
    device     = 'Integrated Lights-Out Standard Virtual USB Controller'
    class      = serial bus
    subclass   = USB
    bar   [20] = type I/O Port, range 32, base 0x3c00, size 32, enabled
    cap 05[70] = MSI supports 1 message, 64 bit 
    cap 10[80] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1)
    cap 01[f0] = powerspec 3  supports D0 D3  current D0


-- 
Zeus V. Panchenko				jid:zeus@im.ibs.dn.ua
IT Dpt., I.B.S. LLC					  GMT+2 (EET)

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 06:34:37 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 60890175;
 Thu,  7 Mar 2013 06:34:37 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 0522C1E9;
 Thu,  7 Mar 2013 06:34:37 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UDUSq-0008ZT-Pk; Thu, 07 Mar 2013 10:38:04 +0400
Message-ID: <513834E4.7050203@FreeBSD.org>
Date: Thu, 07 Mar 2013 10:34:12 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: net@freebsd.org
Subject: [patch] interface routes
Content-Type: multipart/mixed; boundary="------------070403050505050004040202"
Cc: Andre Oppermann <andre@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 06:34:37 -0000

This is a multi-part message in MIME format.
--------------070403050505050004040202
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hello list!

There is a known long-lived issue with interface routes addition/deletion:

ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in 
kernel route table (for example, advertised by IGP like OSPF).

Interface route can be deleted via route(8) or any route socket user 
(sometimes this happens with popular opensource daemons like bird/quagga).

Problem is reported at least in kern/106722 and kern/155772.

This can be fixed the following way:
Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' 
comment) is utilised to mark route 'immutable'.
rtrequest1_fib refuses to delete routes with given flag unless 
RTM_PINNED is set in rti_flags.

Every interface address manupulation is done via rtinit[1], so
rtinit1() sets this flag (and behavior does not change here).

Adding interface address is handled via atomically deleting old prefix 
and adding interface one.

--------------070403050505050004040202
Content-Type: text/plain;
 name="iface_routes.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="iface_routes.diff"

Index: sys/net/if.c
===================================================================
--- sys/net/if.c	(revision 247623)
+++ sys/net/if.c	(working copy)
@@ -1357,7 +1357,8 @@ if_rtdel(struct radix_node *rn, void *arg)
 			return (0);
 
 		err = rtrequest_fib(RTM_DELETE, rt_key(rt), rt->rt_gateway,
-				rt_mask(rt), rt->rt_flags|RTF_RNH_LOCKED,
+				rt_mask(rt),
+				rt->rt_flags|RTF_RNH_LOCKED|RTF_PINNED,
 				(struct rtentry **) NULL, rt->rt_fibnum);
 		if (err) {
 			log(LOG_WARNING, "if_rtdel: error %d\n", err);
Index: sys/net/route.c
===================================================================
--- sys/net/route.c	(revision 247842)
+++ sys/net/route.c	(working copy)
@@ -1112,6 +1112,16 @@ rtrequest1_fib(int req, struct rt_addrinfo *info,
 			error = 0;
 		}
 #endif
+		if ((flags & RTF_PINNED) == 0) {
+			/*
+			 * Check if can delete target route.
+			 */
+			rt = (struct rtentry *)rnh->rnh_lookup(dst,
+			    netmask, rnh);
+			if ((rt != NULL) && (rt->rt_flags & RTF_PINNED))
+				senderr(EPERM);
+		}
+
 		/*
 		 * Remove the item from the tree and return it.
 		 * Complain if it is not there and do no more processing.
@@ -1430,6 +1440,7 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in
 	int didwork = 0;
 	int a_failure = 0;
 	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
+	struct radix_node_head *rnh;
 
 	if (flags & RTF_HOST) {
 		dst = ifa->ifa_dstaddr;
@@ -1488,7 +1499,6 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in
 	 */
 	for ( fibnum = startfib; fibnum <= endfib; fibnum++) {
 		if (cmd == RTM_DELETE) {
-			struct radix_node_head *rnh;
 			struct radix_node *rn;
 			/*
 			 * Look up an rtentry that is in the routing tree and
@@ -1538,7 +1548,8 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in
 		 */
 		bzero((caddr_t)&info, sizeof(info));
 		info.rti_ifa = ifa;
-		info.rti_flags = flags | (ifa->ifa_flags & ~IFA_RTSELF);
+		info.rti_flags = flags |
+		    (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED;
 		info.rti_info[RTAX_DST] = dst;
 		/* 
 		 * doing this for compatibility reasons
@@ -1550,6 +1561,32 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in
 			info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
 		info.rti_info[RTAX_NETMASK] = netmask;
 		error = rtrequest1_fib(cmd, &info, &rt, fibnum);
+
+		if ((error == EEXIST) && (cmd == RTM_ADD)) {
+			/*
+			 * Interface route addition failed.
+			 * Note we probably already checked
+			 * other interface addresses if given prefix exists.
+			 * Atomically delete current prefix generating
+			 * RTM_DELETE message, and retry adding
+			 * interface address.
+			 */
+			rnh = rt_tables_get_rnh(fibnum, dst->sa_family);
+			RADIX_NODE_HEAD_LOCK(rnh);
+			/* Delete old prefix */
+			info.rti_ifa = NULL;
+			info.rti_flags = RTF_RNH_LOCKED;
+			error = rtrequest1_fib(RTM_DELETE, &info, &rt, fibnum);
+			if (error == 0) {
+				info.rti_ifa = ifa;
+				info.rti_flags = flags | RTF_RNH_LOCKED |
+				    (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED;
+				error = rtrequest1_fib(cmd, &info, &rt, fibnum);
+			}
+			RADIX_NODE_HEAD_UNLOCK(rnh);
+		}
+
+
 		if (error == 0 && rt != NULL) {
 			/*
 			 * notify any listening routing agents of the change
Index: sys/net/route.h
===================================================================
--- sys/net/route.h	(revision 247623)
+++ sys/net/route.h	(working copy)
@@ -176,7 +176,7 @@ struct ortentry {
 /*			0x20000		   unused, was RTF_WASCLONED */
 #define RTF_PROTO3	0x40000		/* protocol specific routing flag */
 /*			0x80000		   unused */
-#define RTF_PINNED	0x100000	/* future use */
+#define RTF_PINNED	0x100000	/* route is immutable */
 #define	RTF_LOCAL	0x200000 	/* route represents a local address */
 #define	RTF_BROADCAST	0x400000	/* route represents a bcast address */
 #define	RTF_MULTICAST	0x800000	/* route represents a mcast address */

--------------070403050505050004040202--

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 06:45:14 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DF787391
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 06:45:14 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com
 [209.85.220.52]) by mx1.freebsd.org (Postfix) with ESMTP id 913E1247
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 06:45:14 +0000 (UTC)
Received: by mail-pa0-f52.google.com with SMTP id fb1so223808pad.11
 for <freebsd-net@freebsd.org>; Wed, 06 Mar 2013 22:45:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=a/VwAV22ylSHJzEXo/ipQmpRMMpJ3lxrlYGPfXm+0q0=;
 b=L3Pf7WekeiRVK+zuHw9dUSKiMhno3/wEIXg9WIWF0LBEFOWjfYHqwwYlkMTJYPfyMa
 z/UOpJiaE8kTrP2q9DsK3fbBhFdpfRW0ETAf/dqVOHVL566s9rJNIVFdSsoA9DoEXr7q
 sWizznJshUtQuXtk1AIGZbk2j9eEdMdAZ6Lx7sEKwOmdv22/lbR6oj/qZdB+dy7WJ8xg
 t/Z1EavDWK1O+OLRKA0QO8h8lJy9XhW6p42QppSFgFxgETD5TuuRuIKuMyMWWx6856wR
 4AttT7G5wQFb+8MaW+f90OTVEeGLvIQt/dELTu9iopsXmGAsPzkmf9/v8aSmvKaUZuGm
 zdyQ==
X-Received: by 10.66.51.198 with SMTP id m6mr1321535pao.215.1362638707969;
 Wed, 06 Mar 2013 22:45:07 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id c8sm619022pbq.10.2013.03.06.22.45.04
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Wed, 06 Mar 2013 22:45:06 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Thu, 07 Mar 2013 15:45:00 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Thu, 7 Mar 2013 15:45:00 +0900
To: Zeus Panchenko <zeus@ibs.dn.ua>
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
Message-ID: <20130307064500.GC1478@michelle.cdnetworks.com>
References: <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
 <20130307022446.GB3108@michelle.cdnetworks.com>
 <20130307082251.85461@relay.ibs.dn.ua>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130307082251.85461@relay.ibs.dn.ua>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 06:45:14 -0000

On Thu, Mar 07, 2013 at 08:22:51AM +0300, Zeus Panchenko wrote:
> Hi,
> 
> here is my situation, much like the issue
> 

No, your issue is completely different one.

> On 06.03.2013 12:26, YongHyeon PYUN wrote:
> > If you were using latest stable/8, the result would be same on
> > CURRENT.
> 
> I use FreeBSD 9.1-RELEASE #0 r243825: amd65 + ZFS
> on HP ProLiant DL360e Gen8 
> 
> the box has two 4 headed cards igb(4) I350 and bge(4) NetXtreme BCM5719
> according the pciconf data
> 
> > How frequently do you see the watchdog timeouts? Is there way to
> > reproduce it?
> 
> I noticed that after activation, bge(4) stops respond and interface
> becomes useless, while igb(4) works fine after some sysctl-ing
> 
> for now I'm forced to not to use bge(4) at all :(

9.1-RELEASE does not have required code to support your controller.
Use stable/9. 

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 07:14:06 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id DFE0DA4A
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 07:14:06 +0000 (UTC)
 (envelope-from emz@norma.perm.ru)
Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 8141E35A
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 07:14:06 +0000 (UTC)
Received: from bsdrookie.norma.com. ([IPv6:fd00::726])
 by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r277E3eK006676
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
 for <freebsd-net@freebsd.org>; Thu, 7 Mar 2013 13:14:04 +0600 (YEKT)
 (envelope-from emz@norma.perm.ru)
Message-ID: <51383E3B.5030007@norma.perm.ru>
Date: Thu, 07 Mar 2013 13:14:03 +0600
From: "Eugene M. Zheganin" <emz@norma.perm.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
References: <20130225082042.GB1426@michelle.cdnetworks.com>
 <512CF97B.8030805@norma.perm.ru>
 <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz>
 <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
 <20130307022446.GB3108@michelle.cdnetworks.com> <513820E2.806@norma.perm.ru>
 <20130307062335.GB1478@michelle.cdnetworks.com>
In-Reply-To: <20130307062335.GB1478@michelle.cdnetworks.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (elf.hq.norma.perm.ru [IPv6:fd00::30a]);
 Thu, 07 Mar 2013 13:14:04 +0600 (YEKT)
X-Spam-Status: No hits=-97.8 bayes=0.5 testhits RDNS_NONE=1.274,
 SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 07:14:06 -0000

Hi.

On 07.03.2013 12:23, YongHyeon PYUN wrote:
> On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote:
>> It was definitely older than "months". It was running something similar 
>> to  "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the 
>> uname from a neighbor machine.
>>
>> I have, as I said, identical servers running FreeBSD. Here are some of 
>> the unames that I don't see timeouts on:
>>
>> 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days)
>> 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous 
>> uptime around 180 days)
> These servers do not have 5718/5719/5720 changes.
>
>> 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days)
> This server has the bge(4) change but it didn't trigger watchdog
> timeouts.  Does this server use the same controller? If yes, the
> issue didn't come from bge(4) change.
>
How's that ? It's running even older version than previous two. I guess
you misread the year.

Eugene.

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 07:39:54 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E2FE191
 for <net@freebsd.org>; Thu,  7 Mar 2013 07:39:54 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 5B5DE61A
 for <net@freebsd.org>; Thu,  7 Mar 2013 07:39:54 +0000 (UTC)
Received: (qmail 80793 invoked from network); 7 Mar 2013 08:53:24 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <melifaro@FreeBSD.org>; 7 Mar 2013 08:53:24 -0000
Message-ID: <51384443.5070209@freebsd.org>
Date: Thu, 07 Mar 2013 08:39:47 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org>
In-Reply-To: <513834E4.7050203@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 07:39:54 -0000

On 07.03.2013 07:34, Alexander V. Chernikov wrote:
> Hello list!
>
> There is a known long-lived issue with interface routes addition/deletion:
>
> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in kernel route table (for
> example, advertised by IGP like OSPF).
>
> Interface route can be deleted via route(8) or any route socket user (sometimes this happens with
> popular opensource daemons like bird/quagga).
>
> Problem is reported at least in kern/106722 and kern/155772.

You patch is a welcome addition.

> This can be fixed the following way:
> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' comment) is utilised to mark
> route 'immutable'.
> rtrequest1_fib refuses to delete routes with given flag unless RTM_PINNED is set in rti_flags.

How do the routing daemons react to being unable to change/delete
such a route?

EADDRINUSE would likely be a more descriptive error instead of EPERM?

> Every interface address manupulation is done via rtinit[1], so
> rtinit1() sets this flag (and behavior does not change here).
 >
> Adding interface address is handled via atomically deleting old prefix and adding interface one.

This brings up a long standing sore point of our routing code
which this patch makes more pronounced.  When an interface link
state is down I don't want the route to it to persist but to
become inactive so another path can be chosen.  This the very
point of running a routing daemon.  So on the link-down event
the installed interface routes should be removed from the routing
table.  The configured addresses though should persist and the
interface routes re-installed on a link-up event.  What's your
opinion on it?

Other than these points I think your code is fine and can go
into the tree.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 07:59:55 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C98A1505
 for <net@freebsd.org>; Thu,  7 Mar 2013 07:59:55 +0000 (UTC)
 (envelope-from sthaug@nethelp.no)
Received: from bizet.nethelp.no (bizet.nethelp.no [195.1.209.33])
 by mx1.freebsd.org (Postfix) with SMTP id 109576CB
 for <net@freebsd.org>; Thu,  7 Mar 2013 07:59:54 +0000 (UTC)
Received: (qmail 75461 invoked from network); 7 Mar 2013 07:53:12 -0000
Received: from bizet.nethelp.no (HELO localhost) (195.1.209.33)
 by bizet.nethelp.no with SMTP; 7 Mar 2013 07:53:12 -0000
Date: Thu, 07 Mar 2013 08:53:12 +0100 (CET)
Message-Id: <20130307.085312.41695129.sthaug@nethelp.no>
To: andre@freebsd.org
Subject: Re: [patch] interface routes
From: sthaug@nethelp.no
In-Reply-To: <51384443.5070209@freebsd.org>
References: <513834E4.7050203@FreeBSD.org>
	<51384443.5070209@freebsd.org>
X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: melifaro@FreeBSD.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 07:59:55 -0000

> This brings up a long standing sore point of our routing code
> which this patch makes more pronounced.  When an interface link
> state is down I don't want the route to it to persist but to
> become inactive so another path can be chosen.  This the very
> point of running a routing daemon.  So on the link-down event
> the installed interface routes should be removed from the routing
> table.  The configured addresses though should persist and the
> interface routes re-installed on a link-up event.  What's your
> opinion on it?

Yes please! This is what I take for granted on my routers.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 08:16:02 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 7B24C915
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 08:16:02 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com
 [209.85.220.41]) by mx1.freebsd.org (Postfix) with ESMTP id 4A7A5767
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 08:16:02 +0000 (UTC)
Received: by mail-pa0-f41.google.com with SMTP id fb11so286102pad.0
 for <freebsd-net@freebsd.org>; Thu, 07 Mar 2013 00:15:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=+aDdCEe8i1j6ruNA+ZjB/s+ACQLuzBWpoIyrLdzBB7Q=;
 b=cKQ68uYCPEXPIf+WUxtdb7qD/kD2nAaht42Z2/NmThaVMA36b4NPtO1cDY32n2d9l/
 953EARcRvETYRtJkNpOMIyq+RJ9eKtxQKaIr14evDdBje6S18ICxY1mfQs2+lEW6G6MT
 7qTDvxlIFoi/5vanmq8B9saU4bvwTO8ORs70aSQXXb7Cvp2Hcr6sh7JqJakaoO4R5DXo
 v7h414Ila586o+Wg3BfhRqbqB4aoi3xU/gg/aVN7Yr/c1hQHV3KW6Ai0EPYVhAmtQSkC
 qMrvcvR+90zQx/JlYIPP8In2eoVe+bkVEyTh7lkNIMlUuUrRZ70Q/wD/etCMimB2QAxY
 Ngmw==
X-Received: by 10.68.195.33 with SMTP id ib1mr52532958pbc.105.1362644156666;
 Thu, 07 Mar 2013 00:15:56 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id eg1sm871866pbb.33.2013.03.07.00.15.53
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Thu, 07 Mar 2013 00:15:55 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Thu, 07 Mar 2013 17:15:48 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Thu, 7 Mar 2013 17:15:48 +0900
To: "Eugene M. Zheganin" <emz@norma.perm.ru>
Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
Message-ID: <20130307081548.GD1478@michelle.cdnetworks.com>
References: <20130227020123.GA3581@michelle.cdnetworks.com>
 <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com>
 <5136D89D.4000902@norma.perm.ru>
 <20130306062658.GC1483@michelle.cdnetworks.com>
 <513713C2.1000007@norma.perm.ru>
 <20130307022446.GB3108@michelle.cdnetworks.com> <513820E2.806@norma.perm.ru>
 <20130307062335.GB1478@michelle.cdnetworks.com>
 <51383E3B.5030007@norma.perm.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <51383E3B.5030007@norma.perm.ru>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 08:16:02 -0000

On Thu, Mar 07, 2013 at 01:14:03PM +0600, Eugene M. Zheganin wrote:
> Hi.
> 
> On 07.03.2013 12:23, YongHyeon PYUN wrote:
> > On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote:
> >> It was definitely older than "months". It was running something similar 
> >> to  "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the 
> >> uname from a neighbor machine.
> >>
> >> I have, as I said, identical servers running FreeBSD. Here are some of 
> >> the unames that I don't see timeouts on:
> >>
> >> 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days)
> >> 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous 
> >> uptime around 180 days)
> > These servers do not have 5718/5719/5720 changes.
> >
> >> 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days)
> > This server has the bge(4) change but it didn't trigger watchdog
> > timeouts.  Does this server use the same controller? If yes, the
> > issue didn't come from bge(4) change.
> >
> How's that ? It's running even older version than previous two. I guess
> you misread the year.

Oops, you're right. 

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 10:20:01 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B32E36E8
 for <freebsd-net@smarthost.ysv.freebsd.org>;
 Thu,  7 Mar 2013 10:20:01 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 8A409F5D
 for <freebsd-net@smarthost.ysv.freebsd.org>;
 Thu,  7 Mar 2013 10:20:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r27AK1ID040962
 for <freebsd-net@freefall.freebsd.org>; Thu, 7 Mar 2013 10:20:01 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r27AK1k7040961;
 Thu, 7 Mar 2013 10:20:01 GMT (envelope-from gnats)
Date: Thu, 7 Mar 2013 10:20:01 GMT
Message-Id: <201303071020.r27AK1k7040961@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
Cc: 
From: "Charbon, Julien" <jcharbon@verisign.com>
Subject: Re: kern/176446: [netinet] [patch] Concurrency in ixgbe driving
 out-of-order packet process and spurious RST
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: "Charbon, Julien" <jcharbon@verisign.com>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 10:20:01 -0000

The following reply was made to PR kern/176446; it has been noted by GNATS.

From: "Charbon, Julien" <jcharbon@verisign.com>
To: John Baldwin <jhb@freebsd.org>
Cc: bug-followup@freebsd.org,
        "De La Gueronniere, Marc" <mdelagueronniere@verisign.com>
Subject: Re: kern/176446: [netinet] [patch] Concurrency in ixgbe driving out-of-order
 packet process and spurious RST
Date: Thu, 07 Mar 2013 11:11:25 +0100

 On 2/28/13 8:10 PM, Charbon, Julien wrote:
 > On 2/28/13 4:57 PM, John Baldwin wrote:
 >> Can you try the fixes from http://svnweb.freebsd.org/base?view=revision&revision=240968?
 >
 >    Actually, Marc (I CC'ed him) did find the r240968 fix for concurrency
 > between ixgbe_msix_que() and ixgbe_handle_que(), and made a backport for
 > release-8.3.0 (see patch [1] below).  However, the issue was still
 > reproducible, then Marc found another place for concurrency from
 > ixgbe_local_timer() and fix it (see patch [2]).  But it was still not
 > enough, and he found a last place for concurrency due to
 > ixgbe_rearm_queues() call (see patch [3]).  We all these patches
 > applied, we were not able to reproduce this issue.
 
   Just for the record:  As expected this issue is reproducible on 
 9.1-RELEASE:
 
 # uname -a
 FreeBSD atlas 9.1-RELEASE FreeBSD 9.1-RELEASE #1 r247851M: Wed Mar  6 
 11:17:43 UTC 2013 
 jcharbon@atlas:/usr/obj/app/jcharbon/9.1.0/sys/GENERIC  amd64
 
   Enable TCP debug log:
 
 # sysctl net.inet.tcp.log_debug=1
 
   Load enough a TCP service and due to ixgbe race conditions between 
 ixgbe_msix_que() and ixgbe_handle_que(), you will get:
 
 Mar  7 10:01:04 atlas kernel: TCP: [192.168.100.21]:12918 to 
 [192.168.100.152]:8080; syncache_socket: in_pcbconnect failed with error 48
 Mar  7 10:01:04 atlas kernel: TCP: [192.168.100.21]:12918 to 
 [192.168.100.152]:8080 tcpflags 0x10<ACK>; tcp_input: Listen socket: 
 Socket allocation failed due to limits or memory shortage, sending RST
 Mar  7 10:01:04 atlas kernel: TCP: [192.168.100.21]:12918 to 
 [192.168.100.152]:8080 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST 
 without matching syncache entry (possibly syncookie only), segment ignored
 
   We will provide our current fix patch for 9.1-RELEASE.
 
 --
 Julien

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 11:44:16 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 39D33C23;
 Thu,  7 Mar 2013 11:44:16 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id EE6C4315;
 Thu,  7 Mar 2013 11:44:15 +0000 (UTC)
Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f]
 (helo=dhcp170-36-red.yandex.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UDZIV-000Aup-L8; Thu, 07 Mar 2013 15:47:43 +0400
Message-ID: <51387D4A.9030408@FreeBSD.org>
Date: Thu, 07 Mar 2013 15:43:06 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
In-Reply-To: <51384443.5070209@freebsd.org>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 11:44:16 -0000

On 07.03.2013 11:39, Andre Oppermann wrote:
> On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>> Hello list!
>>
>> There is a known long-lived issue with interface routes
>> addition/deletion:
>>
>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in
>> kernel route table (for
>> example, advertised by IGP like OSPF).
>>
>> Interface route can be deleted via route(8) or any route socket user
>> (sometimes this happens with
>> popular opensource daemons like bird/quagga).
>>
>> Problem is reported at least in kern/106722 and kern/155772.
> 
> You patch is a welcome addition.
> 
>> This can be fixed the following way:
>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use'
>> comment) is utilised to mark
>> route 'immutable'.
>> rtrequest1_fib refuses to delete routes with given flag unless
>> RTM_PINNED is set in rti_flags.
> 
> How do the routing daemons react to being unable to change/delete
> such a route?
routing daemons live long with the fact that there route socket cmds can
fail (and the is route(8) utility which can do anything), so typically
bird/quagga yells like
'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists'
and marks given route as not installed in internal RIB. Additionally,
daemon will probably re-try to insert such routes on every periodic KRT
rescan (tens of minutes).

Given that such sutiations usually happens for a very short time (e.g.
physical link flaps) everything should become to normal state quickly.

> 
> EADDRINUSE would likely be a more descriptive error instead of EPERM?
Well, not sure if EADDRINUSE is very descriptive for _deleting_ route.
"Yes, I know that it is in use so that's the reason I'm trying to delete
it".

> 
>> Every interface address manupulation is done via rtinit[1], so
>> rtinit1() sets this flag (and behavior does not change here).
>>
>> Adding interface address is handled via atomically deleting old prefix
>> and adding interface one.
> 
> This brings up a long standing sore point of our routing code
> which this patch makes more pronounced.  When an interface link
> state is down I don't want the route to it to persist but to
> become inactive so another path can be chosen.  This the very
> point of running a routing daemon.  So on the link-down event
> the installed interface routes should be removed from the routing
> table.  The configured addresses though should persist and the
> interface routes re-installed on a link-up event.  What's your
> opinion on it?
This is exactly what is done in current code for IPv4:
if_down calls if_unroute(), it cals prctlinput() for every interface
address, and domain-dependent function like rip_ctlinput calls
in_ifscrub() cleaning given interface route.
However, address route (/32) still remains (but route daemons, at least
bird, tends to ignore it since it is not listed as valid interface
address/mask).

This is not done for IPv6 and we should probably do the same.

> 
> Other than these points I think your code is fine and can go
> into the tree.
> 


-- 
WBR, Alexander

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 11:56:00 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 68849149
 for <net@freebsd.org>; Thu,  7 Mar 2013 11:56:00 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id DE9543A6
 for <net@freebsd.org>; Thu,  7 Mar 2013 11:55:59 +0000 (UTC)
Received: (qmail 92765 invoked from network); 7 Mar 2013 13:09:25 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <melifaro@FreeBSD.org>; 7 Mar 2013 13:09:25 -0000
Message-ID: <51388046.7040408@freebsd.org>
Date: Thu, 07 Mar 2013 12:55:50 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org>
In-Reply-To: <51387D4A.9030408@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 11:56:00 -0000

On 07.03.2013 12:43, Alexander V. Chernikov wrote:
> On 07.03.2013 11:39, Andre Oppermann wrote:
>> On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>>> Hello list!
>>>
>>> There is a known long-lived issue with interface routes
>>> addition/deletion:
>>>
>>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in
>>> kernel route table (for
>>> example, advertised by IGP like OSPF).
>>>
>>> Interface route can be deleted via route(8) or any route socket user
>>> (sometimes this happens with
>>> popular opensource daemons like bird/quagga).
>>>
>>> Problem is reported at least in kern/106722 and kern/155772.
>>
>> You patch is a welcome addition.
>>
>>> This can be fixed the following way:
>>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use'
>>> comment) is utilised to mark
>>> route 'immutable'.
>>> rtrequest1_fib refuses to delete routes with given flag unless
>>> RTM_PINNED is set in rti_flags.
>>
>> How do the routing daemons react to being unable to change/delete
>> such a route?
> routing daemons live long with the fact that there route socket cmds can
> fail (and the is route(8) utility which can do anything), so typically
> bird/quagga yells like
> 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists'
> and marks given route as not installed in internal RIB. Additionally,
> daemon will probably re-try to insert such routes on every periodic KRT
> rescan (tens of minutes).

OK. No problem then.

> Given that such sutiations usually happens for a very short time (e.g.
> physical link flaps) everything should become to normal state quickly.
>
>>
>> EADDRINUSE would likely be a more descriptive error instead of EPERM?
> Well, not sure if EADDRINUSE is very descriptive for _deleting_ route.
> "Yes, I know that it is in use so that's the reason I'm trying to delete
> it".

I'm thinking of distinguishing it from a permission denial, because of
insufficient rights (jail or something like that) vs. an explicitly
pinned route.  With EPERM you may look for the problem in the wrong
place.  E*INUSE is a common error for something can't be removed due
to it still being used by or for something else.  Which is the case
here and may be more appropriate.

>>> Every interface address manupulation is done via rtinit[1], so
>>> rtinit1() sets this flag (and behavior does not change here).
>>>
>>> Adding interface address is handled via atomically deleting old prefix
>>> and adding interface one.
>>
>> This brings up a long standing sore point of our routing code
>> which this patch makes more pronounced.  When an interface link
>> state is down I don't want the route to it to persist but to
>> become inactive so another path can be chosen.  This the very
>> point of running a routing daemon.  So on the link-down event
>> the installed interface routes should be removed from the routing
>> table.  The configured addresses though should persist and the
>> interface routes re-installed on a link-up event.  What's your
>> opinion on it?
 >
> This is exactly what is done in current code for IPv4:
> if_down calls if_unroute(), it cals prctlinput() for every interface
> address, and domain-dependent function like rip_ctlinput calls
> in_ifscrub() cleaning given interface route.
> However, address route (/32) still remains (but route daemons, at least
> bird, tends to ignore it since it is not listed as valid interface
> address/mask).

IF_DOWN and link state down are not the same thing.  When the cable
is unplugged the link state goes down but not the interface.

> This is not done for IPv6 and we should probably do the same.

Yes, they should be synchronized.

>> Other than these points I think your code is fine and can go
>> into the tree.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 12:39:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C5D2EF9B
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 12:39:21 +0000 (UTC)
 (envelope-from araujobsdport@gmail.com)
Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com
 [IPv6:2a00:1450:400c:c05::22d])
 by mx1.freebsd.org (Postfix) with ESMTP id 69D6E7A0
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 12:39:21 +0000 (UTC)
Received: by mail-wi0-f173.google.com with SMTP id hq4so752131wib.12
 for <freebsd-net@freebsd.org>; Thu, 07 Mar 2013 04:39:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:reply-to:date:message-id:subject:from:to
 :content-type; bh=N09z2CTOmQFdQJz+y3+QxWIj5EQ0G5wVtCHnQcUhrXE=;
 b=rSprFeEk36ojNn+lzSTGNTaZ2bWngnElmfFMUvhfMPIlQDmcNxFfC9PvGnujEXy8i5
 sCN4zNQ0lvWNAMe+yql8hwdQZUHYorHvk8FrQAfGuGPRoSsoUtDW6XHjQFrsGUZqGHK0
 /JcDFZ+4t9TeU1hAmXsMxDCjjPqiARf9/9gGEXxDqHTuqOvdagVT5PG7/owLH5bS3HMz
 6TXUTXYp+AcZoAgUvrlRpD4FBn/IAcVEziXL1vOSMbXwyqQmVULarixRPjOxVvEDDFQB
 50nkD0T3YczoM8PG7u4kKfac3DwQ8PO26WnCsN3q9bC8j/VK4ce0YyxFuvdydyFXc4sJ
 2A7w==
MIME-Version: 1.0
X-Received: by 10.194.21.233 with SMTP id y9mr47215955wje.47.1362659960628;
 Thu, 07 Mar 2013 04:39:20 -0800 (PST)
Received: by 10.180.212.51 with HTTP; Thu, 7 Mar 2013 04:39:20 -0800 (PST)
Date: Thu, 7 Mar 2013 20:39:20 +0800
Message-ID: <CAOfEmZhBiH_dvvAGbOwy3GK=WZqaGpLaZP7pvFR0MZHkEexMhg@mail.gmail.com>
Subject: dhclient issue.
From: Marcelo Araujo <araujobsdport@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: araujo@FreeBSD.org
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 12:39:21 -0000

Hello Guys,

I've faced out some problem with dhclient during this week on 9.1-RELEASE!
Below there is the log:

[root@home ~]# uname -a
FreeBSD HOME 9.1-RELEASE FreeBSD 9.1-RELEASE #10: Tue Mar  5 18:57:14 CST
2013     root@home:/usr/src/sys/HOME.amd64  amd64


[root@home ~]# dhclient ix0
PID = 3276, PPID = 3274
fibnum = 0
fibcmd = setfib 0
interface = ix0
ifconfig: ioctl (SIOCAIFADDR): File exists
ix0: not found
exiting.

[root@home ~]# tail /var/log/messages
Mar 17 14:53:52 ESSD46B70 dhclient[3244]: exiting.
Mar 17 14:54:01 ESSD46B70 login: ROOT LOGIN (root) ON ttyv0
Mar 17 14:54:15 ESSD46B70 dhclient[3257]: ix0: not found
Mar 17 14:54:15 ESSD46B70 dhclient[3257]: exiting.
Mar 17 14:54:15 ESSD46B70 dhclient[3258]: connection closed
Mar 17 14:54:15 ESSD46B70 dhclient[3258]: exiting.
Mar 17 14:54:57 ESSD46B70 dhclient[3274]: ix0: not found

[root@home ~]# ifconfig ix0
ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
ether 00:08:9b:d4:6b:71
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-T <full-duplex>)
status: active


I have another interface em0, and there it works properly!
Any idea, what is going on?

Best Regards,
-- 
Marcelo Araujo
araujo@FreeBSD.org

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 13:38:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 096565A9;
 Thu,  7 Mar 2013 13:38:21 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qc0-x231.google.com (mail-qc0-x231.google.com
 [IPv6:2607:f8b0:400d:c01::231])
 by mx1.freebsd.org (Postfix) with ESMTP id 9D089A4D;
 Thu,  7 Mar 2013 13:38:20 +0000 (UTC)
Received: by mail-qc0-f177.google.com with SMTP id u28so142582qcs.36
 for <multiple recipients>; Thu, 07 Mar 2013 05:38:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=NKMLvr9xiGvggkPzktzt8+mN1aKo8yIewmNrtHWScr0=;
 b=gBHB+impy4qYtLRSPCgHZKBtQA0x0+jBuy7pw/ijW24Zp3f6hy4HNnGDkbsnbWqeo7
 YCAC2abc/bsFKdlYpgtmd9+vhlYBxRHY8p+wTa1CoEGrnoqnSpdkTSBnCxPzhOtWCgQs
 NZUrZHmNonujaWueZMTtYLWU6AhMgdqv82Or8hMYW12kLGeE39vNi0NqO+Xs/OSWwyzG
 N1R8cvuo6PsAbBv4htXl5irLlkhXwUtQsaTXxZMywu4rRP4HXDwV6w6+4lEpj0gY3Lnw
 jAU7FD8X+5mFykswxPZXZQTCNR6994pN7py44HiCsnc1TZgWuOFR4H353ZYHGGUBznI8
 bDPA==
MIME-Version: 1.0
X-Received: by 10.49.120.225 with SMTP id lf1mr54103536qeb.14.1362663500045;
 Thu, 07 Mar 2013 05:38:20 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Thu, 7 Mar 2013 05:38:19 -0800 (PST)
In-Reply-To: <51388046.7040408@freebsd.org>
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
Date: Thu, 7 Mar 2013 14:38:19 +0100
X-Google-Sender-Auth: LaUBMKo0jFb4HxaoZviAR3UNSs0
Message-ID: <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
Subject: Re: [patch] interface routes
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Andre Oppermann <andre@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 13:38:21 -0000

On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann <andre@freebsd.org> wrote:

> On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>
>> On 07.03.2013 11:39, Andre Oppermann wrote:
>>
>>> On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>>>
>>>> Hello list!
>>>>
>>>> There is a known long-lived issue with interface routes
>>>> addition/deletion:
>>>>
>>>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in
>>>> kernel route table (for
>>>> example, advertised by IGP like OSPF).
>>>>
>>>> Interface route can be deleted via route(8) or any route socket user
>>>> (sometimes this happens with
>>>> popular opensource daemons like bird/quagga).
>>>>
>>>> Problem is reported at least in kern/106722 and kern/155772.
>>>>
>>>
>>> You patch is a welcome addition.
>>>
>>>  This can be fixed the following way:
>>>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use'
>>>> comment) is utilised to mark
>>>> route 'immutable'.
>>>> rtrequest1_fib refuses to delete routes with given flag unless
>>>> RTM_PINNED is set in rti_flags.
>>>>
>>>
>>> How do the routing daemons react to being unable to change/delete
>>> such a route?
>>>
>> routing daemons live long with the fact that there route socket cmds can
>> fail (and the is route(8) utility which can do anything), so typically
>> bird/quagga yells like
>> 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists'
>> and marks given route as not installed in internal RIB. Additionally,
>> daemon will probably re-try to insert such routes on every periodic KRT
>> rescan (tens of minutes).
>>
>
>
Isn't it better to teach the routing code about metrics.
Routing daemons cope better this way and they can handle this.
So the policy of this behaviour can be controled by administrator rather
than by code!
With metrics you can add routes with bigger metric for interfaces and lower
from routing daemons.

This also can mitigate somehow on interfaces with the same subnet
configured possibly.


> OK. No problem then.
>
>
>  Given that such sutiations usually happens for a very short time (e.g.
>> physical link flaps) everything should become to normal state quickly.
>>
>>
>>> EADDRINUSE would likely be a more descriptive error instead of EPERM?
>>>
>> Well, not sure if EADDRINUSE is very descriptive for _deleting_ route.
>> "Yes, I know that it is in use so that's the reason I'm trying to delete
>> it".
>>
>
> I'm thinking of distinguishing it from a permission denial, because of
> insufficient rights (jail or something like that) vs. an explicitly
> pinned route.  With EPERM you may look for the problem in the wrong
> place.  E*INUSE is a common error for something can't be removed due
> to it still being used by or for something else.  Which is the case
> here and may be more appropriate.
>
>
>  Every interface address manupulation is done via rtinit[1], so
>>>> rtinit1() sets this flag (and behavior does not change here).
>>>>
>>>> Adding interface address is handled via atomically deleting old prefix
>>>> and adding interface one.
>>>>
>>>
>>> This brings up a long standing sore point of our routing code
>>> which this patch makes more pronounced.  When an interface link
>>> state is down I don't want the route to it to persist but to
>>> become inactive so another path can be chosen.  This the very
>>> point of running a routing daemon.  So on the link-down event
>>> the installed interface routes should be removed from the routing
>>> table.  The configured addresses though should persist and the
>>> interface routes re-installed on a link-up event.  What's your
>>> opinion on it?
>>>
>> >
>
>> This is exactly what is done in current code for IPv4:
>> if_down calls if_unroute(), it cals prctlinput() for every interface
>> address, and domain-dependent function like rip_ctlinput calls
>> in_ifscrub() cleaning given interface route.
>> However, address route (/32) still remains (but route daemons, at least
>> bird, tends to ignore it since it is not listed as valid interface
>> address/mask).
>>
>
> IF_DOWN and link state down are not the same thing.  When the cable
> is unplugged the link state goes down but not the interface.
>
>
>  This is not done for IPv6 and we should probably do the same.
>>
>
> Yes, they should be synchronized.
>
>
>  Other than these points I think your code is fine and can go
>>> into the tree.
>>>
>>
> --
> Andre
>
>
> ______________________________**_________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
> "
>


-- 
Ermal

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 13:51:14 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 73EA88EE
 for <net@freebsd.org>; Thu,  7 Mar 2013 13:51:14 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id C5FC0AF6
 for <net@freebsd.org>; Thu,  7 Mar 2013 13:51:13 +0000 (UTC)
Received: (qmail 98301 invoked from network); 7 Mar 2013 15:04:41 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <eri@freebsd.org>; 7 Mar 2013 15:04:41 -0000
Message-ID: <51389B4B.1060003@freebsd.org>
Date: Thu, 07 Mar 2013 14:51:07 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
In-Reply-To: <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 13:51:14 -0000

On 07.03.2013 14:38, Ermal Lu�i wrote:
> On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann <andre@freebsd.org <mailto:andre@freebsd.org>> wrote:
>
>     On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>
>         On 07.03.2013 11:39, Andre Oppermann wrote:
>
>             On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>
>                 Hello list!
>
>                 There is a known long-lived issue with interface routes
>                 addition/deletion:
>
>                 ifconfig iface inet 1.2.3.4/24 <http://1.2.3.4/24> can fail if given prefix is
>                 already in
>                 kernel route table (for
>                 example, advertised by IGP like OSPF).
>
>                 Interface route can be deleted via route(8) or any route socket user
>                 (sometimes this happens with
>                 popular opensource daemons like bird/quagga).
>
>                 Problem is reported at least in kern/106722 and kern/155772.
>
>
>             You patch is a welcome addition.
>
>                 This can be fixed the following way:
>                 Immutable route flag (RTM_PINNED, added in 19995 with 'for future use'
>                 comment) is utilised to mark
>                 route 'immutable'.
>                 rtrequest1_fib refuses to delete routes with given flag unless
>                 RTM_PINNED is set in rti_flags.
>
>
>             How do the routing daemons react to being unable to change/delete
>             such a route?
>
>         routing daemons live long with the fact that there route socket cmds can
>         fail (and the is route(8) utility which can do anything), so typically
>         bird/quagga yells like
>         'bird: KRT: Error sending route 11.0.0.0/24 <http://11.0.0.0/24> to kernel: File exists'
>         and marks given route as not installed in internal RIB. Additionally,
>         daemon will probably re-try to insert such routes on every periodic KRT
>         rescan (tens of minutes).
>
>
>
> Isn't it better to teach the routing code about metrics.
> Routing daemons cope better this way and they can handle this.
> So the policy of this behaviour can be controled by administrator rather than by code!
> With metrics you can add routes with bigger metric for interfaces and lower from routing daemons.
> This also can mitigate somehow on interfaces with the same subnet configured possibly.

Generally I agree with you that this would be the ideal outcome.
However we're still quite a bit away from reaching that goal.
To make this really work we have make mpath plus metrics a first
class citizen in the routing code and also the update the routing
daemons kernel interfaces to know about this.  I hope we get there
in the not too distant future.

As a first step I think it is important that Alexanders patch goes
in to fix a long standing and very annoying problem with the code
we have.  Also the link down route withdraw should be added asap.
Then we can take the next steps towards the ultimate goal you describe.

I hope you do not object to Alexanders patch?

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 13:55:58 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E354DAE3;
 Thu,  7 Mar 2013 13:55:58 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 8808AB3D;
 Thu,  7 Mar 2013 13:55:58 +0000 (UTC)
Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f]
 (helo=dhcp170-36-red.yandex.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UDbLy-000CJ5-BD; Thu, 07 Mar 2013 17:59:26 +0400
Message-ID: <51389C29.8000407@FreeBSD.org>
Date: Thu, 07 Mar 2013 17:54:49 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
In-Reply-To: <51388046.7040408@freebsd.org>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 13:55:59 -0000

On 07.03.2013 15:55, Andre Oppermann wrote:
> On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>> On 07.03.2013 11:39, Andre Oppermann wrote:
>>> On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>>>> Hello list!
>>>>
>>>> There is a known long-lived issue with interface routes
>>>> addition/deletion:
>>>>
>>>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in
>>>> kernel route table (for
>>>> example, advertised by IGP like OSPF).
>>>>
>>>> Interface route can be deleted via route(8) or any route socket user
>>>> (sometimes this happens with
>>>> popular opensource daemons like bird/quagga).
>>>>
>>>> Problem is reported at least in kern/106722 and kern/155772.
>>>
>>> You patch is a welcome addition.
>>>
>>>> This can be fixed the following way:
>>>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use'
>>>> comment) is utilised to mark
>>>> route 'immutable'.
>>>> rtrequest1_fib refuses to delete routes with given flag unless
>>>> RTM_PINNED is set in rti_flags.
>>>
>>> How do the routing daemons react to being unable to change/delete
>>> such a route?
>> routing daemons live long with the fact that there route socket cmds can
>> fail (and the is route(8) utility which can do anything), so typically
>> bird/quagga yells like
>> 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists'
>> and marks given route as not installed in internal RIB. Additionally,
>> daemon will probably re-try to insert such routes on every periodic KRT
>> rescan (tens of minutes).
> 
> OK. No problem then.
> 
>> Given that such sutiations usually happens for a very short time (e.g.
>> physical link flaps) everything should become to normal state quickly.
>>
>>>
>>> EADDRINUSE would likely be a more descriptive error instead of EPERM?
>> Well, not sure if EADDRINUSE is very descriptive for _deleting_ route.
>> "Yes, I know that it is in use so that's the reason I'm trying to delete
>> it".
OK.
> 
> I'm thinking of distinguishing it from a permission denial, because of
> insufficient rights (jail or something like that) vs. an explicitly
> pinned route.  With EPERM you may look for the problem in the wrong
> place.  E*INUSE is a common error for something can't be removed due
> to it still being used by or for something else.  Which is the case
> here and may be more appropriate.
> 
>>>> Every interface address manupulation is done via rtinit[1], so
>>>> rtinit1() sets this flag (and behavior does not change here).
>>>>
>>>> Adding interface address is handled via atomically deleting old prefix
>>>> and adding interface one.
>>>
>>> This brings up a long standing sore point of our routing code
>>> which this patch makes more pronounced.  When an interface link
>>> state is down I don't want the route to it to persist but to
>>> become inactive so another path can be chosen.  This the very
>>> point of running a routing daemon.  So on the link-down event
>>> the installed interface routes should be removed from the routing
>>> table.  The configured addresses though should persist and the
>>> interface routes re-installed on a link-up event.  What's your
>>> opinion on it?
>>
>> This is exactly what is done in current code for IPv4:
>> if_down calls if_unroute(), it cals prctlinput() for every interface
>> address, and domain-dependent function like rip_ctlinput calls
>> in_ifscrub() cleaning given interface route.
>> However, address route (/32) still remains (but route daemons, at least
>> bird, tends to ignore it since it is not listed as valid interface
>> address/mask).
> 
> IF_DOWN and link state down are not the same thing.  When the cable
> is unplugged the link state goes down but not the interface.
Ups. I've missed 'link' keyword.
Imho 'operational down' should behave exactly the same as 'admin down'
e.g. delete interface routes from route table.
It should be not very hard to do.

> 
>> This is not done for IPv6 and we should probably do the same.
> 
> Yes, they should be synchronized.
> 
>>> Other than these points I think your code is fine and can go
>>> into the tree.
> 


-- 
WBR, Alexander

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 14:03:47 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5A8E1DE9
 for <net@freebsd.org>; Thu,  7 Mar 2013 14:03:47 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id C63D8BF7
 for <net@freebsd.org>; Thu,  7 Mar 2013 14:03:46 +0000 (UTC)
Received: (qmail 98964 invoked from network); 7 Mar 2013 15:17:08 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <melifaro@FreeBSD.org>; 7 Mar 2013 15:17:08 -0000
Message-ID: <51389E36.3020104@freebsd.org>
Date: Thu, 07 Mar 2013 15:03:34 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <51389C29.8000407@FreeBSD.org>
In-Reply-To: <51389C29.8000407@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 14:03:47 -0000

On 07.03.2013 14:54, Alexander V. Chernikov wrote:
> On 07.03.2013 15:55, Andre Oppermann wrote:
>> On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>>> On 07.03.2013 11:39, Andre Oppermann wrote:
>>>> This brings up a long standing sore point of our routing code
>>>> which this patch makes more pronounced.  When an interface link
>>>> state is down I don't want the route to it to persist but to
>>>> become inactive so another path can be chosen.  This the very
>>>> point of running a routing daemon.  So on the link-down event
>>>> the installed interface routes should be removed from the routing
>>>> table.  The configured addresses though should persist and the
>>>> interface routes re-installed on a link-up event.  What's your
>>>> opinion on it?
>>>
>>> This is exactly what is done in current code for IPv4:
>>> if_down calls if_unroute(), it cals prctlinput() for every interface
>>> address, and domain-dependent function like rip_ctlinput calls
>>> in_ifscrub() cleaning given interface route.
>>> However, address route (/32) still remains (but route daemons, at least
>>> bird, tends to ignore it since it is not listed as valid interface
>>> address/mask).
>>
>> IF_DOWN and link state down are not the same thing.  When the cable
>> is unplugged the link state goes down but not the interface.
 >
> Ups. I've missed 'link' keyword.
> Imho 'operational down' should behave exactly the same as 'admin down'
> e.g. delete interface routes from route table.
> It should be not very hard to do.

Are you to implement it after the pinning patch? ;-)

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 14:10:59 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 9FDF9164;
 Thu,  7 Mar 2013 14:10:59 +0000 (UTC)
 (envelope-from melifaro@ipfw.ru)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 66FA3CD0;
 Thu,  7 Mar 2013 14:10:59 +0000 (UTC)
Received: from [213.87.139.85] (helo=[10.231.93.102])
 by mail.ipfw.ru with esmtpsa (TLSv1:AES128-SHA:128)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@ipfw.ru>)
 id 1UDbaO-000CPP-16; Thu, 07 Mar 2013 18:14:27 +0400
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <51389C29.8000407@FreeBSD.org> <51389E36.3020104@freebsd.org>
Mime-Version: 1.0 (1.0)
In-Reply-To: <51389E36.3020104@freebsd.org>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-Id: <0B3FF217-306D-485D-A332-C57B1D7D2F4F@ipfw.ru>
X-Mailer: iPhone Mail (10B146)
From: "Alexander V. Chernikov" <melifaro@ipfw.ru>
Subject: Re: [patch] interface routes
Date: Thu, 7 Mar 2013 18:11:44 +0400
To: Andre Oppermann <andre@freebsd.org>
Cc: "Alexander V. Chernikov" <melifaro@FreeBSD.org>,
 "net@freebsd.org" <net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 14:10:59 -0000

It seems I have no choice :)


WBR, Alexander

On 07.03.2013, at 18:03, Andre Oppermann <andre@freebsd.org> wrote:

> On 07.03.2013 14:54, Alexander V. Chernikov wrote:
>> On 07.03.2013 15:55, Andre Oppermann wrote:
>>> On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>>>> On 07.03.2013 11:39, Andre Oppermann wrote:
>>>>> This brings up a long standing sore point of our routing code
>>>>> which this patch makes more pronounced.  When an interface link
>>>>> state is down I don't want the route to it to persist but to
>>>>> become inactive so another path can be chosen.  This the very
>>>>> point of running a routing daemon.  So on the link-down event
>>>>> the installed interface routes should be removed from the routing
>>>>> table.  The configured addresses though should persist and the
>>>>> interface routes re-installed on a link-up event.  What's your
>>>>> opinion on it?
>>>> 
>>>> This is exactly what is done in current code for IPv4:
>>>> if_down calls if_unroute(), it cals prctlinput() for every interface
>>>> address, and domain-dependent function like rip_ctlinput calls
>>>> in_ifscrub() cleaning given interface route.
>>>> However, address route (/32) still remains (but route daemons, at least
>>>> bird, tends to ignore it since it is not listed as valid interface
>>>> address/mask).
>>> 
>>> IF_DOWN and link state down are not the same thing.  When the cable
>>> is unplugged the link state goes down but not the interface.
> >
>> Ups. I've missed 'link' keyword.
>> Imho 'operational down' should behave exactly the same as 'admin down'
>> e.g. delete interface routes from route table.
>> It should be not very hard to do.
> 
> Are you to implement it after the pinning patch? ;-)
> 
> -- 
> Andre
> 
> 

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 15:22:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 87AFAB8D
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 15:22:21 +0000 (UTC)
 (envelope-from milu@dat.pl)
Received: from jab.dat.pl (dat.pl [80.51.155.34])
 by mx1.freebsd.org (Postfix) with ESMTP id F075EE7
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 15:22:20 +0000 (UTC)
Received: from jab.dat.pl (jsrv.dat.pl [127.0.0.1])
 by jab.dat.pl (Postfix) with ESMTP id 46F9C123;
 Thu,  7 Mar 2013 16:13:49 +0100 (CET)
X-Virus-Scanned: amavisd-new at dat.pl
Received: from jab.dat.pl ([127.0.0.1])
 by jab.dat.pl (jab.dat.pl [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id cGJcUAJLhwyx; Thu,  7 Mar 2013 16:13:44 +0100 (CET)
Received: from [10.0.6.80] (unknown [212.69.68.42])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by jab.dat.pl (Postfix) with ESMTPSA id D73A22D;
 Thu,  7 Mar 2013 16:13:43 +0100 (CET)
Message-ID: <5138AED9.1020801@dat.pl>
Date: Thu, 07 Mar 2013 16:14:33 +0100
From: Maciej Milewski <milu@dat.pl>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/20130221 Thunderbird/17.0.3
MIME-Version: 1.0
To: freebsd-net <fbsdmail@dnswatch.com>
Subject: Re: Implementing IP6 in 8.3
References: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
In-Reply-To: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 15:22:21 -0000

On 06.03.2013 22:02, freebsd-net wrote:
> Greetings,
>   I'm evaluating an ISP for the sake of building BSD operating systems on hardware
> that they use (DSL modems, in this case). When I had my old NEC server, I had a
> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at
> it for use in alot of hardware I have laying around. In my current situation, I'm
> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively
> new modem, it doesn't support IP6. It is my hope to replace the OS with one that
> does. :)
If it doesn't support IPv6 you can always try to use it in Transparent 
Bridging (RFC1483) mode. 
<http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html> 
You can then put other router/computer that does IPv6 routing just after 
that modem.
<http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
> I leased a /48 of IP4's from them, which /also/ came with as many IP6's.
> So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers),
> I'm wondering 2 things:
> If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho
> my gateway (modem) doesn't?
> Am I /correctly/ attempting to use it?
> I'm answering authoritatively for the many domains I own. They have all functioned
> well for many years via IP4. I have added the requisite AAAA records in all the zones,
> as well as the associated RR's.
> While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of
> DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48.
> So it's up to the modem to answer accordingly.
> BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically,
> via rc.conf(5). While I've read as much as I can find on the topic related to BSD,
> boot messages indicate at least -- "IP6 gateway unreachable".
> I'm currently using:
> rc.conf(5):
> ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000"
> ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000"
> I also have the corresponding host IP in hosts(5).
>
> Any help, pointers, guidance, answers /greatly/ appreciated.
>
> Thank you for all your time, and consideration.
>
> --Chris
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

-- 
Pozdrawiam,
Maciej Milewski


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 15:35:49 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 49182F75;
 Thu,  7 Mar 2013 15:35:49 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id E1A5F160;
 Thu,  7 Mar 2013 15:35:48 +0000 (UTC)
Received: from dhcp170-36-red.yandex.net ([95.108.170.36])
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UDcub-000Cxl-8q; Thu, 07 Mar 2013 19:39:17 +0400
Message-ID: <5138B390.2080806@FreeBSD.org>
Date: Thu, 07 Mar 2013 19:34:40 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
 <51389B4B.1060003@freebsd.org>
In-Reply-To: <51389B4B.1060003@freebsd.org>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 15:35:49 -0000

On 07.03.2013 17:51, Andre Oppermann wrote:
> On 07.03.2013 14:38, Ermal Lu�i wrote:
>> On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann <andre@freebsd.org
>> <mailto:andre@freebsd.org>> wrote:
>>
>>     On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>>
>>         On 07.03.2013 11:39, Andre Oppermann wrote:
>>
>>             On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>>
>>                 Hello list!
>>
>>                 There is a known long-lived issue with interface routes
>>                 addition/deletion:
>>
>>                 ifconfig iface inet 1.2.3.4/24 <http://1.2.3.4/24> can
>> fail if given prefix is
>>                 already in
>>                 kernel route table (for
>>                 example, advertised by IGP like OSPF).
>>
>>                 Interface route can be deleted via route(8) or any
>> route socket user
>>                 (sometimes this happens with
>>                 popular opensource daemons like bird/quagga).
>>
>>                 Problem is reported at least in kern/106722 and
>> kern/155772.
>>
>>
>>             You patch is a welcome addition.
>>
>>                 This can be fixed the following way:
>>                 Immutable route flag (RTM_PINNED, added in 19995 with
>> 'for future use'
>>                 comment) is utilised to mark
>>                 route 'immutable'.
>>                 rtrequest1_fib refuses to delete routes with given
>> flag unless
>>                 RTM_PINNED is set in rti_flags.
>>
>>
>>             How do the routing daemons react to being unable to
>> change/delete
>>             such a route?
>>
>>         routing daemons live long with the fact that there route
>> socket cmds can
>>         fail (and the is route(8) utility which can do anything), so
>> typically
>>         bird/quagga yells like
>>         'bird: KRT: Error sending route 11.0.0.0/24
>> <http://11.0.0.0/24> to kernel: File exists'
>>         and marks given route as not installed in internal RIB.
>> Additionally,
>>         daemon will probably re-try to insert such routes on every
>> periodic KRT
>>         rescan (tens of minutes).
>>
>>
>>
>> Isn't it better to teach the routing code about metrics.
>> Routing daemons cope better this way and they can handle this.
>> So the policy of this behaviour can be controled by administrator
>> rather than by code!
>> With metrics you can add routes with bigger metric for interfaces and
>> lower from routing daemons.
>> This also can mitigate somehow on interfaces with the same subnet
>> configured possibly.
> 
> Generally I agree with you that this would be the ideal outcome.
> However we're still quite a bit away from reaching that goal.
> To make this really work we have make mpath plus metrics a first
> class citizen in the routing code and also the update the routing
> daemons kernel interfaces to know about this.  I hope we get there
> in the not too distant future.
Radix is already over-bloated. Typically in performance-oriented
solutions (hardware/software routers from vendors) there is clear
separation between RIB (where route protocol attributes, best candidate
routes, routes with different priority exists) and FIB, which is
typically some kind of radix with minimum needed info, e.g:
prefix, nexthops, their interfaces, optional L2 data to prepend.

Our radix stands somewhere between RIB and FIB (since we have to support
route(8) and upper layer protocols): it serves badly as RIB (little
functionality) and as FIB: too much overhead and inefficient/too general
code.

For example, sizeof(rt_nodes[2]) (first element of rte) is 96 bytes on
amd64.

Additionally, rte refcount approach is totally broken.

I'm currently thinking of adding some kind of hooks to current
route/radix code to permit building efficient trie (or other structure)
for given address family and to use it for forwarding purposes only.

For example, I don't need trie while doing MPLS label switching:
assuming control plane allocates contiguous label space, I can use label
array for efficient lookup.


> 
> As a first step I think it is important that Alexanders patch goes
> in to fix a long standing and very annoying problem with the code
> we have.  Also the link down route withdraw should be added asap.
> Then we can take the next steps towards the ultimate goal you describe.
> 
> I hope you do not object to Alexanders patch?
> 


-- 
WBR, Alexander

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 16:40:01 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5C20CE43
 for <freebsd-net@smarthost.ysv.freebsd.org>;
 Thu,  7 Mar 2013 16:40:01 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 3E21363B
 for <freebsd-net@smarthost.ysv.freebsd.org>;
 Thu,  7 Mar 2013 16:40:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r27Ge1C5014140
 for <freebsd-net@freefall.freebsd.org>; Thu, 7 Mar 2013 16:40:01 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r27Ge14V014139;
 Thu, 7 Mar 2013 16:40:01 GMT (envelope-from gnats)
Date: Thu, 7 Mar 2013 16:40:01 GMT
Message-Id: <201303071640.r27Ge14V014139@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
Cc: 
From: Gleb Smirnoff <glebius@FreeBSD.org>
Subject: Re: kern/176667: libalias locks on uninitalized data
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Gleb Smirnoff <glebius@FreeBSD.org>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 16:40:01 -0000

The following reply was made to PR kern/176667; it has been noted by GNATS.

From: Gleb Smirnoff <glebius@FreeBSD.org>
To: Lutz Donnerhacke <lutz@iks-service.de>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/176667: libalias locks on uninitalized data
Date: Thu, 7 Mar 2013 20:30:26 +0400

 On Tue, Mar 05, 2013 at 03:54:50PM +0000, Lutz Donnerhacke wrote:
 L> 
 L> >Number:         176667
 L> >Category:       kern
 L> >Synopsis:       libalias locks on uninitalized data
 L> >Confidential:   no
 L> >Severity:       non-critical
 L> >Priority:       low
 L> >Responsible:    freebsd-bugs
 L> >State:          open
 L> >Quarter:        
 L> >Keywords:       
 L> >Date-Required:
 L> >Class:          sw-bug
 L> >Submitter-Id:   current-users
 L> >Arrival-Date:   Tue Mar 05 16:00:00 UTC 2013
 L> >Closed-Date:
 L> >Last-Modified:
 L> >Originator:     Lutz Donnerhacke
 L> >Release:        FreeBSD 8.3-RELEASE (GENERIC)
 L> >Organization:
 L> IKS Service GmbH
 L> >Environment:
 L> FreeBSD server7.net.encoline.de 8.3-RELEASE FreeBSD 8.3-RELEASE #0: Mon Apr  9 21:23:18 UTC 2012     root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
 L> 
 L> >Description:
 L> While testing terminating a huge number of PPPoX clients the kernel panics while doing in-kernel NAT.
 L> 
 L> #4 0xffffffff808e8775 at calltrap+0x8
 L> #5 0xffffffff80fa0f01 at HouseKeeping+0xa1
 L> #6 0xffffffff80f9e6ab at LibAliasOutLocked+0x3b
 L> 
 L> Please note, that the stack trace is incomplete. There are calls to IncrementalCleanup() and DeleteLink(), which are not reported in the stack trace.
 L> 
 L> The problem seems to come from incorrect locking, so the contents of the libalias database get corrupted.
 L> 
 L> This patch might be not the full solution, but is an obvious fix for an obvious bug.
 L> >How-To-Repeat:
 L> Setting up ipfw nat, add more then 9000 clients using mpd5.6, generate traffic
 L> >Fix:
 L> --- sys/netinet/libalias/alias_db.c.ORIG        2013-03-05 16:49:13.000000000 +0100
 L> +++ sys/netinet/libalias/alias_db.c     2013-03-05 16:50:09.000000000 +0100
 L> @@ -2767,8 +2767,8 @@
 L>         struct ip_fw rule;      /* On-the-fly built rule */
 L>         int fwhole;             /* Where to punch hole */
 L> 
 L> -       LIBALIAS_LOCK_ASSERT(la);
 L>         la = lnk->la;
 L> +       LIBALIAS_LOCK_ASSERT(la);
 L> 
 L>  /* Don't do anything unless we are asked to */
 L>         if (!(la->packetAliasMode & PKT_ALIAS_PUNCH_FW) ||
 L> @@ -2841,8 +2841,8 @@
 L>  {
 L>         struct libalias *la;
 L> 
 L> -       LIBALIAS_LOCK_ASSERT(la);
 L>         la = lnk->la;
 L> +       LIBALIAS_LOCK_ASSERT(la);
 L>         if (lnk->link_type == LINK_TCP) {
 L>                 int fwhole = lnk->data.tcp->fwhole;     /* Where is the firewall
 L>                                                          * hole? */
 
 The code edited isn't correct and the patch is neither.
 
 The fw punching isn't supported when libalias is compiled into kernel.
 
 The LIBALIAS_LOCK_ASSERT(la) on not initialized variable couldn't even
 pass compiler, if only the entire fw punching code was enabled.
 
 So these lines need to be just removed for sanity. Unfortunately this isn't
 related to panic you are hitting.
 
 Do you have cores of that panic?
 
 -- 
 Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 16:54:35 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EFFE553F;
 Thu,  7 Mar 2013 16:54:34 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from mail-wg0-x22a.google.com (mail-wg0-x22a.google.com
 [IPv6:2a00:1450:400c:c00::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id 6AED76F9;
 Thu,  7 Mar 2013 16:54:34 +0000 (UTC)
Received: by mail-wg0-f42.google.com with SMTP id 12so7028656wgh.5
 for <multiple recipients>; Thu, 07 Mar 2013 08:54:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=3nlPI5HskgFaltsThx3r8qsuhy086Tj8i7ftUYOgtDc=;
 b=MJCls8w9vapM/LQkmwU/F2WCPahpdpx79TBxZhsa82yVHS4+De+UYPTprQQRC3gc/O
 IciommG+Xefjgryf9ffkY03e7EmsLTxwIXcNbxO9+du/bThXXDhs9Cx5GZr4ruBsvNp+
 tBNHMECL1ZMlbUh5RpzGxKdR+xuug1pabyMFKUm9udMQKDJzb2ji4ZDGSqfWgcZjSlpN
 uanQwVC3sN81d1HV8HP1oJkro48SHV5GKOXoyImeJvGDAx2e5YYdzF39kdWXNuyIBYtd
 b18armTxasrbraE++gSMAgcE4dFGrYdRSeLMNZes4c1F2w/3lvdwWASQOfXSYyfuoBNn
 L+yg==
MIME-Version: 1.0
X-Received: by 10.180.81.164 with SMTP id b4mr34955782wiy.34.1362675273327;
 Thu, 07 Mar 2013 08:54:33 -0800 (PST)
Received: by 10.194.110.195 with HTTP; Thu, 7 Mar 2013 08:54:33 -0800 (PST)
In-Reply-To: <51378A9D.6080306@freebsd.org>
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
 <51378A9D.6080306@freebsd.org>
Date: Thu, 7 Mar 2013 08:54:33 -0800
Message-ID: <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
Subject: Re: Default route changes unexpectedly #2 (was Re: kernel:
 arpresolve: can't allocate llinfo for 65.59.233.102)
From: Nick Rogers <ncrogers@gmail.com>
To: Andre Oppermann <andre@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 16:54:35 -0000

I'm not sure. I have not explicitly enabled/disabled it. I am using
the GENERIC kernel from 9.1 plus PF+ALTQ.

# sysctl net.inet.flowtable.enable
sysctl: unknown oid 'net.inet.flowtable.enable'
# sysctl -a | grep flow
kern.sigqueue.overflow: 0
net.inet.tcp.reass.overflows: 0
net.inet6.ip6.auto_flowlabel: 1

uname -v
FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013
root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM

8.0 release notes say flowtable is enabled by default on amd64/i386.
So I presume it is enabled? I can't seem to find much information
about this for FreeBSD 9.x

On Wed, Mar 6, 2013 at 10:27 AM, Andre Oppermann <andre@freebsd.org> wrote:
> Courtland,
>
> the arpresolve observation is very important.  Do you have flowtable
> enabled in your kernel?
>
> --
> Andre
>
>
> On 06.03.2013 17:16, Adrian Chadd wrote:
>>
>> Another instance of it..
>> Adrian
>> On 6 March 2013 07:21, Courtland <ncrogers@gmail.com> wrote:
>>>
>>> Has there been any progress on resolving this problem. Does anyone have a
>>> better idea as to where it is breaking down?
>>>
>>> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF
>>> for
>>> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The
>>> default gateway changes to an IP that is not on my network when under
>>> heavy
>>> network load.
>>>
>>> The last time this happened I had a stream of arpresolve messages in the
>>> kernel for the IP that the default route was changed to.
>>> Mar  5 19:12:53  kernel: arpresolve: can't allocate llinfo for
>>> 50.142.201.101
>>> The default route was changed to 50.142.201.101 after these messages.
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html
>>> Sent from the freebsd-net mailing list archive at Nabble.com.
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>>
>

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 16:56:08 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id F09115ED
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 16:56:08 +0000 (UTC)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225])
 by mx1.freebsd.org (Postfix) with ESMTP id B2A01713
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 16:56:08 +0000 (UTC)
Received: from udns.ultimateDNS.NET (localhost [127.0.0.1])
 by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r27Gu1vO002076;
 Thu, 7 Mar 2013 08:56:07 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: (from www@localhost)
 by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r27Gttj8002070;
 Thu, 7 Mar 2013 08:55:55 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimatedns.net ([209.180.214.225])
 (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP;
 Thu, 7 Mar 2013 08:55:55 -0800 (PST)
Message-ID: <eaa244ab49a30180aa7c88f45f3b38dc.authenticated@ultimatedns.net>
In-Reply-To: <5138AED9.1020801@dat.pl>
References: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
 <5138AED9.1020801@dat.pl>
Date: Thu, 7 Mar 2013 08:55:55 -0800 (PST)
Subject: Re: Implementing IP6 in 8.3
From: "freebsd-net" <fbsdmail@dnswatch.com>
To: "Maciej Milewski" <milu@dat.pl>
User-Agent: UDNSMS/2.0.3
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Cc: freebsd-net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 16:56:09 -0000

Greetings Maciej Milewski, and thank you for your thoughtful reply.
> On 06.03.2013 22:02, freebsd-net wrote:
>> Greetings,
>>   I'm evaluating an ISP for the sake of building BSD operating systems on hardware
>> that they use (DSL modems, in this case). When I had my old NEC server, I had a
>> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at
>> it for use in alot of hardware I have laying around. In my current situation, I'm
>> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively
>> new modem, it doesn't support IP6. It is my hope to replace the OS with one that
>> does. :)
> If it doesn't support IPv6 you can always try to use it in Transparent
> Bridging (RFC1483) mode.
> <http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
> You can then put other router/computer that does IPv6 routing just after
> that modem.
> <http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
Thank you for the links. I was aware of that, but requires that every connection
directly to the modem, send the PPPoE creds to the modem. While it's simple enough
to connect a router/switch between the modem, and clients, it adds an additional
hop. I think I'll be better served building a (free)BSD kernel, and drivers for
the modem -- assuming that because the modem doesn't IP6, it's not possible to
route IP6 traffic directly, unless through a "tunnel broker".
>> I leased a /48 of IP4's from them, which /also/ came with as many IP6's.
>> So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers),
>> I'm wondering 2 things:
>> If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho
>> my gateway (modem) doesn't?
>> Am I /correctly/ attempting to use it?
>> I'm answering authoritatively for the many domains I own. They have all functioned
>> well for many years via IP4. I have added the requisite AAAA records in all the zones,
>> as well as the associated RR's.
>> While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of
>> DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48.
>> So it's up to the modem to answer accordingly.
>> BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically,
>> via rc.conf(5). While I've read as much as I can find on the topic related to BSD,
>> boot messages indicate at least -- "IP6 gateway unreachable".
>> I'm currently using:
>> rc.conf(5):
>> ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000"
>> ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000"
>> I also have the corresponding host IP in hosts(5).
>>
>> Any help, pointers, guidance, answers /greatly/ appreciated.
>>
>> Thank you for all your time, and consideration.
>>
>> --Chris
>>
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>
> --
> Pozdrawiam,
> Maciej Milewski
Thanks again, for taking the time to respond.

--Chris

>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 17:07:51 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 429C78F5;
 Thu,  7 Mar 2013 17:07:51 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from mail-ee0-f49.google.com (mail-ee0-f49.google.com [74.125.83.49])
 by mx1.freebsd.org (Postfix) with ESMTP id 8CC1B7A3;
 Thu,  7 Mar 2013 17:07:49 +0000 (UTC)
Received: by mail-ee0-f49.google.com with SMTP id d41so529474eek.36
 for <multiple recipients>; Thu, 07 Mar 2013 09:07:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=t/FA6a5iw9i48jJqavMhUnLC7Qp6NR1n6q27Ku9YD+U=;
 b=TIfQPfStDVHYmIx8wi9sebwkCrKJgTWTP8I0H+G7sUjiwkVFO35OVewTU5y1MCuEZC
 yT1HaL3y1e/Cr2AaOCtGv91rz63hAmgJzAdamCg8N4XvSQ8mZC/hiRurIl1ZVxgCeP5J
 qX6kLsTiBRmo5U3qRwfmXlPHMV2DV8Sy/U4TnBDcX1CgkCJ3htHBi/XmLLz8w4M49grS
 5LQKmZDU+iL1I1LBEVy+7BjKNS8J/nQHmS2RrmECAwBOEWHzychjr0bpjWumJQsxDFrp
 OoG0WvCAKXWQmBmCxeowJkQl/sscP2eaXm4rFtQemcLpZfBqDvvkQOZkNTBlfksctsXq
 gxFQ==
MIME-Version: 1.0
X-Received: by 10.195.12.133 with SMTP id eq5mr55608632wjd.52.1362676063291;
 Thu, 07 Mar 2013 09:07:43 -0800 (PST)
Received: by 10.194.110.195 with HTTP; Thu, 7 Mar 2013 09:07:43 -0800 (PST)
In-Reply-To: <5136FD71.6000408@freebsd.org>
References: <CAKOb=YYGu6mr-3nyydBi9K-FHPnEx-fKSZ2=r_uDVeY9pvrqtQ@mail.gmail.com>
 <5136FD71.6000408@freebsd.org>
Date: Thu, 7 Mar 2013 09:07:43 -0800
Message-ID: <CAKOb=YaX+yopoofwRbfN7ZXc_yG0uxoKkr3aXsVcXEdLqQ=AXQ@mail.gmail.com>
Subject: Re: Default route changes unexpectedly
From: Nick Rogers <ncrogers@gmail.com>
To: Andre Oppermann <andre@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 17:07:51 -0000

On Wed, Mar 6, 2013 at 12:25 AM, Andre Oppermann <andre@freebsd.org> wrote:
> On 05.03.2013 18:39, Nick Rogers wrote:
>>
>> Hello,
>>
>> I am attempting to create awareness of a serious issue affecting users
>> of FreeBSD 9.x and PF. There appears to be a bug that allows the
>> kernel's routing table to be corrupted by traffic routing through the
>> system. Under heavy traffic load, the default route can seemingly
>> randomly change to an IP address that is not directly connected to the
>> network (i.e., is not configured anywhere). Dhclient is not in the
>> mix, nor is routed, bgpd, etc. Running `route monitor` shows no
>> evidence of the change in the default route. The one commonality
>> between all the systems experiencing this problem seems to be the use
>> of PF.
>>
>> Obviously this is a serious problem as it causes all Internet-bound
>> traffic to stop routing until the default route is corrected. Some
>> users, including myself, are working around this problem by installing
>> a script that runs multiple times a second to check if the default
>> route is incorrect and fixing it if necessary, which mitigates the
>> amount of downtime caused by the bug.
>
>
> Can you describe your traffic forwarding setup in more detail?
> Is it only pf, or do you run netgraph, or other things as well?
> Do you use flow routing?

I use PF for NAT, filtering, and rdr rules. ALTQ for bandwidth
management. I do not use netgraph. I use vlans. PF redirects to squid
as a transproxy. I'm not familiar with flow routing so unless its
enabled in 9.1 by default I do not use it.

>
> How frequent does this happen?
Every other day during periods of heavier Internet-bound traffic.

>
> I'm trying to create a stack graph to see which parts of the network
> stack are involved in handling your packet.
>
> --
> Andre
>
>> Please refer to these past posts for more examples and evidence of
>> other users experiencing this problem:
>>
>> http://forums.freebsd.org/showthread.php?p=211610#post211610
>>
>>
>> http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html
>>
>> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html
>>
>> http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html
>>
>> There is also a PR that was incorrectly labeled as an IPFW issue.
>> Myself and others believe this issue is not restricted to the use of
>> IPFW and that the PR should be relabeled. I am inclined to think it is
>> strictly a PF issue since I am not using IPFW, however there is
>> evidence of the default route changing on people using IPFW for past
>> versions of FreeBSD (7.x/8.x), so perhaps this is related.
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749
>>
>> Another PR for the same problem but specific to IPFW and 8.2-RELEASE
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=157796
>>
>> I am hoping someone reading this can give the problem the attention it
>> deserves. Thank you.
>>
>> -Nick
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
>>
>

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 17:09:35 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B9E64A8A
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 17:09:35 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 1760D7C1
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 17:09:34 +0000 (UTC)
Received: (qmail 8318 invoked from network); 7 Mar 2013 18:23:01 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <ncrogers@gmail.com>; 7 Mar 2013 18:23:01 -0000
Message-ID: <5138C9C8.6030809@freebsd.org>
Date: Thu, 07 Mar 2013 18:09:28 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Nick Rogers <ncrogers@gmail.com>
Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve:
 can't allocate llinfo for 65.59.233.102)
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
 <51378A9D.6080306@freebsd.org>
 <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
In-Reply-To: <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 17:09:35 -0000

On 07.03.2013 17:54, Nick Rogers wrote:
> I'm not sure. I have not explicitly enabled/disabled it. I am using
> the GENERIC kernel from 9.1 plus PF+ALTQ.
>
> # sysctl net.inet.flowtable.enable
> sysctl: unknown oid 'net.inet.flowtable.enable'
> # sysctl -a | grep flow
> kern.sigqueue.overflow: 0
> net.inet.tcp.reass.overflows: 0
> net.inet6.ip6.auto_flowlabel: 1
>
> uname -v
> FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013
> root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM
>
> 8.0 release notes say flowtable is enabled by default on amd64/i386.
> So I presume it is enabled? I can't seem to find much information
> about this for FreeBSD 9.x

It's not compiled in GENERIC on 9.x because it had/has some stability
issues.  I just wanted to make sure that the problem really come out
of the arpresolve area before digging into it.

-- 
Andre

> On Wed, Mar 6, 2013 at 10:27 AM, Andre Oppermann <andre@freebsd.org> wrote:
>> Courtland,
>>
>> the arpresolve observation is very important.  Do you have flowtable
>> enabled in your kernel?
>>
>> --
>> Andre
>>
>>
>> On 06.03.2013 17:16, Adrian Chadd wrote:
>>>
>>> Another instance of it..
>>> Adrian
>>> On 6 March 2013 07:21, Courtland <ncrogers@gmail.com> wrote:
>>>>
>>>> Has there been any progress on resolving this problem. Does anyone have a
>>>> better idea as to where it is breaking down?
>>>>
>>>> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF
>>>> for
>>>> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The
>>>> default gateway changes to an IP that is not on my network when under
>>>> heavy
>>>> network load.
>>>>
>>>> The last time this happened I had a stream of arpresolve messages in the
>>>> kernel for the IP that the default route was changed to.
>>>> Mar  5 19:12:53  kernel: arpresolve: can't allocate llinfo for
>>>> 50.142.201.101
>>>> The default route was changed to 50.142.201.101 after these messages.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html
>>>> Sent from the freebsd-net mailing list archive at Nabble.com.
>>>> _______________________________________________
>>>> freebsd-net@freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>>
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>>
>>>
>>
>
>


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 17:26:34 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D6DABD2C;
 Thu,  7 Mar 2013 17:26:34 +0000 (UTC)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225])
 by mx1.freebsd.org (Postfix) with ESMTP id 8E6D1879;
 Thu,  7 Mar 2013 17:26:34 +0000 (UTC)
Received: from udns.ultimateDNS.NET (localhost [127.0.0.1])
 by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r27HQRaU003640;
 Thu, 7 Mar 2013 09:26:33 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: (from www@localhost)
 by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r27HQMHF003634;
 Thu, 7 Mar 2013 09:26:22 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimatedns.net ([209.180.214.225])
 (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP;
 Thu, 7 Mar 2013 09:26:22 -0800 (PST)
Message-ID: <b5781ec3a57cac2509edd989be785e78.authenticated@ultimatedns.net>
In-Reply-To: <CAOfEmZhBiH_dvvAGbOwy3GK=WZqaGpLaZP7pvFR0MZHkEexMhg@mail.gmail.com>
References: <CAOfEmZhBiH_dvvAGbOwy3GK=WZqaGpLaZP7pvFR0MZHkEexMhg@mail.gmail.com>
Date: Thu, 7 Mar 2013 09:26:22 -0800 (PST)
Subject: Re: dhclient issue.
From: "freebsd-net" <fbsdmail@dnswatch.com>
To: araujo@freebsd.org
User-Agent: UDNSMS/2.0.3
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 17:26:34 -0000

> Hello Guys,
>
> I've faced out some problem with dhclient during this week on 9.1-RELEASE!
> Below there is the log:
>
> [root@home ~]# uname -a
> FreeBSD HOME 9.1-RELEASE FreeBSD 9.1-RELEASE #10: Tue Mar  5 18:57:14 CST
> 2013     root@home:/usr/src/sys/HOME.amd64  amd64
>
>
> [root@home ~]# dhclient ix0
> PID = 3276, PPID = 3274
> fibnum = 0
> fibcmd = setfib 0
> interface = ix0
> ifconfig: ioctl (SIOCAIFADDR): File exists
> ix0: not found
> exiting.
>
> [root@home ~]# tail /var/log/messages
> Mar 17 14:53:52 ESSD46B70 dhclient[3244]: exiting.
> Mar 17 14:54:01 ESSD46B70 login: ROOT LOGIN (root) ON ttyv0
> Mar 17 14:54:15 ESSD46B70 dhclient[3257]: ix0: not found
> Mar 17 14:54:15 ESSD46B70 dhclient[3257]: exiting.
> Mar 17 14:54:15 ESSD46B70 dhclient[3258]: connection closed
> Mar 17 14:54:15 ESSD46B70 dhclient[3258]: exiting.
> Mar 17 14:54:57 ESSD46B70 dhclient[3274]: ix0: not found
>
> [root@home ~]# ifconfig ix0
> ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
> ether 00:08:9b:d4:6b:71
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> media: Ethernet autoselect (10Gbase-T <full-duplex>)
> status: active
>
>
> I have another interface em0, and there it works properly!
> Any idea, what is going on?
Anything in rc.conf(5) that might conflict with your attempt to hook this if(1)
up via dhclient(8)?
For example, if you already have a: ifconfig_ix0="DHCP", and that failed during
boot (init), then dhclient(8) may already still be attempting to hook your ix0
if up. Which will result in fail.

--Chris

>
> Best Regards,
> --
> Marcelo Araujo
> araujo@FreeBSD.org
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 19:27:44 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 4FEB68EA;
 Thu,  7 Mar 2013 19:27:44 +0000 (UTC)
 (envelope-from krzysiek@airnet.opole.pl)
Received: from base.airnet.opole.pl (ns2.airmax.pl [176.111.128.3])
 by mx1.freebsd.org (Postfix) with ESMTP id 0C659E50;
 Thu,  7 Mar 2013 19:27:43 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by base.airnet.opole.pl (Postfix) with ESMTP id EA0727FF031;
 Thu,  7 Mar 2013 20:27:37 +0100 (CET)
Received: from base.airnet.opole.pl ([127.0.0.1])
 by localhost (mail.airnet.opole.pl [127.0.0.1]) (maiad, port 10024)
 with ESMTP id 70250-06; Thu,  7 Mar 2013 20:27:37 +0100 (CET)
Received: from [10.10.11.223] (unknown [176.111.138.12])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 (Authenticated sender: krzysiek@airnet.opole.pl)
 by base.airnet.opole.pl (Postfix) with ESMTPSA id B6B417FF02D;
 Thu,  7 Mar 2013 20:27:37 +0100 (CET)
Message-ID: <5138EA26.4000403@airnet.opole.pl>
Date: Thu, 07 Mar 2013 20:27:34 +0100
From: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve:
 can't allocate llinfo for 65.59.233.102)
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
 <51378A9D.6080306@freebsd.org>
 <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
 <5138C9C8.6030809@freebsd.org>
In-Reply-To: <5138C9C8.6030809@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Nick Rogers <ncrogers@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 19:27:44 -0000

W dniu 2013-03-07 18:09, Andre Oppermann pisze:
> On 07.03.2013 17:54, Nick Rogers wrote:
>> I'm not sure. I have not explicitly enabled/disabled it. I am using
>> the GENERIC kernel from 9.1 plus PF+ALTQ.
>>
>> # sysctl net.inet.flowtable.enable
>> sysctl: unknown oid 'net.inet.flowtable.enable'
>> # sysctl -a | grep flow
>> kern.sigqueue.overflow: 0
>> net.inet.tcp.reass.overflows: 0
>> net.inet6.ip6.auto_flowlabel: 1
>>
>> uname -v
>> FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013
>> root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM
>>
>> 8.0 release notes say flowtable is enabled by default on amd64/i386.
>> So I presume it is enabled? I can't seem to find much information
>> about this for FreeBSD 9.x
>
> It's not compiled in GENERIC on 9.x because it had/has some stability
> issues.  I just wanted to make sure that the problem really come out
> of the arpresolve area before digging into it.
>

I can confirm I get these messages as well:

Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 
86.58.122.125
Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 
86.58.122.125

IP 86.58.122.125 is not from IP pool used by me.

Krzysiek

From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 20:26:29 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0273B679
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 20:26:29 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 5CBCF124
 for <freebsd-net@freebsd.org>; Thu,  7 Mar 2013 20:26:27 +0000 (UTC)
Received: (qmail 17974 invoked from network); 7 Mar 2013 21:39:52 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <krzysiek@airnet.opole.pl>; 7 Mar 2013 21:39:52 -0000
Message-ID: <5138F7EC.30804@freebsd.org>
Date: Thu, 07 Mar 2013 21:26:20 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>
Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve:
 can't allocate llinfo for 65.59.233.102)
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
 <51378A9D.6080306@freebsd.org>
 <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
 <5138C9C8.6030809@freebsd.org> <5138EA26.4000403@airnet.opole.pl>
In-Reply-To: <5138EA26.4000403@airnet.opole.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Nick Rogers <ncrogers@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 20:26:29 -0000

On 07.03.2013 20:27, Krzysztof Barcikowski wrote:
> W dniu 2013-03-07 18:09, Andre Oppermann pisze:
>> On 07.03.2013 17:54, Nick Rogers wrote:
>>> I'm not sure. I have not explicitly enabled/disabled it. I am using
>>> the GENERIC kernel from 9.1 plus PF+ALTQ.
>>>
>>> # sysctl net.inet.flowtable.enable
>>> sysctl: unknown oid 'net.inet.flowtable.enable'
>>> # sysctl -a | grep flow
>>> kern.sigqueue.overflow: 0
>>> net.inet.tcp.reass.overflows: 0
>>> net.inet6.ip6.auto_flowlabel: 1
>>>
>>> uname -v
>>> FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013
>>> root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM
>>>
>>> 8.0 release notes say flowtable is enabled by default on amd64/i386.
>>> So I presume it is enabled? I can't seem to find much information
>>> about this for FreeBSD 9.x
>>
>> It's not compiled in GENERIC on 9.x because it had/has some stability
>> issues.  I just wanted to make sure that the problem really come out
>> of the arpresolve area before digging into it.
>>
>
> I can confirm I get these messages as well:
>
> Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 86.58.122.125
> Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 86.58.122.125

OK.  Then this is the common factor.

> IP 86.58.122.125 is not from IP pool used by me.

You mean it's not from one of the subnets on your interfaces?

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 20:53:40 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5192F895
 for <net@freebsd.org>; Thu,  7 Mar 2013 20:53:40 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id C909C24D
 for <net@freebsd.org>; Thu,  7 Mar 2013 20:53:39 +0000 (UTC)
Received: (qmail 19311 invoked from network); 7 Mar 2013 22:07:04 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <melifaro@FreeBSD.org>; 7 Mar 2013 22:07:04 -0000
Message-ID: <5138FE4C.5030307@freebsd.org>
Date: Thu, 07 Mar 2013 21:53:32 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
 <51389B4B.1060003@freebsd.org> <5138B390.2080806@FreeBSD.org>
In-Reply-To: <5138B390.2080806@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 20:53:40 -0000

On 07.03.2013 16:34, Alexander V. Chernikov wrote:
> On 07.03.2013 17:51, Andre Oppermann wrote:
>> On 07.03.2013 14:38, Ermal Lu�i wrote:
>>> Isn't it better to teach the routing code about metrics.
>>> Routing daemons cope better this way and they can handle this.
>>> So the policy of this behaviour can be controled by administrator
>>> rather than by code!
>>> With metrics you can add routes with bigger metric for interfaces and
>>> lower from routing daemons.
>>> This also can mitigate somehow on interfaces with the same subnet
>>> configured possibly.
>>
>> Generally I agree with you that this would be the ideal outcome.
>> However we're still quite a bit away from reaching that goal.
>> To make this really work we have make mpath plus metrics a first
>> class citizen in the routing code and also the update the routing
>> daemons kernel interfaces to know about this.  I hope we get there
>> in the not too distant future.
 >
> Radix is already over-bloated. Typically in performance-oriented
> solutions (hardware/software routers from vendors) there is clear
> separation between RIB (where route protocol attributes, best candidate
> routes, routes with different priority exists) and FIB, which is
> typically some kind of radix with minimum needed info, e.g:
> prefix, nexthops, their interfaces, optional L2 data to prepend.

ACK.  Though the bloat in itself is not main problem other than kernel
memory consumption.  If you think of it in cache line misses everything
more than 128 bytes away is potentially a cache miss.  The additional
distance due to a large or small structure makes no difference.  What
makes an important difference is the internal layout of the structure
and whether the relevant variables are within the same cache line.
This can be a problem in a large structure when some data is at the
beginning and other data at the end on a different cache line.  Here
potentially twice the cache miss latency per trie element hurts.

If we can manage to put everything for a trie search into the first
cache line we're quit good already.  The additional win for tighter
packing isn't that large anymore.

> Our radix stands somewhere between RIB and FIB (since we have to support
> route(8) and upper layer protocols): it serves badly as RIB (little
> functionality) and as FIB: too much overhead and inefficient/too general
> code.

ACK.  There is a big philosophical question on the model.  Make it a
RIB so that independent but complementary routing daemons can add
routes concurrently and the kernel knows which have higher priority
or are equal cost for traffic balancing (as in bgpd+ospfd).  Or strip
it to a FIB and have a external program do the RIB and coordination
across routing daemons (as in Quagga suite).

> For example, sizeof(rt_nodes[2]) (first element of rte) is 96 bytes on
> amd64.

That is a problem if the trie traversal function accesses fields beyond
the this cache line.  The main problem is that key and mask are pointers
and thus external to the radix_node adding even more cache misses.

> Additionally, rte refcount approach is totally broken.

ACK.  Copy and out.  No references or external pointers into the table.

> I'm currently thinking of adding some kind of hooks to current
> route/radix code to permit building efficient trie (or other structure)
> for given address family and to use it for forwarding purposes only.

AFAIK Marco Zec and/or Luigi have done some work in this area as well.

> For example, I don't need trie while doing MPLS label switching:
> assuming control plane allocates contiguous label space, I can use label
> array for efficient lookup.

Nobody's forcing you to use a radix trie for MPLS.  In theory each
protocol can chose its own best method.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 20:58:20 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EA6E696A;
 Thu,  7 Mar 2013 20:58:20 +0000 (UTC)
 (envelope-from qing.li@bluecoat.com)
Received: from plsvl-mailgw-01.bluecoat.com (plsvl-mailgw-01.bluecoat.com
 [199.91.133.11]) by mx1.freebsd.org (Postfix) with ESMTP id BEA9F278;
 Thu,  7 Mar 2013 20:58:20 +0000 (UTC)
Received: from pwsvl-exchts-03.internal.cacheflow.com
 (pwsvl-exchts-03.bluecoat.com [10.2.2.160])
 by plsvl-mailgw-01.bluecoat.com (Postfix) with ESMTP id 852BE81A0BE;
 Thu,  7 Mar 2013 11:51:46 -0900 (AKST)
Received: from PWSVL-EXCMBX-04.internal.cacheflow.com
 ([fe80::c596:c77:dd67:b72d]) by pwsvl-exchts-03.internal.cacheflow.com
 ([fe80::a508:17dc:1550:e9f6%12]) with mapi id 14.01.0355.002; Thu, 7 Mar 2013
 12:51:45 -0800
From: "Li, Qing" <qing.li@bluecoat.com>
To: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>, Andre Oppermann
 <andre@freebsd.org>
Subject: RE: Default route changes unexpectedly #2 (was Re: kernel:
 arpresolve: can't allocate llinfo for 65.59.233.102)
Thread-Topic: Default route changes unexpectedly #2 (was Re: kernel:
 arpresolve: can't allocate llinfo for 65.59.233.102)
Thread-Index: AQHOG2nqowWL/iUfmkKP6+iB7XtxKZias+mw
Date: Thu, 7 Mar 2013 20:51:45 +0000
Message-ID: <B143A8975061C446AD5E29742C5317231EA96FBE@pwsvl-excmbx-04.internal.cacheflow.com>
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
 <51378A9D.6080306@freebsd.org>
 <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
 <5138C9C8.6030809@freebsd.org> <5138EA26.4000403@airnet.opole.pl>
In-Reply-To: <5138EA26.4000403@airnet.opole.pl>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.2.2.106]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: Nick Rogers <ncrogers@gmail.com>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 20:58:21 -0000

Hi,

>=20
> I can confirm I get these messages as well:
>=20
> Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for
> 86.58.122.125
> Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for
> 86.58.122.125
>=20
> IP 86.58.122.125 is not from IP pool used by me.
>=20

This kernel message is a merely a side effect of a bad route (with=20
off-net IP address) being injected as a default route replacement.

--Qing


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 21:26:37 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 8EB13620;
 Thu,  7 Mar 2013 21:26:37 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 32DC538F;
 Thu,  7 Mar 2013 21:26:37 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UDiO5-000Fet-By; Fri, 08 Mar 2013 01:30:05 +0400
Message-ID: <513905F2.1050409@FreeBSD.org>
Date: Fri, 08 Mar 2013 01:26:10 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
 <51389B4B.1060003@freebsd.org> <5138B390.2080806@FreeBSD.org>
 <5138FE4C.5030307@freebsd.org>
In-Reply-To: <5138FE4C.5030307@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 21:26:37 -0000

On 08.03.2013 00:53, Andre Oppermann wrote:
> On 07.03.2013 16:34, Alexander V. Chernikov wrote:
>> On 07.03.2013 17:51, Andre Oppermann wrote:
>>> On 07.03.2013 14:38, Ermal Lu�i wrote:
>>>> Isn't it better to teach the routing code about metrics.
>>>> Routing daemons cope better this way and they can handle this.
>>>> So the policy of this behaviour can be controled by administrator
>>>> rather than by code!
>>>> With metrics you can add routes with bigger metric for interfaces and
>>>> lower from routing daemons.
>>>> This also can mitigate somehow on interfaces with the same subnet
>>>> configured possibly.
>>>
>>> Generally I agree with you that this would be the ideal outcome.
>>> However we're still quite a bit away from reaching that goal.
>>> To make this really work we have make mpath plus metrics a first
>>> class citizen in the routing code and also the update the routing
>>> daemons kernel interfaces to know about this. I hope we get there
>>> in the not too distant future.
>  >
>> Radix is already over-bloated. Typically in performance-oriented
>> solutions (hardware/software routers from vendors) there is clear
>> separation between RIB (where route protocol attributes, best candidate
>> routes, routes with different priority exists) and FIB, which is
>> typically some kind of radix with minimum needed info, e.g:
>> prefix, nexthops, their interfaces, optional L2 data to prepend.
>
> ACK. Though the bloat in itself is not main problem other than kernel
> memory consumption. If you think of it in cache line misses everything
> more than 128 bytes away is potentially a cache miss. The additional
> distance due to a large or small structure makes no difference. What
> makes an important difference is the internal layout of the structure
> and whether the relevant variables are within the same cache line.
> This can be a problem in a large structure when some data is at the
> beginning and other data at the end on a different cache line. Here
> potentially twice the cache miss latency per trie element hurts.
Yup. I'm talking in cache line terms only.
>
> If we can manage to put everything for a trie search into the first
> cache line we're quit good already. The additional win for tighter
> packing isn't that large anymore.
>
>> Our radix stands somewhere between RIB and FIB (since we have to support
>> route(8) and upper layer protocols): it serves badly as RIB (little
>> functionality) and as FIB: too much overhead and inefficient/too general
>> code.
>
> ACK. There is a big philosophical question on the model. Make it a
> RIB so that independent but complementary routing daemons can add
> routes concurrently and the kernel knows which have higher priority
> or are equal cost for traffic balancing (as in bgpd+ospfd). Or strip
> it to a FIB and have a external program do the RIB and coordination
> across routing daemons (as in Quagga suite).
>
>> For example, sizeof(rt_nodes[2]) (first element of rte) is 96 bytes on
>> amd64.
>
> That is a problem if the trie traversal function accesses fields beyond
> the this cache line. The main problem is that key and mask are pointers
> and thus external to the radix_node adding even more cache misses.
Yes.
>
>> Additionally, rte refcount approach is totally broken.
>
> ACK. Copy and out. No references or external pointers into the table.
>
>> I'm currently thinking of adding some kind of hooks to current
>> route/radix code to permit building efficient trie (or other structure)
>> for given address family and to use it for forwarding purposes only.
>
> AFAIK Marco Zec and/or Luigi have done some work in this area as well.
>
>> For example, I don't need trie while doing MPLS label switching:
>> assuming control plane allocates contiguous label space, I can use label
>> array for efficient lookup.
>
> Nobody's forcing you to use a radix trie for MPLS. In theory each
> protocol can chose its own best method.
Well, actually this is not quite true, and that is the problem.

Userland has to manage kernel MPLS entries somehow, and route socket is 
bound to radix pretty heavily. Additionally, our route(8) abuses kvm(3) 
interface and simply walks thru in-kernel radix tree to print routes and
additional information like refcouns/use count. There is very-very-old 
(but still working) code there printing more or less the same via sysctl 
api, but additional info is not propagated.
>


From owner-freebsd-net@FreeBSD.ORG  Thu Mar  7 21:42:12 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 1611A532;
 Thu,  7 Mar 2013 21:42:12 +0000 (UTC)
 (envelope-from jmg@h2.funkthat.com)
Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18])
 by mx1.freebsd.org (Postfix) with ESMTP id BE8DA645;
 Thu,  7 Mar 2013 21:42:11 +0000 (UTC)
Received: from h2.funkthat.com (localhost [127.0.0.1])
 by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id r27Lg5SZ066094
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Thu, 7 Mar 2013 13:42:05 -0800 (PST)
 (envelope-from jmg@h2.funkthat.com)
Received: (from jmg@localhost)
 by h2.funkthat.com (8.14.3/8.14.3/Submit) id r27Lg5oD066093;
 Thu, 7 Mar 2013 13:42:05 -0800 (PST) (envelope-from jmg)
Date: Thu, 7 Mar 2013 13:42:05 -0800
From: John-Mark Gurney <jmg@funkthat.com>
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: [patch] interface routes
Message-ID: <20130307214205.GD50035@funkthat.com>
Mail-Followup-To: Andre Oppermann <andre@freebsd.org>,
 "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <51384443.5070209@freebsd.org>
User-Agent: Mutt/1.4.2.3i
X-Operating-System: FreeBSD 7.2-RELEASE i386
X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88  9322 9CB1 8F74 6D3F A396
X-Files: The truth is out there
X-URL: http://resnet.uoregon.edu/~gurney_j/
X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html
X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger?
X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2
 (h2.funkthat.com [127.0.0.1]); Thu, 07 Mar 2013 13:42:05 -0800 (PST)
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Mar 2013 21:42:12 -0000

Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 +0100:
> >Adding interface address is handled via atomically deleting old prefix and 
> >adding interface one.
> 
> This brings up a long standing sore point of our routing code
> which this patch makes more pronounced.  When an interface link
> state is down I don't want the route to it to persist but to
> become inactive so another path can be chosen.  This the very
> point of running a routing daemon.  So on the link-down event
> the installed interface routes should be removed from the routing
> table.  The configured addresses though should persist and the
> interface routes re-installed on a link-up event.  What's your
> opinion on it?
> 
> Other than these points I think your code is fine and can go
> into the tree.

The issue that I see with this is that if you bump your cable, all
your connections will be dropped, because as soon as they try to send
something, they'll get a no route to host, and this will break the
TCP connection...  If we keep the routes when the link goes down,
the packet will be queued or dropped (depending upon ethernet driver),
but the TCP connection will not break...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 00:32:53 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 24F131FF
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 00:32:53 +0000 (UTC)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225])
 by mx1.freebsd.org (Postfix) with ESMTP id CA890D7C
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 00:32:51 +0000 (UTC)
Received: from udns.ultimateDNS.NET (localhost [127.0.0.1])
 by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r280WoB3029376
 for <freebsd-net@freebsd.org>; Thu, 7 Mar 2013 16:32:56 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: (from www@localhost)
 by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r280Wjrp029370;
 Thu, 7 Mar 2013 16:32:45 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimatedns.net ([209.180.214.225])
 (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP;
 Thu, 7 Mar 2013 16:32:45 -0800 (PST)
Message-ID: <3a292f3eabb7a27bd9f942f98d6b0e20.authenticated@ultimatedns.net>
In-Reply-To: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
References: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
Date: Thu, 7 Mar 2013 16:32:45 -0800 (PST)
Subject: Re: Implementing IP6 in 8.3
From: "freebsd-net" <fbsdmail@dnswatch.com>
To: freebsd-net@freebsd.org
User-Agent: UDNSMS/2.0.3
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 00:32:53 -0000

> Greetings,
>  I'm evaluating an ISP for the sake of building BSD operating systems on hardware
> that they use (DSL modems, in this case). When I had my old NEC server, I had a
> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at
> it for use in alot of hardware I have laying around. In my current situation, I'm
> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively
> new modem, it doesn't support IP6. It is my hope to replace the OS with one that
> does. :)
> I leased a /48 of IP4's from them, which /also/ came with as many IP6's.
EDIT
The above line /should/ have read:
I leased a /29 of IP4's from them, which /also/ came with as many IP6's.
___________^^^
/EDIT

Sorry.

> So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers),
> I'm wondering 2 things:
> If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho
> my gateway (modem) doesn't?
> Am I /correctly/ attempting to use it?
> I'm answering authoritatively for the many domains I own. They have all functioned
> well for many years via IP4. I have added the requisite AAAA records in all the zones,
> as well as the associated RR's.
> While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of
> DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48.
> So it's up to the modem to answer accordingly.
> BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically,
> via rc.conf(5). While I've read as much as I can find on the topic related to BSD,
> boot messages indicate at least -- "IP6 gateway unreachable".
> I'm currently using:
> rc.conf(5):
> ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000"
> ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000"
> I also have the corresponding host IP in hosts(5).
>
> Any help, pointers, guidance, answers /greatly/ appreciated.
>
> Thank you for all your time, and consideration.
>
> --Chris
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 01:16:00 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B8CE1E4C;
 Fri,  8 Mar 2013 01:16:00 +0000 (UTC)
 (envelope-from ncrogers@gmail.com)
Received: from mail-ve0-f170.google.com (mail-ve0-f170.google.com
 [209.85.128.170])
 by mx1.freebsd.org (Postfix) with ESMTP id 3CA93ECD;
 Fri,  8 Mar 2013 01:15:59 +0000 (UTC)
Received: by mail-ve0-f170.google.com with SMTP id 14so903928vea.29
 for <multiple recipients>; Thu, 07 Mar 2013 17:15:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=j2PSlh2PkxkO1W+jcjXmqz/cReoTUDnkgO6McayluWU=;
 b=CxXbG7yLi2dVO2Sks2wbCKZyEC6g1R3YBoP9NVxiZlTMguoj1vLojrK2JJgwT4nZMt
 m+e3ksRUmbYqupwtzuwk/WgIhqd7u7Gc5YUtUkrJCCKw4HGstjt5PQprsD3HQVxDcrot
 9KDEZF09irVye9wub7k4XHHSSBQYkwShkm+EDZchF8d6cUtlUZEd0KC6EtnTGq73aGjH
 un06NgNBSfcL5GObSWvIyuxDyUC5+DQBScyZFZsqltD1EjACUms/YiwWM3lz0fUT1iR/
 XcVTrL1Q2nUPHxUqRDtXE5rN3N3Ay3LeqwTuXDUqsOXhcCRLt6XDNxzRxiEka5Yw6TwJ
 zaaQ==
MIME-Version: 1.0
X-Received: by 10.220.151.144 with SMTP id c16mr182147vcw.18.1362705353455;
 Thu, 07 Mar 2013 17:15:53 -0800 (PST)
Received: by 10.52.176.131 with HTTP; Thu, 7 Mar 2013 17:15:53 -0800 (PST)
In-Reply-To: <B143A8975061C446AD5E29742C5317231EA96FBE@pwsvl-excmbx-04.internal.cacheflow.com>
References: <CAJ-VmokO85PaW6DksfqsBuSdMPUEGv9f_2qEMAD7WTv4QK0vcw@mail.gmail.com>
 <51378A9D.6080306@freebsd.org>
 <CAKOb=YYPzSfUmcmmvE0fy-2QhzUK5hZUVUEdJs3uho2a6iGz+g@mail.gmail.com>
 <5138C9C8.6030809@freebsd.org> <5138EA26.4000403@airnet.opole.pl>
 <B143A8975061C446AD5E29742C5317231EA96FBE@pwsvl-excmbx-04.internal.cacheflow.com>
Date: Thu, 7 Mar 2013 17:15:53 -0800
Message-ID: <CAKOb=YYnnANb69mmTfYxucuR1gYQweZ_-fzCa+gNh3TnphBFaQ@mail.gmail.com>
Subject: Re: Default route changes unexpectedly #2 (was Re: kernel:
 arpresolve: can't allocate llinfo for 65.59.233.102)
From: Nick Rogers <ncrogers@gmail.com>
To: "Li, Qing" <qing.li@bluecoat.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 Andre Oppermann <andre@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 01:16:00 -0000

On Thu, Mar 7, 2013 at 12:51 PM, Li, Qing <qing.li@bluecoat.com> wrote:
> Hi,
>
>>
>> I can confirm I get these messages as well:
>>
>> Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for
>> 86.58.122.125
>> Mar  7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for
>> 86.58.122.125
>>
>> IP 86.58.122.125 is not from IP pool used by me.
>>
>
> This kernel message is a merely a side effect of a bad route (with
> off-net IP address) being injected as a default route replacement.

I would normally agree, however in the last case, the arpresolve
messages started happening about two hours before the default route
was changed to the IP in the arpresolve message. At least, thats when
my script that runs every second detected a change in the route. The
script parses netstat -rn output to determine if the default route is
correct or not. There was not a 2 hour downtime.

Heres the logging of my script followed by appropriate /var/log/messages.

2013/03/05 21:13:02 rxgd[10816] DEBUG> Rxg::Route::parseRoutes -
/usr/bin/netstat -rnlW -f inet
2013/03/05 21:13:02 rxgd[10816] DEBUG> Rxg::Route::parseRoutes - done
parsing routes
2013/03/05 21:13:02 rxgd[10816] INFO> Rxg::Route::checkDefaultRoute -
deleting incorrect default route em0/50.142.201.101

Mar  5 19:12:48 westmar kernel: arpresolve: can't allocate llinfo for
50.142.201.101
Mar  5 19:12:48 westmar last message repeated 107 times
Mar  5 21:12:48 westmar named[10906]: internal_send: 66.187.177.13#53:
Invalid argument
Mar  5 19:12:48 westmar kernel: arpresolve: can't allocate llinfo for
50.142.201.101
Mar  5 19:12:48 westmar last message repeated 24 times

I don't understand the timestamps however. It is peculiar to have an
arpresolve message at 19:12 followed by bind logging from 21:12, then
another arpresolve at 19:12. Maybe it is just because of losing the
default route.

>
> --Qing
>
>

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 07:10:43 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 9108689A;
 Fri,  8 Mar 2013 07:10:43 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 24604CEF;
 Fri,  8 Mar 2013 07:10:42 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r287AfnT054755;
 Fri, 8 Mar 2013 02:10:41 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r287AfKg054752;
 Fri, 8 Mar 2013 02:10:41 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20793.36593.774795.720959@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 02:10:41 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Limits on jumbo mbuf cluster allocation
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 02:10:42 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: jfv@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 07:10:43 -0000

I have a machine (actually six of them) with an Intel dual-10G NIC on
the motherboard.  Two of them (so far) are connected to a network
using jumbo frames, with an MTU a little under 9k, so the ixgbe driver
allocates 32,000 9k clusters for its receive rings.  I have noticed,
on the machine that is an active NFS server, that it can get into a
state where allocating more 9k clusters fails (as reflected in the
mbuf failure counters) at a utilization far lower than the configured
limits -- in fact, quite close to the number allocated by the driver
for its rx ring.  Eventually, network traffic grinds completely to a
halt, and if one of the interfaces is administratively downed, it
cannot be brought back up again.  There's generally plenty of physical
memory free (at least two or three GB).

There are no console messages generated to indicate what is going on,
and overall UMA usage doesn't look extreme.  I'm guessing that this is
a result of kernel memory fragmentation, although I'm a little bit
unclear as to how this actually comes about.  I am assuming that this
hardware has only limited scatter-gather capability and can't receive
a single packet into multiple buffers of a smaller size, which would
reduce the requirement for two-and-a-quarter consecutive pages of KVA
for each packet.  In actual usage, most of our clients aren't on a
jumbo network, so most of the time, all the packets will fit into a
normal 2k cluster, and we've never observed this issue when the
*server* is on a non-jumbo network.

Does anyone have suggestions for dealing with this issue?  Will
increasing the amount of KVA (to, say, twice physical memory) help
things?  It seems to me like a bug that these large packets don't have
their own submap to ensure that allocation is always possible when
sufficient physical pages are available.

-GAWollman

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 07:54:23 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 61106F88
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 07:54:23 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id D7213EFC
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 07:54:22 +0000 (UTC)
Received: (qmail 52521 invoked from network); 8 Mar 2013 09:07:40 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <wollman@freebsd.org>; 8 Mar 2013 09:07:40 -0000
Message-ID: <51399926.6020201@freebsd.org>
Date: Fri, 08 Mar 2013 08:54:14 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Garrett Wollman <wollman@freebsd.org>
Subject: Re: Limits on jumbo mbuf cluster allocation
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
In-Reply-To: <20793.36593.774795.720959@hergotha.csail.mit.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: jfv@freebsd.org, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 07:54:23 -0000

On 08.03.2013 08:10, Garrett Wollman wrote:
> I have a machine (actually six of them) with an Intel dual-10G NIC on
> the motherboard.  Two of them (so far) are connected to a network
> using jumbo frames, with an MTU a little under 9k, so the ixgbe driver
> allocates 32,000 9k clusters for its receive rings.  I have noticed,
> on the machine that is an active NFS server, that it can get into a
> state where allocating more 9k clusters fails (as reflected in the
> mbuf failure counters) at a utilization far lower than the configured
> limits -- in fact, quite close to the number allocated by the driver
> for its rx ring.  Eventually, network traffic grinds completely to a
> halt, and if one of the interfaces is administratively downed, it
> cannot be brought back up again.  There's generally plenty of physical
> memory free (at least two or three GB).

You have an amd64 kernel running HEAD or 9.x?

> There are no console messages generated to indicate what is going on,
> and overall UMA usage doesn't look extreme.  I'm guessing that this is
> a result of kernel memory fragmentation, although I'm a little bit
> unclear as to how this actually comes about.  I am assuming that this
> hardware has only limited scatter-gather capability and can't receive
> a single packet into multiple buffers of a smaller size, which would
> reduce the requirement for two-and-a-quarter consecutive pages of KVA
> for each packet.  In actual usage, most of our clients aren't on a
> jumbo network, so most of the time, all the packets will fit into a
> normal 2k cluster, and we've never observed this issue when the
> *server* is on a non-jumbo network.
>
> Does anyone have suggestions for dealing with this issue?  Will
> increasing the amount of KVA (to, say, twice physical memory) help
> things?  It seems to me like a bug that these large packets don't have
> their own submap to ensure that allocation is always possible when
> sufficient physical pages are available.

Jumbo pages come directly from the kernel_map which on amd64 is 512GB.
So KVA shouldn't be a problem.  Your problem indeed appears to come
physical memory fragmentation in pmap.  There is a buddy memory
allocator at work but I fear it runs into serious trouble when it has
to allocate a large number of objects spanning more than 2 contiguous
pages.  Also since you're doing NFS serving almost all memory will be
in use for file caching.

Running a NIC with jumbo frames enabled gives some interesting trade-
offs.  Unfortunately most NIC's can't have multiple DMA buffer sizes
on the same receive queue and pick the best size for the incoming frame.
That means they need to use largest jumbo mbuf for all receive traffic,
even a tiny 40 byte ACK.  The send side is not constrained in such a way
and tries to use PAGE_SIZE clusters for socket buffers whenever it can.

Many, but not all, NIC's are able to split a received jumbo frame into
multiple smaller DMA segments forming an mbuf chain.  The ixgbe hardware
is capable of doing this, though the driver supports it but doesn't
actively makes use of it.

Another issue with many drivers is their inability to deal with mbuf
allocation failure for their receive DMA ring.  They try to fill it
up to the maximal ring size and balk on failure.  Rings have become
very big and usually are a power of two.  The driver could function
with a partially filled RX ring too, maybe with some performance
impact when it gets really low.  On every rxeof it tries to refill
the ring, so when resources become available again it'd balance out.
NIC's with multiple receive queues/rings make this problem even more
acute.

A theoretical fix would be to dedicate an entire super page of 1GB
or so exclusively to the jumbo frame UMA zone as backing memory.  That
memory is gone for all other uses though, even if not actually used.
Allocating the superpage and determining its size would have to be
done manually by setting loader variables.  I don't see a reasonable
way to do this with autotuning because it requires advance knowledge
of the usage patters.

IMHO the right fix is to strongly discourage use of jumbo clusters
larger than PAGE_SIZE when the hardware is capable of splitting the
frame into multiple clusters.  The allocation constraint then is only
available memory and no longer contiguous pages.  Also the waste
factor for small frames is much lower.  The performance impact is
minimal to non-existent.  In addition drivers shouldn't break down
when the RX ring can't be filled to the max.

I recently got yelled at for suggesting to remove jumbo > PAGE_SIZE.
However your case proves that such jumbo frames are indeed their own
can of worms and should really only and exclusively be used for NIC's
that have to do jumbo *and* are incapable of RX scatter DMA.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 07:55:13 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C51139D;
 Fri,  8 Mar 2013 07:55:13 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pb0-f52.google.com (mail-pb0-f52.google.com
 [209.85.160.52]) by mx1.freebsd.org (Postfix) with ESMTP id 825B2F0F;
 Fri,  8 Mar 2013 07:55:13 +0000 (UTC)
Received: by mail-pb0-f52.google.com with SMTP id ma3so1014403pbc.25
 for <multiple recipients>; Thu, 07 Mar 2013 23:55:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=VcQqfCnSkyv44aLqzTPHNhhkPzmU9WWa0b9o1VF+2AM=;
 b=B+7qkk2cAAsDNNCFlYxomD8Wg2/AktnAfQ6PgeLEKkgZb6O//Kjvvbflh/najsTJnN
 xjysmxEnz181fZQOwXs2puMpimAsoV5cLC6/Ax5PkgoOeJLhuzwxh+sS3MeH1bYz0NbY
 YIlxCyXswFPzFDdjgpLVKsqKX6cjR02aW/GZsEFUoMHl0dDdU/+XFYZ5A+7Wm0Zivd6d
 pFEFvGvDwTCG9QkUjZQYN9JxfbxwvkiJ6/owPv93nofxkypwv4h5hwYXQy+HToePrhyq
 rfqXMTTaAhBoL/7Ytca1AVGX7TWq2W1X9FP4Jixpfvo9qxoFqUOyih4boYpkyhzsjI/Q
 ohtA==
X-Received: by 10.67.11.4 with SMTP id ee4mr2670500pad.107.1362729307415;
 Thu, 07 Mar 2013 23:55:07 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id av14sm5355178pac.18.2013.03.07.23.55.03
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Thu, 07 Mar 2013 23:55:06 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Fri, 08 Mar 2013 16:54:58 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Fri, 8 Mar 2013 16:54:58 +0900
To: Garrett Wollman <wollman@freebsd.org>
Subject: Re: Limits on jumbo mbuf cluster allocation
Message-ID: <20130308075458.GA1442@michelle.cdnetworks.com>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20793.36593.774795.720959@hergotha.csail.mit.edu>
User-Agent: Mutt/1.4.2.3i
Cc: jfv@freebsd.org, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 07:55:13 -0000

On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote:
> I have a machine (actually six of them) with an Intel dual-10G NIC on
> the motherboard.  Two of them (so far) are connected to a network
> using jumbo frames, with an MTU a little under 9k, so the ixgbe driver
> allocates 32,000 9k clusters for its receive rings.  I have noticed,
> on the machine that is an active NFS server, that it can get into a
> state where allocating more 9k clusters fails (as reflected in the
> mbuf failure counters) at a utilization far lower than the configured
> limits -- in fact, quite close to the number allocated by the driver
> for its rx ring.  Eventually, network traffic grinds completely to a
> halt, and if one of the interfaces is administratively downed, it
> cannot be brought back up again.  There's generally plenty of physical
> memory free (at least two or three GB).
> 
> There are no console messages generated to indicate what is going on,
> and overall UMA usage doesn't look extreme.  I'm guessing that this is
> a result of kernel memory fragmentation, although I'm a little bit
> unclear as to how this actually comes about.  I am assuming that this
> hardware has only limited scatter-gather capability and can't receive
> a single packet into multiple buffers of a smaller size, which would
> reduce the requirement for two-and-a-quarter consecutive pages of KVA
> for each packet.  In actual usage, most of our clients aren't on a
> jumbo network, so most of the time, all the packets will fit into a
> normal 2k cluster, and we've never observed this issue when the
> *server* is on a non-jumbo network.
> 

AFAIK all Intel controllers generate jumbo frame by concatenating
multiple mbufs on RX side so there is no physically contiguous 9KB
allocation. I vaguely guess there could be mbuf leakage when jumbo
frame is enabled. I would check how driver handles mbuf shortage or 
frame errors while mbuf concatenation for jumbo frame is in
progress.

> Does anyone have suggestions for dealing with this issue?  Will
> increasing the amount of KVA (to, say, twice physical memory) help
> things?  It seems to me like a bug that these large packets don't have
> their own submap to ensure that allocation is always possible when
> sufficient physical pages are available.
> 
> -GAWollman

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 08:27:44 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 22F4F981;
 Fri,  8 Mar 2013 08:27:44 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-ve0-f180.google.com (mail-ve0-f180.google.com
 [209.85.128.180]) by mx1.freebsd.org (Postfix) with ESMTP id B7AFABF;
 Fri,  8 Mar 2013 08:27:43 +0000 (UTC)
Received: by mail-ve0-f180.google.com with SMTP id jx10so1040939veb.25
 for <multiple recipients>; Fri, 08 Mar 2013 00:27:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=gdU2vJzm6t7YLjU3UDorWz5FHnY+HPa9McEz1yMCyCE=;
 b=qZZNNrVVstQ0I7K74WY6rwNkgy8oTnegeQiPmHhfgQwDkw2I3fqLOAQA9MPVrtWgUH
 dU8DH0mp2XmCHtpVoOnVIstJJk+K4i/LLiE7T5P1manDYF/wagbMYl0RZjc/4H2WZAue
 lXCe+p6qG+7tyTkrA11Og9rjHzBoxAsq5ZOZOcC/TNmh00MhQTmt9S0Mnh4C9lUwNyfI
 dl5YLvOKi+eDkUPxxfXUOBEzkbGJ/8T8ja/pfetZuhAQmluDH6fO4ANvUFBls+4ivu7i
 wP9BywpSFZLOkzQb+hPnoFeW2uZVtbVxmzf8sJk4bxuHIp6NH54qD56G0kX4hvpdWcOc
 Ge5w==
MIME-Version: 1.0
X-Received: by 10.58.56.161 with SMTP id b1mr588244veq.42.1362731257517; Fri,
 08 Mar 2013 00:27:37 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 00:27:37 -0800 (PST)
In-Reply-To: <20130308075458.GA1442@michelle.cdnetworks.com>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <20130308075458.GA1442@michelle.cdnetworks.com>
Date: Fri, 8 Mar 2013 00:27:37 -0800
Message-ID: <CAFOYbckHDeuwmcPZzhewqrAju3GZ8er6nnTVgkNeVhvH4k=ydQ@mail.gmail.com>
Subject: Re: Limits on jumbo mbuf cluster allocation
From: Jack Vogel <jfvogel@gmail.com>
To: pyunyh@gmail.com
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: jfv@freebsd.org, freebsd-net@freebsd.org,
 Garrett Wollman <wollman@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 08:27:44 -0000

On Thu, Mar 7, 2013 at 11:54 PM, YongHyeon PYUN <pyunyh@gmail.com> wrote:

> On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote:
> > I have a machine (actually six of them) with an Intel dual-10G NIC on
> > the motherboard.  Two of them (so far) are connected to a network
> > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver
> > allocates 32,000 9k clusters for its receive rings.  I have noticed,
> > on the machine that is an active NFS server, that it can get into a
> > state where allocating more 9k clusters fails (as reflected in the
> > mbuf failure counters) at a utilization far lower than the configured
> > limits -- in fact, quite close to the number allocated by the driver
> > for its rx ring.  Eventually, network traffic grinds completely to a
> > halt, and if one of the interfaces is administratively downed, it
> > cannot be brought back up again.  There's generally plenty of physical
> > memory free (at least two or three GB).
> >
> > There are no console messages generated to indicate what is going on,
> > and overall UMA usage doesn't look extreme.  I'm guessing that this is
> > a result of kernel memory fragmentation, although I'm a little bit
> > unclear as to how this actually comes about.  I am assuming that this
> > hardware has only limited scatter-gather capability and can't receive
> > a single packet into multiple buffers of a smaller size, which would
> > reduce the requirement for two-and-a-quarter consecutive pages of KVA
> > for each packet.  In actual usage, most of our clients aren't on a
> > jumbo network, so most of the time, all the packets will fit into a
> > normal 2k cluster, and we've never observed this issue when the
> > *server* is on a non-jumbo network.
> >
>
> AFAIK all Intel controllers generate jumbo frame by concatenating
> multiple mbufs on RX side so there is no physically contiguous 9KB
> allocation. I vaguely guess there could be mbuf leakage when jumbo
> frame is enabled. I would check how driver handles mbuf shortage or
> frame errors while mbuf concatenation for jumbo frame is in
> progress.
>

No, this is not true, if using a 9K jumbo it will actually use the larger
mbuf pool, the code has been this way for a little while now.

Jack


>
> > Does anyone have suggestions for dealing with this issue?  Will
> > increasing the amount of KVA (to, say, twice physical memory) help
> > things?  It seems to me like a bug that these large packets don't have
> > their own submap to ensure that allocation is always possible when
> > sufficient physical pages are available.
> >
> > -GAWollman
>

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 08:31:19 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 77458ABE;
 Fri,  8 Mar 2013 08:31:19 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vb0-x22d.google.com (mail-vb0-x22d.google.com
 [IPv6:2607:f8b0:400c:c02::22d])
 by mx1.freebsd.org (Postfix) with ESMTP id E3F15FB;
 Fri,  8 Mar 2013 08:31:18 +0000 (UTC)
Received: by mail-vb0-f45.google.com with SMTP id p1so541209vbi.18
 for <multiple recipients>; Fri, 08 Mar 2013 00:31:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=qPvTnpHswbOK3Z0nN+72XEjySWIXeQkiYSdWHWSEGo8=;
 b=sS/WOwlYJtiR62rUHNnTTy//dzLQwUQP1iWD+fsC0k8y+l6D2I6nn1GGNKUnS8sMaG
 2qQ+jD2vpgdzPS3siDedT34fGf+0m5HdgLCqQVIOfkqpVuiECDz7qQIIAnpXr+CLrPOX
 ghJayHkWLJsYHZImGsahYnj5yTrlyc4dFRA9A6vV1AtwD14ny3zeZhN+KmvsWR7bPkEi
 meJSLpzW7hB7jCMDhuTCOBrOAdBwOWQL7g/sJFrT16Li4Xwoi0HivqfHhdkRErh/umLt
 S2ollcl7IQUNV9U3j+k28w+UjPHwO+VU6jphh0RNtxexmwOqw7K6kRRgQxa+uXKuf8HQ
 /gOQ==
MIME-Version: 1.0
X-Received: by 10.52.19.239 with SMTP id i15mr520505vde.47.1362731478407; Fri,
 08 Mar 2013 00:31:18 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 00:31:18 -0800 (PST)
In-Reply-To: <51399926.6020201@freebsd.org>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
Date: Fri, 8 Mar 2013 00:31:18 -0800
Message-ID: <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
Subject: Re: Limits on jumbo mbuf cluster allocation
From: Jack Vogel <jfvogel@gmail.com>
To: Andre Oppermann <andre@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: jfv@freebsd.org, freebsd-net@freebsd.org,
 Garrett Wollman <wollman@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 08:31:19 -0000

On Thu, Mar 7, 2013 at 11:54 PM, Andre Oppermann <andre@freebsd.org> wrote:

> On 08.03.2013 08:10, Garrett Wollman wrote:
>
>> I have a machine (actually six of them) with an Intel dual-10G NIC on
>> the motherboard.  Two of them (so far) are connected to a network
>> using jumbo frames, with an MTU a little under 9k, so the ixgbe driver
>> allocates 32,000 9k clusters for its receive rings.  I have noticed,
>> on the machine that is an active NFS server, that it can get into a
>> state where allocating more 9k clusters fails (as reflected in the
>> mbuf failure counters) at a utilization far lower than the configured
>> limits -- in fact, quite close to the number allocated by the driver
>> for its rx ring.  Eventually, network traffic grinds completely to a
>> halt, and if one of the interfaces is administratively downed, it
>> cannot be brought back up again.  There's generally plenty of physical
>> memory free (at least two or three GB).
>>
>
> You have an amd64 kernel running HEAD or 9.x?
>
>
>  There are no console messages generated to indicate what is going on,
>> and overall UMA usage doesn't look extreme.  I'm guessing that this is
>> a result of kernel memory fragmentation, although I'm a little bit
>> unclear as to how this actually comes about.  I am assuming that this
>> hardware has only limited scatter-gather capability and can't receive
>> a single packet into multiple buffers of a smaller size, which would
>> reduce the requirement for two-and-a-quarter consecutive pages of KVA
>> for each packet.  In actual usage, most of our clients aren't on a
>> jumbo network, so most of the time, all the packets will fit into a
>> normal 2k cluster, and we've never observed this issue when the
>> *server* is on a non-jumbo network.
>>
>> Does anyone have suggestions for dealing with this issue?  Will
>> increasing the amount of KVA (to, say, twice physical memory) help
>> things?  It seems to me like a bug that these large packets don't have
>> their own submap to ensure that allocation is always possible when
>> sufficient physical pages are available.
>>
>
> Jumbo pages come directly from the kernel_map which on amd64 is 512GB.
> So KVA shouldn't be a problem.  Your problem indeed appears to come
> physical memory fragmentation in pmap.  There is a buddy memory
> allocator at work but I fear it runs into serious trouble when it has
> to allocate a large number of objects spanning more than 2 contiguous
> pages.  Also since you're doing NFS serving almost all memory will be
> in use for file caching.
>
> Running a NIC with jumbo frames enabled gives some interesting trade-
> offs.  Unfortunately most NIC's can't have multiple DMA buffer sizes
> on the same receive queue and pick the best size for the incoming frame.
> That means they need to use largest jumbo mbuf for all receive traffic,
> even a tiny 40 byte ACK.  The send side is not constrained in such a way
> and tries to use PAGE_SIZE clusters for socket buffers whenever it can.
>
> Many, but not all, NIC's are able to split a received jumbo frame into
> multiple smaller DMA segments forming an mbuf chain.  The ixgbe hardware
> is capable of doing this, though the driver supports it but doesn't
> actively makes use of it.
>
> Another issue with many drivers is their inability to deal with mbuf
> allocation failure for their receive DMA ring.  They try to fill it
> up to the maximal ring size and balk on failure.  Rings have become
> very big and usually are a power of two.  The driver could function
> with a partially filled RX ring too, maybe with some performance
> impact when it gets really low.  On every rxeof it tries to refill
> the ring, so when resources become available again it'd balance out.
> NIC's with multiple receive queues/rings make this problem even more
> acute.
>
> A theoretical fix would be to dedicate an entire super page of 1GB
> or so exclusively to the jumbo frame UMA zone as backing memory.  That
> memory is gone for all other uses though, even if not actually used.
> Allocating the superpage and determining its size would have to be
> done manually by setting loader variables.  I don't see a reasonable
> way to do this with autotuning because it requires advance knowledge
> of the usage patters.
>
> IMHO the right fix is to strongly discourage use of jumbo clusters
> larger than PAGE_SIZE when the hardware is capable of splitting the
> frame into multiple clusters.  The allocation constraint then is only
> available memory and no longer contiguous pages.  Also the waste
> factor for small frames is much lower.  The performance impact is
> minimal to non-existent.  In addition drivers shouldn't break down
> when the RX ring can't be filled to the max.
>
> I recently got yelled at for suggesting to remove jumbo > PAGE_SIZE.
> However your case proves that such jumbo frames are indeed their own
> can of worms and should really only and exclusively be used for NIC's
> that have to do jumbo *and* are incapable of RX scatter DMA.
>
>
I am not strongly opposed to trying the 4k mbuf pool for all larger sizes,
Garrett maybe if you would try that on your system and see if that helps
you, I could envision making this a tunable at some point perhaps?

Thanks for the input Andre.

Jack

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 08:39:47 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 727C9D61;
 Fri,  8 Mar 2013 08:39:47 +0000 (UTC)
 (envelope-from pyunyh@gmail.com)
Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com
 [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id 437FB13D;
 Fri,  8 Mar 2013 08:39:47 +0000 (UTC)
Received: by mail-pa0-f54.google.com with SMTP id fa10so1149766pad.27
 for <multiple recipients>; Fri, 08 Mar 2013 00:39:41 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:from:date:to:cc:subject:message-id:reply-to:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=hcaP4Rxws/QUH/cTA6IdYsVtS4rQ7XmdHcWQlat+qzw=;
 b=FzHrlkExhdrQDeLCHCKg/dkyQJEJos++rKokAApLBYrcGE/qcBUvmcdPbIVKqq9X6F
 jpqO+K+dgSHroOwe4jYHRNibtLAr+3alP56zpI3n7emFODykzut6HbKY2aSl0RoSpLhV
 5aMC+ORTU5e7Ejrdos9R/xq4uE6yZTQoAfZ/wyXnwRvyT2tYEVfNE4rPbkrk+SfnDhQ2
 5WAn5fsFSsKVUBx2N88fumm6gRTCXaCKrF2gNCyaTCuZmj5TnGJFitmrSN3V6eHNEbW5
 qS7c8Tpl5BwnLXMwAdcQoI/wYcAOSylLnqpcVJJzImsNiCJSUTjUmeXLrHR5iw0zPtbB
 aufw==
X-Received: by 10.66.9.69 with SMTP id x5mr2713559paa.204.1362731981669;
 Fri, 08 Mar 2013 00:39:41 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPS id ip8sm4822866pbc.39.2013.03.08.00.39.37
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Fri, 08 Mar 2013 00:39:40 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Fri, 08 Mar 2013 17:39:32 +0900
From: YongHyeon PYUN <pyunyh@gmail.com>
Date: Fri, 8 Mar 2013 17:39:32 +0900
To: Jack Vogel <jfvogel@gmail.com>
Subject: Re: Limits on jumbo mbuf cluster allocation
Message-ID: <20130308083932.GB1442@michelle.cdnetworks.com>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <20130308075458.GA1442@michelle.cdnetworks.com>
 <CAFOYbckHDeuwmcPZzhewqrAju3GZ8er6nnTVgkNeVhvH4k=ydQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAFOYbckHDeuwmcPZzhewqrAju3GZ8er6nnTVgkNeVhvH4k=ydQ@mail.gmail.com>
User-Agent: Mutt/1.4.2.3i
Cc: jfv@freebsd.org, freebsd-net@freebsd.org,
 Garrett Wollman <wollman@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 08:39:47 -0000

On Fri, Mar 08, 2013 at 12:27:37AM -0800, Jack Vogel wrote:
> On Thu, Mar 7, 2013 at 11:54 PM, YongHyeon PYUN <pyunyh@gmail.com> wrote:
> 
> > On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote:
> > > I have a machine (actually six of them) with an Intel dual-10G NIC on
> > > the motherboard.  Two of them (so far) are connected to a network
> > > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver
> > > allocates 32,000 9k clusters for its receive rings.  I have noticed,
> > > on the machine that is an active NFS server, that it can get into a
> > > state where allocating more 9k clusters fails (as reflected in the
> > > mbuf failure counters) at a utilization far lower than the configured
> > > limits -- in fact, quite close to the number allocated by the driver
> > > for its rx ring.  Eventually, network traffic grinds completely to a
> > > halt, and if one of the interfaces is administratively downed, it
> > > cannot be brought back up again.  There's generally plenty of physical
> > > memory free (at least two or three GB).
> > >
> > > There are no console messages generated to indicate what is going on,
> > > and overall UMA usage doesn't look extreme.  I'm guessing that this is
> > > a result of kernel memory fragmentation, although I'm a little bit
> > > unclear as to how this actually comes about.  I am assuming that this
> > > hardware has only limited scatter-gather capability and can't receive
> > > a single packet into multiple buffers of a smaller size, which would
> > > reduce the requirement for two-and-a-quarter consecutive pages of KVA
> > > for each packet.  In actual usage, most of our clients aren't on a
> > > jumbo network, so most of the time, all the packets will fit into a
> > > normal 2k cluster, and we've never observed this issue when the
> > > *server* is on a non-jumbo network.
> > >
> >
> > AFAIK all Intel controllers generate jumbo frame by concatenating
> > multiple mbufs on RX side so there is no physically contiguous 9KB
> > allocation. I vaguely guess there could be mbuf leakage when jumbo
> > frame is enabled. I would check how driver handles mbuf shortage or
> > frame errors while mbuf concatenation for jumbo frame is in
> > progress.
> >
> 
> No, this is not true, if using a 9K jumbo it will actually use the larger
> mbuf pool, the code has been this way for a little while now.

Ah, thanks for correcting me. If H/W is still able to support old
style chaining like em(4), wouldn't it better to use that rather
than allocating a 9KB buffer? Allocating a 9KB buffer to handle a
pure TCP ACK segment looks inefficient.

> 
> Jack
> 
> 
> >
> > > Does anyone have suggestions for dealing with this issue?  Will
> > > increasing the amount of KVA (to, say, twice physical memory) help
> > > things?  It seems to me like a bug that these large packets don't have
> > > their own submap to ensure that allocation is always possible when
> > > sufficient physical pages are available.
> > >
> > > -GAWollman
> >

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 08:39:54 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C902BDDF;
 Fri,  8 Mar 2013 08:39:54 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qc0-x22b.google.com (mail-qc0-x22b.google.com
 [IPv6:2607:f8b0:400d:c01::22b])
 by mx1.freebsd.org (Postfix) with ESMTP id 6ABFD13E;
 Fri,  8 Mar 2013 08:39:54 +0000 (UTC)
Received: by mail-qc0-f171.google.com with SMTP id d1so475877qca.16
 for <multiple recipients>; Fri, 08 Mar 2013 00:39:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=HWXKvWPnsDTPeX/WxELjL+Qn397KGxjDDvzqQ8wKgZs=;
 b=0LMkfwfWUSj/zxxqFlidIPaa4xajS2LOhDx01hf8VTYYGVv5/dgk8VLgk24AEqhQ2b
 r1BEzdjMyY0TQD13xaVirUqNbxZHEvGBvWdBBqEMbh6RanTVEkaFNomNRSvb8Pyh5+cY
 KoyfS1UVXoInWvcWhQ/9GzflVzvdciv1Cw2CToThsSPTixuBsMVJRD0K+LlPSwdviSVz
 mEPDpKNqzaAnP9hFL9GLuAq7h19TLsrOMYih2CJl54uoR+nGmaIzOIMDiS58eVp0p4tO
 GujewDFRBKODKp+YqYn+IhqU7GSjnvTCgQJZqGcn3IGhDuIt0M9sdrrvK4WtprF7/Jmj
 oEMg==
MIME-Version: 1.0
X-Received: by 10.229.69.24 with SMTP id x24mr450213qci.16.1362731993893; Fri,
 08 Mar 2013 00:39:53 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Fri, 8 Mar 2013 00:39:53 -0800 (PST)
In-Reply-To: <51389B4B.1060003@freebsd.org>
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org>
 <CAPBZQG3Of173MoyB-sPy=9RoivtKMRA7LZ0DYyEOSrsyk9_10A@mail.gmail.com>
 <51389B4B.1060003@freebsd.org>
Date: Fri, 8 Mar 2013 09:39:53 +0100
X-Google-Sender-Auth: 9vm07c8oiJGPEjF1irXoPF-BOpE
Message-ID: <CAPBZQG3BgicORvko4_b_5dVafqansQePGZL1nDPmMQbg8VDZrg@mail.gmail.com>
Subject: Re: [patch] interface routes
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Andre Oppermann <andre@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 08:39:54 -0000

On Thu, Mar 7, 2013 at 2:51 PM, Andre Oppermann <andre@freebsd.org> wrote:

> On 07.03.2013 14:38, Ermal Lu=E7i wrote:
>
>> On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann <andre@freebsd.org<mail=
to:
>> andre@freebsd.org>> wrote:
>>
>>     On 07.03.2013 12:43, Alexander V. Chernikov wrote:
>>
>>         On 07.03.2013 11:39, Andre Oppermann wrote:
>>
>>             On 07.03.2013 07:34, Alexander V. Chernikov wrote:
>>
>>                 Hello list!
>>
>>                 There is a known long-lived issue with interface routes
>>                 addition/deletion:
>>
>>                 ifconfig iface inet 1.2.3.4/24 <http://1.2.3.4/24> can
>> fail if given prefix is
>>
>>                 already in
>>                 kernel route table (for
>>                 example, advertised by IGP like OSPF).
>>
>>                 Interface route can be deleted via route(8) or any route
>> socket user
>>                 (sometimes this happens with
>>                 popular opensource daemons like bird/quagga).
>>
>>                 Problem is reported at least in kern/106722 and
>> kern/155772.
>>
>>
>>             You patch is a welcome addition.
>>
>>                 This can be fixed the following way:
>>                 Immutable route flag (RTM_PINNED, added in 19995 with
>> 'for future use'
>>                 comment) is utilised to mark
>>                 route 'immutable'.
>>                 rtrequest1_fib refuses to delete routes with given flag
>> unless
>>                 RTM_PINNED is set in rti_flags.
>>
>>
>>             How do the routing daemons react to being unable to
>> change/delete
>>             such a route?
>>
>>         routing daemons live long with the fact that there route socket
>> cmds can
>>         fail (and the is route(8) utility which can do anything), so
>> typically
>>         bird/quagga yells like
>>         'bird: KRT: Error sending route 11.0.0.0/24 <http://11.0.0.0/24>
>> to kernel: File exists'
>>
>>         and marks given route as not installed in internal RIB.
>> Additionally,
>>         daemon will probably re-try to insert such routes on every
>> periodic KRT
>>         rescan (tens of minutes).
>>
>>
>>
>> Isn't it better to teach the routing code about metrics.
>> Routing daemons cope better this way and they can handle this.
>> So the policy of this behaviour can be controled by administrator rather
>> than by code!
>> With metrics you can add routes with bigger metric for interfaces and
>> lower from routing daemons.
>> This also can mitigate somehow on interfaces with the same subnet
>> configured possibly.
>>
>
> Generally I agree with you that this would be the ideal outcome.
> However we're still quite a bit away from reaching that goal.
> To make this really work we have make mpath plus metrics a first
> class citizen in the routing code and also the update the routing
> daemons kernel interfaces to know about this.  I hope we get there
> in the not too distant future.
>
> As a first step I think it is important that Alexanders patch goes
> in to fix a long standing and very annoying problem with the code
> we have.  Also the link down route withdraw should be added asap.
> Then we can take the next steps towards the ultimate goal you describe.
>
> I hope you do not object to Alexanders patch?


No objection, just trying to put the focus where it needs to be.

Yeah its good to have options there just as always the interface route
should not be scrubbed on interface event
since bound sockets to that interface will behave strangely.


>
>
> --
> Andre
>
>


--=20
Ermal

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 09:00:23 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D869C1FF
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 09:00:23 +0000 (UTC)
 (envelope-from vpenkoff@gmail.com)
Received: from mail-la0-x22f.google.com (mail-la0-x22f.google.com
 [IPv6:2a00:1450:4010:c03::22f])
 by mx1.freebsd.org (Postfix) with ESMTP id 44D1A201
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 09:00:23 +0000 (UTC)
Received: by mail-la0-f47.google.com with SMTP id fj20so1449973lab.34
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 01:00:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:date:message-id:subject:from:to
 :content-type; bh=+bXbQQ2fs383BOb+yIayLEKGoTbSeLuz7zSRzDzVAE8=;
 b=MTBCh2kszzfrSRh1GhKQcZ628lhaakDYWk0g8KRnRirGsDcjI4lJcqq24/p045BenL
 CqWdwYtR7qVk/GOGB1U9ePEFfay/ZAxluobSobXg5O1Z42rf5zRSUMzNvloEfZAA5wXk
 XgOBpcQYillkbJCX59xJ3wZJyw9epdtE6H3OKmhTlfREDOvIfmZaQiiYx65IBVJ4mtuF
 xDQadyyDcxwQKxPr4EDLmsPHxFHlUI0wjTk2CuFrNB54AyhSK5JtecPIdfc4d0faaaD3
 eGEPDIW8UAFBiCP1Ax2itTsYnqrzSsccMcSmrCtTqR0bbRtzbuQ54sy7en7nJu0on5Rf
 Ckhw==
MIME-Version: 1.0
X-Received: by 10.112.16.199 with SMTP id i7mr757383lbd.65.1362733222135; Fri,
 08 Mar 2013 01:00:22 -0800 (PST)
Received: by 10.112.18.43 with HTTP; Fri, 8 Mar 2013 01:00:22 -0800 (PST)
Date: Fri, 8 Mar 2013 11:00:22 +0200
Message-ID: <CAE-a5gbis9RaNy6R3rOtGnvF_a0goy392O-_kmvLK2LqD_Bv+Q@mail.gmail.com>
Subject: BPF data representation
From: Viktor Penkoff <vpenkoff@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 09:00:23 -0000

Hi guys. I'm diggin some bpf stuff and i can't figure out, why there are 3
types of data representations: words, halfwords and bytes? I mean how can i
know, which one is best in a place to use? In some basic example, e.g. for
packet capturing, considering BPF's manual, i use for ETHERTYPE in the
ethernet header a halfword representation, but for a IP address - word
representation.
Let's say we have some read instructions:

BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, X, Y),
....
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
BPF_JUMP(BPF_JMP+BPF+JEQ+BPF_K, 0xABABABAB, X,Y)

Can someone explain?
Thanks!

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 09:02:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id CC3F22A9
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 09:02:21 +0000 (UTC)
 (envelope-from vpenkoff@gmail.com)
Received: from mail-la0-x229.google.com (mail-la0-x229.google.com
 [IPv6:2a00:1450:4010:c03::229])
 by mx1.freebsd.org (Postfix) with ESMTP id 54CAF21A
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 09:02:21 +0000 (UTC)
Received: by mail-la0-f41.google.com with SMTP id fo12so1453281lab.28
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 01:02:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:content-type;
 bh=2qgRNo+YYSnep3Ug6FJjMu+/oIAqofY4vYaeBrNNTKY=;
 b=ZgwpewYvPXtm6MNt9Ssu1ohJ3oXVLgs2mC/Yk+DcAW5RECYcqo0QzJJR0/lWfcG1Qk
 tPyLeqCNnx+0WIrcjCHdyiqZxkC5oQkbbXUuOqwztZsdr8zd5VtMV0nAzSio7pNeo2UK
 SoFzsNSFuUJRtdzvcS0WmgOAMk4ejtMtNTawX65qCxPMMQk+pLHgZ8sUWaGCTdp6fSEO
 fejWb74VFIqjppfRGGa8CHfFUcwGPojHPcGBUOkOS6AkO1NyV4pCm84n6IpdMscvmCIR
 KKjUvv//Wng5G+eqK2SLNh34t67YUC3vqLB7lvyahXAhNEIK9bblz9PETHLmwPy2fYIf
 mOMg==
MIME-Version: 1.0
X-Received: by 10.112.103.168 with SMTP id fx8mr778095lbb.32.1362733340257;
 Fri, 08 Mar 2013 01:02:20 -0800 (PST)
Received: by 10.112.18.43 with HTTP; Fri, 8 Mar 2013 01:02:20 -0800 (PST)
In-Reply-To: <CAE-a5gbis9RaNy6R3rOtGnvF_a0goy392O-_kmvLK2LqD_Bv+Q@mail.gmail.com>
References: <CAE-a5gbis9RaNy6R3rOtGnvF_a0goy392O-_kmvLK2LqD_Bv+Q@mail.gmail.com>
Date: Fri, 8 Mar 2013 11:02:20 +0200
Message-ID: <CAE-a5gZrmW174+cEJ2FTEK9e5vo2m22NDVdJMyN19TUTHPwHKw@mail.gmail.com>
Subject: BPF data representation
From: Viktor Penkoff <vpenkoff@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 09:02:21 -0000

Hi guys. I'm diggin some bpf stuff and i can't figure out, why there are 3
types of data representations: words, halfwords and bytes? I mean how can i
know, which one is best in a place to use? In some basic example, e.g. for
packet capturing, considering BPF's manual, i use for ETHERTYPE in the
ethernet header a halfword representation, but for a IP address - word
representation.
Let's say we have some read instructions:

BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, X, Y),
....
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
BPF_JUMP(BPF_JMP+BPF+JEQ+BPF_K, 0xABABABAB, X,Y)

Can someone explain?
Thanks!

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 11:56:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id F0364C3B
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 11:56:21 +0000 (UTC)
 (envelope-from milu@dat.pl)
Received: from jab.dat.pl (dat.pl [80.51.155.34])
 by mx1.freebsd.org (Postfix) with ESMTP id A9FB3B6D
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 11:56:20 +0000 (UTC)
Received: from jab.dat.pl (jsrv.dat.pl [127.0.0.1])
 by jab.dat.pl (Postfix) with ESMTP id BDB32F8;
 Fri,  8 Mar 2013 12:56:18 +0100 (CET)
X-Virus-Scanned: amavisd-new at dat.pl
Received: from jab.dat.pl ([127.0.0.1])
 by jab.dat.pl (jab.dat.pl [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id WZ8NBY1bV2Wy; Fri,  8 Mar 2013 12:56:12 +0100 (CET)
Received: from [10.0.6.80] (unknown [212.69.68.42])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by jab.dat.pl (Postfix) with ESMTPSA id 8A98C4C;
 Fri,  8 Mar 2013 12:56:12 +0100 (CET)
Message-ID: <5139D20F.4050901@dat.pl>
Date: Fri, 08 Mar 2013 12:57:03 +0100
From: Maciej Milewski <milu@dat.pl>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/20130221 Thunderbird/17.0.3
MIME-Version: 1.0
To: freebsd-net <fbsdmail@dnswatch.com>
Subject: Re: Implementing IP6 in 8.3
References: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
 <5138AED9.1020801@dat.pl>
 <eaa244ab49a30180aa7c88f45f3b38dc.authenticated@ultimatedns.net>
In-Reply-To: <eaa244ab49a30180aa7c88f45f3b38dc.authenticated@ultimatedns.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 11:56:22 -0000

On 07.03.2013 17:55, freebsd-net wrote:
> Greetings Maciej Milewski, and thank you for your thoughtful reply.
>> On 06.03.2013 22:02, freebsd-net wrote:
>>> Greetings,
>>>    I'm evaluating an ISP for the sake of building BSD operating systems on hardware
>>> that they use (DSL modems, in this case). When I had my old NEC server, I had a
>>> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at
>>> it for use in alot of hardware I have laying around. In my current situation, I'm
>>> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively
>>> new modem, it doesn't support IP6. It is my hope to replace the OS with one that
>>> does. :)
>> If it doesn't support IPv6 you can always try to use it in Transparent
>> Bridging (RFC1483) mode.
>> <http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
>> You can then put other router/computer that does IPv6 routing just after
>> that modem.
>> <http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
> Thank you for the links. I was aware of that, but requires that every connection
> directly to the modem, send the PPPoE creds to the modem. While it's simple enough
> to connect a router/switch between the modem, and clients, it adds an additional
> hop. I think I'll be better served building a (free)BSD kernel, and drivers for
> the modem -- assuming that because the modem doesn't IP6, it's not possible to
> route IP6 traffic directly, unless through a "tunnel broker".
If you are sure that you can build kernel for that modem device then try 
it. From my experience it's rather hard. Mainly because today's hw is 
too cheap to have working hw interfaces(like DSL modem) and it's all 
done in software way.
Shortest and fastest way would be setting this modem as transparent 
bridge. Then put your own router/gateway(which is IPv6 capable). Router 
on WAN side connects through PPPoE to your ISP and LAN/WLAN side 
connects to your switch or you computers directly. It will be additional 
device between you and your ISP but in many cases that's much better 
than having all-in-one(which can't do IPv6). I'd go that way.

> Thanks again, for taking the time to respond.
>
> --Chris

I hope that puts more light to what you try to do.
-- 
Pozdrawiam,
Maciej Milewski


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 13:19:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4D879BD9
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 13:19:21 +0000 (UTC)
 (envelope-from vegeta@tuxpowered.net)
Received: from mail-bk0-x22a.google.com (mail-bk0-x22a.google.com
 [IPv6:2a00:1450:4008:c01::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id D6F22CD
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 13:19:20 +0000 (UTC)
Received: by mail-bk0-f42.google.com with SMTP id jk7so688542bkc.15
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 05:19:19 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:from:to:subject:date:user-agent:mime-version:x-uid
 :content-type:content-transfer-encoding:message-id
 :x-gm-message-state;
 bh=VplLr68ahVUntemVm74O8Aql7acd7eYNjzo7VLVGfH4=;
 b=aqS8CbZGI+ay9ovXIfSMdKJ0lq3RfX2RHyf24DJpQbrSb/gBn/e955rRvK5Z69OX5D
 7Q4MdKcRxxP4rwZ1p0rjjrfmLLKkqhu1G+guWaKYhFm3Z5eNIzf6q0fxP1ZsOk1z+87M
 jLr/zGHqp8HV7ZXEk1hFwnfjJtk1p5hABWzXYsJ+TprOTwnmkHfPtsofuyDqQ17oA2Gw
 Dm1V11tLUT5ItXhY/6jJq1Mk4V77bcRrL7EAhVb5JJn6Gyf4GIw0qlJ3coipindq7oMq
 MKgSgrycMZ+DSfXrdzWgX4ENnF2pk1jLB5dMNCxei+MNBewHt/VGfhgv7VP3pMG61GdB
 xICQ==
X-Received: by 10.204.198.3 with SMTP id em3mr821253bkb.96.1362748759383;
 Fri, 08 Mar 2013 05:19:19 -0800 (PST)
Received: from zvezda.localnet ([212.48.107.10])
 by mx.google.com with ESMTPS id z6sm1676495bkv.11.2013.03.08.05.19.18
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Fri, 08 Mar 2013 05:19:18 -0800 (PST)
From: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
To: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject: [patch] Source entries removing is awfully slow.
Date: Fri, 8 Mar 2013 14:19:17 +0100
User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; )
MIME-Version: 1.0
X-UID: 1998
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <201303081419.17743.vegeta@tuxpowered.net>
X-Gm-Message-State: ALoCoQlhTnlJoQ5UWXk/k82qQf2EQF2TP65X5JmF+Bc8mNaNOzgeWbO0dwLIlC4JHkbJHQW5yG92
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 13:19:21 -0000

Hello there!

In my enviroment, where I use FreeBSD machines as loadbalancers, after a server 
is detected as dead, loadbalancer removes the the broken server from a table 
used in route-to pf rule and then removes Source entries pointing clients to 
that server, so clients previously assigned to the broken server are re-
loadbalanced to alive servers.

Each loadbalancer has around 50k Source and 500k State entries. Under those 
conditions removing a Source from anywhere to a dead server with `pfctl -K 
0.0.0.0/0 -K internal.IP.of.server` freezes the machine for a few seconds (or 
even up to a minute in other datacenter segment, where different services are 
served, causing thousands instead of just a few hundred States to be matched). 
Under a DDoS attack, when removing Sources to a server under attack, kernel 
freezes permanently (I gave up after 10 minutes waiting and restarted the 
machine).

A patch fixing the issue can be found here:

http://vegeta.tuxpowered.net/download/link-states-to-src_node.patch

-- 
| pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
|  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
|        Vegeta          | www: http://vegeta.tuxpowered.net     |
`------------------------^---------------------------------------'

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 15:07:14 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 66552B7F
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 15:07:14 +0000 (UTC)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225])
 by mx1.freebsd.org (Postfix) with ESMTP id D1997964
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 15:07:13 +0000 (UTC)
Received: from udns.ultimateDNS.NET (localhost [127.0.0.1])
 by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r28F70Xg023502;
 Fri, 8 Mar 2013 07:07:06 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: (from www@localhost)
 by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r28F6r2Y023489;
 Fri, 8 Mar 2013 07:06:53 -0800 (PST)
 (envelope-from fbsdmail@dnswatch.com)
Received: from udns.ultimatedns.net ([209.180.214.225])
 (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP;
 Fri, 8 Mar 2013 07:06:53 -0800 (PST)
Message-ID: <97d1f60d519956584c4927f72c43e97f.authenticated@ultimatedns.net>
In-Reply-To: <5139D20F.4050901@dat.pl>
References: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
 <5138AED9.1020801@dat.pl>
 <eaa244ab49a30180aa7c88f45f3b38dc.authenticated@ultimatedns.net>
 <5139D20F.4050901@dat.pl>
Date: Fri, 8 Mar 2013 07:06:53 -0800 (PST)
Subject: Re: Implementing IP6 in 8.3
From: "freebsd-net" <fbsdmail@dnswatch.com>
To: "Maciej Milewski" <milu@dat.pl>
User-Agent: UDNSMS/2.0.3
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Cc: freebsd-net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 15:07:14 -0000

Maciej Milewski, and thank you for your reply.
> On 07.03.2013 17:55, freebsd-net wrote:
>> Greetings Maciej Milewski, and thank you for your thoughtful reply.
>>> On 06.03.2013 22:02, freebsd-net wrote:
>>>> Greetings,
>>>>    I'm evaluating an ISP for the sake of building BSD operating systems on hardware
>>>> that they use (DSL modems, in this case). When I had my old NEC server, I had a
>>>> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at
>>>> it for use in alot of hardware I have laying around. In my current situation, I'm
>>>> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively
>>>> new modem, it doesn't support IP6. It is my hope to replace the OS with one that
>>>> does. :)
>>> If it doesn't support IPv6 you can always try to use it in Transparent
>>> Bridging (RFC1483) mode.
>>> <http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
>>> You can then put other router/computer that does IPv6 routing just after
>>> that modem.
>>> <http://qwest.centurylink.com/internethelp/modem-q1000z-setup-bridge.html>
>> Thank you for the links. I was aware of that, but requires that every connection
>> directly to the modem, send the PPPoE creds to the modem. While it's simple enough
>> to connect a router/switch between the modem, and clients, it adds an additional
>> hop. I think I'll be better served building a (free)BSD kernel, and drivers for
>> the modem -- assuming that because the modem doesn't IP6, it's not possible to
>> route IP6 traffic directly, unless through a "tunnel broker".
> If you are sure that you can build kernel for that modem device then try
> it. From my experience it's rather hard. Mainly because today's hw is
> too cheap to have working hw interfaces(like DSL modem) and it's all
> done in software way.
> Shortest and fastest way would be setting this modem as transparent
> bridge. Then put your own router/gateway(which is IPv6 capable). Router
> on WAN side connects through PPPoE to your ISP and LAN/WLAN side
> connects to your switch or you computers directly. It will be additional
> device between you and your ISP but in many cases that's much better
> than having all-in-one(which can't do IPv6). I'd go that way.
>
>> Thanks again, for taking the time to respond.
>>
>> --Chris
>
> I hope that puts more light to what you try to do.
While I agree, inserting a router/switch between the modem & the clients/servers
would be the shortest/easiest solution. In the end, I think the investment in
building a (free)bsd kernel && drivers for the modem would have/provide the
biggest reward(s). Truth be told; I have accumulated quite a mass of this type
of equipment over the years, and I'd like to take a stab at building a
(free)bsd kernel with associated drivers for them. Their all MIPS based, and
many of them have ~32Mb && ~64Mb flash space & RAM. So, resources aren't too
unreasonable. In the end, the benefits of having something /I/ have control over,
makes these devices a great more valuable. It also empowers others whom are
currently subject to the limitations their ISP imposes on them.

Thank you again for taking the time to respond.

--Chris
> --
> Pozdrawiam,
> Maciej Milewski
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 15:27:01 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 24B2EFF6
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 15:27:01 +0000 (UTC)
 (envelope-from milu@dat.pl)
Received: from jab.dat.pl (dat.pl [80.51.155.34])
 by mx1.freebsd.org (Postfix) with ESMTP id D53D4ADE
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 15:27:00 +0000 (UTC)
Received: from jab.dat.pl (jsrv.dat.pl [127.0.0.1])
 by jab.dat.pl (Postfix) with ESMTP id C38D3113;
 Fri,  8 Mar 2013 16:26:52 +0100 (CET)
X-Virus-Scanned: amavisd-new at dat.pl
Received: from jab.dat.pl ([127.0.0.1])
 by jab.dat.pl (jab.dat.pl [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id uzxTyBqV-abg; Fri,  8 Mar 2013 16:26:48 +0100 (CET)
Received: from [10.0.6.80] (unknown [212.69.68.42])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by jab.dat.pl (Postfix) with ESMTPSA id 37FF090;
 Fri,  8 Mar 2013 16:26:48 +0100 (CET)
Message-ID: <513A036A.9040406@dat.pl>
Date: Fri, 08 Mar 2013 16:27:38 +0100
From: Maciej Milewski <milu@dat.pl>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/20130221 Thunderbird/17.0.3
MIME-Version: 1.0
To: freebsd-net <fbsdmail@dnswatch.com>
Subject: Re: Implementing IP6 in 8.3
References: <b77c4b60019d745d151be9ba3e5446cc.authenticated@ultimatedns.net>
 <5138AED9.1020801@dat.pl>
 <eaa244ab49a30180aa7c88f45f3b38dc.authenticated@ultimatedns.net>
 <5139D20F.4050901@dat.pl>
 <97d1f60d519956584c4927f72c43e97f.authenticated@ultimatedns.net>
In-Reply-To: <97d1f60d519956584c4927f72c43e97f.authenticated@ultimatedns.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 15:27:01 -0000

On 08.03.2013 16:06, freebsd-net wrote:
> While I agree, inserting a router/switch between the modem & the clients/servers
> would be the shortest/easiest solution. In the end, I think the investment in
> building a (free)bsd kernel && drivers for the modem would have/provide the
> biggest reward(s). Truth be told; I have accumulated quite a mass of this type
> of equipment over the years, and I'd like to take a stab at building a
> (free)bsd kernel with associated drivers for them. Their all MIPS based, and
> many of them have ~32Mb && ~64Mb flash space & RAM. So, resources aren't too
> unreasonable. In the end, the benefits of having something /I/ have control over,
> makes these devices a great more valuable. It also empowers others whom are
> currently subject to the limitations their ISP imposes on them.
>
> Thank you again for taking the time to respond.
>
> --Chris
That's all correct as long as there are all pieces. F.ex. I've heard of 
some low level problems with some of the chipsets. The wifi chipsets are 
the most known for this. I think that I've heard about xDSL chipsets 
with similar problems. I wish you all the best with making your own 
firmware for this hardware.

-- 
Pozdrawiam,
Maciej Milewski


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 15:32:10 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5BA302BB;
 Fri,  8 Mar 2013 15:32:10 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id F3D81B51;
 Fri,  8 Mar 2013 15:32:09 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UDzKb-000NOA-KA; Fri, 08 Mar 2013 19:35:37 +0400
Message-ID: <513A0459.7010809@FreeBSD.org>
Date: Fri, 08 Mar 2013 19:31:37 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: jmg@funkthat.com, Andre Oppermann <andre@freebsd.org>, 
 net@freebsd.org
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <20130307214205.GD50035@funkthat.com>
In-Reply-To: <20130307214205.GD50035@funkthat.com>
Content-Type: multipart/mixed; boundary="------------050906070400020401060909"
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 15:32:10 -0000

This is a multi-part message in MIME format.
--------------050906070400020401060909
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 08.03.2013 01:42, John-Mark Gurney wrote:
> Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 +0100:
>>> Adding interface address is handled via atomically deleting old prefix and
>>> adding interface one.
>>
>> This brings up a long standing sore point of our routing code
>> which this patch makes more pronounced.  When an interface link
>> state is down I don't want the route to it to persist but to
>> become inactive so another path can be chosen.  This the very
>> point of running a routing daemon.  So on the link-down event
>> the installed interface routes should be removed from the routing
>> table.  The configured addresses though should persist and the
>> interface routes re-installed on a link-up event.  What's your
>> opinion on it?
>>
>> Other than these points I think your code is fine and can go
>> into the tree.
>
> The issue that I see with this is that if you bump your cable, all
> your connections will be dropped, because as soon as they try to send
> something, they'll get a no route to host, and this will break the
> TCP connection...  If we keep the routes when the link goes down,
> the packet will be queued or dropped (depending upon ethernet driver),
> but the TCP connection will not break...
Yes. Older one using if_start with OS queue should queue traffic, while 
if_transmit ones probably drop it.

So this behavior should be configurable depending on OS role.

Patch attached. Other issues like carp, IPv6 or similar can arise, so 
this definitely deserves wider discussion.

>


--------------050906070400020401060909
Content-Type: text/plain;
 name="remove_iface_routes.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="remove_iface_routes.diff"

Index: sys/net/if.c
===================================================================
--- sys/net/if.c	(revision 247623)
+++ sys/net/if.c	(working copy)
@@ -112,6 +112,12 @@ SYSCTL_INT(_net_link, OID_AUTO, log_link_state_cha
 	&log_link_state_change, 0,
 	"log interface link state change events");
 
+static VNET_DEFINE(int, remove_if_routes) = 0;
+#define	V_remove_if_routes	VNET(remove_if_routes)
+SYSCTL_VNET_INT(_net_link, OID_AUTO, remove_iface_routes_on_change, CTLFLAG_RW,
+    &VNET_NAME(remove_if_routes), 0,
+    "Remove iface routes on link state change");
+
 /* Interface description */
 static unsigned int ifdescr_maxlen = 1024;
 SYSCTL_UINT(_net, OID_AUTO, ifdescr_maxlen, CTLFLAG_RW,
@@ -161,10 +167,10 @@ static int	ifconf(u_long, caddr_t);
 static void	if_freemulti(struct ifmultiaddr *);
 static void	if_init(void *);
 static void	if_grow(void);
-static void	if_route(struct ifnet *, int flag, int fam);
+static void	if_route(struct ifnet *, int fam);
 static int	if_setflag(struct ifnet *, int, int, int *, int);
 static int	if_transmit(struct ifnet *ifp, struct mbuf *m);
-static void	if_unroute(struct ifnet *, int flag, int fam);
+static void	if_unroute(struct ifnet *, int fam);
 static void	link_rtrequest(int, struct rtentry *, struct rt_addrinfo *);
 static int	if_rtdel(struct radix_node *, void *);
 static int	ifhwioctl(u_long, struct ifnet *, caddr_t, struct thread *);
@@ -1834,22 +1841,13 @@ link_rtrequest(int cmd, struct rtentry *rt, struct
  * the transition.
  */
 static void
-if_unroute(struct ifnet *ifp, int flag, int fam)
+if_unroute(struct ifnet *ifp, int fam)
 {
 	struct ifaddr *ifa;
 
-	KASSERT(flag == IFF_UP, ("if_unroute: flag != IFF_UP"));
-
-	ifp->if_flags &= ~flag;
-	getmicrotime(&ifp->if_lastchange);
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (fam == PF_UNSPEC || (fam == ifa->ifa_addr->sa_family))
 			pfctlinput(PRC_IFDOWN, ifa->ifa_addr);
-	ifp->if_qflush(ifp);
-
-	if (ifp->if_carp)
-		(*carp_linkstate_p)(ifp);
-	rt_ifmsg(ifp);
 }
 
 /*
@@ -1857,23 +1855,13 @@ static void
  * the transition.
  */
 static void
-if_route(struct ifnet *ifp, int flag, int fam)
+if_route(struct ifnet *ifp, int fam)
 {
 	struct ifaddr *ifa;
 
-	KASSERT(flag == IFF_UP, ("if_route: flag != IFF_UP"));
-
-	ifp->if_flags |= flag;
-	getmicrotime(&ifp->if_lastchange);
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (fam == PF_UNSPEC || (fam == ifa->ifa_addr->sa_family))
 			pfctlinput(PRC_IFUP, ifa->ifa_addr);
-	if (ifp->if_carp)
-		(*carp_linkstate_p)(ifp);
-	rt_ifmsg(ifp);
-#ifdef INET6
-	in6_if_up(ifp);
-#endif
 }
 
 void	(*vlan_link_state_p)(struct ifnet *);	/* XXX: private from if_vlan */
@@ -1909,8 +1897,19 @@ do_link_state_change(void *arg, int pending)
 	int link_state = ifp->if_link_state;
 	CURVNET_SET(ifp->if_vnet);
 
+	/* Remove routes if link goes down */
+	if (V_remove_if_routes != 0 && link_state == LINK_STATE_DOWN &&
+	    (ifp->if_flags & IFF_UP))
+			if_unroute(ifp, PF_UNSPEC);
+
 	/* Notify that the link state has changed. */
 	rt_ifmsg(ifp);
+
+	/* Announce routes IFF Oper & Admin state is UP */
+	if (V_remove_if_routes != 0 && link_state == LINK_STATE_UP &&
+	    (ifp->if_flags & IFF_UP))
+			if_route(ifp, PF_UNSPEC);
+
 	if (ifp->if_vlantrunk != NULL)
 		(*vlan_link_state_p)(ifp);
 
@@ -1945,7 +1944,16 @@ void
 if_down(struct ifnet *ifp)
 {
 
-	if_unroute(ifp, IFF_UP, AF_UNSPEC);
+	ifp->if_flags &= ~IFF_UP;
+	getmicrotime(&ifp->if_lastchange);
+
+	if_unroute(ifp, AF_UNSPEC);
+
+	ifp->if_qflush(ifp);
+
+	if (ifp->if_carp)
+		(*carp_linkstate_p)(ifp);
+	rt_ifmsg(ifp);
 }
 
 /*
@@ -1956,7 +1964,18 @@ void
 if_up(struct ifnet *ifp)
 {
 
-	if_route(ifp, IFF_UP, AF_UNSPEC);
+	ifp->if_flags |= IFF_UP;
+	getmicrotime(&ifp->if_lastchange);
+
+	if (V_remove_if_routes == 0 || ifp->if_link_state == LINK_STATE_UP)
+		if_route(ifp, AF_UNSPEC);
+
+	if (ifp->if_carp)
+		(*carp_linkstate_p)(ifp);
+	rt_ifmsg(ifp);
+#ifdef INET6
+	in6_if_up(ifp);
+#endif
 }
 
 /*

--------------050906070400020401060909--

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 17:04:37 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B1251B97
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 17:04:37 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 6B70D377
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 17:04:37 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r28H4a4N003421;
 Fri, 8 Mar 2013 12:04:36 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r28H4aD4003418;
 Fri, 8 Mar 2013 12:04:36 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20794.6692.191898.682241@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 12:04:36 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: Jack Vogel <jfvogel@gmail.com>
Subject: Re: Limits on jumbo mbuf cluster allocation
In-Reply-To: <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
 <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 12:04:36 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 17:04:37 -0000

<<On Fri, 8 Mar 2013 00:31:18 -0800, Jack Vogel <jfvogel@gmail.com> said:

> I am not strongly opposed to trying the 4k mbuf pool for all larger sizes,
> Garrett maybe if you would try that on your system and see if that helps
> you, I could envision making this a tunable at some point perhaps?

If you can provide a patch I can certainly build it in to our kernel
and have it ready the next time the production server crashes.  I'd
like it to be at least a *little* tested by someone else beforehand,
though.

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 17:09:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5D674D4E;
 Fri,  8 Mar 2013 17:09:57 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 064B15EA;
 Fri,  8 Mar 2013 17:09:56 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r28H9ugC003460;
 Fri, 8 Mar 2013 12:09:56 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r28H9uOI003457;
 Fri, 8 Mar 2013 12:09:56 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20794.7012.265887.99878@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 12:09:56 -0500
From: Garrett Wollman <wollman@bimajority.org>
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Limits on jumbo mbuf cluster allocation
In-Reply-To: <51399926.6020201@freebsd.org>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 12:09:56 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: jfv@freebsd.org, freebsd-net@freebsd.org,
 Garrett Wollman <wollman@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 17:09:57 -0000

<<On Fri, 08 Mar 2013 08:54:14 +0100, Andre Oppermann <andre@freebsd.org> said:

> [stuff I wrote deleted]
> You have an amd64 kernel running HEAD or 9.x?

Yes, these are 9.1 with some patches to reduce mutex contention on the
NFS server's replay "cache".

> Jumbo pages come directly from the kernel_map which on amd64 is 512GB.
> So KVA shouldn't be a problem.  Your problem indeed appears to come
> physical memory fragmentation in pmap.

I hadn't realized that they were physically contiguous, but that makes
perfect sense.

> pages.  Also since you're doing NFS serving almost all memory will be
> in use for file caching.

I actually had the ZFS ARC tuned down to 64 GB (out of 96 GB physmem)
when I experienced this, but there are plenty of data structures in
the kernel that aren't subject to this limit and I could easily
imagine them checkerboarding physical memory to the point where no
contiguous three-page allocations were possible.

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 18:06:13 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5437A2D0
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 18:06:13 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id B9162A97
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 18:06:12 +0000 (UTC)
Received: (qmail 93868 invoked from network); 8 Mar 2013 19:19:21 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <wollman@freebsd.org>; 8 Mar 2013 19:19:21 -0000
Message-ID: <513A2887.2010408@freebsd.org>
Date: Fri, 08 Mar 2013 19:05:59 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Garrett Wollman <wollman@freebsd.org>
Subject: Re: Limits on jumbo mbuf cluster allocation
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
 <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
 <20794.6692.191898.682241@hergotha.csail.mit.edu>
In-Reply-To: <20794.6692.191898.682241@hergotha.csail.mit.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Jack Vogel <jfvogel@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 18:06:13 -0000

On 08.03.2013 18:04, Garrett Wollman wrote:
> <<On Fri, 8 Mar 2013 00:31:18 -0800, Jack Vogel <jfvogel@gmail.com> said:
>
>> I am not strongly opposed to trying the 4k mbuf pool for all larger sizes,
>> Garrett maybe if you would try that on your system and see if that helps
>> you, I could envision making this a tunable at some point perhaps?
>
> If you can provide a patch I can certainly build it in to our kernel
> and have it ready the next time the production server crashes.  I'd
> like it to be at least a *little* tested by someone else beforehand,
> though.

This should do the trick.

-- 
Andre

Index: dev/ixgbe/ixgbe.c
===================================================================
--- dev/ixgbe/ixgbe.c   (revision 247893)
+++ dev/ixgbe/ixgbe.c   (working copy)
@@ -1120,12 +1120,8 @@
         */
         if (adapter->max_frame_size <= 2048)
                 adapter->rx_mbuf_sz = MCLBYTES;
-       else if (adapter->max_frame_size <= 4096)
+       else
                 adapter->rx_mbuf_sz = MJUMPAGESIZE;
-       else if (adapter->max_frame_size <= 9216)
-               adapter->rx_mbuf_sz = MJUM9BYTES;
-       else
-               adapter->rx_mbuf_sz = MJUM16BYTES;

         /* Prepare receive descriptors and buffers */
         if (ixgbe_setup_receive_structures(adapter)) {

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 19:19:30 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B98161BA
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 19:19:30 +0000 (UTC)
 (envelope-from mxb@alumni.chalmers.se)
Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com
 [209.85.217.171]) by mx1.freebsd.org (Postfix) with ESMTP id 3B9B37DB
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 19:19:28 +0000 (UTC)
Received: by mail-lb0-f171.google.com with SMTP id gg13so1597740lbb.30
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 11:19:27 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:from:content-type:content-transfer-encoding:subject
 :message-id:date:to:mime-version:x-mailer:x-gm-message-state;
 bh=hpP2DfqP0i7Cbc+wFXdogReSXv9+5ysn6PhmsGsBvPg=;
 b=bpYNttKSr/EG12Mj+4/q0wS049grHQyxDJBdIQy9qvoNL39B0ejeQOY1oHVXM0C5PH
 5i71iW3sspPVssrk7YABSqRI4kG3ByztAFslixhiz+lucZrGAUiBBaKlj3l9bxFn0oMA
 PGFBD7RfbnAivLuRiIrbDinVxQTV1p3sdb0su67ClyaD2aS3T1nkNhHgrVMjfhzRBkoW
 jVp2cJm1n24F8LFCiRmiymK5eOCq3vWAM1/prr3tZ3tnfR8FTIMenWHDgDKnzhR6AjYU
 74nDdWVCfA5dEeDZ0QQTVv5huD3vSF8YT5C03iRIadhS2swdQXlP54GHzgPYQ5NKWpUR
 eQ3A==
X-Received: by 10.152.104.80 with SMTP id gc16mr2937538lab.49.1362770367656;
 Fri, 08 Mar 2013 11:19:27 -0800 (PST)
Received: from grey.home.unixconn.com (h-75-17.a183.priv.bahnhof.se.
 [46.59.75.17])
 by mx.google.com with ESMTPS id xw14sm3072505lab.6.2013.03.08.11.19.25
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 08 Mar 2013 11:19:26 -0800 (PST)
From: mxb <mxb@alumni.chalmers.se>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Subject: 9.1-RELEASE-p1: em0: Could not setup receive structures
Message-Id: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se>
Date: Fri, 8 Mar 2013 20:19:24 +0100
To: freebsd-net@freebsd.org
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
X-Mailer: Apple Mail (2.1499)
X-Gm-Message-State: ALoCoQnplJjATAZxjQmJP+UGWD6UIBjCMX5FPWaXxwMH9eolz2KL0oPbdrG973j6mgaCcMlVJ4Pq
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 19:19:30 -0000


Hello list@,

I'm mostly active on OpenBSD-side, however I have several machines =
running fbsd with ZFS.

I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of em(4) =
with the problem  <subject> on 8.2-stable.
However, my problem has not disappeared after mentioned upgrade.

I serve VMWare images from this machine.

Configuration:

lagg0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu =
9000
	=
options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLA=
N_HWTSO>
	ether 00:25:90:24:70:e8
	inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255
	nd6 options=3D1<PERFORMNUD>
	media: Ethernet autoselect
	status: active
	laggproto lacp lagghash l2,l3,l4
	laggport: igb0 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: em0 flags=3D18<COLLECTING,DISTRIBUTING>
lagg1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu =
9000
	=
options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLA=
N_HWTSO>
	ether 00:25:90:24:70:e9
	inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255
	inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255
	nd6 options=3D1<PERFORMNUD>
	media: Ethernet autoselect
	status: active
	laggproto lacp lagghash l2,l3,l4
	laggport: igb1 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: em1 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>

current sysctl.conf (not fixed after upgrade):
# Every socket is a file, so increase them
kern.maxfiles=3D204800
kern.maxfilesperproc=3D200000
kern.ipc.maxsockets=3D204800

# Increase max command-line length showed in `ps` (e.g for Tomcat/Java)
# Default is PAGE_SIZE / 16 or 256 on x86
# For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120749
kern.ps_arg_cache_limit=3D4096

# Security
#net.inet.udp.blackhole=3D1
#net.inet.tcp.blackhole=3D2

kern.ipc.maxsockbuf=3D16777216
kern.ipc.nmbclusters=3D65535
kern.ipc.somaxconn=3D32768
#kern.maxfiles=3D65535
kern.maxvnodes=3D800000

vfs.zfs.l2arc_noprefetch=3D0
vfs.zfs.l2arc_write_max=3D16777216
vfs.zfs.l2arc_write_boost=3D16777216

net.inet.tcp.sendspace=3D65535
net.inet.tcp.recvspace=3D131072
net.inet.tcp.mssdflt=3D1452
net.inet.tcp.sendbuf_max=3D16777216
net.inet.tcp.sendbuf_inc=3D524288
net.inet.tcp.recvbuf_max=3D16777216
net.inet.tcp.recvbuf_inc=3D524288
net.inet.udp.recvspace=3D65535
net.inet.udp.maxdgram=3D65535
net.local.stream.recvspace=3D65535
net.local.stream.sendspace=3D65535
net.inet.tcp.delayed_ack=3D0


Any clues?

Please mail directly to me or cc to sysop@prisjakt.nu

Regards
Maxim


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:03:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 8FFBF41A
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:03:21 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vc0-f179.google.com (mail-vc0-f179.google.com
 [209.85.220.179]) by mx1.freebsd.org (Postfix) with ESMTP id 537CD718
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:03:21 +0000 (UTC)
Received: by mail-vc0-f179.google.com with SMTP id k1so1095174vck.38
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 12:03:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=nedf9ngqEBAa4nY079vCkQ80qvaCJ3Sbsdtri/ES8t8=;
 b=X/arb8emGD9aaULCqa5ckKIQjQEDaVMlRTl6V89g68ot0mrjSxA54kWeIQz30Qvf39
 JoFT6jw3/w6O8Qyl2R6/iQzDZM0Y+t3JUj5++dl5Imdw+zcUnBHtyUfh5BPuz15Rxdd2
 DfcHH2BW6rsX/fd9yJp9lc+yzt6cJs4G8/4j9ECMTxa9wA6keY0OWhuEbdum7ghwq5jC
 0w7tP3631rGSnPU1HRq38V0GgB/5J1OXErK9Z0Q/HUWtAuZucV1ozc5st3tK9fEgSLgm
 W4gEwrEYrNUz5epIfSVCYShxePtrvqVgmxKoh1z6ezESCpT3EcZwdRDmkkLcpenahWee
 WPmA==
MIME-Version: 1.0
X-Received: by 10.52.30.48 with SMTP id p16mr1275596vdh.118.1362773000366;
 Fri, 08 Mar 2013 12:03:20 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 12:03:20 -0800 (PST)
In-Reply-To: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se>
References: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se>
Date: Fri, 8 Mar 2013 12:03:20 -0800
Message-ID: <CAFOYbcmr8LzhUt6VndDgvX+W-0r17X3W0xqt7JYgFJSUVUJF-A@mail.gmail.com>
Subject: Re: 9.1-RELEASE-p1: em0: Could not setup receive structures
From: Jack Vogel <jfvogel@gmail.com>
To: mxb <mxb@alumni.chalmers.se>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:03:21 -0000

The message occurs because you don't have enough mbufs to setup the RX
ring, so you
need to look at nmbclusters. It may be that em is just the victim, since
you have igb interfaces
as well from what I see.

Jack


On Fri, Mar 8, 2013 at 11:19 AM, mxb <mxb@alumni.chalmers.se> wrote:

>
> Hello list@,
>
> I'm mostly active on OpenBSD-side, however I have several machines running
> fbsd with ZFS.
>
> I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of em(4)
> with the problem  <subject> on 8.2-stable.
> However, my problem has not disappeared after mentioned upgrade.
>
> I serve VMWare images from this machine.
>
> Configuration:
>
> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>
> options=4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:25:90:24:70:e8
>         inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255
>         nd6 options=1<PERFORMNUD>
>         media: Ethernet autoselect
>         status: active
>         laggproto lacp lagghash l2,l3,l4
>         laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
>         laggport: em0 flags=18<COLLECTING,DISTRIBUTING>
> lagg1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>
> options=4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:25:90:24:70:e9
>         inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255
>         inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255
>         nd6 options=1<PERFORMNUD>
>         media: Ethernet autoselect
>         status: active
>         laggproto lacp lagghash l2,l3,l4
>         laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
>         laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
>
> current sysctl.conf (not fixed after upgrade):
> # Every socket is a file, so increase them
> kern.maxfiles=204800
> kern.maxfilesperproc=200000
> kern.ipc.maxsockets=204800
>
> # Increase max command-line length showed in `ps` (e.g for Tomcat/Java)
> # Default is PAGE_SIZE / 16 or 256 on x86
> # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=120749
> kern.ps_arg_cache_limit=4096
>
> # Security
> #net.inet.udp.blackhole=1
> #net.inet.tcp.blackhole=2
>
> kern.ipc.maxsockbuf=16777216
> kern.ipc.nmbclusters=65535
> kern.ipc.somaxconn=32768
> #kern.maxfiles=65535
> kern.maxvnodes=800000
>
> vfs.zfs.l2arc_noprefetch=0
> vfs.zfs.l2arc_write_max=16777216
> vfs.zfs.l2arc_write_boost=16777216
>
> net.inet.tcp.sendspace=65535
> net.inet.tcp.recvspace=131072
> net.inet.tcp.mssdflt=1452
> net.inet.tcp.sendbuf_max=16777216
> net.inet.tcp.sendbuf_inc=524288
> net.inet.tcp.recvbuf_max=16777216
> net.inet.tcp.recvbuf_inc=524288
> net.inet.udp.recvspace=65535
> net.inet.udp.maxdgram=65535
> net.local.stream.recvspace=65535
> net.local.stream.sendspace=65535
> net.inet.tcp.delayed_ack=0
>
>
> Any clues?
>
> Please mail directly to me or cc to sysop@prisjakt.nu
>
> Regards
> Maxim
>
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:11:50 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 163E0881;
 Fri,  8 Mar 2013 20:11:50 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qe0-f43.google.com (mail-qe0-f43.google.com
 [209.85.128.43]) by mx1.freebsd.org (Postfix) with ESMTP id A7D6E82C;
 Fri,  8 Mar 2013 20:11:49 +0000 (UTC)
Received: by mail-qe0-f43.google.com with SMTP id 1so1241902qee.2
 for <multiple recipients>; Fri, 08 Mar 2013 12:11:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=6dfG5ak3dFdXeVLMvxfuj1OJZUZSzvkRA24n915cEJ0=;
 b=USSvbn+UoCu3EK8sPoJWRMDiTWtDb9brRRGWLHAZW9eVjBHSv7H5kqgidwdDyxmM/i
 p3ggkKyvVcorXyUt+4oE9F41QqWJtJj6wKbA5K6gqINOgK0dhnGFwEYfCQWmvMMSFTts
 R9vzlaMny4m3vBJdZPp7oRmXHgT3hfQvSsctmVj7u2qockJOcnBz96t87yt5sgr1topB
 SbdZNG2nfiUVuBq8eXQXnhSr6tkqPYkBEYnc/UiJK4itXxfm8moFaeRk9d7735p0yGv5
 v3fqzD/7Q33nQsiWveGxiI7znfhqNC92ggkNqg2aAUUgrE26KduAK8DrMik4STf/HDrp
 qGEg==
MIME-Version: 1.0
X-Received: by 10.224.184.130 with SMTP id ck2mr5848224qab.41.1362773503493;
 Fri, 08 Mar 2013 12:11:43 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Fri, 8 Mar 2013 12:11:43 -0800 (PST)
In-Reply-To: <201303081419.17743.vegeta@tuxpowered.net>
References: <201303081419.17743.vegeta@tuxpowered.net>
Date: Fri, 8 Mar 2013 21:11:43 +0100
X-Google-Sender-Auth: 9xXpcPwr1C64h_-MLHQWtFTBtYw
Message-ID: <CAPBZQG2bb2xzPB2UoPUDx-ifyBdmjac6b8kV76DTPBUzLCDmJw@mail.gmail.com>
Subject: Re: [patch] Source entries removing is awfully slow.
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:11:50 -0000

Is this FreeBSD 9.x or HEAD?


On Fri, Mar 8, 2013 at 2:19 PM, Kajetan Staszkiewicz
<vegeta@tuxpowered.net>wrote:

> Hello there!
>
> In my enviroment, where I use FreeBSD machines as loadbalancers, after a
> server
> is detected as dead, loadbalancer removes the the broken server from a
> table
> used in route-to pf rule and then removes Source entries pointing clients
> to
> that server, so clients previously assigned to the broken server are re-
> loadbalanced to alive servers.
>
> Each loadbalancer has around 50k Source and 500k State entries. Under those
> conditions removing a Source from anywhere to a dead server with `pfctl -K
> 0.0.0.0/0 -K internal.IP.of.server` freezes the machine for a few seconds
> (or
> even up to a minute in other datacenter segment, where different services
> are
> served, causing thousands instead of just a few hundred States to be
> matched).
> Under a DDoS attack, when removing Sources to a server under attack, kernel
> freezes permanently (I gave up after 10 minutes waiting and restarted the
> machine).
>
> A patch fixing the issue can be found here:
>
> http://vegeta.tuxpowered.net/download/link-states-to-src_node.patch
>
> --
> | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
> |  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
> |        Vegeta          | www: http://vegeta.tuxpowered.net     |
> `------------------------^---------------------------------------'
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


-- 
Ermal

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:13:29 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DE245969;
 Fri,  8 Mar 2013 20:13:29 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vb0-x22a.google.com (mail-vb0-x22a.google.com
 [IPv6:2607:f8b0:400c:c02::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id 7BEF4844;
 Fri,  8 Mar 2013 20:13:29 +0000 (UTC)
Received: by mail-vb0-f42.google.com with SMTP id ff1so806973vbb.1
 for <multiple recipients>; Fri, 08 Mar 2013 12:13:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=WSoNqN97tyowqEdqZLWxrDvqQNDpD7DH6Yyp44TzIZo=;
 b=YKpA/eGUhk18+UYso/10hrMYdAzx6ZUj5x7P8mlFQC8Sr7KRgSB/xkMPhmG6EYC/EG
 xa6LbqI3Stls3gC6lboeXurnmx1rlcRFWi2xV/Tc+pk52B9Dzitj6zNYGNYVpRSWLtqb
 Ib7YphqtJdq2KwNPQmOiV6+w5zj2oz9bUR9GRENWmIIuNDnxM6gPAgP8NtB0NKrl4eI8
 MigSZH2mSBPG4V0ni9tVRToLrn5WJ01qutU1woqSgWhRm4wBX1rida4iMMfLLBm1N6Iw
 6RYTk7uUegtmickuNHP2Z8fN/TbP/HcHalfe2BTYE9Le9C9PhtV8M0gDP/PxeEi37TVi
 e01w==
MIME-Version: 1.0
X-Received: by 10.52.19.239 with SMTP id i15mr1292070vde.47.1362773608886;
 Fri, 08 Mar 2013 12:13:28 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 12:13:28 -0800 (PST)
In-Reply-To: <513A2887.2010408@freebsd.org>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
 <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
 <20794.6692.191898.682241@hergotha.csail.mit.edu>
 <513A2887.2010408@freebsd.org>
Date: Fri, 8 Mar 2013 12:13:28 -0800
Message-ID: <CAFOYbc=7iROKzUwnB0fMR=ix8VFo+ONfG=NX43jeF7jkp74JhQ@mail.gmail.com>
Subject: Re: Limits on jumbo mbuf cluster allocation
From: Jack Vogel <jfvogel@gmail.com>
To: Andre Oppermann <andre@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org, Garrett Wollman <wollman@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:13:29 -0000

Yes, in the past the code was in this form, it should work fine Garrett,
just make sure
the 4K pool is large enough.

I've actually been thinking about making the ring mbuf allocation sparse,
and what type
of strategy could be used. Right now I'm thinking of implementing a tunable
threshold,
and as long as I'm doing that, the 82599 hardware has an interrupt that can
be set up
for a low descriptor condition. This could have some performance benefits,
I could decouple
the mbuf refresh even further from rxeof, relying on the interrupt to
initiate the refresh rather
than a count as it is now.

It needs some experimentation/testing first, but I will look into this.

Jack


On Fri, Mar 8, 2013 at 10:05 AM, Andre Oppermann <andre@freebsd.org> wrote:

> On 08.03.2013 18:04, Garrett Wollman wrote:
>
>> <<On Fri, 8 Mar 2013 00:31:18 -0800, Jack Vogel <jfvogel@gmail.com> said:
>>
>>  I am not strongly opposed to trying the 4k mbuf pool for all larger
>>> sizes,
>>> Garrett maybe if you would try that on your system and see if that helps
>>> you, I could envision making this a tunable at some point perhaps?
>>>
>>
>> If you can provide a patch I can certainly build it in to our kernel
>> and have it ready the next time the production server crashes.  I'd
>> like it to be at least a *little* tested by someone else beforehand,
>> though.
>>
>
> This should do the trick.
>
> --
> Andre
>
> Index: dev/ixgbe/ixgbe.c
> ==============================**==============================**=======
> --- dev/ixgbe/ixgbe.c   (revision 247893)
> +++ dev/ixgbe/ixgbe.c   (working copy)
> @@ -1120,12 +1120,8 @@
>         */
>         if (adapter->max_frame_size <= 2048)
>                 adapter->rx_mbuf_sz = MCLBYTES;
> -       else if (adapter->max_frame_size <= 4096)
> +       else
>                 adapter->rx_mbuf_sz = MJUMPAGESIZE;
> -       else if (adapter->max_frame_size <= 9216)
> -               adapter->rx_mbuf_sz = MJUM9BYTES;
> -       else
> -               adapter->rx_mbuf_sz = MJUM16BYTES;
>
>         /* Prepare receive descriptors and buffers */
>         if (ixgbe_setup_receive_**structures(adapter)) {
>

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:16:55 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B1509A25
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:16:55 +0000 (UTC)
 (envelope-from mxb@alumni.chalmers.se)
Received: from mail-la0-x230.google.com (mail-la0-x230.google.com
 [IPv6:2a00:1450:4010:c03::230])
 by mx1.freebsd.org (Postfix) with ESMTP id 1073B865
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:16:54 +0000 (UTC)
Received: by mail-la0-f48.google.com with SMTP id fq13so2107035lab.35
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 12:16:54 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:content-type:mime-version:subject:from:in-reply-to:date
 :cc:content-transfer-encoding:message-id:references:to:x-mailer
 :x-gm-message-state;
 bh=ngCr5DDSFRRX6s9lrYxa2hy/4HXmTC0ze58H1zEkJUg=;
 b=ZAMkI5l1Gg6aKrJV+X7DdiyaXn9sblK/W73zpGyArr4az/tkMUXIOFcztO26n1Sljb
 r4BAXo4bgkUdAOBoYK6OCLWWp3sIGFRbACDI7KBoe45z/QqzvroHylm293rHAmEd2A6p
 kLXNSwRSE3NadcteipSIX8Y5BEtvLPH8Xgxj9+ehU+aHcx31WEjjTI4fKHJmUn4gpOqS
 z7MODRq3W3VolEUbUrP2iq/eHKfPDqFuPYJxOPqJIyOn1DcK5vhVmu75+njSCIMRRmC9
 dZwZ5m8ZYGTDaPEmttHAIcGuPLzFADN0T04ZWtKXkJIQnhgdbalPFnuEG71N5ubBjDv8
 g8Qg==
X-Received: by 10.112.104.103 with SMTP id gd7mr1538285lbb.54.1362773813868;
 Fri, 08 Mar 2013 12:16:53 -0800 (PST)
Received: from grey.home.unixconn.com (h-75-17.a183.priv.bahnhof.se.
 [46.59.75.17])
 by mx.google.com with ESMTPS id q9sm2011142lbz.3.2013.03.08.12.16.51
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 08 Mar 2013 12:16:52 -0800 (PST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: 9.1-RELEASE-p1: em0: Could not setup receive structures
From: mxb <mxb@alumni.chalmers.se>
In-Reply-To: <CAFOYbcmr8LzhUt6VndDgvX+W-0r17X3W0xqt7JYgFJSUVUJF-A@mail.gmail.com>
Date: Fri, 8 Mar 2013 21:16:50 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <2E6BC0C8-D435-433A-ABF7-D0E4F649F262@alumni.chalmers.se>
References: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se>
 <CAFOYbcmr8LzhUt6VndDgvX+W-0r17X3W0xqt7JYgFJSUVUJF-A@mail.gmail.com>
To: Jack Vogel <jfvogel@gmail.com>
X-Mailer: Apple Mail (2.1499)
X-Gm-Message-State: ALoCoQnQs7wPieT+/P0jO+kHHaSP6N62Obz/c4+bNoOxC15/ncOI4u0wxGzsF83kTrR6HrZZ3Eeg
Cc: freebsd-net@freebsd.org, mxb <mxb@alumni.chalmers.se>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:16:55 -0000


Any sysctl I'd should look out for?

//maxim


On 8 mar 2013, at 21:03, Jack Vogel <jfvogel@gmail.com> wrote:

> The message occurs because you don't have enough mbufs to setup the RX
> ring, so you
> need to look at nmbclusters. It may be that em is just the victim, =
since
> you have igb interfaces
> as well from what I see.
>=20
> Jack
>=20
>=20
> On Fri, Mar 8, 2013 at 11:19 AM, mxb <mxb@alumni.chalmers.se> wrote:
>=20
>>=20
>> Hello list@,
>>=20
>> I'm mostly active on OpenBSD-side, however I have several machines =
running
>> fbsd with ZFS.
>>=20
>> I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of =
em(4)
>> with the problem  <subject> on 8.2-stable.
>> However, my problem has not disappeared after mentioned upgrade.
>>=20
>> I serve VMWare images from this machine.
>>=20
>> Configuration:
>>=20
>> lagg0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 9000
>>=20
>> =
options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLA=
N_HWTSO>
>>        ether 00:25:90:24:70:e8
>>        inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255
>>        nd6 options=3D1<PERFORMNUD>
>>        media: Ethernet autoselect
>>        status: active
>>        laggproto lacp lagghash l2,l3,l4
>>        laggport: igb0 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
>>        laggport: em0 flags=3D18<COLLECTING,DISTRIBUTING>
>> lagg1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 =
mtu 9000
>>=20
>> =
options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLA=
N_HWTSO>
>>        ether 00:25:90:24:70:e9
>>        inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255
>>        nd6 options=3D1<PERFORMNUD>
>>        media: Ethernet autoselect
>>        status: active
>>        laggproto lacp lagghash l2,l3,l4
>>        laggport: igb1 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
>>        laggport: em1 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
>>=20
>> current sysctl.conf (not fixed after upgrade):
>> # Every socket is a file, so increase them
>> kern.maxfiles=3D204800
>> kern.maxfilesperproc=3D200000
>> kern.ipc.maxsockets=3D204800
>>=20
>> # Increase max command-line length showed in `ps` (e.g for =
Tomcat/Java)
>> # Default is PAGE_SIZE / 16 or 256 on x86
>> # For more info see: =
http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120749
>> kern.ps_arg_cache_limit=3D4096
>>=20
>> # Security
>> #net.inet.udp.blackhole=3D1
>> #net.inet.tcp.blackhole=3D2
>>=20
>> kern.ipc.maxsockbuf=3D16777216
>> kern.ipc.nmbclusters=3D65535
>> kern.ipc.somaxconn=3D32768
>> #kern.maxfiles=3D65535
>> kern.maxvnodes=3D800000
>>=20
>> vfs.zfs.l2arc_noprefetch=3D0
>> vfs.zfs.l2arc_write_max=3D16777216
>> vfs.zfs.l2arc_write_boost=3D16777216
>>=20
>> net.inet.tcp.sendspace=3D65535
>> net.inet.tcp.recvspace=3D131072
>> net.inet.tcp.mssdflt=3D1452
>> net.inet.tcp.sendbuf_max=3D16777216
>> net.inet.tcp.sendbuf_inc=3D524288
>> net.inet.tcp.recvbuf_max=3D16777216
>> net.inet.tcp.recvbuf_inc=3D524288
>> net.inet.udp.recvspace=3D65535
>> net.inet.udp.maxdgram=3D65535
>> net.local.stream.recvspace=3D65535
>> net.local.stream.sendspace=3D65535
>> net.inet.tcp.delayed_ack=3D0
>>=20
>>=20
>> Any clues?
>>=20
>> Please mail directly to me or cc to sysop@prisjakt.nu
>>=20
>> Regards
>> Maxim
>>=20
>>=20
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
>>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:20:09 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 52313AFC
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:20:09 +0000 (UTC)
 (envelope-from jeffrey.e.pieper@intel.com)
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by mx1.freebsd.org (Postfix) with ESMTP id BAD05880
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:20:08 +0000 (UTC)
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by fmsmga102.fm.intel.com with ESMTP; 08 Mar 2013 12:19:56 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="4.84,809,1355126400"; d="scan'208";a="298045704"
Received: from orsmsx105.amr.corp.intel.com ([10.22.225.132])
 by fmsmga001.fm.intel.com with ESMTP; 08 Mar 2013 12:19:51 -0800
Received: from orsmsx110.amr.corp.intel.com (10.22.225.11) by
 ORSMSX105.amr.corp.intel.com (10.22.225.132) with Microsoft SMTP Server (TLS)
 id 14.1.355.2; Fri, 8 Mar 2013 12:19:50 -0800
Received: from orsmsx101.amr.corp.intel.com ([169.254.8.213]) by
 ORSMSX110.amr.corp.intel.com ([10.22.225.11]) with mapi id 14.01.0355.002;
 Fri, 8 Mar 2013 12:19:50 -0800
From: "Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>
To: mxb <mxb@alumni.chalmers.se>, Jack Vogel <jfvogel@gmail.com>
Subject: RE: 9.1-RELEASE-p1: em0: Could not setup receive structures
Thread-Topic: 9.1-RELEASE-p1: em0: Could not setup receive structures
Thread-Index: AQHOHDnr1t1jJJcMs0m4zpHbw4SFOpicPASA
Date: Fri, 8 Mar 2013 20:19:49 +0000
Message-ID: <2A35EA60C3C77D438915767F458D65687D4608A9@ORSMSX101.amr.corp.intel.com>
References: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se>
 <CAFOYbcmr8LzhUt6VndDgvX+W-0r17X3W0xqt7JYgFJSUVUJF-A@mail.gmail.com>
 <2E6BC0C8-D435-433A-ABF7-D0E4F649F262@alumni.chalmers.se>
In-Reply-To: <2E6BC0C8-D435-433A-ABF7-D0E4F649F262@alumni.chalmers.se>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.22.254.138]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:20:09 -0000

kern.ipc.nmbclusters

Jeff

-----Original Message-----
From: owner-freebsd-net@freebsd.org [mailto:owner-freebsd-net@freebsd.org] =
On Behalf Of mxb
Sent: Friday, March 08, 2013 12:17 PM
To: Jack Vogel
Cc: freebsd-net@freebsd.org; mxb
Subject: Re: 9.1-RELEASE-p1: em0: Could not setup receive structures


Any sysctl I'd should look out for?

//maxim


On 8 mar 2013, at 21:03, Jack Vogel <jfvogel@gmail.com> wrote:

> The message occurs because you don't have enough mbufs to setup the RX
> ring, so you
> need to look at nmbclusters. It may be that em is just the victim, since
> you have igb interfaces
> as well from what I see.
>=20
> Jack
>=20
>=20
> On Fri, Mar 8, 2013 at 11:19 AM, mxb <mxb@alumni.chalmers.se> wrote:
>=20
>>=20
>> Hello list@,
>>=20
>> I'm mostly active on OpenBSD-side, however I have several machines runni=
ng
>> fbsd with ZFS.
>>=20
>> I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of em(4)
>> with the problem  <subject> on 8.2-stable.
>> However, my problem has not disappeared after mentioned upgrade.
>>=20
>> I serve VMWare images from this machine.
>>=20
>> Configuration:
>>=20
>> lagg0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu=
 9000
>>=20
>> options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,V=
LAN_HWTSO>
>>        ether 00:25:90:24:70:e8
>>        inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255
>>        nd6 options=3D1<PERFORMNUD>
>>        media: Ethernet autoselect
>>        status: active
>>        laggproto lacp lagghash l2,l3,l4
>>        laggport: igb0 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
>>        laggport: em0 flags=3D18<COLLECTING,DISTRIBUTING>
>> lagg1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu=
 9000
>>=20
>> options=3D4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,V=
LAN_HWTSO>
>>        ether 00:25:90:24:70:e9
>>        inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255
>>        inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255
>>        nd6 options=3D1<PERFORMNUD>
>>        media: Ethernet autoselect
>>        status: active
>>        laggproto lacp lagghash l2,l3,l4
>>        laggport: igb1 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
>>        laggport: em1 flags=3D1c<ACTIVE,COLLECTING,DISTRIBUTING>
>>=20
>> current sysctl.conf (not fixed after upgrade):
>> # Every socket is a file, so increase them
>> kern.maxfiles=3D204800
>> kern.maxfilesperproc=3D200000
>> kern.ipc.maxsockets=3D204800
>>=20
>> # Increase max command-line length showed in `ps` (e.g for Tomcat/Java)
>> # Default is PAGE_SIZE / 16 or 256 on x86
>> # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120749
>> kern.ps_arg_cache_limit=3D4096
>>=20
>> # Security
>> #net.inet.udp.blackhole=3D1
>> #net.inet.tcp.blackhole=3D2
>>=20
>> kern.ipc.maxsockbuf=3D16777216
>> kern.ipc.nmbclusters=3D65535
>> kern.ipc.somaxconn=3D32768
>> #kern.maxfiles=3D65535
>> kern.maxvnodes=3D800000
>>=20
>> vfs.zfs.l2arc_noprefetch=3D0
>> vfs.zfs.l2arc_write_max=3D16777216
>> vfs.zfs.l2arc_write_boost=3D16777216
>>=20
>> net.inet.tcp.sendspace=3D65535
>> net.inet.tcp.recvspace=3D131072
>> net.inet.tcp.mssdflt=3D1452
>> net.inet.tcp.sendbuf_max=3D16777216
>> net.inet.tcp.sendbuf_inc=3D524288
>> net.inet.tcp.recvbuf_max=3D16777216
>> net.inet.tcp.recvbuf_inc=3D524288
>> net.inet.udp.recvspace=3D65535
>> net.inet.udp.maxdgram=3D65535
>> net.local.stream.recvspace=3D65535
>> net.local.stream.sendspace=3D65535
>> net.inet.tcp.delayed_ack=3D0
>>=20
>>=20
>> Any clues?
>>=20
>> Please mail directly to me or cc to sysop@prisjakt.nu
>>=20
>> Regards
>> Maxim
>>=20
>>=20
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:28:13 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 08E08E80
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:28:13 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 9464B8E2
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:28:12 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r28KSBKx005762;
 Fri, 8 Mar 2013 15:28:11 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r28KSBQV005759;
 Fri, 8 Mar 2013 15:28:11 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20794.18907.530374.164737@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 15:28:11 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: Jack Vogel <jfvogel@gmail.com>
Subject: UNS: Re: Limits on jumbo mbuf cluster allocation
In-Reply-To: <CAFOYbc=7iROKzUwnB0fMR=ix8VFo+ONfG=NX43jeF7jkp74JhQ@mail.gmail.com>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
 <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
 <20794.6692.191898.682241@hergotha.csail.mit.edu>
 <513A2887.2010408@freebsd.org>
 <CAFOYbc=7iROKzUwnB0fMR=ix8VFo+ONfG=NX43jeF7jkp74JhQ@mail.gmail.com>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 15:28:11 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:28:13 -0000

<<On Fri, 8 Mar 2013 12:13:28 -0800, Jack Vogel <jfvogel@gmail.com> said:

> Yes, in the past the code was in this form, it should work fine Garrett,
> just make sure
> the 4K pool is large enough.

I take it then that the hardware works in the traditional way, and
just keeps on using buffers until the packet is completely written,
then sets a field on the ring descriptor saying "the end of the packet
is HERE"?

I'll give that change a try when I get a chance.

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:33:11 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 6FC7AD6;
 Fri,  8 Mar 2013 20:33:11 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vb0-x22b.google.com (mail-vb0-x22b.google.com
 [IPv6:2607:f8b0:400c:c02::22b])
 by mx1.freebsd.org (Postfix) with ESMTP id 166C090C;
 Fri,  8 Mar 2013 20:33:11 +0000 (UTC)
Received: by mail-vb0-f43.google.com with SMTP id fs19so818997vbb.30
 for <multiple recipients>; Fri, 08 Mar 2013 12:33:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=/iBg7AYMgRUajUfsdvZt+hW9WgMVNk0duoiVzI09uNY=;
 b=FscbCUFsB28Z6Q/CpT5RGNkHo3ZQY/PQtVk4uOBx3SG8nZWReOvpsGAfhVeRX/czH6
 qAssvofTgp4S6rr6yD4uTka81b2gzVfRxEdazuXqOHWVdVJ5cxwakOshNdmIUnJNt2uh
 hYLIGRjGKVT8R2tumDR5FqmxI0FGdGMl6fga2t3kC+T/66w8kTfvqPNpE6nAUkGSlyQA
 v7Now3ABgoPQP6mfjRLlljxM2U1HgvZc5cNKVxM29rdBX0uDDNRwXe40Dq2I8syoG8Fj
 XXwYnMp9qnl6xHqmzaJURwOR7j+KdyxQ4Xzm/Ga5GxMOYoBW6mxBZhoT4/xRo4ccuhz9
 KbnQ==
MIME-Version: 1.0
X-Received: by 10.52.19.239 with SMTP id i15mr1317983vde.47.1362774789931;
 Fri, 08 Mar 2013 12:33:09 -0800 (PST)
Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 12:33:09 -0800 (PST)
In-Reply-To: <20794.18907.530374.164737@hergotha.csail.mit.edu>
References: <20793.36593.774795.720959@hergotha.csail.mit.edu>
 <51399926.6020201@freebsd.org>
 <CAFOYbc=x7U-s70KvcZJdrVP6v-On716qMi=HN1P2Kj+d_K972A@mail.gmail.com>
 <20794.6692.191898.682241@hergotha.csail.mit.edu>
 <513A2887.2010408@freebsd.org>
 <CAFOYbc=7iROKzUwnB0fMR=ix8VFo+ONfG=NX43jeF7jkp74JhQ@mail.gmail.com>
 <20794.18907.530374.164737@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 12:33:09 -0800
Message-ID: <CAFOYbcnHy+Oh2kjYTG7zj=te2Wi4A6vAOZKmYttMt12EoeW=WQ@mail.gmail.com>
Subject: Re: UNS: Re: Limits on jumbo mbuf cluster allocation
From: Jack Vogel <jfvogel@gmail.com>
To: Garrett Wollman <wollman@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:33:11 -0000

Yes, the write-back descriptor has a bit in the status field that says its
EOP (end of packet)
or not.

Jack


On Fri, Mar 8, 2013 at 12:28 PM, Garrett Wollman <wollman@freebsd.org>wrote:

> <<On Fri, 8 Mar 2013 12:13:28 -0800, Jack Vogel <jfvogel@gmail.com> said:
>
> > Yes, in the past the code was in this form, it should work fine Garrett,
> > just make sure
> > the 4K pool is large enough.
>
> I take it then that the hardware works in the traditional way, and
> just keeps on using buffers until the packet is completely written,
> then sets a field on the ring descriptor saying "the end of the packet
> is HERE"?
>
> I'll give that change a try when I get a chance.
>
> -GAWollman
>
>

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:45:42 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5736C533;
 Fri,  8 Mar 2013 20:45:42 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 1A30198A;
 Fri,  8 Mar 2013 20:45:42 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r28Kjflt047423;
 Fri, 8 Mar 2013 20:45:41 GMT
 (envelope-from melifaro@freefall.freebsd.org)
Received: (from melifaro@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r28Kjf63047422;
 Fri, 8 Mar 2013 20:45:41 GMT (envelope-from melifaro)
Date: Fri, 8 Mar 2013 20:45:41 GMT
Message-Id: <201303082045.r28Kjf63047422@freefall.freebsd.org>
To: melifaro@FreeBSD.org, freebsd-net@FreeBSD.org, melifaro@FreeBSD.org
From: melifaro@FreeBSD.org
Subject: Re: kern/155772: ifconfig(8): ioctl (SIOCAIFADDR): File exists on
 directly connected networks
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:45:42 -0000

Synopsis: ifconfig(8): ioctl (SIOCAIFADDR): File exists on directly connected networks

Responsible-Changed-From-To: freebsd-net->melifaro
Responsible-Changed-By: melifaro
Responsible-Changed-When: Fri Mar 8 20:45:18 UTC 2013
Responsible-Changed-Why: 
Take

http://www.freebsd.org/cgi/query-pr.cgi?pr=155772

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 20:51:06 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5D08668E
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:51:06 +0000 (UTC)
 (envelope-from vegeta@tuxpowered.net)
Received: from mail-ee0-f53.google.com (mail-ee0-f53.google.com [74.125.83.53])
 by mx1.freebsd.org (Postfix) with ESMTP id B81859C4
 for <freebsd-net@freebsd.org>; Fri,  8 Mar 2013 20:51:05 +0000 (UTC)
Received: by mail-ee0-f53.google.com with SMTP id e53so1288731eek.40
 for <freebsd-net@freebsd.org>; Fri, 08 Mar 2013 12:51:04 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:from:to:subject:date:user-agent:cc:references
 :in-reply-to:mime-version:content-type:content-transfer-encoding
 :message-id:x-gm-message-state;
 bh=XWil0kz7DOw9U0/ENqWwDm2MDNiKp778Re89w4pDYG4=;
 b=duT3y2usus5Fq8hj6ewPBTLM4XKJ5tAK5O7ilSWUHK+je4NBmTpMm4Peo3wyzzIwvx
 ARhM+e3cs8WjBNd/5Wp7CMbzwT7fTiO31PAMkG6iupw2TitOPGYyMqBfJTSE7zp21/y4
 h8eWSvBjAa2vfbCwnUxSp4mOouU0OL5vqJDmRuA6zcU/NlG2OVAqDYtNtLDiU3ovTuk6
 /CzAKRZ9oD5T+/kriioHFH2itJODE5XpzY4mExy0uZT9hCLiumJhJe2VmIDCxwdYVwF4
 VL5vcXZeFnt10pvPs8C33XHoYZxtsnlN7iMIgS/grbG1E5BqRGtBQBDF0AtFKSUQSbLJ
 jrxQ==
X-Received: by 10.14.0.135 with SMTP id 7mr9518352eeb.5.1362775864101;
 Fri, 08 Mar 2013 12:51:04 -0800 (PST)
Received: from zvezda.localnet ([37.83.50.199])
 by mx.google.com with ESMTPS id s3sm9728785eem.4.2013.03.08.12.51.02
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Fri, 08 Mar 2013 12:51:03 -0800 (PST)
From: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
To: Ermal =?utf-8?q?Lu=C3=A7i?= <eri@freebsd.org>
Subject: Re: [patch] Source entries removing is awfully slow.
Date: Fri, 8 Mar 2013 21:51:00 +0100
User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; )
References: <201303081419.17743.vegeta@tuxpowered.net>
 <CAPBZQG2bb2xzPB2UoPUDx-ifyBdmjac6b8kV76DTPBUzLCDmJw@mail.gmail.com>
In-Reply-To: <CAPBZQG2bb2xzPB2UoPUDx-ifyBdmjac6b8kV76DTPBUzLCDmJw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: <201303082151.00895.vegeta@tuxpowered.net>
X-Gm-Message-State: ALoCoQlvCQaYqgOa7UB12jEbGoAdKsTZFhU4qyepSvGKwF88C7hO+jHjs4wrr06SVtqLNw0uP+5q
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 20:51:06 -0000

Dnia pi=C4=85tek, 8 marca 2013 o 21:11:43 Ermal Lu=C3=A7i napisa=C5=82(a):
> Is this FreeBSD 9.x or HEAD?

I found the problem and developed the patch on 9.1.

=2D-=20
| pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
|  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
|        Vegeta          | www: http://vegeta.tuxpowered.net     |
`------------------------^---------------------------------------'

From owner-freebsd-net@FreeBSD.ORG  Fri Mar  8 23:14:14 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 75FDAA5D;
 Fri,  8 Mar 2013 23:14:14 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 3C97DECD;
 Fri,  8 Mar 2013 23:14:14 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r28NEE6c077371;
 Fri, 8 Mar 2013 23:14:14 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r28NEEud077370;
 Fri, 8 Mar 2013 23:14:14 GMT (envelope-from linimon)
Date: Fri, 8 Mar 2013 23:14:14 GMT
Message-Id: <201303082314.r28NEEud077370@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/176764: [net] [if_bridge] [patch] use-after-free in if_bridge
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2013 23:14:14 -0000

Old Synopsis: [net] [if_bridge] use-after-free in if_bridge
New Synopsis: [net] [if_bridge] [patch] use-after-free in if_bridge

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Fri Mar 8 23:13:52 UTC 2013
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=176764

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 00:48:22 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id CC08DDF5;
 Sat,  9 Mar 2013 00:48:22 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 39715224;
 Sat,  9 Mar 2013 00:48:21 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqEEAIyFOlGDaFvO/2dsb2JhbABDiCW8NoFzdIIsAQEBAwEBAQEgBCcgCxsYAgINGQIpAQkmBggHBAEcBIdsBgypZ5I3gSOMMwV9NAeCLYETA4hxiySCPoEej1SDKE99CBce
X-IronPort-AV: E=Sophos;i="4.84,810,1355115600"; d="scan'208";a="20199368"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 08 Mar 2013 19:47:13 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A6806B4032;
 Fri,  8 Mar 2013 19:47:13 -0500 (EST)
Date: Fri, 8 Mar 2013 19:47:13 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Garrett Wollman <wollman@bimajority.org>
Message-ID: <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20794.7012.265887.99878@hergotha.csail.mit.edu>
Subject: Re: Limits on jumbo mbuf cluster allocation
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: jfv@freebsd.org, freebsd-net@freebsd.org,
 Andre Oppermann <andre@freebsd.org>, Garrett Wollman <wollman@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 00:48:22 -0000

Garrett Wollman wrote:
> <<On Fri, 08 Mar 2013 08:54:14 +0100, Andre Oppermann
> <andre@freebsd.org> said:
> 
> > [stuff I wrote deleted]
> > You have an amd64 kernel running HEAD or 9.x?
> 
> Yes, these are 9.1 with some patches to reduce mutex contention on the
> NFS server's replay "cache".
> 
The cached replies are copies of the mbuf list done via m_copym().
As such, the clusters in these replies won't be free'd (ref cnt -> 0)
until the cache is trimmed (nfsrv_trimcache() gets called after the
TCP layer has received an ACK for receipt of the reply from the client).

If reducing the size to 4K doesn't fix the problem, you might want to
consider shrinking the tunable vfs.nfsd.tcphighwater and suffering
the increased CPU overhead (and some increased mutex contention) of
calling nfsrv_trimcache() more frequently.
(I'm assuming that you are using drc2.patch + drc3.patch. If you are
 using one of ivoras@'s variants of the patch, I'm not sure if the
 tunable is called the same thing, although it should have basically
 the same effect.)

Good luck with it and thanks for running on the "bleeding edge" so
these issues get identified, rick

> > Jumbo pages come directly from the kernel_map which on amd64 is
> > 512GB.
> > So KVA shouldn't be a problem. Your problem indeed appears to come
> > physical memory fragmentation in pmap.
> 
> I hadn't realized that they were physically contiguous, but that makes
> perfect sense.
> 
> > pages. Also since you're doing NFS serving almost all memory will be
> > in use for file caching.
> 
> I actually had the ZFS ARC tuned down to 64 GB (out of 96 GB physmem)
> when I experienced this, but there are plenty of data structures in
> the kernel that aren't subject to this limit and I could easily
> imagine them checkerboarding physical memory to the point where no
> contiguous three-page allocations were possible.
> 
> -GAWollman
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 01:40:03 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 95E148D8
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 01:40:03 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 37E2436A
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 01:40:03 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r291e1kw010071;
 Fri, 8 Mar 2013 20:40:01 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r291e1O7010068;
 Fri, 8 Mar 2013 20:40:01 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20794.37617.822910.93537@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 20:40:01 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: Limits on jumbo mbuf cluster allocation
In-Reply-To: <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca>
References: <20794.7012.265887.99878@hergotha.csail.mit.edu>
 <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 20:40:01 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 01:40:03 -0000

<<On Fri, 8 Mar 2013 19:47:13 -0500 (EST), Rick Macklem <rmacklem@uoguelph.ca> said:

> If reducing the size to 4K doesn't fix the problem, you might want to
> consider shrinking the tunable vfs.nfsd.tcphighwater and suffering
> the increased CPU overhead (and some increased mutex contention) of
> calling nfsrv_trimcache() more frequently.

Can't do that -- the system becomes intolerably slow when it gets into
that state, and seems to get stuck that way, such that the only way to
restore performance is to increase the size of the "cache".
(Essentially all of the nfsd service threads end up spinning most of
the time, load average goes to N, and goodput goes to nearly nil.)  It
does seem like a lot of effort for an extreme edge case that, in
practical terms, never happens.

> (I'm assuming that you are using drc2.patch + drc3.patch.

I believe that's what I have.  If my kernel coding skills were less
rusty, I'd fix it to have a separate cache-trimming thread.

One other weird thing that I've noticed is that netstat(1) reports the
send and receive queues on NFS connections as being far higher than I
have the limits configured.  Does NFS do something to override this?

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 01:52:46 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B2E9AABF;
 Sat,  9 Mar 2013 01:52:46 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 5D8EB3E9;
 Sat,  9 Mar 2013 01:52:46 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r291qj70010186;
 Fri, 8 Mar 2013 20:52:45 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r291qjvK010183;
 Fri, 8 Mar 2013 20:52:45 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20794.38381.221980.5038@hergotha.csail.mit.edu>
Date: Fri, 8 Mar 2013 20:52:45 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: NFS DRC size
In-Reply-To: <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca>
References: <20794.7012.265887.99878@hergotha.csail.mit.edu>
 <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 20:52:45 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 01:52:46 -0000

<<On Fri, 8 Mar 2013 19:47:13 -0500 (EST), Rick Macklem <rmacklem@uoguelph.ca> said:

> The cached replies are copies of the mbuf list done via m_copym().
> As such, the clusters in these replies won't be free'd (ref cnt -> 0)
> until the cache is trimmed (nfsrv_trimcache() gets called after the
> TCP layer has received an ACK for receipt of the reply from the client).

I wonder if this bit is even working at all.  In my experience, the
size of the DRC quickly grows under load up to the maximum (or
actually, slightly beyond), and never drops much below that level.  On
my production server right now, "nfsstat -se" reports:

Server Info:
  Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
 13036780    359901   1723623      3420  36397693  12385668    346590    109984
   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
    45173        16    116791     14192      1176        24  12876747   3398533
    Mknod    Fsstat    Fsinfo  PathConf    Commit   LookupP   SetClId SetClIdCf
        0      2703     14992      7502   1329196         0         1         1
     Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet     GetFH      Lock
   263034         0         0    263019         0         0    545104         0
    LockT     LockU     Close    Verify   NVerify     PutFH  PutPubFH PutRootFH
        0         0    263012         0         0  23753375         0         1
    Renew RestoreFH    SaveFH   Secinfo RelLckOwn  V4Create
        2    263006    263033         0         0         0
Server:
Retfailed    Faults   Clients
        0         0         1
OpenOwner     Opens LockOwner     Locks    Delegs 
       56        10         0         0         0 
Server Cache Stats:
   Inprog      Idem  Non-idem    Misses CacheSize   TCPPeak
        0         0         0  81714128     60997     61017

It's only been up for about the last 24 hours.  Should I be setting
the size limit to something truly outrageous, like 200,000?  (I'd
definitely need to deal with the mbuf cluster issue then!)  The
average request rate over this time is about 1000/s, but that includes
several episodes of high-cpu spinning (which I resolved by increasing
the DRC limit).

Meanwhile, some relevant bits from sysctl:

vfs.nfsd.udphighwater: 500
vfs.nfsd.tcphighwater: 61000
vfs.nfsd.minthreads: 16
vfs.nfsd.maxthreads: 64
vfs.nfsd.threads: 64
vfs.nfsd.request_space_used: 1416
vfs.nfsd.request_space_used_highest: 4284672
vfs.nfsd.request_space_high: 47185920
vfs.nfsd.request_space_low: 31457280
vfs.nfsd.request_space_throttled: 0
vfs.nfsd.request_space_throttle_count: 0

(I'd actually like to put maxthreads back up at 256, which is where I
had it during testing, but I need to test that the jumbo-frames issue
is fixed first.  I did pre-production testing on a non-jumbo network.)

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 12:14:25 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 716F8F4F;
 Sat,  9 Mar 2013 12:14:25 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qe0-f42.google.com (mail-qe0-f42.google.com
 [209.85.128.42]) by mx1.freebsd.org (Postfix) with ESMTP id 21C54D0F;
 Sat,  9 Mar 2013 12:14:24 +0000 (UTC)
Received: by mail-qe0-f42.google.com with SMTP id f6so1549085qej.1
 for <multiple recipients>; Sat, 09 Mar 2013 04:14:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=UHcs7vGdgH8kblpB8G3zinSns84zyrBia5+mdu0h4JI=;
 b=FaYXZcVSrwD+4NmmdszJDXIVru1k+chxA4egobAae5nsfklbiXzpU2nadlpVps1oWh
 zq76F8b+GoPKi4FW4H1x56YOOC8WsrYnSIT9dm5w4TZDdaRsmteMu05mjdLL/IEGRyR/
 e0qRYx3bv3lH2fuVBzUQxD1JZYGDurLWcJvidIZZ68KesaCGx3rVJ9O9uX+cZVcO/n7H
 ROM8KbnmmrJ5/wtcktUCUEeFqfMvdplRp3R0TRQ4bazuIFnVBdSZfp4V3HWFjHeDpHS/
 DCRQ6bc22+J0EKc0CNLg/rEz/klFMtYwPeOUOmfUZjMq41V3E50bljkiXPYO/T1VdOAs
 ziuA==
MIME-Version: 1.0
X-Received: by 10.49.6.101 with SMTP id z5mr9322969qez.50.1362831258610; Sat,
 09 Mar 2013 04:14:18 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Sat, 9 Mar 2013 04:14:16 -0800 (PST)
In-Reply-To: <201303082151.00895.vegeta@tuxpowered.net>
References: <201303081419.17743.vegeta@tuxpowered.net>
 <CAPBZQG2bb2xzPB2UoPUDx-ifyBdmjac6b8kV76DTPBUzLCDmJw@mail.gmail.com>
 <201303082151.00895.vegeta@tuxpowered.net>
Date: Sat, 9 Mar 2013 13:14:16 +0100
X-Google-Sender-Auth: PO_l65cnq0c2RwQhae4xh5miDZE
Message-ID: <CAPBZQG0Jj_c-XvVJNV2S02xcitr+nhs+mV=GjJm3YeM6iPUX7g@mail.gmail.com>
Subject: Re: [patch] Source entries removing is awfully slow.
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
Content-Type: multipart/mixed; boundary=047d7bea40f00ef2e504d77ce15b
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 12:14:25 -0000

--047d7bea40f00ef2e504d77ce15b
Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable

On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz
<vegeta@tuxpowered.net>wrote:

> Dnia pi=B1tek, 8 marca 2013 o 21:11:43 Ermal Lu=E7i napisa=B3(a):
> > Is this FreeBSD 9.x or HEAD?
>
> I found the problem and developed the patch on 9.1.
>
> Can you please test this more 'beautiful' patch.
Its similar to yours but also delays src state removal to the proper purge
thread.

Though the src node removal option through pfctl -K does a lot of job to
cleanup things
Still need to undertand why it takes so much time for you to loop through
500K states.
The purge thread does that every tick by partitioning it to a few per time
slot but still minutes is way loong.

Can you please try to give a top -SH view of the time when this happens and
a pfctl -vvsa output?


> --
> | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
> |  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
> |        Vegeta          | www: http://vegeta.tuxpowered.net     |
> `------------------------^---------------------------------------'
>


--=20
Ermal

--047d7bea40f00ef2e504d77ce15b
Content-Type: application/octet-stream; 
	name="state_unlink_optimization2.diff"
Content-Disposition: attachment; filename="state_unlink_optimization2.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_he2q1w430

ZGlmZiAtLWdpdCBhL3N5cy9jb250cmliL3BmL25ldC9wZi5jIGIvc3lzL2NvbnRyaWIvcGYvbmV0
L3BmLmMKaW5kZXggOWZiMDVhZS4uNGRmNDBjYyAxMDA2NDQKLS0tIGEvc3lzL2NvbnRyaWIvcGYv
bmV0L3BmLmMKKysrIGIvc3lzL2NvbnRyaWIvcGYvbmV0L3BmLmMKQEAgLTcyMCw2ICs3MjAsOSBA
QCBwZl9pbnNlcnRfc3JjX25vZGUoc3RydWN0IHBmX3NyY19ub2RlICoqc24sIHN0cnVjdCBwZl9y
dWxlICpydWxlLAogCQkgICAgcnVsZS0+bWF4X3NyY19jb25uX3JhdGUubGltaXQsCiAJCSAgICBy
dWxlLT5tYXhfc3JjX2Nvbm5fcmF0ZS5zZWNvbmRzKTsKIAorI2lmZGVmIF9fRnJlZUJTRF9fCisJ
CVRBSUxRX0lOSVQoJigqc24pLT5zdGF0ZV9saXN0KTsKKyNlbmRpZgogCQkoKnNuKS0+YWYgPSBh
ZjsKIAkJaWYgKHJ1bGUtPnJ1bGVfZmxhZyAmIFBGUlVMRV9SVUxFU1JDVFJBQ0sgfHwKIAkJICAg
IHJ1bGUtPnJwb29sLm9wdHMgJiBQRl9QT09MX1NUSUNLWUFERFIpCkBAIC0xNDUzLDYgKzE0NTYs
OSBAQCBwZl9wdXJnZV9leHBpcmVkX3NyY19ub2RlcyhpbnQgd2FzbG9ja2VkKQogI2VuZGlmCiB7
CiAJc3RydWN0IHBmX3NyY19ub2RlCQkqY3VyLCAqbmV4dDsKKyNpZmRlZiBfX0ZyZWVCU0RfXwor
CXN0cnVjdCBwZl9zdGF0ZQkJCSpzOworI2VuZGlmCiAJaW50CQkJCSBsb2NrZWQgPSB3YXNsb2Nr
ZWQ7CiAKICNpZmRlZiBfX0ZyZWVCU0RfXwpAQCAtMTQ4Niw2ICsxNDkyLDEyIEBAIHBmX3B1cmdl
X2V4cGlyZWRfc3JjX25vZGVzKGludCB3YXNsb2NrZWQpCiAJCQkJCXBmX3JtX3J1bGUoTlVMTCwg
Y3VyLT5ydWxlLnB0cik7CiAJCQl9CiAjaWZkZWYgX19GcmVlQlNEX18KKwkJCXdoaWxlICghVEFJ
TFFfRU1QVFkoJmN1ci0+c3RhdGVfbGlzdCkpIHsKKwkJCQlzID0gVEFJTFFfRklSU1QoJmN1ci0+
c3RhdGVfbGlzdCk7CisJCQkJVEFJTFFfUkVNT1ZFKCZjdXItPnN0YXRlX2xpc3QsIHMsIHNyY25v
ZGVfbGluayk7CisJCQkJcy0+c3JjX25vZGUgPSBOVUxMOworCQkJCXMtPm5hdF9zcmNfbm9kZSA9
IE5VTEw7CisJCQl9CiAJCQlSQl9SRU1PVkUocGZfc3JjX3RyZWUsICZWX3RyZWVfc3JjX3RyYWNr
aW5nLCBjdXIpOwogCQkJVl9wZl9zdGF0dXMuc2NvdW50ZXJzW1NDTlRfU1JDX05PREVfUkVNT1ZB
TFNdKys7CiAJCQlWX3BmX3N0YXR1cy5zcmNfbm9kZXMtLTsKQEAgLTE1MjksNiArMTU0MSwxMCBA
QCBwZl9zcmNfdHJlZV9yZW1vdmVfc3RhdGUoc3RydWN0IHBmX3N0YXRlICpzKQogI2VuZGlmCiAJ
CQlzLT5zcmNfbm9kZS0+ZXhwaXJlID0gdGltZV9zZWNvbmQgKyB0aW1lb3V0OwogCQl9CisjaWZk
ZWYgX19GcmVlQlNEX18KKwkJaWYgKCFUQUlMUV9FTVBUWSgmcy0+c3JjX25vZGUtPnN0YXRlX2xp
c3QpKQorCQkJVEFJTFFfUkVNT1ZFKCZzLT5zcmNfbm9kZS0+c3RhdGVfbGlzdCwgcywgc3Jjbm9k
ZV9saW5rKTsKKyNlbmRpZgogCX0KIAlpZiAocy0+bmF0X3NyY19ub2RlICE9IHMtPnNyY19ub2Rl
ICYmIHMtPm5hdF9zcmNfbm9kZSAhPSBOVUxMKSB7CiAJCWlmICgtLXMtPm5hdF9zcmNfbm9kZS0+
c3RhdGVzIDw9IDApIHsKQEAgLTE1NDIsNiArMTU1OCwxMCBAQCBwZl9zcmNfdHJlZV9yZW1vdmVf
c3RhdGUoc3RydWN0IHBmX3N0YXRlICpzKQogI2VuZGlmCiAJCQlzLT5uYXRfc3JjX25vZGUtPmV4
cGlyZSA9IHRpbWVfc2Vjb25kICsgdGltZW91dDsKIAkJfQorI2lmZGVmIF9fRnJlZUJTRF9fCisJ
CWlmICghVEFJTFFfRU1QVFkoJnMtPm5hdF9zcmNfbm9kZS0+c3RhdGVfbGlzdCkpCisJCQlUQUlM
UV9SRU1PVkUoJnMtPm5hdF9zcmNfbm9kZS0+c3RhdGVfbGlzdCwgcywgc3Jjbm9kZV9saW5rKTsK
KyNlbmRpZgogCX0KIAlzLT5zcmNfbm9kZSA9IHMtPm5hdF9zcmNfbm9kZSA9IE5VTEw7CiB9CkBA
IC0zOTQ5LDggKzM5NjksMTggQEAgcGZfY3JlYXRlX3N0YXRlKHN0cnVjdCBwZl9ydWxlICpyLCBz
dHJ1Y3QgcGZfcnVsZSAqbnIsIHN0cnVjdCBwZl9ydWxlICphLAogCQlwb29sX3B1dCgmcGZfc3Rh
dGVfcGwsIHMpOwogI2VuZGlmCiAJCXJldHVybiAoUEZfRFJPUCk7CisjaWZkZWYgX19GcmVlQlNE
X18KKwl9IGVsc2UgeworCQlpZiAoc24gIT0gTlVMTCkKKwkJCVRBSUxRX0lOU0VSVF9IRUFEKCZz
bi0+c3RhdGVfbGlzdCwgcywgc3Jjbm9kZV9saW5rKTsKKwkJaWYgKG5zbiAhPSBOVUxMKQorCQkJ
VEFJTFFfSU5TRVJUX0hFQUQoJm5zbi0+c3RhdGVfbGlzdCwgcywgc3Jjbm9kZV9saW5rKTsKKwkJ
KnNtID0gczsKKwl9CisjZWxzZQogCX0gZWxzZQogCQkqc20gPSBzOworI2VuZGlmCiAKIAlwZl9z
ZXRfcnRfaWZwKHMsIHBkLT5zcmMpOwkvKiBuZWVkcyBzLT5zdGF0ZV9rZXkgc2V0ICovCiAJaWYg
KHRhZyA+IDApIHsKZGlmZiAtLWdpdCBhL3N5cy9jb250cmliL3BmL25ldC9wZl9pb2N0bC5jIGIv
c3lzL2NvbnRyaWIvcGYvbmV0L3BmX2lvY3RsLmMKaW5kZXggM2IxMzBlNS4uMjg2NGU5YSAxMDA2
NDQKLS0tIGEvc3lzL2NvbnRyaWIvcGYvbmV0L3BmX2lvY3RsLmMKKysrIGIvc3lzL2NvbnRyaWIv
cGYvbmV0L3BmX2lvY3RsLmMKQEAgLTM3ODksNyArMzc4OSw5IEBAIHBmaW9jdGwoZGV2X3QgZGV2
LCB1X2xvbmcgY21kLCBjYWRkcl90IGFkZHIsIGludCBmbGFncywgc3RydWN0IHByb2MgKnApCiAK
IAljYXNlIERJT0NLSUxMU1JDTk9ERVM6IHsKIAkJc3RydWN0IHBmX3NyY19ub2RlCSpzbjsKKyNp
Zm5kZWYgX19GcmVlQlNEX18KIAkJc3RydWN0IHBmX3N0YXRlCQkqczsKKyNlbmRpZgogCQlzdHJ1
Y3QgcGZpb2Nfc3JjX25vZGVfa2lsbCAqcHNuayA9CiAJCSAgICAoc3RydWN0IHBmaW9jX3NyY19u
b2RlX2tpbGwgKilhZGRyOwogCQl1X2ludAkJCWtpbGxlZCA9IDA7CkBAIC0zODA4LDYgKzM4MTAs
NyBAQCBwZmlvY3RsKGRldl90IGRldiwgdV9sb25nIGNtZCwgY2FkZHJfdCBhZGRyLCBpbnQgZmxh
Z3MsIHN0cnVjdCBwcm9jICpwKQogCQkJCSZwc25rLT5wc25rX2RzdC5hZGRyLnYuYS5tYXNrLAog
CQkJCSZzbi0+cmFkZHIsIHNuLT5hZikpIHsKIAkJCQkvKiBIYW5kbGUgc3RhdGUgdG8gc3JjX25v
ZGUgbGlua2FnZSAqLworI2lmbmRlZiBfX0ZyZWVCU0RfXyAKIAkJCQlpZiAoc24tPnN0YXRlcyAh
PSAwKSB7CiAJCQkJCVJCX0ZPUkVBQ0gocywgcGZfc3RhdGVfdHJlZV9pZCwKICNpZmRlZiBfX0Zy
ZWVCU0RfXwpAQCAtMzgyMiwxMyArMzgyNSwxNiBAQCBwZmlvY3RsKGRldl90IGRldiwgdV9sb25n
IGNtZCwgY2FkZHJfdCBhZGRyLCBpbnQgZmxhZ3MsIHN0cnVjdCBwcm9jICpwKQogCQkJCQl9CiAJ
CQkJCXNuLT5zdGF0ZXMgPSAwOwogCQkJCX0KKyNlbmRpZgogCQkJCXNuLT5leHBpcmUgPSAxOwog
CQkJCWtpbGxlZCsrOwogCQkJfQogCQl9CiAKKyNpZiAwCiAJCWlmIChraWxsZWQgPiAwKQogCQkJ
cGZfcHVyZ2VfZXhwaXJlZF9zcmNfbm9kZXMoMSk7CisjZW5kaWYKIAogCQlwc25rLT5wc25rX2tp
bGxlZCA9IGtpbGxlZDsKIAkJYnJlYWs7CmRpZmYgLS1naXQgYS9zeXMvY29udHJpYi9wZi9uZXQv
cGZ2YXIuaCBiL3N5cy9jb250cmliL3BmL25ldC9wZnZhci5oCmluZGV4IGRhYjcwYzUuLmUzMWQz
OWQgMTAwNjQ0Ci0tLSBhL3N5cy9jb250cmliL3BmL25ldC9wZnZhci5oCisrKyBiL3N5cy9jb250
cmliL3BmL25ldC9wZnZhci5oCkBAIC03MzksNiArNzM5LDkgQEAgc3RydWN0IHBmX3NyY19ub2Rl
IHsKIAlzdHJ1Y3QgcGZfYWRkcgkgcmFkZHI7CiAJdW5pb24gcGZfcnVsZV9wdHIgcnVsZTsKIAlz
dHJ1Y3QgcGZpX2tpZgkqa2lmOworI2lmZGVmIF9fRnJlZUJTRF9fCisJVEFJTFFfSEVBRCgsIHBm
X3N0YXRlKQlzdGF0ZV9saXN0OworI2VuZGlmCiAJdV9pbnQ2NF90CSBieXRlc1syXTsKIAl1X2lu
dDY0X3QJIHBhY2tldHNbMl07CiAJdV9pbnQzMl90CSBzdGF0ZXM7CkBAIC04NDAsNiArODQzLDkg
QEAgc3RydWN0IHBmX3N0YXRlIHsKIAogCVRBSUxRX0VOVFJZKHBmX3N0YXRlKQkgc3luY19saXN0
OwogCVRBSUxRX0VOVFJZKHBmX3N0YXRlKQkgZW50cnlfbGlzdDsKKyNpZmRlZiBfX0ZyZWVCU0Rf
XworCVRBSUxRX0VOVFJZKHBmX3N0YXRlKQkgc3Jjbm9kZV9saW5rOworI2VuZGlmCiAJUkJfRU5U
UlkocGZfc3RhdGUpCSBlbnRyeV9pZDsKIAlzdHJ1Y3QgcGZfc3RhdGVfcGVlcgkgc3JjOwogCXN0
cnVjdCBwZl9zdGF0ZV9wZWVyCSBkc3Q7Cg==
--047d7bea40f00ef2e504d77ce15b--

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 12:15:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 53D619D;
 Sat,  9 Mar 2013 12:15:05 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qe0-f53.google.com (mail-qe0-f53.google.com
 [209.85.128.53]) by mx1.freebsd.org (Postfix) with ESMTP id 06BC3D26;
 Sat,  9 Mar 2013 12:15:04 +0000 (UTC)
Received: by mail-qe0-f53.google.com with SMTP id cz11so1542710qeb.12
 for <multiple recipients>; Sat, 09 Mar 2013 04:15:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=IF4nE86t6qj/q8ZZ8pG7hCBxHuKDGdQchvqsuaVRHYE=;
 b=G2dj1zF4+MwPcCmU68wZgHh3FooxhAyYhrNaCtjJtehhstrk4HXuh8wcmYH6ov2NEn
 +twD0AD5BCM2smDgXr9IwOC12/cYUBfLBj3poXGR6hlqKJB1a+p9uzo3h+kh6zgkZilt
 ah1Xfnm7u3cIVLEsUHrcDbZvLT6/1eqCgkt3KqtsXAWLj2atE2WS+ouoDEE1lGq3Onqu
 Begh2eyRgBrc2sUKK4F4J254DAiNePjUieRqEejcAQwmcr+DupoCJsBrBDNDQWsMhdz/
 TGJ/LVDEJeJ6egMqnEhsdGIWihm76x6QkhsMdgh3jvMFlj0TkljWPLEQm4QmpWSwc/6j
 SMGg==
MIME-Version: 1.0
X-Received: by 10.224.186.82 with SMTP id cr18mr8691238qab.64.1362831304317;
 Sat, 09 Mar 2013 04:15:04 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Sat, 9 Mar 2013 04:15:04 -0800 (PST)
In-Reply-To: <CAPBZQG0Jj_c-XvVJNV2S02xcitr+nhs+mV=GjJm3YeM6iPUX7g@mail.gmail.com>
References: <201303081419.17743.vegeta@tuxpowered.net>
 <CAPBZQG2bb2xzPB2UoPUDx-ifyBdmjac6b8kV76DTPBUzLCDmJw@mail.gmail.com>
 <201303082151.00895.vegeta@tuxpowered.net>
 <CAPBZQG0Jj_c-XvVJNV2S02xcitr+nhs+mV=GjJm3YeM6iPUX7g@mail.gmail.com>
Date: Sat, 9 Mar 2013 13:15:04 +0100
X-Google-Sender-Auth: YuZhHC-J6WEuDQwMu0GUxbI9FRw
Message-ID: <CAPBZQG3B1-wcmWpUDwsDxLgaQkNHwSRR8ZTeOgs=8ZNRNTJPPA@mail.gmail.com>
Subject: Re: [patch] Source entries removing is awfully slow.
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 12:15:05 -0000

Also do not forget to rebuild pfctl so that statistics are shown correctly.


On Sat, Mar 9, 2013 at 1:14 PM, Ermal Lu=C3=A7i <eri@freebsd.org> wrote:

>
>
>
> On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz <
> vegeta@tuxpowered.net> wrote:
>
>> Dnia pi=C4=85tek, 8 marca 2013 o 21:11:43 Ermal Lu=C3=A7i napisa=C5=82(a=
):
>> > Is this FreeBSD 9.x or HEAD?
>>
>> I found the problem and developed the patch on 9.1.
>>
>> Can you please test this more 'beautiful' patch.
> Its similar to yours but also delays src state removal to the proper purg=
e
> thread.
>
> Though the src node removal option through pfctl -K does a lot of job to
> cleanup things
> Still need to undertand why it takes so much time for you to loop through
> 500K states.
> The purge thread does that every tick by partitioning it to a few per tim=
e
> slot but still minutes is way loong.
>
> Can you please try to give a top -SH view of the time when this happens
> and a pfctl -vvsa output?
>
>
>
>> --
>> | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
>> |  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
>> |        Vegeta          | www: http://vegeta.tuxpowered.net     |
>> `------------------------^---------------------------------------'
>>
>
>
>
> --
> Ermal
>


--=20
Ermal

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 13:37:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 8451DBDB
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 13:37:57 +0000 (UTC)
 (envelope-from vegeta@tuxpowered.net)
Received: from mail-ea0-x22a.google.com (mail-ea0-x22a.google.com
 [IPv6:2a00:1450:4013:c01::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id F2FA01D5
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 13:37:56 +0000 (UTC)
Received: by mail-ea0-f170.google.com with SMTP id a15so551164eae.15
 for <freebsd-net@freebsd.org>; Sat, 09 Mar 2013 05:37:55 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:from:to:subject:date:user-agent:cc:references
 :in-reply-to:mime-version:content-type:content-transfer-encoding
 :message-id:x-gm-message-state;
 bh=ZfG3s3NFULdgi+srDZR0jl6TBJ+5Egt1AF4+r9mpUUI=;
 b=VNYp6mQA58OvDbOrGDKwua95jkxnt5bDmiRLkMFAoaIX0HmrjQv3OXLCkKSKc1ZK0z
 lm/4EXWW/SMwK6+2GAjHGU6584I61gwNEFQnsbFTP2smzhtK4oHjHJ0/i3YDrfjUjJdS
 D8Qole7pOddTcoQtqUa4w2FB0dquxQInAIsOFe6sH7ULwoz7LFm3VdCOQbpzJE2HDcWM
 mmVkekHVnJZQd2+hPiAO7DoT37TV7JRzoEj+Aicrt0zt8dnkPZOLX9aH0iotYPXqTO2G
 hvOr3kE0NGxB6dPC8zkU/YRfA5l4pKJjkco0X8S3bmmoarsHxPjPM0OHP5qs9bRviS16
 FP6Q==
X-Received: by 10.14.183.198 with SMTP id q46mr16472183eem.1.1362836275730;
 Sat, 09 Mar 2013 05:37:55 -0800 (PST)
Received: from zvezda.localnet ([37.81.64.97])
 by mx.google.com with ESMTPS id 44sm13262429eek.5.2013.03.09.05.37.53
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Sat, 09 Mar 2013 05:37:54 -0800 (PST)
From: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
To: Ermal =?utf-8?q?Lu=C3=A7i?= <eri@freebsd.org>
Subject: Re: [patch] Source entries removing is awfully slow.
Date: Sat, 9 Mar 2013 14:37:51 +0100
User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; )
References: <201303081419.17743.vegeta@tuxpowered.net>
 <201303082151.00895.vegeta@tuxpowered.net>
 <CAPBZQG0Jj_c-XvVJNV2S02xcitr+nhs+mV=GjJm3YeM6iPUX7g@mail.gmail.com>
In-Reply-To: <CAPBZQG0Jj_c-XvVJNV2S02xcitr+nhs+mV=GjJm3YeM6iPUX7g@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: <201303091437.51945.vegeta@tuxpowered.net>
X-Gm-Message-State: ALoCoQn09kjRt4d+P7fNlvvJYQ+w9TlP8yVULZMp79p7cm7tRAdsSin3D8UK8LSQTNNIBYIE08hQ
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 13:37:57 -0000

Dnia sobota, 9 marca 2013 o 13:14:16 Ermal Lu=C3=A7i napisa=C5=82(a):
> On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz
>=20
> <vegeta@tuxpowered.net>wrote:
> > Dnia pi=C4=85tek, 8 marca 2013 o 21:11:43 Ermal Lu=C3=A7i napisa=C5=82(=
a):
> > > Is this FreeBSD 9.x or HEAD?
> >=20
> > I found the problem and developed the patch on 9.1.
> >=20
> Can you please test this more 'beautiful' patch.

Oh, somehow I did not notice an existing implementation for doubly linked l=
ist.=20
I'm quite new to kernel programming.

> Its similar to yours but also delays src state removal to the proper purge
> thread.

I'll try it right after the weekend.

> Though the src node removal option through pfctl -K does a lot of job to
> cleanup things
> Still need to undertand why it takes so much time for you to loop through
> 500K states.

That is because the loop will not be called just once.

`pfctl -K 0.0.0.0/0 -K ip.of.internal.server.behind.this.loadbalancer` will=
=20
match multiple Source entries, up to a thousand of them in normal condition=
s=20
("normal" for my loadbalancers) and many many more when under a DDoS attack.

> The purge thread does that every tick by partitioning it to a few per time
> slot but still minutes is way loong.
>=20
> Can you please try to give a top -SH view of the time when this happens a=
nd
> a pfctl -vvsa output?

I'll try on Monday, although as far as I remember the machine was quite fro=
zen=20
during this operation.

=2D-=20
| pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
|  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
|        Vegeta          | www: http://vegeta.tuxpowered.net     |
`------------------------^---------------------------------------'

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 15:11:57 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 81FDBE46;
 Sat,  9 Mar 2013 15:11:57 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com
 [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 350727FA;
 Sat,  9 Mar 2013 15:11:57 +0000 (UTC)
Received: by mail-qa0-f48.google.com with SMTP id j8so295744qah.0
 for <multiple recipients>; Sat, 09 Mar 2013 07:11:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=t8tBgO4mi0deOXxY60/CPlmuTr33/GZM/wLwrlvJua4=;
 b=DyMck+jw/XdCiePMWTFnQtBCr4St/A9AfdkCSSV75ALyi1Zp526hzee3Es+ZlSBMLH
 h8gWhjtI1wCgdIDzOtoiI1hLSL0VPSq9Ug3lWfrW68EzLI45PDdDMDWXGzEx8zW13SbL
 GdlnY0qsIDVw/ZuIWXtUpaEnubEFqLh57N1tJncoS0tAeOOBNx0XMwADeIZ47UzEA2LO
 FKO35jtOhIfAZe78ldaqx+6W1rqxVNYnyvOz/KC0ALcdbdRByLIYi3PtmSejYLY+IMA/
 5MfpdfbJ3Y9WrDOxVyBlcYdHtnOyVfkmebTk5egIOVK+m2zmJoGfftaXvTsvUm+6p9/e
 aVvA==
MIME-Version: 1.0
X-Received: by 10.224.178.77 with SMTP id bl13mr9338639qab.13.1362841916475;
 Sat, 09 Mar 2013 07:11:56 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.27.197 with HTTP; Sat, 9 Mar 2013 07:11:56 -0800 (PST)
In-Reply-To: <201303091437.51945.vegeta@tuxpowered.net>
References: <201303081419.17743.vegeta@tuxpowered.net>
 <201303082151.00895.vegeta@tuxpowered.net>
 <CAPBZQG0Jj_c-XvVJNV2S02xcitr+nhs+mV=GjJm3YeM6iPUX7g@mail.gmail.com>
 <201303091437.51945.vegeta@tuxpowered.net>
Date: Sat, 9 Mar 2013 16:11:56 +0100
X-Google-Sender-Auth: SDQcnfZIop-Qf76jdAFs98G2DVc
Message-ID: <CAPBZQG0EyUb=MZFfFzesxQvA38CPBubjd7izt3OHyqpbMOMarA@mail.gmail.com>
Subject: Re: [patch] Source entries removing is awfully slow.
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 15:11:57 -0000

On Sat, Mar 9, 2013 at 2:37 PM, Kajetan Staszkiewicz
<vegeta@tuxpowered.net>wrote:

> Dnia sobota, 9 marca 2013 o 13:14:16 Ermal Lu=E7i napisa=B3(a):
> > On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz
> >
> > <vegeta@tuxpowered.net>wrote:
> > > Dnia pi=B1tek, 8 marca 2013 o 21:11:43 Ermal Lu=E7i napisa=B3(a):
> > > > Is this FreeBSD 9.x or HEAD?
> > >
> > > I found the problem and developed the patch on 9.1.
> > >
> > Can you please test this more 'beautiful' patch.
>
> Oh, somehow I did not notice an existing implementation for doubly linked
> list.
> I'm quite new to kernel programming.
>
> > Its similar to yours but also delays src state removal to the proper
> purge
> > thread.
>
> I'll try it right after the weekend.
>
> > Though the src node removal option through pfctl -K does a lot of job t=
o
> > cleanup things
> > Still need to undertand why it takes so much time for you to loop throu=
gh
> > 500K states.
>
> That is because the loop will not be called just once.
>
> `pfctl -K 0.0.0.0/0 -K ip.of.internal.server.behind.this.loadbalancer`
> will
> match multiple Source entries, up to a thousand of them in normal
> conditions
> ("normal" for my loadbalancers) and many many more when under a DDoS
> attack.
>
>
I would expect from a proper software to kill states from those clients and
then kill the srcnode for the backend server.
It does not make proper sense to not kill state before src nodes since that
is what will impact your connectivity.

Though the patch improves your use case a lot still would be better to even
kill those states during this step, with an extra option,
since otherwise you'd have to create for each of those client a separate
request.

Do you control the application to test an extra addition to this patch to
allow killing the linked states as well?


> > The purge thread does that every tick by partitioning it to a few per
> time
> > slot but still minutes is way loong.
> >
> > Can you please try to give a top -SH view of the time when this happens
> and
> > a pfctl -vvsa output?
>
> I'll try on Monday, although as far as I remember the machine was quite
> frozen
> during this operation.
>
> --
> | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
> |  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
> |        Vegeta          | www: http://vegeta.tuxpowered.net     |
> `------------------------^---------------------------------------'
>


--=20
Ermal

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 16:15:47 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B5042E2B
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 16:15:47 +0000 (UTC)
 (envelope-from vegeta@tuxpowered.net)
Received: from mail-ea0-x229.google.com (mail-ea0-x229.google.com
 [IPv6:2a00:1450:4013:c01::229])
 by mx1.freebsd.org (Postfix) with ESMTP id 33C94E02
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 16:15:46 +0000 (UTC)
Received: by mail-ea0-f169.google.com with SMTP id z7so579293eaf.14
 for <freebsd-net@freebsd.org>; Sat, 09 Mar 2013 08:15:46 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:from:to:subject:date:user-agent:cc:references
 :in-reply-to:mime-version:content-type:content-transfer-encoding
 :message-id:x-gm-message-state;
 bh=87U5I1ti2Xiokb1uoM/2wytuL6lompIPNW4nwc8R2X8=;
 b=bpzguDRGogILgT5DTgJVShj4mbcVP47Bygb8Qo1ienaMCs9mPvcJM/BMRncKqo6Uwo
 sdxVk8Q4MlATwzBOeKBwgJjbSAaXMfehbS4DyCGoKe1+UtX3KJKD5T3VboQy62Y5HmeY
 6IeXX3rqjwbhKq6k9BJ6Fy9ioCsdvy8qtaUFzzDQPyXq3Ti05KufFe02ReImImMJeTSa
 hh1D2PdZSehEdKDB5Bzja5SmsxvbN1Oqylmz/Q00jUoA8Tyl8Js+iNg6vn8Cb6W9UOX7
 CZ2SStJ3PwzRYnOYZ5ZcvK4rGeiUtDnNQU2nBNvgDi5+qib6wXKnGo0BMWc7uIlbw4MI
 fm2w==
X-Received: by 10.14.4.69 with SMTP id 45mr17622104eei.0.1362845745816;
 Sat, 09 Mar 2013 08:15:45 -0800 (PST)
Received: from zvezda.localnet ([37.81.64.97])
 by mx.google.com with ESMTPS id 3sm13797558eej.6.2013.03.09.08.15.43
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Sat, 09 Mar 2013 08:15:44 -0800 (PST)
From: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
To: Ermal =?utf-8?q?Lu=C3=A7i?= <eri@freebsd.org>
Subject: Re: [patch] Source entries removing is awfully slow.
Date: Sat, 9 Mar 2013 17:15:42 +0100
User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; )
References: <201303081419.17743.vegeta@tuxpowered.net>
 <201303091437.51945.vegeta@tuxpowered.net>
 <CAPBZQG0EyUb=MZFfFzesxQvA38CPBubjd7izt3OHyqpbMOMarA@mail.gmail.com>
In-Reply-To: <CAPBZQG0EyUb=MZFfFzesxQvA38CPBubjd7izt3OHyqpbMOMarA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: <201303091715.42624.vegeta@tuxpowered.net>
X-Gm-Message-State: ALoCoQmKh3qvj6TlKdybwlU8fvTHcwi2t84HCc2J6fSoQ2inHWVn75kU0MrAhiWpCdSRyo/Pm4b3
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 16:15:47 -0000

Dnia sobota, 9 marca 2013 o 16:11:56 napisa=C5=82e=C5=9B:

> > > Though the src node removal option through pfctl -K does a lot of job
> > > to cleanup things
> > > Still need to undertand why it takes so much time for you to loop
> > > through 500K states.
> >=20
> > That is because the loop will not be called just once.
> >=20
> > `pfctl -K 0.0.0.0/0 -K ip.of.internal.server.behind.this.loadbalancer`
> > will
> > match multiple Source entries, up to a thousand of them in normal
> > conditions
> > ("normal" for my loadbalancers) and many many more when under a DDoS
> > attack.
>=20
> I would expect from a proper software to kill states from those clients a=
nd
> then kill the srcnode for the backend server.

=46irst of all, I do not know which clients are affected. I know which serv=
er is=20
dead. But I can not remove states to this server using pfctl, as states are=
=20
from clients' public IP addresses to loadbalancer's public IP address. Sour=
ces=20
on the other hand point to the internal IP address of the broken server.

And the second thing is, that under normal conditions removing just a bit o=
f=20
states would not help the performance. Also the server health checking soft=
ware=20
is unaware of DDoS attacks and will not remove states resulting from the at=
tack=20
in advance.

> It does not make proper sense to not kill state before src nodes since th=
at
> is what will impact your connectivity.

I agree, it makes only sense to remove both sources and linked states at th=
e=20
same time. With removing sources only, states are still pointing to the bro=
ken=20
server and clients are still connected to it in existing tcp connections. I=
f=20
states would be also removed, clients will loose all connectivity (which I=
=20
prefer rather than them seeing wrong data) and (hopefully) reconnect to ano=
ther=20
live server.

> Though the patch improves your use case a lot still would be better to ev=
en
> kill those states during this step, with an extra option,
> since otherwise you'd have to create for each of those client a separate
> request.

That would be in updated version of the patch I hope to send to the list on=
=20
Monday.

=2D-=20
| pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD |
|  Kajetan Staszkiewicz  | jabber,email: vegeta()tuxpowered net  |
|        Vegeta          | www: http://vegeta.tuxpowered.net     |
`------------------------^---------------------------------------'

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 16:27:35 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 50F51FD2;
 Sat,  9 Mar 2013 16:27:35 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id B9078E61;
 Sat,  9 Mar 2013 16:27:34 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqEEAE5iO1GDaFvO/2dsb2JhbABDiCi8OIF1dIItAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASHbAYMqT2SC4EjjCkKBX00B4ItgRMDiHGLJYI+gR6PVYMoT30IFx4
X-IronPort-AV: E=Sophos;i="4.84,814,1355115600"; d="scan'208";a="17907963"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 09 Mar 2013 11:27:32 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BBA7BB4036;
 Sat,  9 Mar 2013 11:27:32 -0500 (EST)
Date: Sat, 9 Mar 2013 11:27:32 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Garrett Wollman <wollman@freebsd.org>
Message-ID: <1639798917.3728142.1362846452693.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20794.38381.221980.5038@hergotha.csail.mit.edu>
Subject: Re: NFS DRC size
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 16:27:35 -0000

Garrett Wollman wrote:
> <<On Fri, 8 Mar 2013 19:47:13 -0500 (EST), Rick Macklem
> <rmacklem@uoguelph.ca> said:
> 
> > The cached replies are copies of the mbuf list done via m_copym().
> > As such, the clusters in these replies won't be free'd (ref cnt ->
> > 0)
> > until the cache is trimmed (nfsrv_trimcache() gets called after the
> > TCP layer has received an ACK for receipt of the reply from the
> > client).
> 
> I wonder if this bit is even working at all. In my experience, the
> size of the DRC quickly grows under load up to the maximum (or
> actually, slightly beyond), and never drops much below that level. On
> my production server right now, "nfsstat -se" reports:
> 
Well, once you add the patches and turn vfs.nfsd.tcphighwater up, it
will only trim the cache when that highwater mark is exceeded. When
it does the trim, the size does drop for the simple testing I do with
a single client. (I'll take another look at drc3.patch and see if I
can spot anywhere this might be broken, although my hunch is
that you have a lot of TCP connections and enough activity that it
rapidly grows back up to the limit.) The fact that it trims down to
around the highwater mark basically indicates this is working. If it wasn't
throwing away replies where the receipt has been ack'd at the TCP
level, the cache would grow very large, since they would only be
discarded after a loonnngg timeout (12hours unless you've changes
NFSRVCACHE_TCPTIMEOUT in sys/fs/nfs/nfs.h).

> Server Info:
> Getattr Setattr Lookup Readlink Read Write Create Remove
> 13036780 359901 1723623 3420 36397693 12385668 346590 109984
> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access
> 45173 16 116791 14192 1176 24 12876747 3398533
> Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf
> 0 2703 14992 7502 1329196 0 1 1
> Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock
> 263034 0 0 263019 0 0 545104 0
> LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH
> 0 0 263012 0 0 23753375 0 1
> Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create
> 2 263006 263033 0 0 0
> Server:
> Retfailed Faults Clients
> 0 0 1
> OpenOwner Opens LockOwner Locks Delegs
> 56 10 0 0 0
> Server Cache Stats:
> Inprog Idem Non-idem Misses CacheSize TCPPeak
> 0 0 0 81714128 60997 61017
> 
> It's only been up for about the last 24 hours. Should I be setting
> the size limit to something truly outrageous, like 200,000? (I'd
> definitely need to deal with the mbuf cluster issue then!) The
> average request rate over this time is about 1000/s, but that includes
> several episodes of high-cpu spinning (which I resolved by increasing
> the DRC limit).
> 
It is the number of TCP connections from clients that determines how much
gets cached, not the request rate. For TCP, a scheme like LRU doesn't work,
because RPC retries (as opposed to TCP segment retransmits) only happen long
after the initial RPC request. (Usually after a TCP connection has broken and
the client has established a new connection, although some NFSv3 over TCP
clients will retry an RPC after a long timeout.) The cache needs to hold the
last N RPC replies for each TCP connection and discard them when further
traffic on the TCP connection indicates that the connection is still working.
(Some NFSv3 over TCP servers don't guarantee to generate a reply for an RPC
 when resource constrained, but the FreeBSD one always sends a reply, except
 for NFSv2, where it will close down the TCP connection when it has no choice.
 I doubt any client is doing NFSv2 over TCP, so I don't consider this relevent.)

If the CPU is spinning in nfsrc_trimcache() a lot, increasing vfs.nfsd.tcphighwater
should decrease that, but with an increase in mbuf cluster allocation.

If there is a lot of contention for mutexes, increasing the size of the hash
table might help. The drc3.patch bumped the hash table from 20->200,
but that would still be about 300 entries per hash list and one mutex for
those 300 entries, assuming the hash function is working well.
Increasing it only adds list head pointers and mutexes.
(It's NFSRVCACHE_HASHSIZE in sys/fs/nfs/nfsrvcache.h.)

Unfortunately, increasing it requires a kernel rebuild/reboot. Maybe the patch
for head should change the size of the hash table when vfs.nfsd.tcphighwater
is set much larger? (Not quite trivial and will probably result in a short stall of
the nfsd threads, since all the entries will need to be rehashed/moved to
new lists, but could be worth the effort.)

> Meanwhile, some relevant bits from sysctl:
> 
> vfs.nfsd.udphighwater: 500
> vfs.nfsd.tcphighwater: 61000
> vfs.nfsd.minthreads: 16
> vfs.nfsd.maxthreads: 64
> vfs.nfsd.threads: 64
> vfs.nfsd.request_space_used: 1416
> vfs.nfsd.request_space_used_highest: 4284672
> vfs.nfsd.request_space_high: 47185920
> vfs.nfsd.request_space_low: 31457280
> vfs.nfsd.request_space_throttled: 0
> vfs.nfsd.request_space_throttle_count: 0
> 
> (I'd actually like to put maxthreads back up at 256, which is where I
> had it during testing, but I need to test that the jumbo-frames issue
> is fixed first. I did pre-production testing on a non-jumbo network.)
> 
> -GAWollman
> 
Well, the DRC will try to cache replies until the client's TCP layer
acknowledges receipt of the reply. It is hard to say how many replies
that is for a given TCP connection, since it is a function of the level
of concurrently (# of nfsiod threads in the FreeBSD client)
in the client. I'd guess it's somewhere between 1<->20?

Multiply that by the number of TCP connections from all clients and
you have about how big the server's DRC will be. (Some clients use
a single TCP connection for the client whereas others use a separate
TCP connection for each mount point.)

When ivoras@ and I have a patch for head, it should probably allow
the DRC to be disabled for TCP mounts (by setting vfs.nfsd.tcphighwater == -1?).
I don't really like the idea, but I can see the argument that TCP
maintains a reliable enough RPC transport that the DRC isn't needed.

rick

> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 16:50:31 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A911E222;
 Sat,  9 Mar 2013 16:50:31 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 5D10AF92;
 Sat,  9 Mar 2013 16:50:31 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqEEAPNmO1GDaFvO/2dsb2JhbABDiCi8OIF1dIItAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASHbAYMqUmSDIEjjDh9NAeCLYETA4hxiyWCPoEej1WDKE+BBTU
X-IronPort-AV: E=Sophos;i="4.84,814,1355115600"; d="scan'208";a="17909999"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 09 Mar 2013 11:50:30 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 72B81B3F51;
 Sat,  9 Mar 2013 11:50:30 -0500 (EST)
Date: Sat, 9 Mar 2013 11:50:30 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Garrett Wollman <wollman@freebsd.org>
Message-ID: <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20794.37617.822910.93537@hergotha.csail.mit.edu>
Subject: Re: Limits on jumbo mbuf cluster allocation
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 16:50:31 -0000

Garrett Wollman wrote:
> <<On Fri, 8 Mar 2013 19:47:13 -0500 (EST), Rick Macklem
> <rmacklem@uoguelph.ca> said:
> 
> > If reducing the size to 4K doesn't fix the problem, you might want
> > to
> > consider shrinking the tunable vfs.nfsd.tcphighwater and suffering
> > the increased CPU overhead (and some increased mutex contention) of
> > calling nfsrv_trimcache() more frequently.
> 
> Can't do that -- the system becomes intolerably slow when it gets into
> that state, and seems to get stuck that way, such that the only way to
> restore performance is to increase the size of the "cache".
> (Essentially all of the nfsd service threads end up spinning most of
> the time, load average goes to N, and goodput goes to nearly nil.) It
> does seem like a lot of effort for an extreme edge case that, in
> practical terms, never happens.
> 
So, it sounds like you've found a reasonable setting. Yes, if it is too
small, it will keep trimming over and over and over again...

I suspect this indicates that it isn't mutex contention, since the
threads would block waiting for the mutex for that case, I think?
(Bumping up NFSRVCACHE_HASHSIZE can't hurt if/when you get the chance.)

> > (I'm assuming that you are using drc2.patch + drc3.patch.
> 
> I believe that's what I have. If my kernel coding skills were less
> rusty, I'd fix it to have a separate cache-trimming thread.
> 
I've thought about this. My concern is that the separate thread might
not keep up with the trimming demand. If that occurred, the cache would
grow veryyy laarrggge, with effects like running out of mbuf clusters.

By having the nfsd threads do it, they slow down, which provides feedback
to the clients (slower RPC replies->generate fewer request->less to cache).
(I think you are probably familiar with the generic concept that a system
 needs feedback to remain stable. An M/M/1 queue with open arrivals and
 no feedback to slow the arrival rate explodes when the arrival rate
 approaches the service rate, etc and so on...)

As such, I'm not convinced a separate thread is a good idea. I think
that simply allowing sysadmins to disable the DRC for TCP may make
sense. Although I prefer more reliable vs better performance, I can
see the argument that TCP transport for RPC is "good enough" for
some environments. (Basically, if a site has a high degree of
confidence in their network fabric, such that network partitioning
type failures are pretty well non-existent and the NFS server isn't
getting overloaded to the point of very slow RPC replies, I can
see TCP retransmits as being sufficient?)

> One other weird thing that I've noticed is that netstat(1) reports the
> send and receive queues on NFS connections as being far higher than I
> have the limits configured. Does NFS do something to override this?
> 
> -GAWollman
> 
The nfs server does soreserve(so, sb_max_adj, sb_max_adj); I can't
recall exactly why it is that way, except that it needs to be large
enough to handle the largest RPC request a client might generate.

I should take another look at this, in case sb_max_adj is now
too large?

rick

> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 17:34:51 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C9184346
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 17:34:51 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 7F7031AD
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 17:34:51 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r29HYo0R061832;
 Sat, 9 Mar 2013 12:34:50 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r29HYohJ061829;
 Sat, 9 Mar 2013 12:34:50 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20795.29370.194678.963351@hergotha.csail.mit.edu>
Date: Sat, 9 Mar 2013 12:34:50 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: Limits on jumbo mbuf cluster allocation
In-Reply-To: <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca>
References: <20794.37617.822910.93537@hergotha.csail.mit.edu>
 <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Sat, 09 Mar 2013 12:34:50 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 17:34:51 -0000

<<On Sat, 9 Mar 2013 11:50:30 -0500 (EST), Rick Macklem <rmacklem@uoguelph.ca> said:

> I suspect this indicates that it isn't mutex contention, since the
> threads would block waiting for the mutex for that case, I think?

No, because our mutexes are adaptive, so each thread spins for a while
before blocking.  With the current implementation, all of them end up
doing this in pretty close to lock-step.

> (Bumping up NFSRVCACHE_HASHSIZE can't hurt if/when you get the chance.)

I already have it set to 129 (up from 20); I could see putting it up
to, say, 1023.  It would be nice to have a sysctl for maximum chain
length to see how bad it's getting (and if the hash function is
actually effective).

> I've thought about this. My concern is that the separate thread might
> not keep up with the trimming demand. If that occurred, the cache would
> grow veryyy laarrggge, with effects like running out of mbuf clusters.

At a minimum, once one nfsd thread is committed to doing the cache
trim, a flag should be set to discourage other threads from trying to
do it.  Having them all spinning their wheels punishes the clients
much too much.

> By having the nfsd threads do it, they slow down, which provides feedback
> to the clients (slower RPC replies->generate fewer request->less to cache).
> (I think you are probably familiar with the generic concept that a system
>  needs feedback to remain stable. An M/M/1 queue with open arrivals and
>  no feedback to slow the arrival rate explodes when the arrival rate
>  approaches the service rate, etc and so on...)

Unfortunately, the feedback channel that I have is: one user starts
500 virtual machines accessing a filesystem on the server -> other
users of this server see their goodput go to zero -> everyone sends in
angry trouble tickets -> I increase the DRC size manually.  It would
be nice if, by the time I next want to take a vacation, I have this
figured out.

I'm OK with throwing memory at the problem -- these servers have 96 GB
and can hold up to 144 GB -- so long as I can find a tuning that
provides stability and consistent, reasonable performance for the
users.

> The nfs server does soreserve(so, sb_max_adj, sb_max_adj); I can't
> recall exactly why it is that way, except that it needs to be large
> enough to handle the largest RPC request a client might generate.

> I should take another look at this, in case sb_max_adj is now
> too large?

It probably shouldn't be larger than the
net.inet.tcp.{send,recv}buf_max, and the read and write sizes that are
negotiated should be chosen so that a whole RPC can fit in that
space.  If that's too hard for whatever reason, nfsd should at least
log a message saying "hey, your socket buffer limits are too small,
I'm going to ignore them".

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 18:00:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C5033C08;
 Sat,  9 Mar 2013 18:00:05 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 66E9627C;
 Sat,  9 Mar 2013 18:00:05 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r29I04Sp062160;
 Sat, 9 Mar 2013 13:00:04 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r29I04gL062157;
 Sat, 9 Mar 2013 13:00:04 -0500 (EST) (envelope-from wollman)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <20795.30884.330015.123616@hergotha.csail.mit.edu>
Date: Sat, 9 Mar 2013 13:00:04 -0500
From: Garrett Wollman <wollman@freebsd.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: NFS DRC size
In-Reply-To: <1639798917.3728142.1362846452693.JavaMail.root@erie.cs.uoguelph.ca>
References: <20794.38381.221980.5038@hergotha.csail.mit.edu>
 <1639798917.3728142.1362846452693.JavaMail.root@erie.cs.uoguelph.ca>
X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Sat, 09 Mar 2013 13:00:04 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 18:00:05 -0000

<<On Sat, 9 Mar 2013 11:27:32 -0500 (EST), Rick Macklem <rmacklem@uoguelph.ca> said:

> around the highwater mark basically indicates this is working. If it wasn't
> throwing away replies where the receipt has been ack'd at the TCP
> level, the cache would grow very large, since they would only be
> discarded after a loonnngg timeout (12hours unless you've changes
> NFSRVCACHE_TCPTIMEOUT in sys/fs/nfs/nfs.h).

That seems unreasonably large.

> Well, the DRC will try to cache replies until the client's TCP layer
> acknowledges receipt of the reply. It is hard to say how many replies
> that is for a given TCP connection, since it is a function of the level
> of concurrently (# of nfsiod threads in the FreeBSD client)
> in the client. I'd guess it's somewhere between 1<->20?

Nearly all our clients are Linux, so it's likely to be whatever Debian
does by default.

> Multiply that by the number of TCP connections from all clients and
> you have about how big the server's DRC will be. (Some clients use
> a single TCP connection for the client whereas others use a separate
> TCP connection for each mount point.)

The Debian client appears to use a single TCP connection for
everything.

So if I want to support 2,000 clients each with 20 requests in flight,
that would suggest that I need a DRC size of 40,000, which my
experience shows is not sufficient with even a much smaller number of
clients.

-GAWollman


From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 18:46:11 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 7D31A6AA
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 18:46:11 +0000 (UTC)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu
 (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2])
 by mx1.freebsd.org (Postfix) with ESMTP id EE57C606
 for <freebsd-net@freebsd.org>; Sat,  9 Mar 2013 18:46:10 +0000 (UTC)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
 by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r29Ik927062597;
 Sat, 9 Mar 2013 13:46:09 -0500 (EST)
 (envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
 by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r29Ik9jX062596;
 Sat, 9 Mar 2013 13:46:09 -0500 (EST) (envelope-from wollman)
Date: Sat, 9 Mar 2013 13:46:09 -0500 (EST)
From: Garrett Wollman <wollman@hergotha.csail.mit.edu>
Message-Id: <201303091846.r29Ik9jX062596@hergotha.csail.mit.edu>
To: rmacklem@uoguelph.ca
Subject: Re: Limits on jumbo mbuf cluster allocation
X-Newsgroups: mit.lcs.mail.freebsd-net
In-Reply-To: <20795.29370.194678.963351@hergotha.csail.mit.edu>
References: <20794.37617.822910.93537@hergotha.csail.mit.edu>
 <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca>
Organization: none
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (hergotha.csail.mit.edu [127.0.0.1]); Sat, 09 Mar 2013 13:46:09 -0500 (EST)
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED
 autolearn=disabled version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 hergotha.csail.mit.edu
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 18:46:11 -0000

In article <20795.29370.194678.963351@hergotha.csail.mit.edu>, I wrote:
><<On Sat, 9 Mar 2013 11:50:30 -0500 (EST), Rick Macklem
><rmacklem@uoguelph.ca> said:
>> I've thought about this. My concern is that the separate thread might
>> not keep up with the trimming demand. If that occurred, the cache would
>> grow veryyy laarrggge, with effects like running out of mbuf clusters.
>
>At a minimum, once one nfsd thread is committed to doing the cache
>trim, a flag should be set to discourage other threads from trying to
>do it.  Having them all spinning their wheels punishes the clients
>much too much.

Also, it occurs to me that this strategy is subject to livelock.  To
put backpressure on the clients, it is far better to get them to stop
sending (by advertising a small receive window) than to accept their
traffic but queue it for a long time.  By the time the NFS code gets
an RPC, the system has already invested so much into it that it should
be processed as quickly as possible, and this strategy essentially
guarantees[1] that, once those 2 MB socket buffers start to fill up, they
will stay filled, sending latency through the roof.  If nfsd didn't
override the usual socket-buffer sizing mechanisms, then sysadmins
could limit the buffers to ensure a stable response time.

The bandwidth-delay product in our network is somewhere between 12.5
kB and 125 kB, depending on how the client is connected and what sort
of latency they experience.  The usual theory would suggest that
socket buffers should be no more than twice that -- i.e., about 256
kB.

I'd actually like to see something like WFQ in the NFS server to allow
me to limit the amount of damage one client or group of clients can
do without unnecessarily limiting other clients.

-GAWollman

[1] The largest RPC is a bit more than 64 KiB (negotiated), so if the
server gets slow, the 2 MB receive queue will be refilled by the
client before the server manages to perform the RPC and send a
response.

From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 19:18:04 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0AB9EE8B;
 Sat,  9 Mar 2013 19:18:04 +0000 (UTC)
 (envelope-from ndenev@gmail.com)
Received: from mail-wg0-x22a.google.com (mail-wg0-x22a.google.com
 [IPv6:2a00:1450:400c:c00::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id 2339D6E7;
 Sat,  9 Mar 2013 19:18:02 +0000 (UTC)
Received: by mail-wg0-f42.google.com with SMTP id 12so824752wgh.3
 for <multiple recipients>; Sat, 09 Mar 2013 11:18:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:subject:mime-version:content-type:from:in-reply-to:date
 :cc:content-transfer-encoding:message-id:references:to:x-mailer;
 bh=Vo2XatOVIi1H9+9J+ZkiPtFJDTs0Nf5R27czcbeFKvM=;
 b=c7QJCwLUWI+UDQFVBsk1cGZPxXWWVS3neWNsZExVlk2MvQmlfCglGa10tt8ePkU267
 /TkdYldy4zj/YWyV0N7ljcqxJHXrCXmblXg+zd044YhiOJ0AlrzpTh1X2ilxrsWeJnIS
 sKI9hQq+RIlIvKa7oNUOAkpUuEWWFZTYgTqonoa6ZnowLM52J2MrUJn5tU7SWOTuWaVz
 ybiMUtJgtWZRvPR6518pgGSrVt0AwjYOXG9MVOh9RJD1LwGMEmG71jfrYmnIh9HBxlad
 SnZWePQPKp9OKmbh+CYH4gZlw1AEU+srkMb/cBGxsOBjmv5diQfbrfz+n5wQd5bM1fG1
 BIzQ==
X-Received: by 10.180.82.33 with SMTP id f1mr4727528wiy.13.1362856682282;
 Sat, 09 Mar 2013 11:18:02 -0800 (PST)
Received: from [192.168.1.35] ([188.141.28.166])
 by mx.google.com with ESMTPS id c15sm6550408wiw.3.2013.03.09.11.18.00
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Sat, 09 Mar 2013 11:18:01 -0800 (PST)
Subject: Re: [patch] interface routes
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Content-Type: text/plain; charset=us-ascii
From: Nikolay Denev <ndenev@gmail.com>
In-Reply-To: <20130307214205.GD50035@funkthat.com>
Date: Sat, 9 Mar 2013 19:17:59 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <5205A02F-E886-4B7E-8494-1D92F930933B@gmail.com>
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <20130307214205.GD50035@funkthat.com>
To: John-Mark Gurney <jmg@funkthat.com>
X-Mailer: Apple Mail (2.1499)
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Andre Oppermann <andre@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 19:18:04 -0000

On Mar 7, 2013, at 9:42 PM, John-Mark Gurney <jmg@funkthat.com> wrote:

> Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 =
+0100:
>>> Adding interface address is handled via atomically deleting old =
prefix and=20
>>> adding interface one.
>>=20
>> This brings up a long standing sore point of our routing code
>> which this patch makes more pronounced.  When an interface link
>> state is down I don't want the route to it to persist but to
>> become inactive so another path can be chosen.  This the very
>> point of running a routing daemon.  So on the link-down event
>> the installed interface routes should be removed from the routing
>> table.  The configured addresses though should persist and the
>> interface routes re-installed on a link-up event.  What's your
>> opinion on it?
>>=20
>> Other than these points I think your code is fine and can go
>> into the tree.
>=20
> The issue that I see with this is that if you bump your cable, all
> your connections will be dropped, because as soon as they try to send
> something, they'll get a no route to host, and this will break the
> TCP connection...  If we keep the routes when the link goes down,
> the packet will be queued or dropped (depending upon ethernet driver),
> but the TCP connection will not break...
>=20
> --=20
>  John-Mark Gurney				Voice: +1 415 225 5579
>=20
>     "All that I will do, has been done, All that I have, has not."

Maybe this can be made a option that can be turned on when needed.
What you describe can be very undesirable for a workstation/laptop or a =
server,
but a router that itself does not have many connections originating or =
terminating on it could
actually benefit from this.
The current state is actually much worse for routers. A link down does =
not do anything, and
while there may be a alternative route to be installed for example from =
OSPF, the interface without link
pertains its routes and effectively blackholes all traffic.

--
Nikolay


From owner-freebsd-net@FreeBSD.ORG  Sat Mar  9 19:20:23 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C8813F6D;
 Sat,  9 Mar 2013 19:20:23 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 8E0C0701;
 Sat,  9 Mar 2013 19:20:23 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1UEPN0-00092D-Td; Sat, 09 Mar 2013 23:23:51 +0400
Message-ID: <513B8B56.1000005@FreeBSD.org>
Date: Sat, 09 Mar 2013 23:19:50 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: Nikolay Denev <ndenev@gmail.com>
Subject: Re: [patch] interface routes
References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org>
 <20130307214205.GD50035@funkthat.com>
 <5205A02F-E886-4B7E-8494-1D92F930933B@gmail.com>
In-Reply-To: <5205A02F-E886-4B7E-8494-1D92F930933B@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: John-Mark Gurney <jmg@funkthat.com>, Andre Oppermann <andre@freebsd.org>,
 net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2013 19:20:23 -0000

On 09.03.2013 23:17, Nikolay Denev wrote:
> On Mar 7, 2013, at 9:42 PM, John-Mark Gurney<jmg@funkthat.com>  wrote:
>
>> Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 +0100:
>>>> Adding interface address is handled via atomically deleting old prefix and
>>>> adding interface one.
>>>
>>> This brings up a long standing sore point of our routing code
>>> which this patch makes more pronounced.  When an interface link
>>> state is down I don't want the route to it to persist but to
>>> become inactive so another path can be chosen.  This the very
>>> point of running a routing daemon.  So on the link-down event
>>> the installed interface routes should be removed from the routing
>>> table.  The configured addresses though should persist and the
>>> interface routes re-installed on a link-up event.  What's your
>>> opinion on it?
>>>
>>> Other than these points I think your code is fine and can go
>>> into the tree.
>>
>> The issue that I see with this is that if you bump your cable, all
>> your connections will be dropped, because as soon as they try to send
>> something, they'll get a no route to host, and this will break the
>> TCP connection...  If we keep the routes when the link goes down,
>> the packet will be queued or dropped (depending upon ethernet driver),
>> but the TCP connection will not break...
>>
>> --
>>   John-Mark Gurney				Voice: +1 415 225 5579
>>
>>      "All that I will do, has been done, All that I have, has not."
>
> Maybe this can be made a option that can be turned on when needed.
Yes. There is another patch in this thread with 
"remove_iface_routes_on_change" per-VNET sysctl, turned off by default.
> What you describe can be very undesirable for a workstation/laptop or a server,
> but a router that itself does not have many connections originating or terminating on it could
> actually benefit from this.
> The current state is actually much worse for routers. A link down does not do anything, and
> while there may be a alternative route to be installed for example from OSPF, the interface without link
> pertains its routes and effectively blackholes all traffic.
>
> --
> Nikolay
>
>