From owner-freebsd-net@FreeBSD.ORG Sun Mar 3 10:14:47 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 88236A45 for ; Sun, 3 Mar 2013 10:14:47 +0000 (UTC) (envelope-from sepherosa@gmail.com) Received: from mail-we0-x22a.google.com (mail-we0-x22a.google.com [IPv6:2a00:1450:400c:c03::22a]) by mx1.freebsd.org (Postfix) with ESMTP id 178D0DC9 for ; Sun, 3 Mar 2013 10:14:46 +0000 (UTC) Received: by mail-we0-f170.google.com with SMTP id z53so3839489wey.1 for ; Sun, 03 Mar 2013 02:14:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=ZcSKMS+LANeREY4ze6TFG9fc0w7myZ9+Bf8VLNFyTo4=; b=NjmCEWcUIirzZ+sNJDI0KOynYIkJ2KygO40CLbIYckqZPIn1nNyjj5wQ4cXWqPdO/M jOgWnGwvhoip+AO3Myre8yblzLK4LKXEGtnrX366B77x9Jl7gJyLa1SaMHgpQUwiT35p iKuo44GqQFrOasZ8ZyxMDIrZA13P3f8Wq0aIHSPf8rh3ikhhES8Mi/gOWS2cTz+uOFYg vffFzNa6rTyTV6F+xaPxOZ+3EH209xpYeHqTUWRBD/x5PT3+MxEG+sayuv9r8LKx9TWL /wTYwLekWFL6RGHOIEAIszcY8PQFgOx/IB8Q/D16wmi5moYAL6Uv959KDjFe7gFJJ2Ww E06A== MIME-Version: 1.0 X-Received: by 10.180.185.44 with SMTP id ez12mr5516119wic.33.1362305685002; Sun, 03 Mar 2013 02:14:45 -0800 (PST) Received: by 10.194.89.170 with HTTP; Sun, 3 Mar 2013 02:14:44 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Sun, 3 Mar 2013 18:14:44 +0800 Message-ID: Subject: Re: igb network lockups From: Sepherosa Ziehau To: Nick Rogers Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" , Jack Vogel , "Christopher D. Harrison" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Mar 2013 10:14:47 -0000 On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers wrote: > On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers wrote: >> FWIW I have been experiencing a similar issue on a number of systems >> using the em(4) driver under 9.1-RELEASE. This is after upgrading from >> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms >> are: interface stops passing traffic until the system is rebooted. I >> have not yet been able to gain access to the systems to dig around >> (after they have crashed), however my kernel/network settings are >> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to >> happen about once a day on systems with around a sustained 50Mb/s of >> traffic. >> >> I realize this is not much to go on but perhaps it helps. I am >> debating trying the e1000 driver in the latest CURRENT on top of >> 9.1-RELEASE. I noticed the Intel shared code was updated about a week >> ago. Would this change or perhaps another change to e1000 since >> 9.1-RELEASE possibly affect stability in a positive way? >> >> Thanks. > > Heres relevant pciconf output: > > em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it is simply broken (you could check 82574 errata on Intel's website to confirm what I have said here). For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it is simply broken (you could check 82575 errata on Intel's website to confirm what I have said here). Best Regards, sephe -- Tomorrow Will Never Die On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers wrote: > On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers wrote: >> FWIW I have been experiencing a similar issue on a number of systems >> using the em(4) driver under 9.1-RELEASE. This is after upgrading from >> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms >> are: interface stops passing traffic until the system is rebooted. I >> have not yet been able to gain access to the systems to dig around >> (after they have crashed), however my kernel/network settings are >> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to >> happen about once a day on systems with around a sustained 50Mb/s of >> traffic. >> >> I realize this is not much to go on but perhaps it helps. I am >> debating trying the e1000 driver in the latest CURRENT on top of >> 9.1-RELEASE. I noticed the Intel shared code was updated about a week >> ago. Would this change or perhaps another change to e1000 since >> 9.1-RELEASE possibly affect stability in a positive way? >> >> Thanks. > > Heres relevant pciconf output: > > em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > em1@pci0:2:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > em2@pci0:7:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > em3@pci0:8:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > > >> >> On Mon, Feb 25, 2013 at 10:45 AM, Jack Vogel wrote: >>> Have you done any poking around, looking at stats to determine why the >>> hangs? For instance, >>> might your mbuf pool be depleted? Some other network resource perhaps? >>> >>> Jack >>> >>> >>> On Mon, Feb 25, 2013 at 10:38 AM, Christopher D. Harrison < >>> harrison@biostat.wisc.edu> wrote: >>> >>>> Sure, >>>> The problem appears on both systems running with ALTQ and vanilla. >>>> -C >>>> >>>> On 02/25/13 12:29, Jack Vogel wrote: >>>> >>>> I've not heard of this problem, but I think most users do not use ALTQ, >>>> and we (Intel) do not >>>> test using it. Can it be eliminated from the equation? >>>> >>>> Jack >>>> >>>> >>>> On Mon, Feb 25, 2013 at 10:16 AM, Christopher D. Harrison < >>>> harrison@biostat.wisc.edu> wrote: >>>> >>>>> I recently have been experiencing network "freezes" and network "lockups" >>>>> on our Freebsd 9.1 systems which are running zfs and nfs file servers. >>>>> I upgraded from 9.0 to 9.1 about 2 months ago and we have been having >>>>> issues with almost bi-monthly. The issue manifests in the system becomes >>>>> unresponsive to any/all nfs clients. The system is not resource bound as >>>>> our I/O is low to disk and our network is usually in the 20mbit/40mbit >>>>> range. We do notice a correlation between temporary i/o spikes and >>>>> network freezes but not enough to send our system in to "lockup" mode for >>>>> the next 5min. Currently we have 4 igb nics in 2 aggr's with 8 queue's >>>>> per nic and our dev.igb reports: >>>>> >>>>> dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.4 >>>>> >>>>> I am almost certain the problem is with the ibg driver as a friend is >>>>> also experiencing the same problem with the same intel igb nic. He has >>>>> addressed the issue by restarting the network using netif on his systems. >>>>> According to my friend, once the network interfaces get cleared, everything >>>>> comes back and starts working as expected. >>>>> >>>>> I have noticed an issue with the igb driver and I was looking for >>>>> thoughts on how to help address this problem. >>>>> >>>>> http://freebsd.1045724.n5.nabble.com/em-igb-if-transmit-drbr-and-ALTQ-td5760338.html >>>>> >>>>> Thoughts/Ideas are greatly appreciated!!! >>>>> >>>>> -C >>>>> >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>>> >>>> >>>> >>>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" -- Tomorrow Will Never Die From owner-freebsd-net@FreeBSD.ORG Sun Mar 3 15:20:53 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D040EA0B for ; Sun, 3 Mar 2013 15:20:53 +0000 (UTC) (envelope-from pawel.worach@gmail.com) Received: from mail-la0-x229.google.com (mail-la0-x229.google.com [IPv6:2a00:1450:4010:c03::229]) by mx1.freebsd.org (Postfix) with ESMTP id 4F5A6B61 for ; Sun, 3 Mar 2013 15:20:53 +0000 (UTC) Received: by mail-la0-f41.google.com with SMTP id fo12so4281115lab.14 for ; Sun, 03 Mar 2013 07:20:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=hpLalTKDGz/GLk4+w9mlrfqY4aJchhfOG0teBmUnrwE=; b=rjHmshf9bIF7OjShNmweU0OgAcCJL38ZhJTbt3KwPduGq39NgBzs6Iy5eeg7kwzfjH U6XakKlPDSsxFACdYuR9eVDdUi6D8qe5GTNeycKKRUH3mYTl1y3CJUtWstvL3PJ+tKC5 kblDDABmAcu995viuuTePHklN42vd5hHtD5tZwf3ne2jKm9kv+Mrt4zl0/XkncA7NzRk s5Rf+4jXdwkdYTTffmyPUGIdcrCzL3RpRHs7M6SWEo6c5TQrMx3OSX+jiYC+QnC0+Qd+ LVlRbzyOLHYD8+ZpVV1+DgzDdMgpIdfDGC0jwMpbja0DPXTtvVR1CF5nZr+3WlPG9Pd+ EJGQ== X-Received: by 10.152.131.233 with SMTP id op9mr15142771lab.3.1362323696254; Sun, 03 Mar 2013 07:14:56 -0800 (PST) Received: from one.local ([2001:16d8:ffce:0:5586:8f94:1e64:77f6]) by mx.google.com with ESMTPS id j2sm6282510lbd.16.2013.03.03.07.14.54 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 03 Mar 2013 07:14:55 -0800 (PST) Message-ID: <513368EE.9090802@gmail.com> Date: Sun, 03 Mar 2013 16:14:54 +0100 From: Pawel Worach User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130222 Thunderbird/17.0.3 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: ipfw NAT, keepalive from wrong source Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Mar 2013 15:20:53 -0000 Hi, In the scenario below ipfw seems to be sending the keep-alive packets from the wrong source address if the traffic is NATed, on the external interface the packet is sent to the server with the original source. Did I configure my ipfw rules incorrectly ? I'm using in-kernel NAT on FreeBSD 9-STABLE r247666 with r247626 merged from head (that patch did not change the behavior). Internal client (172.16.0.31) connects to an external ssh server (192.0.2.100) with hide-nat behind a.b.c.d. tcpdump on outside interface (the second packets is likely the keepalive ACK the client sent as result of the keepalive the ipfw gateway sent on the inside which got forwarded on to the server, is that intentional ?): 15:36:28.075529 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 2804620200, win 0, length 0 15:36:28.076823 IP a.b.c.d.41731 > 192.0.2.100.22: Flags [.], ack 2625, win 1040, options [nop,nop,TS val 151519866 ecr 3275697134], length 0 15:36:33.075499 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, win 0, length 0 15:36:38.075497 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, win 0, length 0 15:36:43.075519 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, win 0, length 0 tcpdump on inside interface: 15:36:28.078015 IP 192.0.2.100.22 > 172.16.0.31.41731: Flags [.], ack 517940233, win 0, length 0 15:36:28.078040 IP 172.16.0.31.41731 > 192.0.2.100.22: Flags [.], ack 1, win 1040, options [nop,nop,TS val 151519866 ecr 3275697134], length 0 State table (the keepalives where send at about 20-19 seconds before expiration): 03600 27 7867 (22s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 27 7867 (21s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 27 7867 (20s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 27 7867 (19s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 28 7919 (18s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 28 7919 (17s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 28 7919 (16s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 28 7919 (15s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 03600 28 7919 (14s) STATE tcp 172.16.0.31 41731 <-> 192.0.2.100 22 .. continues to 1 and disappears .. Rules (em0 is the external interface): ${fwcmd} nat 10 config if em0 log same_ports unreg_only ${fwcmd} add nat 10 all from 172.16.0.0/12 to any via em0 ${fwcmd} add nat 10 all from not 172.16.0.0/12 any to me via em0 ${fwcmd} add allow tcp from 172.16.0.0/12 to any established ${fwcmd} add allow tcp from 172.16.0.0/12 to any setup keep-state # this is rule 03600) Regards Pawel From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 11:06:46 2013 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6BFFFEB6 for ; Mon, 4 Mar 2013 11:06:46 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 5D2D9E55 for ; Mon, 4 Mar 2013 11:06:46 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r24B6kiU038832 for ; Mon, 4 Mar 2013 11:06:46 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r24B6k00038830 for freebsd-net@FreeBSD.org; Mon, 4 Mar 2013 11:06:46 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 4 Mar 2013 11:06:46 GMT Message-Id: <201303041106.r24B6k00038830@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-net@FreeBSD.org Subject: Current problem reports assigned to freebsd-net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 11:06:46 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/176596 net [firewire] [ip6] Crash with IPv6 and Firewire o kern/176510 net [udp] [panic] Kernel Panic in udp_input @ offset 0x475 o kern/176446 net [netinet] [patch] Concurrency in ixgbe driving out-of- o kern/176420 net [kernel] [patch] incorrect errno for LOCAL_PEERCRED o kern/176419 net [kernel] [patch] socketpair support for LOCAL_PEERCRED o kern/176401 net [netgraph] page fault in netgraph o kern/176167 net [ipsec][lagg] using lagg and ipsec causes immediate pa o kern/176097 net [lagg] [patch] lagg/lacp broken when aggregated interf o kern/176027 net [em] [patch] flow control systcl consistency for em dr o kern/176026 net [tcp] [patch] TCP wrappers caused quite a lot of warni o bin/175974 net ppp(8): logic issue o kern/175864 net [re] Intel MB D510MO, onboard ethernet not working aft o kern/175852 net [amd64] [patch] in_cksum_hdr() behaves differently on o kern/175734 net no ethernet detected on system with EG20T PCH chipset o kern/175267 net [pf] [tap] pf + tap keep state problem o kern/175236 net [epair] [gif] epair and gif Devices On Bridge o kern/175182 net [panic] kernel panic on RADIX_MPATH when deleting rout o kern/175153 net [tcp] will there miss a FIN when do TSO? o kern/174959 net [net] [patch] rnh_walktree_from visits spurious nodes o kern/174958 net [net] [patch] rnh_walktree_from makes unreasonable ass o kern/174897 net [route] Interface routes are broken o kern/174851 net [bxe] [patch] UDP checksum offload is wrong in bxe dri o kern/174850 net [bxe] [patch] bxe driver does not receive multicasts o kern/174849 net [bxe] [patch] bxe driver can hang kernel when reset o kern/174822 net [tcp] Page fault in tcp_discardcb under high traffic o kern/174602 net [gif] [ipsec] traceroute issue on gif tunnel with ipse o kern/174535 net [tcp] TCP fast retransmit feature works strange o kern/173475 net [tun] tun(4) stays opened by PID after process is term o kern/173201 net [ixgbe] [patch] Missing / broken ixgbe sysctl's and tu o kern/173137 net [em] em(4) unable to run at gigabit with 9.1-RC2 o kern/173002 net [patch] data type size problem in if_spppsubr.c o kern/172985 net [patch] [ip6] lltable leak when adding and removing IP o kern/172895 net [ixgb] [ixgbe] do not properly determine link-state o kern/172683 net [ip6] Duplicate IPv6 Link Local Addresses o kern/172675 net [netinet] [patch] sysctl_tcp_hc_list (net.inet.tcp.hos o kern/172113 net [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4 o kern/171840 net [ip6] IPv6 packets transmitting only on queue 0 o kern/171739 net [bce] [panic] bce related kernel panic o kern/171711 net [dummynet] [panic] Kernel panic in dummynet o kern/171532 net [ndis] ndis(4) driver includes 'pccard'-specific code, o kern/171531 net [ndis] undocumented dependency for ndis(4) o kern/171524 net [ipmi] ipmi driver crashes kernel by reboot or shutdow s kern/171508 net [epair] [request] Add the ability to name epair device o kern/171228 net [re] [patch] if_re - eeprom write issues o kern/170701 net [ppp] killl ppp or reboot with active ppp connection c o kern/170267 net [ixgbe] IXGBE_LE32_TO_CPUS is probably an unintentiona o kern/170081 net [fxp] pf/nat/jails not working if checksum offloading o kern/169898 net ifconfig(8) fails to set MTU on multiple interfaces. o kern/169676 net [bge] [hang] system hangs, fully or partially after re o kern/169664 net [bgp] Wrongful replacement of interface connected net o kern/169620 net [ng] [pf] ng_l2tp incoming packet bypass pf firewall o kern/169459 net [ppp] umodem/ppp/3g stopped working after update from o kern/169438 net [ipsec] ipv4-in-ipv6 tunnel mode IPsec does not work p kern/168294 net [ixgbe] [patch] ixgbe driver compiled in kernel has no o kern/168246 net [em] Multiple em(4) not working with qemu o kern/168245 net [arp] [regression] Permanent ARP entry not deleted on o kern/168244 net [arp] [regression] Unable to manually remove permanent o kern/168183 net [bce] bce driver hang system o kern/167947 net [setfib] [patch] arpresolve checks only the default FI o kern/167603 net [ip] IP fragment reassembly's broken: file transfer ov o kern/167500 net [em] [panic] Kernel panics in em driver o kern/167325 net [netinet] [patch] sosend sometimes return EINVAL with o kern/167202 net [igmp]: Sending multiple IGMP packets crashes kernel o kern/167059 net [tcp] [panic] System does panic in in_pcbbind() and ha o kern/166940 net [ipfilter] [panic] Double fault in kern 8.2 o kern/166462 net [gre] gre(4) when using a tunnel source address from c o kern/166372 net [patch] ipfilter drops UDP packets with zero checksum o kern/166285 net [arp] FreeBSD v8.1 REL p8 arp: unknown hardware addres o kern/166255 net [net] [patch] It should be possible to disable "promis o kern/165963 net [panic] [ipf] ipfilter/nat NULL pointer deference o kern/165903 net mbuf leak o kern/165643 net [net] [patch] Missing vnet restores in net/if_ethersub o kern/165622 net [ndis][panic][patch] Unregistered use of FPU in kernel s kern/165562 net [request] add support for Intel i350 in FreeBSD 7.4 o kern/165526 net [bxe] UDP packets checksum calculation whithin if_bxe o kern/165488 net [ppp] [panic] Fatal trap 12 jails and ppp , kernel wit o kern/165305 net [ip6] [request] Feature parity between IP_TOS and IPV6 o kern/165296 net [vlan] [patch] Fix EVL_APPLY_VLID, update EVL_APPLY_PR o kern/165181 net [igb] igb freezes after about 2 weeks of uptime o kern/165174 net [patch] [tap] allow tap(4) to keep its address on clos o kern/165152 net [ip6] Does not work through the issue of ipv6 addresse o kern/164495 net [igb] connect double head igb to switch cause system t o kern/164490 net [pfil] Incorrect IP checksum on pfil pass from ip_outp o kern/164475 net [gre] gre misses RUNNING flag after a reboot o kern/164265 net [netinet] [patch] tcp_lro_rx computes wrong checksum i o kern/163903 net [igb] "igb0:tx(0)","bpf interface lock" v2.2.5 9-STABL o kern/163481 net freebsd do not add itself to ping route packet o kern/162927 net [tun] Modem-PPP error ppp[1538]: tun0: Phase: Clearing o kern/162926 net [ipfilter] Infinite loop in ipfilter with fragmented I o kern/162558 net [dummynet] [panic] seldom dummynet panics o kern/162153 net [em] intel em driver 7.2.4 don't compile o kern/162110 net [igb] [panic] RELENG_9 panics on boot in IGB driver - o kern/162028 net [ixgbe] [patch] misplaced #endif in ixgbe.c o kern/161277 net [em] [patch] BMC cannot receive IPMI traffic after loa o kern/160873 net [igb] igb(4) from HEAD fails to build on 7-STABLE o kern/160750 net Intel PRO/1000 connection breaks under load until rebo o kern/160693 net [gif] [em] Multicast packet are not passed from GIF0 t o kern/160293 net [ieee80211] ppanic] kernel panic during network setup o kern/160206 net [gif] gifX stops working after a while (IPv6 tunnel) o kern/159817 net [udp] write UDPv4: No buffer space available (code=55) o kern/159629 net [ipsec] [panic] kernel panic with IPsec in transport m o kern/159621 net [tcp] [panic] panic: soabort: so_count o kern/159603 net [netinet] [patch] in_ifscrubprefix() - network route c o kern/159601 net [netinet] [patch] in_scrubprefix() - loopback route re o kern/159294 net [em] em watchdog timeouts o kern/159203 net [wpi] Intel 3945ABG Wireless LAN not support IBSS o kern/158930 net [bpf] BPF element leak in ifp->bpf_if->bif_dlist o kern/158726 net [ip6] [patch] ICMPv6 Router Announcement flooding limi o kern/158694 net [ix] [lagg] ix0 is not working within lagg(4) o kern/158665 net [ip6] [panic] kernel pagefault in in6_setscope() o kern/158635 net [em] TSO breaks BPF packet captures with em driver f kern/157802 net [dummynet] [panic] kernel panic in dummynet o kern/157785 net amd64 + jail + ipfw + natd = very slow outbound traffi o kern/157418 net [em] em driver lockup during boot on Supermicro X9SCM- o kern/157410 net [ip6] IPv6 Router Advertisements Cause Excessive CPU U o kern/157287 net [re] [panic] INVARIANTS panic (Memory modified after f o kern/157209 net [ip6] [patch] locking error in rip6_input() (sys/netin o kern/157200 net [network.subr] [patch] stf(4) can not communicate betw o kern/157182 net [lagg] lagg interface not working together with epair o kern/156877 net [dummynet] [panic] dummynet move_pkt() null ptr derefe o kern/156667 net [em] em0 fails to init on CURRENT after March 17 o kern/156408 net [vlan] Routing failure when using VLANs vs. Physical e o kern/156328 net [icmp]: host can ping other subnet but no have IP from o kern/156317 net [ip6] Wrong order of IPv6 NS DAD/MLD Report o kern/156283 net [ip6] [patch] nd6_ns_input - rtalloc_mpath does not re o kern/156279 net [if_bridge][divert][ipfw] unable to correctly re-injec o kern/156226 net [lagg]: failover does not announce the failover to swi o kern/156030 net [ip6] [panic] Crash in nd6_dad_start() due to null ptr o kern/155772 net ifconfig(8): ioctl (SIOCAIFADDR): File exists on direc o kern/155680 net [multicast] problems with multicast s kern/155642 net [new driver] [request] Add driver for Realtek RTL8191S o kern/155597 net [panic] Kernel panics with "sbdrop" message o kern/155420 net [vlan] adding vlan break existent vlan o kern/155177 net [route] [panic] Panic when inject routes in kernel p kern/155030 net [igb] igb(4) DEVICE_POLLING does not work with carp(4) o kern/155010 net [msk] ntfs-3g via iscsi using msk driver cause kernel o kern/154943 net [gif] ifconfig gifX create on existing gifX clears IP s kern/154851 net [new driver] [request]: Port brcm80211 driver from Lin o kern/154850 net [netgraph] [patch] ng_ether fails to name nodes when t o kern/154679 net [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R o kern/154600 net [tcp] [panic] Random kernel panics on tcp_output o kern/154557 net [tcp] Freeze tcp-session of the clients, if in the gat o kern/154443 net [if_bridge] Kernel module bridgestp.ko missing after u o kern/154286 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/154255 net [nfs] NFS not responding o kern/154214 net [stf] [panic] Panic when creating stf interface o kern/154185 net race condition in mb_dupcl o kern/154169 net [multicast] [ip6] Node Information Query multicast add o kern/154134 net [ip6] stuck kernel state in LISTEN on ipv6 daemon whic o kern/154091 net [netgraph] [panic] netgraph, unaligned mbuf? o conf/154062 net [vlan] [patch] change to way of auto-generatation of v o kern/153937 net [ral] ralink panics the system (amd64 freeBSDD 8.X) wh o kern/153936 net [ixgbe] [patch] MPRC workaround incorrectly applied to o kern/153816 net [ixgbe] ixgbe doesn't work properly with the Intel 10g o kern/153772 net [ixgbe] [patch] sysctls reference wrong XON/XOFF varia o kern/153497 net [netgraph] netgraph panic due to race conditions o kern/153454 net [patch] [wlan] [urtw] Support ad-hoc and hostap modes o kern/153308 net [em] em interface use 100% cpu o kern/153244 net [em] em(4) fails to send UDP to port 0xffff o kern/152893 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/152853 net [em] tftpd (and likely other udp traffic) fails over e o kern/152828 net [em] poor performance on 8.1, 8.2-PRE o kern/152569 net [net]: Multiple ppp connections and routing table prob o kern/152235 net [arp] Permanent local ARP entries are not properly upd o kern/152141 net [vlan] [patch] encapsulate vlan in ng_ether before out o kern/152036 net [libc] getifaddrs(3) returns truncated sockaddrs for n o kern/151690 net [ep] network connectivity won't work until dhclient is o kern/151681 net [nfs] NFS mount via IPv6 leads to hang on client with o kern/151593 net [igb] [panic] Kernel panic when bringing up igb networ o kern/150920 net [ixgbe][igb] Panic when packets are dropped with heade o kern/150557 net [igb] igb0: Watchdog timeout -- resetting o kern/150251 net [patch] [ixgbe] Late cable insertion broken o kern/150249 net [ixgbe] Media type detection broken o bin/150224 net ppp(8) does not reassign static IP after kill -KILL co f kern/149969 net [wlan] [ral] ralink rt2661 fails to maintain connectio o kern/149937 net [ipfilter] [patch] kernel panic in ipfilter IP fragmen o kern/149643 net [rum] device not sending proper beacon frames in ap mo o kern/149609 net [panic] reboot after adding second default route o kern/149117 net [inet] [patch] in_pcbbind: redundant test o kern/149086 net [multicast] Generic multicast join failure in 8.1 o kern/148018 net [flowtable] flowtable crashes on ia64 o kern/147912 net [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300 11 o kern/147894 net [ipsec] IPv6-in-IPv4 does not work inside an ESP-only o kern/147155 net [ip6] setfb not work with ipv6 o kern/146845 net [libc] close(2) returns error 54 (connection reset by f kern/146792 net [flowtable] flowcleaner 100% cpu's core load o kern/146719 net [pf] [panic] PF or dumynet kernel panic o kern/146534 net [icmp6] wrong source address in echo reply o kern/146427 net [mwl] Additional virtual access points don't work on m f kern/146394 net [vlan] IP source address for outgoing connections o bin/146377 net [ppp] [tun] Interface doesn't clear addresses when PPP o kern/146358 net [vlan] wrong destination MAC address o kern/146165 net [wlan] [panic] Setting bssid in adhoc mode causes pani o kern/146082 net [ng_l2tp] a false invaliant check was performed in ng_ o kern/146037 net [panic] mpd + CoA = kernel panic o kern/145825 net [panic] panic: soabort: so_count o kern/145728 net [lagg] Stops working lagg between two servers. p kern/145600 net TCP/ECN behaves different to CE/CWR than ns2 reference f kern/144917 net [flowtable] [panic] flowtable crashes system [regressi o kern/144882 net MacBookPro =>4.1 does not connect to BSD in hostap wit o kern/144874 net [if_bridge] [patch] if_bridge frees mbuf after pfil ho o conf/144700 net [rc.d] async dhclient breaks stuff for too many people o kern/144616 net [nat] [panic] ip_nat panic FreeBSD 7.2 f kern/144315 net [ipfw] [panic] freebsd 8-stable reboot after add ipfw o kern/144231 net bind/connect/sendto too strict about sockaddr length o kern/143846 net [gif] bringing gif3 tunnel down causes gif0 tunnel to s kern/143673 net [stf] [request] there should be a way to support multi s kern/143666 net [ip6] [request] PMTU black hole detection not implemen o kern/143622 net [pfil] [patch] unlock pfil lock while calling firewall o kern/143593 net [ipsec] When using IPSec, tcpdump doesn't show outgoin o kern/143591 net [ral] RT2561C-based DLink card (DWL-510) fails to work o kern/143208 net [ipsec] [gif] IPSec over gif interface not working o kern/143034 net [panic] system reboots itself in tcp code [regression] o kern/142877 net [hang] network-related repeatable 8.0-STABLE hard hang o kern/142774 net Problem with outgoing connections on interface with mu o kern/142772 net [libc] lla_lookup: new lle malloc failed f kern/142518 net [em] [lagg] Problem on 8.0-STABLE with em and lagg o kern/142018 net [iwi] [patch] Possibly wrong interpretation of beacon- o kern/141861 net [wi] data garbled with WEP and wi(4) with Prism 2.5 f kern/141741 net Etherlink III NIC won't work after upgrade to FBSD 8, o kern/140742 net rum(4) Two asus-WL167G adapters cannot talk to each ot o kern/140682 net [netgraph] [panic] random panic in netgraph f kern/140634 net [vlan] destroying if_lagg interface with if_vlan membe o kern/140619 net [ifnet] [patch] refine obsolete if_var.h comments desc o kern/140346 net [wlan] High bandwidth use causes loss of wlan connecti o kern/140142 net [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6 o kern/140066 net [bwi] install report for 8.0 RC 2 (multiple problems) o kern/139565 net [ipfilter] ipfilter ioctl SIOCDELST broken o kern/139387 net [ipsec] Wrong lenth of PF_KEY messages in promiscuous o bin/139346 net [patch] arp(8) add option to remove static entries lis o kern/139268 net [if_bridge] [patch] allow if_bridge to forward just VL p kern/139204 net [arp] DHCP server replies rejected, ARP entry lost bef o kern/139117 net [lagg] + wlan boot timing (EBUSY) o kern/139058 net [ipfilter] mbuf cluster leak on FreeBSD 7.2 o kern/138850 net [dummynet] dummynet doesn't work correctly on a bridge o kern/138782 net [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00 o kern/138688 net [rum] possibly broken on 8 Beta 4 amd64: able to wpa a o kern/138678 net [lo] FreeBSD does not assign linklocal address to loop o kern/138407 net [gre] gre(4) interface does not come up after reboot o kern/138332 net [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/ o kern/138266 net [panic] kernel panic when udp benchmark test used as r o kern/138177 net [ipfilter] FreeBSD crashing repeatedly in ip_nat.c:257 f kern/138029 net [bpf] [panic] periodically kernel panic and reboot o kern/137881 net [netgraph] [panic] ng_pppoe fatal trap 12 p bin/137841 net [patch] wpa_supplicant(8) cannot verify SHA256 signed p kern/137776 net [rum] panic in rum(4) driver on 8.0-BETA2 o bin/137641 net ifconfig(8): various problems with "vlan_device.vlan_i o kern/137392 net [ip] [panic] crash in ip_nat.c line 2577 o kern/137372 net [ral] FreeBSD doesn't support wireless interface from o kern/137089 net [lagg] lagg falsely triggers IPv6 duplicate address de o bin/136994 net [patch] ifconfig(8) print carp mac address o kern/136911 net [netgraph] [panic] system panic on kldload ng_bpf.ko t o kern/136618 net [pf][stf] panic on cloning interface without unit numb o kern/135502 net [periodic] Warning message raised by rtfree function i o kern/134583 net [hang] Machine with jail freezes after random amount o o kern/134531 net [route] [panic] kernel crash related to routes/zebra o kern/134157 net [dummynet] dummynet loads cpu for 100% and make a syst o kern/133969 net [dummynet] [panic] Fatal trap 12: page fault while in o kern/133968 net [dummynet] [panic] dummynet kernel panic o kern/133736 net [udp] ip_id not protected ... o kern/133595 net [panic] Kernel Panic at pcpu.h:195 o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre f kern/133213 net arp and sshd errors on 7.1-PRERELEASE o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o conf/132851 net [patch] rc.conf(5): allow to setfib(1) for service run o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o kern/131601 net [ipfilter] [panic] 7-STABLE panic in nat_finalise (tcp o bin/131365 net route(8): route add changes interpretation of network f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129517 net [ipsec] [panic] double fault / stack overflow f kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o bin/128954 net ifconfig(8) deletes valid routes o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by p kern/127360 net [socket] TOE socket options missing from sosetopt() o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126339 net [ipw] ipw driver drops the connection o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124225 net [ndis] [patch] ndis network driver sometimes loses net o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o kern/123758 net [panic] panic while restarting net/freenet6 o bin/123633 net ifconfig(8) doesn't set inet and ether address in one o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices f kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal o kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup ieee o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] [security] ppp(8): fix local stack overflow in o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/120966 net [rum] kernel panic with if_rum and WPA encryption o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr o kern/118727 net [netgraph] [patch] [request] add new ng_pf module o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o kern/113432 net [ucom] WARNING: attempt to net_add_domain(netgraph) af o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111537 net [inet6] [patch] ip6_input() treats mbuf cluster wrong o kern/111457 net [ral] ral(4) freeze o kern/110284 net [if_ethersubr] Invalid Assumption in SIOCSIFADDR in et o kern/110249 net [kernel] [regression] [patch] setsockopt() error regre o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106444 net [netgraph] [panic] Kernel Panic on Binding to an ip to o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o kern/102540 net [netgraph] [patch] supporting vlan(4) by ng_fec(4) o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working o kern/97306 net [netgraph] NG_L2TP locks after connection with failed o conf/97014 net [gif] gifconfig_gif? in rc.conf does not recognize IPv f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87421 net [netgraph] [panic]: ng_ether + ng_eiface + if_bridge o kern/86871 net [tcp] [patch] allocation logic for PCBs in TIME_WAIT s o kern/86427 net [lor] Deadlock with FASTIPSEC and nat o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ p kern/85320 net [gre] [patch] possible depletion of kernel stack in ip o bin/82975 net route change does not parse classfull network as given o kern/82881 net [netgraph] [panic] ng_fec(4) causes kernel panic after o kern/82468 net Using 64MB tcp send/recv buffers, trafficflow stops, i o bin/82185 net [patch] ndp(8) can delete the incorrect entry o kern/81095 net IPsec connection stops working if associated network i o kern/78968 net FreeBSD freezes on mbufs exhaustion (network interface o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if o kern/77341 net [ip6] problems with IPV6 implementation s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time a kern/71474 net [route] route lookup does not skip interfaces marked d o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/68889 net [panic] m_copym, length > size of mbuf chain o kern/66225 net [netgraph] [patch] extend ng_eiface(4) control message o kern/65616 net IPSEC can't detunnel GRE packets after real ESP encryp s kern/60293 net [patch] FreeBSD arp poison patch a kern/56233 net IPsec tunnel (ESP) over IPv6: MTU computation is wrong s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr o kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31940 net ip queue length too short for >500kpps o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c f kern/24959 net [patch] proper TCP_NOPUSH/TCP_CORK compatibility o conf/23063 net [arp] [patch] for static ARP tables in rc.network o kern/21998 net [socket] [patch] ident only for outgoing connections o kern/5877 net [socket] sb_cc counts control data as well as data dat 451 problems total. From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 16:35:44 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EA51BE19 for ; Mon, 4 Mar 2013 16:35:44 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 43B29721 for ; Mon, 4 Mar 2013 16:35:43 +0000 (UTC) Received: (qmail 34361 invoked from network); 4 Mar 2013 17:49:43 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 4 Mar 2013 17:49:43 -0000 Message-ID: <5134CD5D.6090107@freebsd.org> Date: Mon, 04 Mar 2013 17:35:41 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Lawrence Stewart Subject: Re: Bug in sbsndptr() References: <512CBADB.3050004@freebsd.org> In-Reply-To: <512CBADB.3050004@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 16:35:45 -0000 On 26.02.2013 14:38, Lawrence Stewart wrote: > Hi Andre, Hi Lawrence, :-) > A colleague and I spent a very frustrating day tracing an accounting bug > in the multipath TCP patch we're working on at CAIA to a bug in > sbsndptr(). I haven't tested it with regular TCP yet, but I believe the > following patch fixes the bug (proposed commit log message is at the top > of the patch): > > http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff > > The patch should have no tangible effect to operation other than to > ensure the function delivers on the promise to return the closest mbuf > in the chain for the given offset. I agree that the description of sbsndptr() can be misleading as it refers to the point in time when the pointer was updated last. Relative to now the real offset may be at the beginning of the next mbuf. As you note in the proposed commit message by the time the send pointer is calculated we may have reached the end of the chain and must avoid storing a NULL pointer. The mbuf copy routines simply skips over the additional mbuf in the chain using the returned offset. I wonder how this has caused trouble with your multipath patch. You'd have to copy the sockbuf contents as well and unless you're using custom sockbuf and mbuf chain functions this shouldn't be a problem. Using custom functions on a socket buffer is a delicate approach. For a sockbuf consumer being able to handle valid offsets into an mbuf chain is a core feature and must-have part of the functionality. > I would appreciate a review and any thoughts. I think you have found a valid (micro-)optimization. However you're still making a dangerous assumption in that the next mbuf is indeed the one you want. This may not be true in subtle ways when the chain contains m_len=0 mbufs in it. I'm not aware of it actually happening but it can't be ruled out either if custom sockbuf manipulation functions are in use. I'd recommend the following: have you custom sockbuf function handle forward seeking like the other m_copy() functions; and/or apply a patch along the (untested) example below. Cheers -- Andre Index: uipc_sockbuf.c =================================================================== --- uipc_sockbuf.c (revision 247775) +++ uipc_sockbuf.c (working copy) @@ -936,10 +936,17 @@ return (sb->sb_mb); } - /* Return closest mbuf in chain for current offset. */ + /* Return closest known mbuf in chain for current offset. */ *moff = off - sb->sb_sndptroff; m = ret = sb->sb_sndptr ? sb->sb_sndptr : sb->sb_mb; + /* Possibly seek forward to return the closest mbuf to the offset. */ + while (*moff >= m->m_len && ret->m_next != NULL) { + *moff -= m->m_len; + ret = m->m_next; + } + KASSERT(*moff != NULL, ("%s: moff is NULL", __func__)); + /* Advance by len to be as close as possible for the next transmit. */ for (off = off - sb->sb_sndptroff + len - 1; off > 0 && m != NULL && off >= m->m_len; From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 16:41:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E9FBA199 for ; Mon, 4 Mar 2013 16:41:57 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id A6CA276F for ; Mon, 4 Mar 2013 16:41:57 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fl17so3540197vcb.27 for ; Mon, 04 Mar 2013 08:41:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=mLMCDuQOvB/E4N3J5JAoWAtZESzDmQ0eMdCy8LDaTpk=; b=Ai+dsVLGM4sadpofEti2R+umDi6eRcRvbOLpcpZ3r2ZAbR6PX5NeRfoTdiNql6XQtS 7qt6aX+isQwXsSl5J8d0ZGx9jfChAZ0nAkllsUHQlHWoHBcPaZfkxjQ1JFGx6xyjCdI3 9L0IY7K9ayEwkrcP5BzGp7Z9BVKZ5RcMcAMooIhmb7U2yzZxourWzLHqXfCC4z8KfImK U/Vca9h7NVTDIfbypayo/k68tVVyl8/vGcPylZnoFhyYZ8xhMWb3spe3CAwdJAqrI/SO sTGPe/rc3BjAt4OKkvXio921IxTn6XzIhHvZXqMD4am2d1RJLpEofc6+JToJQNhdqyns uXVg== MIME-Version: 1.0 X-Received: by 10.220.219.73 with SMTP id ht9mr7873390vcb.47.1362415311147; Mon, 04 Mar 2013 08:41:51 -0800 (PST) Received: by 10.52.176.131 with HTTP; Mon, 4 Mar 2013 08:41:51 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Mon, 4 Mar 2013 08:41:51 -0800 Message-ID: Subject: Re: igb network lockups From: Nick Rogers To: Sepherosa Ziehau Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" , Jack Vogel , "Christopher D. Harrison" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 16:41:58 -0000 On Sun, Mar 3, 2013 at 2:14 AM, Sepherosa Ziehau wrote: > On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers wrote: >> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers wrote: >>> FWIW I have been experiencing a similar issue on a number of systems >>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from >>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms >>> are: interface stops passing traffic until the system is rebooted. I >>> have not yet been able to gain access to the systems to dig around >>> (after they have crashed), however my kernel/network settings are >>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to >>> happen about once a day on systems with around a sustained 50Mb/s of >>> traffic. >>> >>> I realize this is not much to go on but perhaps it helps. I am >>> debating trying the e1000 driver in the latest CURRENT on top of >>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week >>> ago. Would this change or perhaps another change to e1000 since >>> 9.1-RELEASE possibly affect stability in a positive way? >>> >>> Thanks. >> >> Heres relevant pciconf output: >> >> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > > For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it > is simply broken (you could check 82574 errata on Intel's website to > confirm what I have said here). Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this is advisable? > > For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it > is simply broken (you could check 82575 errata on Intel's website to > confirm what I have said here). > > Best Regards, > sephe > > -- > Tomorrow Will Never Die > > On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers wrote: >> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers wrote: >>> FWIW I have been experiencing a similar issue on a number of systems >>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from >>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms >>> are: interface stops passing traffic until the system is rebooted. I >>> have not yet been able to gain access to the systems to dig around >>> (after they have crashed), however my kernel/network settings are >>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to >>> happen about once a day on systems with around a sustained 50Mb/s of >>> traffic. >>> >>> I realize this is not much to go on but perhaps it helps. I am >>> debating trying the e1000 driver in the latest CURRENT on top of >>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week >>> ago. Would this change or perhaps another change to e1000 since >>> 9.1-RELEASE possibly affect stability in a positive way? >>> >>> Thanks. >> >> Heres relevant pciconf output: >> >> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> em1@pci0:2:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> em2@pci0:7:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> em3@pci0:8:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> >> >>> >>> On Mon, Feb 25, 2013 at 10:45 AM, Jack Vogel wrote: >>>> Have you done any poking around, looking at stats to determine why the >>>> hangs? For instance, >>>> might your mbuf pool be depleted? Some other network resource perhaps? >>>> >>>> Jack >>>> >>>> >>>> On Mon, Feb 25, 2013 at 10:38 AM, Christopher D. Harrison < >>>> harrison@biostat.wisc.edu> wrote: >>>> >>>>> Sure, >>>>> The problem appears on both systems running with ALTQ and vanilla. >>>>> -C >>>>> >>>>> On 02/25/13 12:29, Jack Vogel wrote: >>>>> >>>>> I've not heard of this problem, but I think most users do not use ALTQ, >>>>> and we (Intel) do not >>>>> test using it. Can it be eliminated from the equation? >>>>> >>>>> Jack >>>>> >>>>> >>>>> On Mon, Feb 25, 2013 at 10:16 AM, Christopher D. Harrison < >>>>> harrison@biostat.wisc.edu> wrote: >>>>> >>>>>> I recently have been experiencing network "freezes" and network "lockups" >>>>>> on our Freebsd 9.1 systems which are running zfs and nfs file servers. >>>>>> I upgraded from 9.0 to 9.1 about 2 months ago and we have been having >>>>>> issues with almost bi-monthly. The issue manifests in the system becomes >>>>>> unresponsive to any/all nfs clients. The system is not resource bound as >>>>>> our I/O is low to disk and our network is usually in the 20mbit/40mbit >>>>>> range. We do notice a correlation between temporary i/o spikes and >>>>>> network freezes but not enough to send our system in to "lockup" mode for >>>>>> the next 5min. Currently we have 4 igb nics in 2 aggr's with 8 queue's >>>>>> per nic and our dev.igb reports: >>>>>> >>>>>> dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.4 >>>>>> >>>>>> I am almost certain the problem is with the ibg driver as a friend is >>>>>> also experiencing the same problem with the same intel igb nic. He has >>>>>> addressed the issue by restarting the network using netif on his systems. >>>>>> According to my friend, once the network interfaces get cleared, everything >>>>>> comes back and starts working as expected. >>>>>> >>>>>> I have noticed an issue with the igb driver and I was looking for >>>>>> thoughts on how to help address this problem. >>>>>> >>>>>> http://freebsd.1045724.n5.nabble.com/em-igb-if-transmit-drbr-and-ALTQ-td5760338.html >>>>>> >>>>>> Thoughts/Ideas are greatly appreciated!!! >>>>>> >>>>>> -C >>>>>> >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > -- > Tomorrow Will Never Die From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 18:16:19 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 47D9665D for ; Mon, 4 Mar 2013 18:16:19 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vc0-f177.google.com (mail-vc0-f177.google.com [209.85.220.177]) by mx1.freebsd.org (Postfix) with ESMTP id D939ECC0 for ; Mon, 4 Mar 2013 18:16:18 +0000 (UTC) Received: by mail-vc0-f177.google.com with SMTP id m18so3541785vcm.22 for ; Mon, 04 Mar 2013 10:16:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=CM8tmangjINKI2bTYR+D4LqV85uQDGow7zqBrVisZxw=; b=wL2eXN3K8hto9pQ40gwmt6IlX87odAVQAjIGI2/y/QEFjWb/nHQxwNa8B5Y/Q48Z5K c4V/PnZ7+dcs+Uqp5HA4po+t/2QXoT4F772I94MB30qBpzIsFvhAH/wFAZVMwZ0OfAYd j07UlMLrhq1enF92DOJ8OC20AVV8T04sLVMo/zjERe8u7Tf+CV9chvmEaFtADyOx/zd1 ea7gSK2ttlB50WTC3Ugk3HO3C7uCF2Dzq42GwksQ4laQfASqrCGHGoaqXSW/EhUBoHHF nHZwEGoH0438uUAAk/osv2l8Qx/vLRP8i52yq9XdMEtfAL3UwnMNsl4HNMGKesEVKpPj qmyQ== MIME-Version: 1.0 X-Received: by 10.58.214.231 with SMTP id od7mr8381659vec.44.1362420978031; Mon, 04 Mar 2013 10:16:18 -0800 (PST) Received: by 10.220.191.132 with HTTP; Mon, 4 Mar 2013 10:16:17 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Mon, 4 Mar 2013 10:16:17 -0800 Message-ID: Subject: Re: igb network lockups From: Jack Vogel To: Sepherosa Ziehau Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Nick Rogers , "freebsd-net@freebsd.org" , "Christopher D. Harrison" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 18:16:19 -0000 On Sun, Mar 3, 2013 at 2:14 AM, Sepherosa Ziehau wrote: ... > > > For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it > is simply broken (you could check 82574 errata on Intel's website to > confirm what I have said here). > If you actually checked the errata you will find that its not "simply broken", furthermore it most certainly SHOULD be enabled, it is by default in the Linux driver as well as mine, the issue is not with the 82574, its with some system designs that have upstream PCIE problems. If you experience problems with a particular system, then we recommend disabling MSIX to determine if this hardware issue may be behind it. In most cases MSIX works just fine. > > For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it > is simply broken (you could check 82575 errata on Intel's website to > confirm what I have said here). > > The same issue obtains on the 82575, its a system issue, and we have tested the part on our Reference systems in prolonged stress without any problem. So the same recommendation as above applies. Jack Vogel Intel Network Division From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 18:22:40 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0A785857 for ; Mon, 4 Mar 2013 18:22:40 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vc0-f170.google.com (mail-vc0-f170.google.com [209.85.220.170]) by mx1.freebsd.org (Postfix) with ESMTP id B5A58D14 for ; Mon, 4 Mar 2013 18:22:39 +0000 (UTC) Received: by mail-vc0-f170.google.com with SMTP id p16so3652890vcq.29 for ; Mon, 04 Mar 2013 10:22:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=Fib5MBKum0uhZMc7N0ASsM6cESlZHIDHixTm8hRSZ78=; b=sAcHMPMjpB/xcqCVTw1tu5i0lstURAtTTyQwGtu8VnQ7ees82yJ4/Fgy+8inbOX3gH oH5K4TbYJEdhq7kJcAHcKO3OUA+rjHlxTvESgH3mcO2sGQVdSLtRq9RpLTJO6pda2pp2 iEnX3iNrV7Yw1hO4eIYnMaLo273VWBsAqmFxb0gahwkFryqmgQP6QDx4c+yFzDaaS9Z8 QwdH24opM2UnH19zlBZuxxW68m1/bYis4S29yfuMOaSEoeSQcUIJvGn+gPM2luBfcuKY FJpWjR4HecKQaFI4zFNirFXMvtZT6qdsfntPdLOTr6D2rXC0O9sfVb/xxb40dtYvVqB4 /0jA== MIME-Version: 1.0 X-Received: by 10.220.153.143 with SMTP id k15mr8111012vcw.33.1362421353127; Mon, 04 Mar 2013 10:22:33 -0800 (PST) Received: by 10.220.191.132 with HTTP; Mon, 4 Mar 2013 10:22:32 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Mon, 4 Mar 2013 10:22:32 -0800 Message-ID: Subject: Re: igb network lockups From: Jack Vogel To: Nick Rogers Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Sepherosa Ziehau , "freebsd-net@freebsd.org" , "Christopher D. Harrison" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 18:22:40 -0000 > > Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set > hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this > is advisable? > > As I explained in a previous email, this is not advisable unless you are experiencing problems (like hangs), if you are then its one possible cause, so try falling back to MSI to see if it eliminates your problem. And, 82574 is the only devise the em driver supports at present that is capable of MSIX, all others use the igb driver. Regards, Jack From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 18:58:35 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C469C20E for ; Mon, 4 Mar 2013 18:58:35 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-ve0-f178.google.com (mail-ve0-f178.google.com [209.85.128.178]) by mx1.freebsd.org (Postfix) with ESMTP id 7865BE5C for ; Mon, 4 Mar 2013 18:58:35 +0000 (UTC) Received: by mail-ve0-f178.google.com with SMTP id db10so4987775veb.37 for ; Mon, 04 Mar 2013 10:58:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=hOzfVdUa4NIuj0zHKnleUyMv1C2irzEdfjdkArFukwI=; b=QBje+CyYniNWJIH3IF9aDu4EBQpoSuwtBe8rWNGrXMyTWRUOzaC88VwWkUFXtmmg3B gxyGyEifNTWbxAUXzp+OlHFuP00L+DRW4KAo5/+H1H8TselRvNxVc2euLlvjcnOdw0WU 3MCgWb/I/IcySCRsbZDum2lDl5WkemL5wpzBHaWflo8cjzHiPO5NEcsutz+s/fXkA3Pc vXjaENqmevDSceo8AYYkavka9fqof5ewamZJvZtPd6kvRkVXkKQbAOxLwmC1837LEXZH 4kDyzqROO2Eso5Qpy+fAOfcBO7tCxb8j+qwnCvjAJhosgtLcxIS8EOAbxJOZLkC6/gnM Rr0Q== MIME-Version: 1.0 X-Received: by 10.220.107.210 with SMTP id c18mr8280287vcp.5.1362423514653; Mon, 04 Mar 2013 10:58:34 -0800 (PST) Received: by 10.220.232.6 with HTTP; Mon, 4 Mar 2013 10:58:34 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Mon, 4 Mar 2013 13:58:34 -0500 Message-ID: Subject: Re: igb network lockups From: Zaphod Beeblebrox To: Jack Vogel Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Nick Rogers , Sepherosa Ziehau , "Christopher D. Harrison" , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 18:58:35 -0000 For everyone having lockup problems with IGB, I'd like to ask if they could try disabling hyperthreads --- this worked for me on one system but has been unnecessary on others. From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 20:13:55 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1B19DB44 for ; Mon, 4 Mar 2013 20:13:55 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-ve0-f181.google.com (mail-ve0-f181.google.com [209.85.128.181]) by mx1.freebsd.org (Postfix) with ESMTP id AD4551190 for ; Mon, 4 Mar 2013 20:13:54 +0000 (UTC) Received: by mail-ve0-f181.google.com with SMTP id d10so5026517vea.40 for ; Mon, 04 Mar 2013 12:13:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=S3Ck/6jUPhI7MikRgsek4u7CKU45PgCwVlbwCZbg6+A=; b=V75Eag5ZkSsaoz7FFPnrF93rOz0Gswj5oDLWy/fouoBHBmfCGChhwyljARq564Y9hA STjAhJCmcok9WQoF8rsENIHatGGTIrXFSaBBXd5c3JToxJPxkp3j/YP92D/PqNAa1cqi GOHNJxy0aFHbVf2y3RF7ZFvkQweWW+5X5hcK50N8ZwzUQQ6uhgLT1fOnIJ+2fO3qAecm o7U5r39M28d5Q/Rsau7Fv/jdA5wCVaH9Qr20KY2krZoMbybjaAv5IkROj0sZtPOJ5hIp KbTTTqa1lYMXlnjNxTqmn6ZZk/RJVlGQaFhakvog6yV+oRKB+p06GQ/aw7m2SJReBSBA sdsg== MIME-Version: 1.0 X-Received: by 10.52.16.40 with SMTP id c8mr7263867vdd.99.1362428028434; Mon, 04 Mar 2013 12:13:48 -0800 (PST) Received: by 10.52.176.131 with HTTP; Mon, 4 Mar 2013 12:13:48 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Mon, 4 Mar 2013 12:13:48 -0800 Message-ID: Subject: Re: igb network lockups From: Nick Rogers To: Jack Vogel Content-Type: text/plain; charset=ISO-8859-1 Cc: Sepherosa Ziehau , "freebsd-net@freebsd.org" , "Christopher D. Harrison" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 20:13:55 -0000 On Mon, Mar 4, 2013 at 10:22 AM, Jack Vogel wrote: > >> >> Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set >> hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this >> is advisable? >> > > As I explained in a previous email, this is not advisable unless you are > experiencing problems (like hangs), if you are then its one possible > cause, so try falling back to MSI to see if it eliminates your problem. > > And, 82574 is the only devise the em driver supports at present that is > capable of MSIX, all others use the igb driver. > Jack, thanks for clarifying. Its much appreciated. > Regards, > > Jack > From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 03:21:28 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 313B7D60; Tue, 5 Mar 2013 03:21:28 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id BB199B81; Tue, 5 Mar 2013 03:21:27 +0000 (UTC) Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au [136.186.229.95]) by lauren.room52.net (Postfix) with ESMTPSA id 3210E7E820; Tue, 5 Mar 2013 14:21:18 +1100 (EST) Message-ID: <513564AD.7000006@freebsd.org> Date: Tue, 05 Mar 2013 14:21:17 +1100 From: Lawrence Stewart User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130213 Thunderbird/17.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: Bug in sbsndptr() References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org> In-Reply-To: <5134CD5D.6090107@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 03:21:28 -0000 On 03/05/13 03:35, Andre Oppermann wrote: > On 26.02.2013 14:38, Lawrence Stewart wrote: >> Hi Andre, > > Hi Lawrence, :-) > >> A colleague and I spent a very frustrating day tracing an accounting bug >> in the multipath TCP patch we're working on at CAIA to a bug in >> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the >> following patch fixes the bug (proposed commit log message is at the top >> of the patch): >> >> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff >> >> >> The patch should have no tangible effect to operation other than to >> ensure the function delivers on the promise to return the closest mbuf >> in the chain for the given offset. > > I agree that the description of sbsndptr() can be misleading as it refers > to the point in time when the pointer was updated last. Relative to now > the real offset may be at the beginning of the next mbuf. Right, and we ran into the issue because we made an assumption based on the use of the present tense in the comment: "Return closest mbuf in chain for current offset." > As you note in the proposed commit message by the time the send pointer > is calculated we may have reached the end of the chain and must avoid > storing a NULL pointer. The mbuf copy routines simply skips over the > additional mbuf in the chain using the returned offset. > > I wonder how this has caused trouble with your multipath patch. You'd > have to copy the sockbuf contents as well and unless you're using custom > sockbuf and mbuf chain functions this shouldn't be a problem. Using > custom functions on a socket buffer is a delicate approach. For a sockbuf > consumer being able to handle valid offsets into an mbuf chain is a core > feature and must-have part of the functionality. No custom sockbuf or mbuf routines are in use. We've implemented a mapping shim between subflows and the socket buffer. When a subflow asks the multipath layer for some data to send, the multipath layer returns a mapping onto the socket buffer, which will remain valid until such time as the subflow has marked the mapped data as acknowledged. Part of the map accounting is tracking the pointer of the first mbuf in the sockbuf where the map's data begins. Our accounting assumed the mbuf + the offset returned by sbsndptr had data available, which is how we triggered the problem. We could have accounted for the issue in our new map accounting code, but that would add additional complexity to some already complex code and the better solution is to make sbsndptr DTRT. >> I would appreciate a review and any thoughts. > > I think you have found a valid (micro-)optimization. However you're > still making a dangerous assumption in that the next mbuf is indeed > the one you want. This may not be true in subtle ways when the chain > contains m_len=0 mbufs in it. I'm not aware of it actually happening > but it can't be ruled out either if custom sockbuf manipulation functions > are in use. True, though I'm struggling to think why there would be m_len=0 mbufs interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain. > I'd recommend the following: > have you custom sockbuf function handle forward seeking like the other > m_copy() functions; and/or apply a patch along the (untested) example > below. If you believe it is both correct and possible for m_len=0 mbufs to exist in a socket buffer chain, then I agree that we should amend my proposed patch to loop and skip over m_len=0 mbufs as you've suggested. However, I'm more inclined to suspect it is undesirable and potentially buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on which sbsndptr is being used, and would instead suggest a "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if block. Thoughts? Cheers, Lawrence From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 09:04:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 39F75FEA for ; Tue, 5 Mar 2013 09:04:57 +0000 (UTC) (envelope-from barczyzna@home.pl) Received: from v045229.home.net.pl (v045229.home.net.pl [89.161.226.17]) by mx1.freebsd.org (Postfix) with SMTP id D2AF7A63 for ; Tue, 5 Mar 2013 09:04:56 +0000 (UTC) Date: Tue, 5 Mar 2013 08:58:14 -0000 Message-ID: <20130305085814.28429.qmail@home.pl> To: haendler@mailsnare.net, emcgough@sbcglobal.net, worlord668@hotmail.com, freebsd-net@freebsd.org, jppbulk@hotmail.com, kalypsomcs@att.net, brooke.mcgough@skorburgcompany.com Subject: Recommend From: X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 09:04:57 -0000 http://edsmithrealestate.com/readme.php?ma=665&nqb=46g=5&awu=k01&yqk=3&mcg=1313&rby=547049&oj=v1k8 From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 09:36:49 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 27969227; Tue, 5 Mar 2013 09:36:49 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id ED9ADE07; Tue, 5 Mar 2013 09:36:48 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r259amPH057833; Tue, 5 Mar 2013 09:36:48 GMT (envelope-from glebius@freefall.freebsd.org) Received: (from glebius@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r259ampR057832; Tue, 5 Mar 2013 09:36:48 GMT (envelope-from glebius) Date: Tue, 5 Mar 2013 09:36:48 GMT Message-Id: <201303050936.r259ampR057832@freefall.freebsd.org> To: asa@cs.txstate.edu, glebius@FreeBSD.org, freebsd-net@FreeBSD.org, glebius@FreeBSD.org From: glebius@FreeBSD.org Subject: Re: kern/176510: [udp] [panic] Kernel Panic in udp_input @ offset 0x475 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 09:36:49 -0000 Synopsis: [udp] [panic] Kernel Panic in udp_input @ offset 0x475 State-Changed-From-To: open->closed State-Changed-By: glebius State-Changed-When: Tue Mar 5 09:36:16 UTC 2013 State-Changed-Why: Fixed in stable/9 in r241435. Responsible-Changed-From-To: freebsd-net->glebius Responsible-Changed-By: glebius Responsible-Changed-When: Tue Mar 5 09:36:16 UTC 2013 Responsible-Changed-Why: Fixed in stable/9 in r241435. http://www.freebsd.org/cgi/query-pr.cgi?pr=176510 From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 13:54:56 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 512B3616 for ; Tue, 5 Mar 2013 13:54:56 +0000 (UTC) (envelope-from s.khanchi@gmail.com) Received: from mail-we0-x229.google.com (mail-we0-x229.google.com [IPv6:2a00:1450:400c:c03::229]) by mx1.freebsd.org (Postfix) with ESMTP id D8DB5FD1 for ; Tue, 5 Mar 2013 13:54:55 +0000 (UTC) Received: by mail-we0-f169.google.com with SMTP id t11so6486599wey.28 for ; Tue, 05 Mar 2013 05:54:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:sender:from:date:x-google-sender-auth :message-id:subject:to:content-type; bh=X2eJpBnhlRRhOEaQq7XHdAt0DaO55myJly0uxs/yZgc=; b=trTFzGsxRRuUd0RK+tQ96x+81932bV/C26KtHSRfbmSKDp+/7/nYpnsT8MQwbgPlF8 UUPFfS85Z2+24fSybHu8Z+IpsIGRIpCGb6XebHhFZnwdK6uovh8izfA2MHsD/3cWshVr gOKUMr0oVQbZkuewhvQ62L1UW6KN1i6sUhbI7sS0cqvftG6LjkFvOtVj7Jmn0V09jaUe 84LM6Pv1xex8grIwtZj4GxFJiZ8Wj4mFBzNAzUku4VAyV8gzlCeP97ZOOFjnDz1Ftjxg Qdb9zV66+cWAW6zRJGM04UyjzGhgMFt3h4KZYyZ6sTsDisGh5Rq8dpexPwQzafQAhLVj 0U5A== X-Received: by 10.180.81.164 with SMTP id b4mr18908535wiy.34.1362491689049; Tue, 05 Mar 2013 05:54:49 -0800 (PST) MIME-Version: 1.0 Sender: s.khanchi@gmail.com Received: by 10.194.121.104 with HTTP; Tue, 5 Mar 2013 05:54:29 -0800 (PST) From: h bagade Date: Tue, 5 Mar 2013 17:24:29 +0330 X-Google-Sender-Auth: Qb6d16XEcq-m2MOhHnUmU3zsT30 Message-ID: Subject: how to get mac address info in kernel code? To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 13:54:56 -0000 Hi all, I need to get interface MAC address within the kernel code and I couldn't use "getifaddrs" because it's user-mode. How can I have the MAC address information within kernel code? Any hints or comments are really appreciated. From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 14:03:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0075E801 for ; Tue, 5 Mar 2013 14:03:56 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 77294120 for ; Tue, 5 Mar 2013 14:03:56 +0000 (UTC) Received: (qmail 41221 invoked from network); 5 Mar 2013 15:17:45 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 5 Mar 2013 15:17:45 -0000 Message-ID: <5135FB48.1000809@freebsd.org> Date: Tue, 05 Mar 2013 15:03:52 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Lawrence Stewart Subject: Re: Bug in sbsndptr() References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org> <513564AD.7000006@freebsd.org> In-Reply-To: <513564AD.7000006@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 14:03:57 -0000 On 05.03.2013 04:21, Lawrence Stewart wrote: > On 03/05/13 03:35, Andre Oppermann wrote: >> On 26.02.2013 14:38, Lawrence Stewart wrote: >>> Hi Andre, >> >> Hi Lawrence, :-) >> >>> A colleague and I spent a very frustrating day tracing an accounting bug >>> in the multipath TCP patch we're working on at CAIA to a bug in >>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the >>> following patch fixes the bug (proposed commit log message is at the top >>> of the patch): >>> >>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff >>> >>> >>> The patch should have no tangible effect to operation other than to >>> ensure the function delivers on the promise to return the closest mbuf >>> in the chain for the given offset. >> >> I agree that the description of sbsndptr() can be misleading as it refers >> to the point in time when the pointer was updated last. Relative to now >> the real offset may be at the beginning of the next mbuf. > > Right, and we ran into the issue because we made an assumption based on > the use of the present tense in the comment: > > "Return closest mbuf in chain for current offset." I apologize for the incorrect and misleading description. :-) >> As you note in the proposed commit message by the time the send pointer >> is calculated we may have reached the end of the chain and must avoid >> storing a NULL pointer. The mbuf copy routines simply skips over the >> additional mbuf in the chain using the returned offset. >> >> I wonder how this has caused trouble with your multipath patch. You'd >> have to copy the sockbuf contents as well and unless you're using custom >> sockbuf and mbuf chain functions this shouldn't be a problem. Using >> custom functions on a socket buffer is a delicate approach. For a sockbuf >> consumer being able to handle valid offsets into an mbuf chain is a core >> feature and must-have part of the functionality. > > No custom sockbuf or mbuf routines are in use. We've implemented a > mapping shim between subflows and the socket buffer. When a subflow asks > the multipath layer for some data to send, the multipath layer returns a > mapping onto the socket buffer, which will remain valid until such time > as the subflow has marked the mapped data as acknowledged. > > Part of the map accounting is tracking the pointer of the first mbuf in > the sockbuf where the map's data begins. Our accounting assumed the mbuf > + the offset returned by sbsndptr had data available, which is how we > triggered the problem. We could have accounted for the issue in our new > map accounting code, but that would add additional complexity to some > already complex code and the better solution is to make sbsndptr DTRT. So effectively you run a separate sbsndptr for each subflow using the real sbsndptr to track the head of the queue? /me fears the day a mptcp import comes up. tcp-complexity^^3. :-o >>> I would appreciate a review and any thoughts. >> >> I think you have found a valid (micro-)optimization. However you're >> still making a dangerous assumption in that the next mbuf is indeed >> the one you want. This may not be true in subtle ways when the chain >> contains m_len=0 mbufs in it. I'm not aware of it actually happening >> but it can't be ruled out either if custom sockbuf manipulation functions >> are in use. > > True, though I'm struggling to think why there would be m_len=0 mbufs > interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain. sbcompress() doesn't allow for m_len=0 mbufs. This holds true as long as the sbappend functions are used. If not, we may get anything there. As long as nobody is using custom sockbuf appends we're safe. Because I first assumed from your description some custom sockbuf munging the guarantee wouldn't haven been there anymore. >> I'd recommend the following: >> have you custom sockbuf function handle forward seeking like the other >> m_copy() functions; and/or apply a patch along the (untested) example >> below. > > If you believe it is both correct and possible for m_len=0 mbufs to > exist in a socket buffer chain, then I agree that we should amend my > proposed patch to loop and skip over m_len=0 mbufs as you've suggested. No. So far it is neither possible or correct. > However, I'm more inclined to suspect it is undesirable and potentially > buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on > which sbsndptr is being used, and would instead suggest a > "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if > block. Agreed. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 15:53:43 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 46BA846B for ; Tue, 5 Mar 2013 15:53:43 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id 0A1059F6 for ; Tue, 5 Mar 2013 15:53:42 +0000 (UTC) Received: from [209.249.190.124] (port=53951 helo=dhcp-10-2-210-24.hudson-trading.com) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1UCuBS-0005rG-Cu; Tue, 05 Mar 2013 10:53:42 -0500 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: how to get mac address info in kernel code? From: George Neville-Neil In-Reply-To: Date: Tue, 5 Mar 2013 10:53:42 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <8EB66934-D33C-425E-A076-66E31B618DCA@neville-neil.com> References: To: h bagade X-Mailer: Apple Mail (2.1499) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com X-Get-Message-Sender-Via: vps.hungerhost.com: authenticated_id: gnn@neville-neil.com Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 15:53:43 -0000 On Mar 5, 2013, at 08:54 , h bagade wrote: > Hi all, >=20 > I need to get interface MAC address within the kernel code and I = couldn't > use "getifaddrs" because it's user-mode. How can I have the MAC = address > information within kernel code? >=20 > Any hints or comments are really appreciated. If you have access to the struct ifnet you can look at the if_addr = member, which is a struct ifaddr, defined in if_var.h . Best, George From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 17:39:40 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E953118D for ; Tue, 5 Mar 2013 17:39:40 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-vb0-x231.google.com (mail-vb0-x231.google.com [IPv6:2607:f8b0:400c:c02::231]) by mx1.freebsd.org (Postfix) with ESMTP id A22A81D3 for ; Tue, 5 Mar 2013 17:39:40 +0000 (UTC) Received: by mail-vb0-f49.google.com with SMTP id s24so1348745vbi.22 for ; Tue, 05 Mar 2013 09:39:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=7F9ZNFFu2MhjeuJJobpDMVVbM3mN86LBeufB7a0f1Jo=; b=0U0dbviakVyoj1lH4prDPDIkTry/sBF8Isqij9cQqC12SKIu5eAtP9NaYb0KbwiFb0 4Ram6q07P082eBYcKF6TRAkIU89J0iWnVcQ0A3/zhWTtLpaImErjDO2W9TFOhwE6SnwL GhadL0nD/PcbSytEsUo6lUAGS1OtOJKv8jNw4ITfx69MAuroMq1BStSYZzx5OfBfIpID m8uzKxYbX3zl/c5LqLVKP1raqCzb2Aus/icS8u1CwC4AFFyRzPxgi2yYcd18HzO6CpJj 5/4/oOMn+3HPmTbWxhs/T80nvuviS2Ja4//MsCTT1fvsxGzs8nFxMvrLkqeaCXdqPEDI lxGQ== MIME-Version: 1.0 X-Received: by 10.220.227.131 with SMTP id ja3mr8434935vcb.54.1362505180089; Tue, 05 Mar 2013 09:39:40 -0800 (PST) Received: by 10.52.176.131 with HTTP; Tue, 5 Mar 2013 09:39:39 -0800 (PST) Date: Tue, 5 Mar 2013 09:39:39 -0800 Message-ID: Subject: Default route changes unexpectedly From: Nick Rogers To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 17:39:41 -0000 Hello, I am attempting to create awareness of a serious issue affecting users of FreeBSD 9.x and PF. There appears to be a bug that allows the kernel's routing table to be corrupted by traffic routing through the system. Under heavy traffic load, the default route can seemingly randomly change to an IP address that is not directly connected to the network (i.e., is not configured anywhere). Dhclient is not in the mix, nor is routed, bgpd, etc. Running `route monitor` shows no evidence of the change in the default route. The one commonality between all the systems experiencing this problem seems to be the use of PF. Obviously this is a serious problem as it causes all Internet-bound traffic to stop routing until the default route is corrected. Some users, including myself, are working around this problem by installing a script that runs multiple times a second to check if the default route is incorrect and fixing it if necessary, which mitigates the amount of downtime caused by the bug. Please refer to these past posts for more examples and evidence of other users experiencing this problem: http://forums.freebsd.org/showthread.php?p=211610#post211610 http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html There is also a PR that was incorrectly labeled as an IPFW issue. Myself and others believe this issue is not restricted to the use of IPFW and that the PR should be relabeled. I am inclined to think it is strictly a PF issue since I am not using IPFW, however there is evidence of the default route changing on people using IPFW for past versions of FreeBSD (7.x/8.x), so perhaps this is related. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749 Another PR for the same problem but specific to IPFW and 8.2-RELEASE http://www.freebsd.org/cgi/query-pr.cgi?pr=157796 I am hoping someone reading this can give the problem the attention it deserves. Thank you. -Nick From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 21:18:05 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4DC7A836 for ; Tue, 5 Mar 2013 21:18:05 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm5-vm1.bullet.mail.ne1.yahoo.com (nm5-vm1.bullet.mail.ne1.yahoo.com [98.138.91.32]) by mx1.freebsd.org (Postfix) with ESMTP id D3E90DDA for ; Tue, 5 Mar 2013 21:18:04 +0000 (UTC) Received: from [98.138.226.176] by nm5.bullet.mail.ne1.yahoo.com with NNFMP; 05 Mar 2013 21:17:58 -0000 Received: from [98.138.226.166] by tm11.bullet.mail.ne1.yahoo.com with NNFMP; 05 Mar 2013 21:17:58 -0000 Received: from [127.0.0.1] by omp1067.mail.ne1.yahoo.com with NNFMP; 05 Mar 2013 21:17:57 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 988950.57047.bm@omp1067.mail.ne1.yahoo.com Received: (qmail 11860 invoked by uid 60001); 5 Mar 2013 21:17:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1362518277; bh=QeBARCoW0PPYZndkrrLFXYjsSVbP6hHpuGGFJLWRtK4=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=zy4PMMWzSpm6IJcF154tLUSpo9zqhAJwyC61qKpM70RNLrgP1vI/5T10Rt9x0pUr+jf+wI3dMwAyMh+ZudDyjkWEeJq8Vj50oXMvgvRTHeZrpgDRR8aIO6Gd6aQoZ5HmCBPqVtoU59lTBfYy4jJX3XJaWVnWI1crUlem/fIlArg= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=nJjw7CAoNhZKXaqIzpIpRn3PHFAqbKKRfCicQsm5HyVc9rLygjHBc/5tmr+r9VmsyWCRF94ffdDRl94iGMXkixSC++B4Y/taHzHfWDuSOuBkRe5WFaNj3TMn2MsC4ze3Kaw57GbDmypSDWCeTz1392ToIGyrGYB5+3+8qaxY8lI=; X-YMail-OSG: .hcdfk0VM1lj8vyuBrDtNWgP0GPkC7pqIC3Far0gBQNFLab 05UeLHBSwQHREiEB1D2jKsMgOq0t.1TlNW9Fetgv8oo13M4zDVpBof.plS69 zg434GOIPDqb.2dTRLev89aD8apfKTOXM8SaqJfmAK_wBTzcPKYzvyi7UcsZ ymLVZLvhcuscrEnO8xXzge4ITQ0_2Y26kUnzBw5HSIy4D9Xoc2p2.cKQA1tG VXMhJ9gXq5XBUKIXGV5mPj7z35sDzkKi1UWoPzanpSxSM.my_DothvH_2HQ3 SUL6ydwKcy5vgQXuHDpeGlFliuDqA6jyyopuXpBLVLpVu8NOwspOQsJanY9t BOu43kMDLDomaR07fUfS5iIbcQd6U09xX.eKVwSMzEnZhqHwQwYx.JtIxYpq WigNNfWQRsGeW86zfjDeDrnKmySEuCyySukRaKjDn05u._E2N1Wa63iGnMKW yGaH17mn2.0GEmZW5W0WghljPeyul98cUFQ5p9UmNP38sCDP1ZXToJNaKpiZ DYwKno4vh_p2dgiVdZt6Gnm1TleFn Received: from [174.48.128.27] by web121603.mail.ne1.yahoo.com via HTTP; Tue, 05 Mar 2013 13:17:57 PST X-Rocket-MIMEInfo: 001.001, DQoNCi0tLSBPbiBNb24sIDMvNC8xMywgWmFwaG9kIEJlZWJsZWJyb3ggPHpiZWVibGVAZ21haWwuY29tPiB3cm90ZToNCg0KPiBGcm9tOiBaYXBob2QgQmVlYmxlYnJveCA8emJlZWJsZUBnbWFpbC5jb20.DQo.IFN1YmplY3Q6IFJlOiBpZ2IgbmV0d29yayBsb2NrdXBzDQo.IFRvOiAiSmFjayBWb2dlbCIgPGpmdm9nZWxAZ21haWwuY29tPg0KPiBDYzogIk5pY2sgUm9nZXJzIiA8bmNyb2dlcnNAZ21haWwuY29tPiwgIlNlcGhlcm9zYSBaaWVoYXUiIDxzZXBoZXJvc2FAZ21haWwuY29tPiwgIkNocmlzdG9waGUBMAEBAQE- X-Mailer: YahooMailClassic/15.1.4 YahooMailWebService/0.8.135.514 Message-ID: <1362518277.2420.YahooMailClassic@web121603.mail.ne1.yahoo.com> Date: Tue, 5 Mar 2013 13:17:57 -0800 (PST) From: Barney Cordoba Subject: Re: igb network lockups To: Jack Vogel , Zaphod Beeblebrox In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Nick Rogers , Sepherosa Ziehau , "Christopher D. Harrison" , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 21:18:05 -0000 --- On Mon, 3/4/13, Zaphod Beeblebrox wrote: > From: Zaphod Beeblebrox > Subject: Re: igb network lockups > To: "Jack Vogel" > Cc: "Nick Rogers" , "Sepherosa Ziehau" , "Christopher D. Harrison" , "freebsd-net@freebsd.org" > Date: Monday, March 4, 2013, 1:58 PM > For everyone having lockup problems > with IGB, I'd like to ask if they could > try disabling hyperthreads --- this worked for me on one > system but has > been unnecessary on others. Gee, maybe binding an interrupt to a virtual cpu isn't a good idea? BC From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 21:20:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C827C98C for ; Tue, 5 Mar 2013 21:20:21 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm10-vm2.bullet.mail.ne1.yahoo.com (nm10-vm2.bullet.mail.ne1.yahoo.com [98.138.90.158]) by mx1.freebsd.org (Postfix) with ESMTP id 60A62E01 for ; Tue, 5 Mar 2013 21:20:21 +0000 (UTC) Received: from [98.138.226.180] by nm10.bullet.mail.ne1.yahoo.com with NNFMP; 05 Mar 2013 21:17:59 -0000 Received: from [98.138.87.6] by tm15.bullet.mail.ne1.yahoo.com with NNFMP; 05 Mar 2013 21:17:59 -0000 Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP; 05 Mar 2013 21:17:59 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 411997.34112.bm@omp1006.mail.ne1.yahoo.com Received: (qmail 78306 invoked by uid 60001); 5 Mar 2013 21:17:59 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1362518279; bh=QeBARCoW0PPYZndkrrLFXYjsSVbP6hHpuGGFJLWRtK4=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=mxYwWxZjIbYtMNWoMd26vuGzLVHiHBoMgEuIleAqu+FEvJ8IKwuoOPC6OdnpZj0aEnfWD0nLDwOJxcF6/ahvh/77H06I06zrN621GCQ/or6KMdY/+jRIirJwL9MxBaoBeAHZXNZeFS5P/yRglnhN+6VplygfcxPkRh4bCtex4Cs= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=ahXKYR9M6pZdWiF9Bv57VnzGSkkb/rpTmCZZsv9XBJq81natVPiihCMSH34NDtlSvtSqcUe/9JQkqurRVoYlFHlB9e/RScLmM/0h8UAwgHsiRuEOVqEjDB6KfO/pm6Z0l0ipUQk/llABNM1wxEorKHH7g0XojNMbzf3gHXgisG8=; X-YMail-OSG: Mp.qklcVM1ncq113JaRK6G4rZjh.O5ttH6FOipt6c2m4lu3 33VRo3UEfFVhtjAWSLT81b0T2fnUfNsr7cgBS2R_Rcg5n.1bp5uhH7VSBjcX hkUopQejX021Rjwy91JigmhVG3f9rmS3_qH1m7Rwhu110RzhTr9__f68v1TC 9vVw_vp6JEz4KD1UM2ZQrJSNF2DYUYHn3LSV.q_PKzXm5XmzF.zY0v3lz0CY tBL1g.mwGrR3v2WHuZ4Fl5bVyzOc7N.Lf1v2gFNq5YZYiHtk9HKyDSMx0D2H ZDm7M2M0IL3_.WuDo7QF1ZKDKXRhgwmlOuDAammyJNmfOb6uzZ9pqqUFoaca 3bySMyQIeiJCa0vgh.45rk7T.3p3THqrZUeE5OhThG.zCXoBkC6hReClRJKc Cv1JTNctxGSsZ3WP2HGvF0XGzkEOk8m3Nx5eD2eaikiE54bGNCLUo1RRKs_U ZYOjj4T3rB7mPUC5vP2Mq8K7CQ.dcqRreSN5iGlGaUrqYuOo6EDoOBBaApSz VDB4qVCjuCMU7ENrU_IjynpyR0Gxe Received: from [174.48.128.27] by web121605.mail.ne1.yahoo.com via HTTP; Tue, 05 Mar 2013 13:17:59 PST X-Rocket-MIMEInfo: 001.001, DQoNCi0tLSBPbiBNb24sIDMvNC8xMywgWmFwaG9kIEJlZWJsZWJyb3ggPHpiZWVibGVAZ21haWwuY29tPiB3cm90ZToNCg0KPiBGcm9tOiBaYXBob2QgQmVlYmxlYnJveCA8emJlZWJsZUBnbWFpbC5jb20.DQo.IFN1YmplY3Q6IFJlOiBpZ2IgbmV0d29yayBsb2NrdXBzDQo.IFRvOiAiSmFjayBWb2dlbCIgPGpmdm9nZWxAZ21haWwuY29tPg0KPiBDYzogIk5pY2sgUm9nZXJzIiA8bmNyb2dlcnNAZ21haWwuY29tPiwgIlNlcGhlcm9zYSBaaWVoYXUiIDxzZXBoZXJvc2FAZ21haWwuY29tPiwgIkNocmlzdG9waGUBMAEBAQE- X-Mailer: YahooMailClassic/15.1.4 YahooMailWebService/0.8.135.514 Message-ID: <1362518279.75650.YahooMailClassic@web121605.mail.ne1.yahoo.com> Date: Tue, 5 Mar 2013 13:17:59 -0800 (PST) From: Barney Cordoba Subject: Re: igb network lockups To: Jack Vogel , Zaphod Beeblebrox In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Nick Rogers , Sepherosa Ziehau , "Christopher D. Harrison" , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 21:20:21 -0000 --- On Mon, 3/4/13, Zaphod Beeblebrox wrote: > From: Zaphod Beeblebrox > Subject: Re: igb network lockups > To: "Jack Vogel" > Cc: "Nick Rogers" , "Sepherosa Ziehau" , "Christopher D. Harrison" , "freebsd-net@freebsd.org" > Date: Monday, March 4, 2013, 1:58 PM > For everyone having lockup problems > with IGB, I'd like to ask if they could > try disabling hyperthreads --- this worked for me on one > system but has > been unnecessary on others. Gee, maybe binding an interrupt to a virtual cpu isn't a good idea? BC From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 01:54:18 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C3E81CF3; Wed, 6 Mar 2013 01:54:18 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id 482419A6; Wed, 6 Mar 2013 01:54:17 +0000 (UTC) Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au [136.186.229.95]) by lauren.room52.net (Postfix) with ESMTPSA id 414EC7E84A; Wed, 6 Mar 2013 12:54:15 +1100 (EST) Message-ID: <5136A1C6.4000406@freebsd.org> Date: Wed, 06 Mar 2013 12:54:14 +1100 From: Lawrence Stewart User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130213 Thunderbird/17.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: Bug in sbsndptr() References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org> <513564AD.7000006@freebsd.org> <5135FB48.1000809@freebsd.org> In-Reply-To: <5135FB48.1000809@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 01:54:18 -0000 On 03/06/13 01:03, Andre Oppermann wrote: > On 05.03.2013 04:21, Lawrence Stewart wrote: >> On 03/05/13 03:35, Andre Oppermann wrote: >>> On 26.02.2013 14:38, Lawrence Stewart wrote: >>>> Hi Andre, >>> >>> Hi Lawrence, :-) >>> >>>> A colleague and I spent a very frustrating day tracing an accounting >>>> bug >>>> in the multipath TCP patch we're working on at CAIA to a bug in >>>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the >>>> following patch fixes the bug (proposed commit log message is at the >>>> top >>>> of the patch): >>>> >>>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff >>>> >>>> >>>> >>>> The patch should have no tangible effect to operation other than to >>>> ensure the function delivers on the promise to return the closest mbuf >>>> in the chain for the given offset. >>> >>> I agree that the description of sbsndptr() can be misleading as it >>> refers >>> to the point in time when the pointer was updated last. Relative to now >>> the real offset may be at the beginning of the next mbuf. >> >> Right, and we ran into the issue because we made an assumption based on >> the use of the present tense in the comment: >> >> "Return closest mbuf in chain for current offset." > > I apologize for the incorrect and misleading description. :-) No drama, just explaining the crux of the problem from our perspective so it's clear why we ran into this. >>> As you note in the proposed commit message by the time the send pointer >>> is calculated we may have reached the end of the chain and must avoid >>> storing a NULL pointer. The mbuf copy routines simply skips over the >>> additional mbuf in the chain using the returned offset. >>> >>> I wonder how this has caused trouble with your multipath patch. You'd >>> have to copy the sockbuf contents as well and unless you're using custom >>> sockbuf and mbuf chain functions this shouldn't be a problem. Using >>> custom functions on a socket buffer is a delicate approach. For a >>> sockbuf >>> consumer being able to handle valid offsets into an mbuf chain is a core >>> feature and must-have part of the functionality. >> >> No custom sockbuf or mbuf routines are in use. We've implemented a >> mapping shim between subflows and the socket buffer. When a subflow asks >> the multipath layer for some data to send, the multipath layer returns a >> mapping onto the socket buffer, which will remain valid until such time >> as the subflow has marked the mapped data as acknowledged. >> >> Part of the map accounting is tracking the pointer of the first mbuf in >> the sockbuf where the map's data begins. Our accounting assumed the mbuf >> + the offset returned by sbsndptr had data available, which is how we >> triggered the problem. We could have accounted for the issue in our new >> map accounting code, but that would add additional complexity to some >> already complex code and the better solution is to make sbsndptr DTRT. > > So effectively you run a separate sbsndptr for each subflow using the > real sbsndptr to track the head of the queue? Yes, essentially works as you describe. The initial goal/design was to make multi-stream support a first class citizen inside the socket buffer, but we ran out of time to do this. The design we've come up with is a reasonable interim to get to an alpha patch release, which should be happening later this week if you're interested to take a look. We'll make an announcement when it's up on the website. > /me fears the day a mptcp import comes up. tcp-complexity^^3. :-o Yeah it's pretty invasive but does bring some useful features too. There is a lot more work to do before I'd consider proposing we import it into the stack and even then, we'll want to have a robust discussion about when and how to do it. Given that this is being done as part of a research project, we've also taken the opportunity to experiment with changing some ideas and idiosyncrasies in the existing stack code and will be doing a lot of experimental research with the stack and iteratively refining things as we go. >>>> I would appreciate a review and any thoughts. >>> >>> I think you have found a valid (micro-)optimization. However you're >>> still making a dangerous assumption in that the next mbuf is indeed >>> the one you want. This may not be true in subtle ways when the chain >>> contains m_len=0 mbufs in it. I'm not aware of it actually happening >>> but it can't be ruled out either if custom sockbuf manipulation >>> functions >>> are in use. >> >> True, though I'm struggling to think why there would be m_len=0 mbufs >> interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain. > > sbcompress() doesn't allow for m_len=0 mbufs. This holds true as long > as the sbappend functions are used. If not, we may get anything there. > As long as nobody is using custom sockbuf appends we're safe. Because > I first assumed from your description some custom sockbuf munging the > guarantee wouldn't haven been there anymore. Ok cool. >>> I'd recommend the following: >>> have you custom sockbuf function handle forward seeking like the other >>> m_copy() functions; and/or apply a patch along the (untested) example >>> below. >> >> If you believe it is both correct and possible for m_len=0 mbufs to >> exist in a socket buffer chain, then I agree that we should amend my >> proposed patch to loop and skip over m_len=0 mbufs as you've suggested. > > No. So far it is neither possible or correct. > >> However, I'm more inclined to suspect it is undesirable and potentially >> buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on >> which sbsndptr is being used, and would instead suggest a >> "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if >> block. > > Agreed. How does this look? http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314_v2.diff Sockbuf code is tricky so I'll test this for a while and commit after it has had a reasonable run and not shown any side effects. Cheers, Lawrence From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 05:48:17 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4B720D20 for ; Wed, 6 Mar 2013 05:48:17 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2]) by mx1.freebsd.org (Postfix) with ESMTP id F357D234 for ; Wed, 6 Mar 2013 05:48:16 +0000 (UTC) Received: from bsdrookie.norma.com. ([IPv6:fd00::726]) by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r265mDQA002265 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 6 Mar 2013 11:48:14 +0600 (YEKT) (envelope-from emz@norma.perm.ru) Message-ID: <5136D89D.4000902@norma.perm.ru> Date: Wed, 06 Mar 2013 11:48:13 +0600 From: "Eugene M. Zheganin" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> In-Reply-To: <20130228053558.GA1474@michelle.cdnetworks.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (elf.hq.norma.perm.ru [IPv6:fd00::30a]); Wed, 06 Mar 2013 11:48:14 +0600 (YEKT) X-Spam-Status: No hits=-97.8 bayes=0.5 testhits RDNS_NONE=1.274, SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 05:48:17 -0000 Hi. On 28.02.2013 11:35, YongHyeon PYUN wrote: > The reporter said the machine was Sun Fire X2200 M2 so I guess you > may see the same issue on both stable/9 and stable/8. Ideally the > loader tunable hw.bge.allow_asf should not be there and driver > should take care of it by checking the existence of ASF/IPMI > firmware. > > Unfortunately, I just had the 'bge0 - watchdog timeout - resetting' on a recent 8.3-STABLE and a 'Broadcom NetXtreme BCM5722 Gigabit (94309)' (according to the pciconf -lv) controller. I haven't seen this in a year or two (I guess), the machine was running 8.2-STABLE. So, in order to fight this (machine is freezing during these messages) whet should I do ? Is upgrading to 10.0-CURRENT an option ? hw.bge.allow_asf is 0 already. Thanks. Eugene. From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 06:27:06 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 57F587E3 for ; Wed, 6 Mar 2013 06:27:06 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-f52.google.com (mail-pb0-f52.google.com [209.85.160.52]) by mx1.freebsd.org (Postfix) with ESMTP id F276C368 for ; Wed, 6 Mar 2013 06:27:05 +0000 (UTC) Received: by mail-pb0-f52.google.com with SMTP id ma3so5532032pbc.25 for ; Tue, 05 Mar 2013 22:27:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=5pksi+SItNBqsH0a7bRloNG3e9weeUdPPTblC0GN32M=; b=v+e2oe/P80/6agI6D3qmtB8ovtJVYCsYHtru50/x/+ft+BCiR/62Yy58bsd8hA7eny X8IQJ+DFnOyEdCTOGhr+PN4sFm01kjqDTTf0uDdU/nbiNsq2PIUWO4jzCBbvw57CkqrZ EwIdr69CBwEq2JZQLRK7ONK9enu75RSPQDf38dALo7pao9LsBIdzvyZVeC4i7v994lvc YeUAkobRSCdH6bwJGK/Mc62gICRu/UWDsg5vVRIv0pUT7f9aqO+VHVa5+yebxoIzwlqN 8bfvpCgrYdDTiDUoKGlalvCXxUhKQLrH1bpTIJWXa3GeKVxnLtSIlkq5LZIEHysIUYRh jsxw== X-Received: by 10.68.138.225 with SMTP id qt1mr42815544pbb.82.1362551225280; Tue, 05 Mar 2013 22:27:05 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id zm1sm29929930pbc.26.2013.03.05.22.27.01 (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 05 Mar 2013 22:27:04 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 06 Mar 2013 15:26:58 +0900 From: YongHyeon PYUN Date: Wed, 6 Mar 2013 15:26:58 +0900 To: "Eugene M. Zheganin" Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout Message-ID: <20130306062658.GC1483@michelle.cdnetworks.com> References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5136D89D.4000902@norma.perm.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 06:27:06 -0000 On Wed, Mar 06, 2013 at 11:48:13AM +0600, Eugene M. Zheganin wrote: > Hi. > > On 28.02.2013 11:35, YongHyeon PYUN wrote: > > The reporter said the machine was Sun Fire X2200 M2 so I guess you > > may see the same issue on both stable/9 and stable/8. Ideally the > > loader tunable hw.bge.allow_asf should not be there and driver > > should take care of it by checking the existence of ASF/IPMI > > firmware. > > > > > Unfortunately, I just had the 'bge0 - watchdog timeout - resetting' on a > recent 8.3-STABLE and a 'Broadcom NetXtreme BCM5722 Gigabit (94309)' > (according to the pciconf -lv) controller. I haven't seen this in a year > or two (I guess), the machine was running 8.2-STABLE. So, in order to > fight this (machine is freezing during these messages) whet should I do > ? Is upgrading to 10.0-CURRENT an option ? hw.bge.allow_asf is 0 already. If you were using latest stable/8, the result would be same on CURRENT. How frequently do you see the watchdog timeouts? Is there way to reproduce it? Would you show me the output of dmesg (bge(4) and brgphy(4) only) and "pciconf -lcbv"? From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 06:30:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EEDE8B3F for ; Wed, 6 Mar 2013 06:30:57 +0000 (UTC) (envelope-from sodynet1@gmail.com) Received: from mail-pb0-f53.google.com (mail-pb0-f53.google.com [209.85.160.53]) by mx1.freebsd.org (Postfix) with ESMTP id CD38B390 for ; Wed, 6 Mar 2013 06:30:57 +0000 (UTC) Received: by mail-pb0-f53.google.com with SMTP id un1so5499463pbc.40 for ; Tue, 05 Mar 2013 22:30:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=T6iM0YcmDpDtHfIJIQgDrMTmWTOR5/cSTNpaa1aLJnY=; b=jik4d+bvvP9bFkrtc5kxFh7InZf8gvhIVvy5roXu/yQWEbJBe7MWW7lqzV1eSsu4tS 5pCSwPDH3nhMrgO9U5RzMxG1MU46avkjGu51UivY2HkyTK/wFRxorLP4yQjaYrGoPPo8 7hOeKenl/HivmEO6fHJYsW4saRPrbYZgJMBZjkknsRmR3oJLubXsY1nQoK322ZaVl3ih y4W2j1ts0ekfo9iNc0nRkHVvH6RLMRG2hb9wmX+bYaswwY1IgrPgujzx1mthqdUd5u0v 6plfEAT84bPm077KSxf/I4ddQf6imVWPXIyupMV3s1DrsSJh7C0Fyhxn80huscUUTTLN cHPg== MIME-Version: 1.0 X-Received: by 10.68.196.225 with SMTP id ip1mr43465231pbc.72.1362551456896; Tue, 05 Mar 2013 22:30:56 -0800 (PST) Received: by 10.70.34.103 with HTTP; Tue, 5 Mar 2013 22:30:56 -0800 (PST) Received: by 10.70.34.103 with HTTP; Tue, 5 Mar 2013 22:30:56 -0800 (PST) In-Reply-To: References: Date: Wed, 6 Mar 2013 08:30:56 +0200 Message-ID: Subject: Re: Default route changes unexpectedly From: Sami Halabi To: Nick Rogers Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 06:30:58 -0000 Hi, I can say also i faced this problem in 9.1-preRelease. And i'm not using pf, i usyally use ipfw. but i didn't see this happening for a while... Sami On Mar 5, 2013 7:39 PM, "Nick Rogers" wrote: > Hello, > > I am attempting to create awareness of a serious issue affecting users > of FreeBSD 9.x and PF. There appears to be a bug that allows the > kernel's routing table to be corrupted by traffic routing through the > system. Under heavy traffic load, the default route can seemingly > randomly change to an IP address that is not directly connected to the > network (i.e., is not configured anywhere). Dhclient is not in the > mix, nor is routed, bgpd, etc. Running `route monitor` shows no > evidence of the change in the default route. The one commonality > between all the systems experiencing this problem seems to be the use > of PF. > > Obviously this is a serious problem as it causes all Internet-bound > traffic to stop routing until the default route is corrected. Some > users, including myself, are working around this problem by installing > a script that runs multiple times a second to check if the default > route is incorrect and fixing it if necessary, which mitigates the > amount of downtime caused by the bug. > > Please refer to these past posts for more examples and evidence of > other users experiencing this problem: > > http://forums.freebsd.org/showthread.php?p=211610#post211610 > > > http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html > > http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html > > http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html > > There is also a PR that was incorrectly labeled as an IPFW issue. > Myself and others believe this issue is not restricted to the use of > IPFW and that the PR should be relabeled. I am inclined to think it is > strictly a PF issue since I am not using IPFW, however there is > evidence of the default route changing on people using IPFW for past > versions of FreeBSD (7.x/8.x), so perhaps this is related. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749 > > Another PR for the same problem but specific to IPFW and 8.2-RELEASE > > http://www.freebsd.org/cgi/query-pr.cgi?pr=157796 > > I am hoping someone reading this can give the problem the attention it > deserves. Thank you. > > -Nick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 06:39:57 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5913FD3D; Wed, 6 Mar 2013 06:39:57 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 1CDA33E3; Wed, 6 Mar 2013 06:39:57 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r266du6g007177; Wed, 6 Mar 2013 06:39:56 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r266duAQ007176; Wed, 6 Mar 2013 06:39:56 GMT (envelope-from linimon) Date: Wed, 6 Mar 2013 06:39:56 GMT Message-Id: <201303060639.r266duAQ007176@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/176671: [epair] MAC address for epair device not unique X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 06:39:57 -0000 Old Synopsis: MAC address for epair device not unique New Synopsis: [epair] MAC address for epair device not unique Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Wed Mar 6 06:39:09 UTC 2013 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=176671 From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 06:44:02 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 06D16EA5; Wed, 6 Mar 2013 06:44:02 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id CBB1E5EC; Wed, 6 Mar 2013 06:44:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r266i1jc008797; Wed, 6 Mar 2013 06:44:01 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r266i1Dh008796; Wed, 6 Mar 2013 06:44:01 GMT (envelope-from linimon) Date: Wed, 6 Mar 2013 06:44:01 GMT Message-Id: <201303060644.r266i1Dh008796@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/176667: [libalias] [patch] libalias locks on uninitalized data X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 06:44:02 -0000 Old Synopsis: libalias locks on uninitalized data New Synopsis: [libalias] [patch] libalias locks on uninitalized data Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Wed Mar 6 06:43:36 UTC 2013 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=176667 From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 07:01:40 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id F0EDC767 for ; Wed, 6 Mar 2013 07:01:40 +0000 (UTC) (envelope-from VenkatKumar.Duvvuru@Emulex.Com) Received: from CMEXEDGE1.ext.emulex.com (cmexedge1.ext.emulex.com [138.239.224.99]) by mx1.freebsd.org (Postfix) with ESMTP id A22B16A7 for ; Wed, 6 Mar 2013 07:01:40 +0000 (UTC) Received: from CMEXHTCAS2.ad.emulex.com (138.239.115.218) by CMEXEDGE1.ext.emulex.com (138.239.224.99) with Microsoft SMTP Server (TLS) id 14.2.318.4; Tue, 5 Mar 2013 23:02:53 -0800 Received: from CMEXMB1.ad.emulex.com ([169.254.1.137]) by CMEXHTCAS2.ad.emulex.com ([2002:8aef:73da::8aef:73da]) with mapi id 14.02.0318.004; Tue, 5 Mar 2013 23:01:31 -0800 From: "Duvvuru,Venkat Kumar" To: Josh Paetzel Subject: RE: OCE driver patches Thread-Topic: OCE driver patches Thread-Index: Ac4FKZvQ2m3Cu1QRR3u/QVUeigTouwB8AcmAA9n7JiAAEvolAADaXNqw Date: Wed, 6 Mar 2013 07:01:31 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [138.239.140.229] Content-Type: multipart/mixed; boundary="_002_BF3270C86E8B1349A26C34E4EC1C44CB215D9B59CMEXMB1ademulex_" MIME-Version: 1.0 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 07:01:41 -0000 --_002_BF3270C86E8B1349A26C34E4EC1C44CB215D9B59CMEXMB1ademulex_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Josh, I'm attaching a .tgz file of the patches (oce0.patch to oce24.patch, Please= make sure that you apply them in the same order) that I told you about for= the Emulex's OCE driver. I had opened a PR for the same, However .tgz files are not allowed as attac= hments to the problem report, So I just renamed .tgz file to .txt for uploa= ding. However I have not gotten any email notification on that problem report, co= uld be because of the attachment problem I guess.=20 Please let me know if I could open a PR without any attachment and send you= that PR number. Also there is a notification about freebsd 8.4 code freeze by March 8. It = would be nice if we could get these patches in before the code freeze on 8.= 4 as well. Pls suggest. Thanks, Venkat. -----Original Message----- From: Josh Paetzel [mailto:josh@tcbug.org]=20 Sent: Friday, March 01, 2013 8:08 PM To: Duvvuru,Venkat Kumar Cc: freebsd-net@freebsd.org Subject: Re: OCE driver patches On Mar 1, 2013, at 5:36 AM, "Duvvuru,Venkat Kumar" wrote: > Hi Josh, > I have a bunch of patches (~25 in number) to submit. Please let me know t= he process to submit them. > Do I just attach them in a single email or open pr's for each of them?? > Pls suggest. >=20 > /Venkat >=20 Venkat, I think it depends on how you want them committed to FreeBSD.=20 If the patches are atomic changes that should be kept atomic in the FreeBSD= source tree then I'll commit them seperately. This is a tad time consuming= as I test them atomically before committing them. If they can be committe= d in one go then I can just apply them all, test the end result, and commit= that. One PR with the patches attached and a note saying these can all go in in o= ne go is appropriate in the latter case, the former would be best served by= seperate PRs. Thanks, Josh Paetzel --_002_BF3270C86E8B1349A26C34E4EC1C44CB215D9B59CMEXMB1ademulex_-- From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 07:32:42 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9EB14F2C for ; Wed, 6 Mar 2013 07:32:42 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-f46.google.com (mail-wg0-f46.google.com [74.125.82.46]) by mx1.freebsd.org (Postfix) with ESMTP id 133B67A1 for ; Wed, 6 Mar 2013 07:32:41 +0000 (UTC) Received: by mail-wg0-f46.google.com with SMTP id fg15so6853704wgb.13 for ; Tue, 05 Mar 2013 23:32:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=Tynyg9fXL6DfIoA1/e9slZogS8k9CTqbtZDi5floLgk=; b=CTF+x27Ya5LjbXh32BYYzSe+bDbdk55VVecTJKiXzkox3XKd46wlgds1rMbbjnDpgA RjpVsbOxjxOH0SAQyLdLMp86HxQX4nbnfHXACkbGbWxnRFC6FJQI196yM3mIB/oi6Gni t1abd7GwwuHMskkzWEy6sL76S9r/kOr7sirRJ+1RjYraxq12guiE/i4feeIjVt5EquHC N0iUvfWnSmvjGZe2LFb4SkHkBpIGnRsGxwzo4BlUDfZkqAkr8PF61mrBndUD/i42d6uH xBRvgwRGnFpTL3/BJeEJa7oAxnzMVi44dATkPlPhkQ0iS7CXdckBXfalb/G+ZxyH7nV0 A6+Q== MIME-Version: 1.0 X-Received: by 10.180.108.3 with SMTP id hg3mr23594376wib.33.1362555160454; Tue, 05 Mar 2013 23:32:40 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.114.201 with HTTP; Tue, 5 Mar 2013 23:32:40 -0800 (PST) In-Reply-To: References: Date: Tue, 5 Mar 2013 23:32:40 -0800 X-Google-Sender-Auth: O_oVYtcpGYq0nhueZ0odFTAQD1w Message-ID: Subject: Re: Default route changes unexpectedly From: Adrian Chadd To: Nick Rogers Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 07:32:42 -0000 It's a known problem; it just seems that it doesn't overlap/intersect the day to day activities of any network focused freebsd developers. If you guys want it fixed then you may have to find a developer to hire on contract to fix it, or find some kind of ruleset/traffic generation setup that reliably triggers the bug. Adrian On 5 March 2013 09:39, Nick Rogers wrote: > Hello, > > I am attempting to create awareness of a serious issue affecting users > of FreeBSD 9.x and PF. There appears to be a bug that allows the > kernel's routing table to be corrupted by traffic routing through the > system. Under heavy traffic load, the default route can seemingly > randomly change to an IP address that is not directly connected to the > network (i.e., is not configured anywhere). Dhclient is not in the > mix, nor is routed, bgpd, etc. Running `route monitor` shows no > evidence of the change in the default route. The one commonality > between all the systems experiencing this problem seems to be the use > of PF. > > Obviously this is a serious problem as it causes all Internet-bound > traffic to stop routing until the default route is corrected. Some > users, including myself, are working around this problem by installing > a script that runs multiple times a second to check if the default > route is incorrect and fixing it if necessary, which mitigates the > amount of downtime caused by the bug. > > Please refer to these past posts for more examples and evidence of > other users experiencing this problem: > > http://forums.freebsd.org/showthread.php?p=211610#post211610 > > http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html > > http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html > > http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html > > There is also a PR that was incorrectly labeled as an IPFW issue. > Myself and others believe this issue is not restricted to the use of > IPFW and that the PR should be relabeled. I am inclined to think it is > strictly a PF issue since I am not using IPFW, however there is > evidence of the default route changing on people using IPFW for past > versions of FreeBSD (7.x/8.x), so perhaps this is related. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749 > > Another PR for the same problem but specific to IPFW and 8.2-RELEASE > > http://www.freebsd.org/cgi/query-pr.cgi?pr=157796 > > I am hoping someone reading this can give the problem the attention it > deserves. Thank you. > > -Nick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 08:25:26 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B0BD7B73 for ; Wed, 6 Mar 2013 08:25:26 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 11C3598F for ; Wed, 6 Mar 2013 08:25:25 +0000 (UTC) Received: (qmail 52640 invoked from network); 6 Mar 2013 09:39:06 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 6 Mar 2013 09:39:06 -0000 Message-ID: <5136FD71.6000408@freebsd.org> Date: Wed, 06 Mar 2013 09:25:21 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Nick Rogers Subject: Re: Default route changes unexpectedly References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 08:25:26 -0000 On 05.03.2013 18:39, Nick Rogers wrote: > Hello, > > I am attempting to create awareness of a serious issue affecting users > of FreeBSD 9.x and PF. There appears to be a bug that allows the > kernel's routing table to be corrupted by traffic routing through the > system. Under heavy traffic load, the default route can seemingly > randomly change to an IP address that is not directly connected to the > network (i.e., is not configured anywhere). Dhclient is not in the > mix, nor is routed, bgpd, etc. Running `route monitor` shows no > evidence of the change in the default route. The one commonality > between all the systems experiencing this problem seems to be the use > of PF. > > Obviously this is a serious problem as it causes all Internet-bound > traffic to stop routing until the default route is corrected. Some > users, including myself, are working around this problem by installing > a script that runs multiple times a second to check if the default > route is incorrect and fixing it if necessary, which mitigates the > amount of downtime caused by the bug. Can you describe your traffic forwarding setup in more detail? Is it only pf, or do you run netgraph, or other things as well? Do you use flow routing? How frequent does this happen? I'm trying to create a stack graph to see which parts of the network stack are involved in handling your packet. -- Andre > Please refer to these past posts for more examples and evidence of > other users experiencing this problem: > > http://forums.freebsd.org/showthread.php?p=211610#post211610 > > http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html > > http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html > > http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html > > There is also a PR that was incorrectly labeled as an IPFW issue. > Myself and others believe this issue is not restricted to the use of > IPFW and that the PR should be relabeled. I am inclined to think it is > strictly a PF issue since I am not using IPFW, however there is > evidence of the default route changing on people using IPFW for past > versions of FreeBSD (7.x/8.x), so perhaps this is related. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749 > > Another PR for the same problem but specific to IPFW and 8.2-RELEASE > > http://www.freebsd.org/cgi/query-pr.cgi?pr=157796 > > I am hoping someone reading this can give the problem the attention it > deserves. Thank you. > > -Nick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 08:45:27 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5C7D311C; Wed, 6 Mar 2013 08:45:27 +0000 (UTC) (envelope-from krzysiek@airnet.opole.pl) Received: from base.airnet.opole.pl (ns2.airmax.pl [176.111.128.3]) by mx1.freebsd.org (Postfix) with ESMTP id 1AE26A53; Wed, 6 Mar 2013 08:45:25 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by base.airnet.opole.pl (Postfix) with ESMTP id 036B87FF059; Wed, 6 Mar 2013 09:38:47 +0100 (CET) Received: from base.airnet.opole.pl ([127.0.0.1]) by localhost (mail.airnet.opole.pl [127.0.0.1]) (maiad, port 10024) with ESMTP id 66913-06; Wed, 6 Mar 2013 09:38:46 +0100 (CET) Received: from [10.10.11.223] (unknown [176.111.138.12]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: krzysiek@airnet.opole.pl) by base.airnet.opole.pl (Postfix) with ESMTPSA id 1C8E87FF04D; Wed, 6 Mar 2013 09:38:44 +0100 (CET) Message-ID: <51370093.40009@airnet.opole.pl> Date: Wed, 06 Mar 2013 09:38:43 +0100 From: Krzysztof Barcikowski User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: Default route changes unexpectedly References: <5136FD71.6000408@freebsd.org> In-Reply-To: <5136FD71.6000408@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 08:45:27 -0000 W dniu 2013-03-06 09:25, Andre Oppermann pisze: > Can you describe your traffic forwarding setup in more detail? > Is it only pf, or do you run netgraph, or other things as well? > Do you use flow routing? > > How frequent does this happen? > > I'm trying to create a stack graph to see which parts of the network > stack are involved in handling your packet. > Hi, In my case, I do use PF for filtering and NAT (without routing options like 'route-to' or 'reply-to') together with ALTQ (PRIQ). I also use IPFW+Dummynet combo for shaping. net.inet.ip.sourceroute: 0 net.inet.ip.accept_sourceroute: 0 Router traffic is about 300Mb/s in peak. Frequency: Wed Oct 3 14:19:15 CEST 2012 Thu Dec 13 04:39:43 CET 2012 Thu Dec 13 04:39:46 CET 2012 Thu Dec 13 04:39:47 CET 2012 Thu Dec 13 04:39:50 CET 2012 Thu Dec 13 04:39:53 CET 2012 Thu Dec 13 04:39:59 CET 2012 Thu Dec 13 04:40:11 CET 2012 Fri Jan 4 07:47:00 CET 2013 Mon Jan 28 18:35:43 CET 2013 Sat Feb 2 22:43:01 CET 2013 I do only monitor default route change, but this bug also affects static routes (i.e. I have one static route and it changes more frequently that default route). Please let me know if I can provide any more feedback. Krzysiek From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 09:12:13 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B9448686; Wed, 6 Mar 2013 09:12:13 +0000 (UTC) (envelope-from dhartmei@insomnia.benzedrine.cx) Received: from insomnia.benzedrine.cx (cust.static.213-3-30-106.swisscomdata.ch [213.3.30.106]) by mx1.freebsd.org (Postfix) with ESMTP id D3995BE4; Wed, 6 Mar 2013 09:12:12 +0000 (UTC) Received: from insomnia.benzedrine.cx (localhost [127.0.0.1]) by insomnia.benzedrine.cx (8.14.5/8.14.5) with ESMTP id r268rCxp018755 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 6 Mar 2013 09:53:12 +0100 (MET) Received: (from dhartmei@localhost) by insomnia.benzedrine.cx (8.14.5/8.14.5/Submit) id r268rBUr023680; Wed, 6 Mar 2013 09:53:11 +0100 (MET) Date: Wed, 6 Mar 2013 09:53:11 +0100 From: Daniel Hartmeier To: Andre Oppermann Subject: Re: Default route changes unexpectedly Message-ID: <20130306085311.GA12382@insomnia.benzedrine.cx> References: <5136FD71.6000408@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5136FD71.6000408@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Nick Rogers , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 09:12:13 -0000 On Wed, Mar 06, 2013 at 09:25:21AM +0100, Andre Oppermann wrote: > I'm trying to create a stack graph to see which parts of the network > stack are involved in handling your packet. Ask people if they're using multiple pfil hooks (even just having ipfilter loaded counts, for instance). If that's a common factor, see http://marc.info/?l=freebsd-net&m=133888532814565&w=2 Daniel From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 09:13:50 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 213FE88F; Wed, 6 Mar 2013 09:13:50 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qe0-f49.google.com (mail-qe0-f49.google.com [209.85.128.49]) by mx1.freebsd.org (Postfix) with ESMTP id B7EB5BFC; Wed, 6 Mar 2013 09:13:49 +0000 (UTC) Received: by mail-qe0-f49.google.com with SMTP id 1so5145504qec.8 for ; Wed, 06 Mar 2013 01:13:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=WPUlVVDT8iKgrMxHdE8jqje/PV2KmIfkfRsKR3Hd4xg=; b=y6dLyPqK9SCtBHCTa5h1mD9ikmogMNJZQrvqlTtBFGsxSTOlKCNQZN5oqcUQ/nyHSw bnVOUemuaFSoFK147ylsavhBAv7qqXbfq/081RggUvx8cENh1vsCs8k5ba939zxBXZbd PyTQ3tkgQei+n/NwDMr1hdRnFCaGN2RIpzx4MvFo7hXJjK16YqU4+dJrDK+81V7Uh1Bk uT/XGDwnKVvF7zWT1bSzzTfBKcf3a/k5Tsl9+1XOsj4By7qtUXs76sxsU26ynjuA2X91 xbnL89FLJFtsqAOVKkmchsgEHKU64s6mDnl8R9tbrUFnsGFpJblshbW+TwP7LzlVO8fq Jghg== MIME-Version: 1.0 X-Received: by 10.49.30.70 with SMTP id q6mr46092639qeh.28.1362561223158; Wed, 06 Mar 2013 01:13:43 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Wed, 6 Mar 2013 01:13:43 -0800 (PST) In-Reply-To: <51370093.40009@airnet.opole.pl> References: <5136FD71.6000408@freebsd.org> <51370093.40009@airnet.opole.pl> Date: Wed, 6 Mar 2013 10:13:43 +0100 X-Google-Sender-Auth: -lDIT0I12nZHgWq6L3qEAhFF5O0 Message-ID: Subject: Re: Default route changes unexpectedly From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Krzysztof Barcikowski Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" , Andre Oppermann X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 09:13:50 -0000 On Wed, Mar 6, 2013 at 9:38 AM, Krzysztof Barcikowski < krzysiek@airnet.opole.pl> wrote: > W dniu 2013-03-06 09:25, Andre Oppermann pisze: > > Can you describe your traffic forwarding setup in more detail? >> Is it only pf, or do you run netgraph, or other things as well? >> Do you use flow routing? >> >> How frequent does this happen? >> >> I'm trying to create a stack graph to see which parts of the network >> stack are involved in handling your packet. >> >> > Hi, > In my case, I do use PF for filtering and NAT (without routing options > like 'route-to' or 'reply-to') together with ALTQ (PRIQ). > I also use IPFW+Dummynet combo for shaping. > > net.inet.ip.sourceroute: 0 > net.inet.ip.accept_**sourceroute: 0 > > Router traffic is about 300Mb/s in peak. > > Frequency: > Wed Oct 3 14:19:15 CEST 2012 > Thu Dec 13 04:39:43 CET 2012 > Thu Dec 13 04:39:46 CET 2012 > Thu Dec 13 04:39:47 CET 2012 > Thu Dec 13 04:39:50 CET 2012 > Thu Dec 13 04:39:53 CET 2012 > Thu Dec 13 04:39:59 CET 2012 > Thu Dec 13 04:40:11 CET 2012 > Fri Jan 4 07:47:00 CET 2013 > Mon Jan 28 18:35:43 CET 2013 > Sat Feb 2 22:43:01 CET 2013 > > I do only monitor default route change, but this bug also affects static > routes (i.e. I have one static route and it changes more frequently that > default route). > > Please let me know if I can provide any more feedback. > > Krzysiek > > > > Do you have flowtable support in your kernel? Can you try without it enabled? > > > ______________________________**_________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org > " > -- Ermal From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 09:29:03 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2FF88EB for ; Wed, 6 Mar 2013 09:29:03 +0000 (UTC) (envelope-from krzysiek@airnet.opole.pl) Received: from base.airnet.opole.pl (ns2.airmax.pl [176.111.128.3]) by mx1.freebsd.org (Postfix) with ESMTP id D801DD34 for ; Wed, 6 Mar 2013 09:29:02 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by base.airnet.opole.pl (Postfix) with ESMTP id 0453D7FF055 for ; Wed, 6 Mar 2013 10:29:00 +0100 (CET) Received: from base.airnet.opole.pl ([127.0.0.1]) by localhost (mail.airnet.opole.pl [127.0.0.1]) (maiad, port 10024) with ESMTP id 54708-04 for ; Wed, 6 Mar 2013 10:28:59 +0100 (CET) Received: from [10.10.11.223] (unknown [176.111.138.12]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: krzysiek@airnet.opole.pl) by base.airnet.opole.pl (Postfix) with ESMTPSA id C93937FF051 for ; Wed, 6 Mar 2013 10:28:59 +0100 (CET) Message-ID: <51370C5A.1080701@airnet.opole.pl> Date: Wed, 06 Mar 2013 10:28:58 +0100 From: Krzysztof Barcikowski User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: Default route changes unexpectedly References: <5136FD71.6000408@freebsd.org> <51370093.40009@airnet.opole.pl> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 09:29:03 -0000 I believe I don't have flowtable suport in kernel (no FLOWTABLE option), and no sysctl's related to flowtable. How to check if I'm using multiple pfil hooks? Best regards! Krzysiek W dniu 2013-03-06 10:13, Ermal Luçi pisze: > On Wed, Mar 6, 2013 at 9:38 AM, Krzysztof Barcikowski < > krzysiek@airnet.opole.pl> wrote: > >> W dniu 2013-03-06 09:25, Andre Oppermann pisze: >> >> Can you describe your traffic forwarding setup in more detail? >>> Is it only pf, or do you run netgraph, or other things as well? >>> Do you use flow routing? >>> >>> How frequent does this happen? >>> >>> I'm trying to create a stack graph to see which parts of the network >>> stack are involved in handling your packet. >>> >>> >> Hi, >> In my case, I do use PF for filtering and NAT (without routing options >> like 'route-to' or 'reply-to') together with ALTQ (PRIQ). >> I also use IPFW+Dummynet combo for shaping. >> >> net.inet.ip.sourceroute: 0 >> net.inet.ip.accept_**sourceroute: 0 >> >> Router traffic is about 300Mb/s in peak. >> >> Frequency: >> Wed Oct 3 14:19:15 CEST 2012 >> Thu Dec 13 04:39:43 CET 2012 >> Thu Dec 13 04:39:46 CET 2012 >> Thu Dec 13 04:39:47 CET 2012 >> Thu Dec 13 04:39:50 CET 2012 >> Thu Dec 13 04:39:53 CET 2012 >> Thu Dec 13 04:39:59 CET 2012 >> Thu Dec 13 04:40:11 CET 2012 >> Fri Jan 4 07:47:00 CET 2013 >> Mon Jan 28 18:35:43 CET 2013 >> Sat Feb 2 22:43:01 CET 2013 >> >> I do only monitor default route change, but this bug also affects static >> routes (i.e. I have one static route and it changes more frequently that >> default route). >> >> Please let me know if I can provide any more feedback. >> >> Krzysiek >> >> >> >> > Do you have flowtable support in your kernel? > Can you try without it enabled? > > >> >> ______________________________**_________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/**mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org >> " >> > > From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 10:00:38 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0E850C2A for ; Wed, 6 Mar 2013 10:00:38 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2]) by mx1.freebsd.org (Postfix) with ESMTP id 575CFE5D for ; Wed, 6 Mar 2013 10:00:37 +0000 (UTC) Received: from bsdrookie.norma.com. ([IPv6:fd00::726]) by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r26A0YB7029546 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 6 Mar 2013 16:00:35 +0600 (YEKT) (envelope-from emz@norma.perm.ru) Message-ID: <513713C2.1000007@norma.perm.ru> Date: Wed, 06 Mar 2013 16:00:34 +0600 From: "Eugene M. Zheganin" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> In-Reply-To: <20130306062658.GC1483@michelle.cdnetworks.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (elf.hq.norma.perm.ru [IPv6:fd00::30a]); Wed, 06 Mar 2013 16:00:35 +0600 (YEKT) X-Spam-Status: No hits=-97.8 bayes=0.5 testhits RDNS_NONE=1.274, SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 10:00:38 -0000 Hi. Hi. On 06.03.2013 12:26, YongHyeon PYUN wrote: > If you were using latest stable/8, the result would be same on > CURRENT. > How frequently do you see the watchdog timeouts? Is there way to > reproduce it? > Would you show me the output of dmesg (bge(4) and brgphy(4) only) > and "pciconf -lcbv"? I upgraded one om my routers 2 days ago to 8.3-STABLE, and got today a freeze. Uptime was less than a day. I have like dozens of these IBM system x3250, all of them run various 8.2-STABLE's, that's why I worry that much. I don't know if this is triggered by some of my actions. These routers run gre/ipsec, dirrerent routing stuff (quagga, bird), proxies and pf. In 2011/early 2012 I saw similar watchdog issues on these machines, and I disabled the tso on them. I don't know whether this is a coincidence or it really helps, but after that I didn't see these watchdog issues until today. I've also discovered that this particular server is running some old bioses/firmwares including the fact that it misses some NetXtreme updates available from IBM. Would applying such updates resolve the situation ? I am ok with that fact that I cannot run ipmi/sol on these machines, but it would be nice if this watchdog issue could be somehow resolved. Furthermore, I have some spare machines that I can provide full access to, including ipkvm stuff. Since the machine is only partially freezing, I cannot even rely on the ichwd and watchdogd to reboot it. pciconf (there's two controllers in this server, I use the first, but anyway): bge0@pci0:2:0:0: class=0x020000 card=0x03781014 chip=0x165a14e4 rev=0x00 hdr=0x00 vendor = 'Broadcom Corporation' device = 'Broadcom NetXtreme BCM5722 Gigabit (94309)' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xe8200000, size 65536, enabled cap 01[48] = powerspec 3 supports D0 D3 current D0 cap 03[50] = VPD cap 09[58] = vendor (length 120) cap 05[e8] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[d0] = PCI-Express 1 endpoint max data 128(128) link x1(x1) speed 2.5(2.5) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected ecap 0002[13c] = VC 1 max VC0 ecap 0003[160] = Serial 1 001a64fffe21962d ecap 0004[16c] = Power Budgeting 1 bge1@pci0:3:1:0: class=0x020000 card=0x026f1014 chip=0x16c714e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5703A3 NetXtreme Gigabit Ethernet' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xe8400000, size 65536, enabled cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split transaction cap 01[48] = powerspec 2 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit dmesg: bge0: mem 0xe8200000-0xe820ffff irq 16 at device 0.0 on pci2 bge0: CHIP ID 0x0000a200; ASIC REV 0x0a; CHIP REV 0xa2; PCI-E miibus0: on bge0 bge0: Ethernet address: 00:1a:64:21:96:2d bge0: [FILTER] bge1: mem 0xe8400000-0xe840ffff irq 21 at device 1.0 on pci3 bge1: CHIP ID 0x00001100; ASIC REV 0x01; CHIP REV 0x11; PCI on PCI-X 33 MHz; 32bit miibus1: on bge1 bge1: Ethernet address: 00:1a:64:21:96:2e bge1: [ITHREAD] [emz@omega:~]# cat /var/run/dmesg.boot | egrep 'bge|brg' bge0: mem 0xe8200000-0xe820ffff irq 16 at device 0.0 on pci2 bge0: CHIP ID 0x0000a200; ASIC REV 0x0a; CHIP REV 0xa2; PCI-E miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1a:64:21:96:2d bge0: [FILTER] bge1: mem 0xe8400000-0xe840ffff irq 21 at device 1.0 on pci3 bge1: CHIP ID 0x00001100; ASIC REV 0x01; CHIP REV 0x11; PCI on PCI-X 33 MHz; 32bit miibus1: on bge1 brgphy1: PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:1a:64:21:96:2e bge1: [ITHREAD] Thanks. Eugene. From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 10:12:11 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4CC711CF for ; Wed, 6 Mar 2013 10:12:11 +0000 (UTC) (envelope-from s.khanchi@gmail.com) Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49]) by mx1.freebsd.org (Postfix) with ESMTP id E2AFDEF1 for ; Wed, 6 Mar 2013 10:12:10 +0000 (UTC) Received: by mail-wg0-f49.google.com with SMTP id 15so7048352wgd.28 for ; Wed, 06 Mar 2013 02:12:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=9lMR4V2uS0cOekjpDYrJNsbRjYW9wEFwVn9BHGebNcs=; b=l2NaisGQprDMO3rJF2qNkCZidkDHC31iZkXN+ziBdLzcuHog1SBNHk4+s1cBv9ZSM6 TvvIYVLnoC/4Pmdp3NE7ujTAz+sXF7T5/sZDc93QVH1sFpqJ3cy+UkwRev6GqJshXQRQ lMqs1nekO3J2R7D/V0DbXnu2eZaH+CTNmh8P+pm0BTDM/AGqbTEDprEC8Cm+HYr9bBpj z7BkV1CorlKMLWbvZ3adWnB2yf6xSmIv7OSDhV6hKSWtS8m1myVw4z3MR3uh7XXQPFzL y2+oxZ4uvuM/Z4/09TWcn1uqenJtPixBhUSyhtqv49pgw3mvDeXaoXt1Ee/tiJ/LmQJp HQfQ== X-Received: by 10.194.170.165 with SMTP id an5mr45512266wjc.41.1362564730098; Wed, 06 Mar 2013 02:12:10 -0800 (PST) MIME-Version: 1.0 Sender: s.khanchi@gmail.com Received: by 10.194.121.104 with HTTP; Wed, 6 Mar 2013 02:11:47 -0800 (PST) In-Reply-To: <8EB66934-D33C-425E-A076-66E31B618DCA@neville-neil.com> References: <8EB66934-D33C-425E-A076-66E31B618DCA@neville-neil.com> From: h bagade Date: Wed, 6 Mar 2013 13:41:47 +0330 X-Google-Sender-Auth: 6_LdrbOOVlHX6mo_X5ws-iu2EzI Message-ID: Subject: Re: how to get mac address info in kernel code? To: George Neville-Neil Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 10:12:11 -0000 On Tue, Mar 5, 2013 at 7:23 PM, George Neville-Neil wrote: > > On Mar 5, 2013, at 08:54 , h bagade wrote: > > > Hi all, > > > > I need to get interface MAC address within the kernel code and I couldn't > > use "getifaddrs" because it's user-mode. How can I have the MAC address > > information within kernel code? > > > > Any hints or comments are really appreciated. > > If you have access to the struct ifnet you can look at the if_addr member, > which is > a struct ifaddr, defined in if_var.h . > > Best, > George > Thanks for your suggestion. I will make it a try. From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 11:56:33 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 82CEA686 for ; Wed, 6 Mar 2013 11:56:33 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2]) by mx1.freebsd.org (Postfix) with ESMTP id 53010601 for ; Wed, 6 Mar 2013 11:56:32 +0000 (UTC) Received: from bsdrookie.norma.com. ([IPv6:fd00::726]) by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r26BuTII049900 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 6 Mar 2013 17:56:30 +0600 (YEKT) (envelope-from emz@norma.perm.ru) Message-ID: <51372EED.7080803@norma.perm.ru> Date: Wed, 06 Mar 2013 17:56:29 +0600 From: "Eugene M. Zheganin" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 CC: freebsd-net@freebsd.org Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> In-Reply-To: <20130306062658.GC1483@michelle.cdnetworks.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (elf.hq.norma.perm.ru [IPv6:fd00::30a]); Wed, 06 Mar 2013 17:56:30 +0600 (YEKT) X-Spam-Status: No hits=-96.5 bayes=0.5 testhits MISSING_HEADERS=1.207, RDNS_NONE=1.274,SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 11:56:33 -0000 Hi. On 06.03.2013 12:26, YongHyeon PYUN wrote: > If you were using latest stable/8, the result would be same on > CURRENT. > How frequently do you see the watchdog timeouts? Is there way to > reproduce it? > Would you show me the output of dmesg (bge(4) and brgphy(4) only) > and "pciconf -lcbv"? I just thought. I have never saw a watchdog timeout on an i386. Like, never (on same system x3250 and same controllers - these servers are from the same bunch). However all of my i386 machines run less recent versions of FreeBSD. Does this make sense ? I mean amd64 and related stuff. Thanks Eugene. From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 15:21:09 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A5AFEE7B for ; Wed, 6 Mar 2013 15:21:09 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from sam.nabble.com (sam.nabble.com [216.139.236.26]) by mx1.freebsd.org (Postfix) with ESMTP id 775A8E5 for ; Wed, 6 Mar 2013 15:21:08 +0000 (UTC) Received: from [192.168.236.26] (helo=sam.nabble.com) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1UDG9O-0004DD-Ax for freebsd-net@freebsd.org; Wed, 06 Mar 2013 07:21:02 -0800 Date: Wed, 6 Mar 2013 07:21:02 -0800 (PST) From: Courtland To: freebsd-net@freebsd.org Message-ID: <1362583262334-5793139.post@n5.nabble.com> In-Reply-To: <2DE61B0869B7484997BCA012845482C7EBE62DDD88@WIN2008.Domnt.abi.ca> References: <2DE61B0869B7484997BCA012845482C7EBE62DDD88@WIN2008.Domnt.abi.ca> Subject: Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 15:21:09 -0000 Has there been any progress on resolving this problem. Does anyone have a better idea as to where it is breaking down? I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF for NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The default gateway changes to an IP that is not on my network when under heavy network load. The last time this happened I had a stream of arpresolve messages in the kernel for the IP that the default route was changed to. Mar 5 19:12:53 kernel: arpresolve: can't allocate llinfo for 50.142.201.101 The default route was changed to 50.142.201.101 after these messages. -- View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html Sent from the freebsd-net mailing list archive at Nabble.com. From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 16:16:28 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3EFB9EE2 for ; Wed, 6 Mar 2013 16:16:28 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-we0-x22f.google.com (mail-we0-x22f.google.com [IPv6:2a00:1450:400c:c03::22f]) by mx1.freebsd.org (Postfix) with ESMTP id D88855F2 for ; Wed, 6 Mar 2013 16:16:27 +0000 (UTC) Received: by mail-we0-f175.google.com with SMTP id x8so8338885wey.20 for ; Wed, 06 Mar 2013 08:16:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:date:x-google-sender-auth:message-id :subject:from:to:cc:content-type; bh=4qEZV/i3KKzSEHVRGpNufiVbrhmx3176fBsOpgkhyPY=; b=uxgywnmT5phLvZ5mEVfjL2AH4lHLBxQ8qVuRl6hFv6bO3npOOjmMz7tmcUsxPkgJuy FbelbZPuxDvj/dTPcmoQbqYueGP+S2tZhlZVmqTWbrVmZ+PLmYyU65JOL5iSRcjeR2Yb rCkEyoI7UclFg/s/vdSvDfj8SKd0+AyDEDJPCGBNjsarLP42numuufbkyWpFGpsqW9AY riBCbk0HkGIwK2jtmynQ/j8uOyX5to5yB5Wz28QWhjmg2ElZJDiD1O/fJAbM5JlbNmSU 1+cL+AskzCTBhO7LZqtYH2w8TSL63WqTfSfXzn3OxDJNO4K4/LYi3oK6f4IH+S1JVIjf gpSA== MIME-Version: 1.0 X-Received: by 10.180.87.170 with SMTP id az10mr27682576wib.3.1362586584029; Wed, 06 Mar 2013 08:16:24 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.217.51.2 with HTTP; Wed, 6 Mar 2013 08:16:23 -0800 (PST) Date: Wed, 6 Mar 2013 08:16:23 -0800 X-Google-Sender-Auth: XyQlxVmqtxkTvTsrIfDMhQHFid4 Message-ID: Subject: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) From: Adrian Chadd To: Courtland Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 16:16:28 -0000 Another instance of it.. Adrian On 6 March 2013 07:21, Courtland wrote: > Has there been any progress on resolving this problem. Does anyone have a > better idea as to where it is breaking down? > > I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF for > NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The > default gateway changes to an IP that is not on my network when under heavy > network load. > > The last time this happened I had a stream of arpresolve messages in the > kernel for the IP that the default route was changed to. > Mar 5 19:12:53 kernel: arpresolve: can't allocate llinfo for > 50.142.201.101 > The default route was changed to 50.142.201.101 after these messages. > > > > > -- > View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html > Sent from the freebsd-net mailing list archive at Nabble.com. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 18:27:48 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 745B14DB for ; Wed, 6 Mar 2013 18:27:48 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id D2C13DBB for ; Wed, 6 Mar 2013 18:27:47 +0000 (UTC) Received: (qmail 67564 invoked from network); 6 Mar 2013 19:41:23 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 6 Mar 2013 19:41:23 -0000 Message-ID: <51378A9D.6080306@freebsd.org> Date: Wed, 06 Mar 2013 19:27:41 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Courtland Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 18:27:48 -0000 Courtland, the arpresolve observation is very important. Do you have flowtable enabled in your kernel? -- Andre On 06.03.2013 17:16, Adrian Chadd wrote: > Another instance of it.. > Adrian > On 6 March 2013 07:21, Courtland wrote: >> Has there been any progress on resolving this problem. Does anyone have a >> better idea as to where it is breaking down? >> >> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF for >> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The >> default gateway changes to an IP that is not on my network when under heavy >> network load. >> >> The last time this happened I had a stream of arpresolve messages in the >> kernel for the IP that the default route was changed to. >> Mar 5 19:12:53 kernel: arpresolve: can't allocate llinfo for >> 50.142.201.101 >> The default route was changed to 50.142.201.101 after these messages. >> >> >> >> >> -- >> View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html >> Sent from the freebsd-net mailing list archive at Nabble.com. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From owner-freebsd-net@FreeBSD.ORG Wed Mar 6 21:02:37 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 008BC8E8 for ; Wed, 6 Mar 2013 21:02:36 +0000 (UTC) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225]) by mx1.freebsd.org (Postfix) with ESMTP id B7ABA836 for ; Wed, 6 Mar 2013 21:02:35 +0000 (UTC) Received: from udns.ultimateDNS.NET (localhost [127.0.0.1]) by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r26L2TUN033493 for ; Wed, 6 Mar 2013 13:02:35 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: (from www@localhost) by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r26L2OPa033492; Wed, 6 Mar 2013 13:02:24 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimatedns.net ([209.180.214.225]) (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP; Wed, 6 Mar 2013 13:02:24 -0800 (PST) Message-ID: Date: Wed, 6 Mar 2013 13:02:24 -0800 (PST) Subject: Implementing IP6 in 8.3 From: "freebsd-net" To: "freebsd-net" User-Agent: UDNSMS/2.0.3 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2013 21:02:37 -0000 Greetings, I'm evaluating an ISP for the sake of building BSD operating systems on hardware that they use (DSL modems, in this case). When I had my old NEC server, I had a MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at it for use in alot of hardware I have laying around. In my current situation, I'm using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively new modem, it doesn't support IP6. It is my hope to replace the OS with one that does. :) I leased a /48 of IP4's from them, which /also/ came with as many IP6's. So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers), I'm wondering 2 things: If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho my gateway (modem) doesn't? Am I /correctly/ attempting to use it? I'm answering authoritatively for the many domains I own. They have all functioned well for many years via IP4. I have added the requisite AAAA records in all the zones, as well as the associated RR's. While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48. So it's up to the modem to answer accordingly. BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically, via rc.conf(5). While I've read as much as I can find on the topic related to BSD, boot messages indicate at least -- "IP6 gateway unreachable". I'm currently using: rc.conf(5): ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000" ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000" I also have the corresponding host IP in hosts(5). Any help, pointers, guidance, answers /greatly/ appreciated. Thank you for all your time, and consideration. --Chris From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 02:25:00 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EA82ADD2 for ; Thu, 7 Mar 2013 02:25:00 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-f47.google.com (mail-pb0-f47.google.com [209.85.160.47]) by mx1.freebsd.org (Postfix) with ESMTP id 94D46926 for ; Thu, 7 Mar 2013 02:25:00 +0000 (UTC) Received: by mail-pb0-f47.google.com with SMTP id rp2so6899271pbb.20 for ; Wed, 06 Mar 2013 18:24:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=YPZ8COZutJIEvqb0+oF9RXA1P51EFK3FBcE9iwJVvH8=; b=r5pS8R6NfJMjusQLxzlaVWwwEro8drxWKzLr4BR0VE8bEI8nBhPIP/NI1/agAYyp+J V0wm6n/1NuQco0paym2Rnvt6P3sJW3ZMEVzWhLOmn1AeDpbffT0tQZtjfKzHSHTYZfiH LOoOEcPVVmzIsAr3EW5PmOYeSQMHmS2TLlcng8VTJfcjaP00xRSvZyTaE88VXkw97nme lO0uHstQy7XMzUljgvn6/LD/ZZL/wUyRUeAD9T3uo+eBYrbRYWlXTttI+mPzxIKRCqDf LFrlIr1USSYbS+ydvD7ghzOZyPWAFIqa4Wbcjv9W++e7kjQht0lwpUuWQgIkrBaUgzw2 YsZw== X-Received: by 10.68.33.98 with SMTP id q2mr51351215pbi.135.1362623094627; Wed, 06 Mar 2013 18:24:54 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id av14sm243052pac.18.2013.03.06.18.24.51 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 06 Mar 2013 18:24:53 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 07 Mar 2013 11:24:46 +0900 From: YongHyeon PYUN Date: Thu, 7 Mar 2013 11:24:46 +0900 To: "Eugene M. Zheganin" Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout Message-ID: <20130307022446.GB3108@michelle.cdnetworks.com> References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <513713C2.1000007@norma.perm.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 02:25:01 -0000 On Wed, Mar 06, 2013 at 04:00:34PM +0600, Eugene M. Zheganin wrote: > Hi. > Hi. > > On 06.03.2013 12:26, YongHyeon PYUN wrote: > > If you were using latest stable/8, the result would be same on > > CURRENT. > > How frequently do you see the watchdog timeouts? Is there way to > > reproduce it? > > Would you show me the output of dmesg (bge(4) and brgphy(4) only) > > and "pciconf -lcbv"? > I upgraded one om my routers 2 days ago to 8.3-STABLE, and got today a > freeze. Uptime was less than a day. > I have like dozens of these IBM system x3250, all of them run various > 8.2-STABLE's, that's why I worry that much. I don't know if this is What was previous SVN revision number on that machine? The support for 5718/5719/5720 was merged to stable/8 about 3 months ago. > triggered by some of my actions. These routers run gre/ipsec, dirrerent > routing stuff (quagga, bird), proxies and pf. In 2011/early 2012 I saw > similar watchdog issues on these machines, and I disabled the tso on > them. I don't know whether this is a coincidence or it really helps, but > after that I didn't see these watchdog issues until today. I'm not aware of TSO issue on your controller. pf(4) had TSO issue but I guess it was fixed long time ago. > > I've also discovered that this particular server is running some old > bioses/firmwares including the fact that it misses some NetXtreme > updates available from IBM. Would applying such updates resolve the > situation ? > Updating etherent controller firmware is always good idea. But I'm not sure whether this address the issue. > I am ok with that fact that I cannot run ipmi/sol on these machines, but > it would be nice if this watchdog issue could be somehow resolved. Actually this is the first report after the merge which seems to break bge(4). > Furthermore, I have some spare machines that I can provide full access > to, including ipkvm stuff. Since the machine is only partially freezing, > I cannot even rely on the ichwd and watchdogd to reboot it. Sorry no clue yet. > > pciconf (there's two controllers in this server, I use the first, but > anyway): Thanks for the info. [...] From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 05:09:00 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id BCC0B624 for ; Thu, 7 Mar 2013 05:09:00 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2]) by mx1.freebsd.org (Postfix) with ESMTP id 4C780F16 for ; Thu, 7 Mar 2013 05:09:00 +0000 (UTC) Received: from [192.168.248.33] ([192.168.248.33]) by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r2758vVt080708 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 7 Mar 2013 11:08:57 +0600 (YEKT) (envelope-from emz@norma.perm.ru) Message-ID: <513820E2.806@norma.perm.ru> Date: Thu, 07 Mar 2013 11:08:50 +0600 From: "Eugene M. Zheganin" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> <20130307022446.GB3108@michelle.cdnetworks.com> In-Reply-To: <20130307022446.GB3108@michelle.cdnetworks.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (elf.hq.norma.perm.ru [192.168.3.10]); Thu, 07 Mar 2013 11:08:57 +0600 (YEKT) X-Spam-Status: No hits=-101.0 bayes=0.5 testhits ALL_TRUSTED=-1, USER_IN_WHITELIST=-100 autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 05:09:00 -0000 Hi. On 07.03.2013 8:24, YongHyeon PYUN wrote: > What was previous SVN revision number on that machine? > The support for 5718/5719/5720 was merged to stable/8 about 3 > months ago. > It was definitely older than "months". It was running something similar to "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the uname from a neighbor machine. I have, as I said, identical servers running FreeBSD. Here are some of the unames that I don't see timeouts on: 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous uptime around 180 days) 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) One more question: could it be a zfs-related issue ? Some kernel-level locking ? All of those run zfs also (no ufs at all). Thanks. Eugene. From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 06:23:45 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 122D6EFB for ; Thu, 7 Mar 2013 06:23:45 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id AE742196 for ; Thu, 7 Mar 2013 06:23:44 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id rr4so120094pbb.41 for ; Wed, 06 Mar 2013 22:23:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=JDLNkJAP0h5DdluY+tYGUCTMWl05ObqPfXOZ945CAYM=; b=FtFTWEbTe6mjYy83ylJsxMS6HOYUhOxrtGfORWWcA3WoK4axgH82WABYEtOrkbYr1O RG1d7mYWAQVJd5DGGJrcF7fEdHdr83UtgM39jwkoV8gdU9UEXYXqocCt5a0dROI7ZZjs Llk9RkB5FDbNQVdOXwcUom8nHngxO7GxRC0/EPRcGEUryAWAmsXJrKZhtccfAEpdZ3hE LdhQF/k1pJcxc8BxCnbRs6OkBRab3kYtOq8J3e0YlPx59adUl/Z4NLCyrtk/u2FkTz17 QcKGyHV9eQxu/oEZE/fLyXdFy0Em0d4Y5iNGaFkMYyrRuV7AI8ywOkzL/FKPtQwQE0Wa cu9A== X-Received: by 10.68.116.169 with SMTP id jx9mr51243706pbb.94.1362637424182; Wed, 06 Mar 2013 22:23:44 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id f10sm1014220paf.17.2013.03.06.22.23.40 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 06 Mar 2013 22:23:42 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 07 Mar 2013 15:23:35 +0900 From: YongHyeon PYUN Date: Thu, 7 Mar 2013 15:23:35 +0900 To: "Eugene M. Zheganin" Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout Message-ID: <20130307062335.GB1478@michelle.cdnetworks.com> References: <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> <20130307022446.GB3108@michelle.cdnetworks.com> <513820E2.806@norma.perm.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <513820E2.806@norma.perm.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 06:23:45 -0000 On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote: > Hi. > > On 07.03.2013 8:24, YongHyeon PYUN wrote: > >What was previous SVN revision number on that machine? > >The support for 5718/5719/5720 was merged to stable/8 about 3 > >months ago. > > > It was definitely older than "months". It was running something similar > to "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the > uname from a neighbor machine. > > I have, as I said, identical servers running FreeBSD. Here are some of > the unames that I don't see timeouts on: > > 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) > 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous > uptime around 180 days) These servers do not have 5718/5719/5720 changes. > 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) This server has the bge(4) change but it didn't trigger watchdog timeouts. Does this server use the same controller? If yes, the issue didn't come from bge(4) change. > > One more question: could it be a zfs-related issue ? Some kernel-level > locking ? All of those run zfs also (no ufs at all). Sorry I have no idea on ZFS. From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 06:24:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 311C3F94 for ; Thu, 7 Mar 2013 06:24:57 +0000 (UTC) (envelope-from zeus@ibs.dn.ua) Received: from relay.ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) by mx1.freebsd.org (Postfix) with ESMTP id 7ED5F1A6 for ; Thu, 7 Mar 2013 06:24:56 +0000 (UTC) Received: from ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) by relay.ibs.dn.ua with ESMTP id r276MpqC085463 for ; Thu, 7 Mar 2013 08:22:52 +0200 (EET) Message-ID: <20130307082251.85461@relay.ibs.dn.ua> Date: Thu, 07 Mar 2013 08:22:51 +0300 From: Zeus Panchenko To: Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout In-reply-to: Your message of Thu, 7 Mar 2013 11:24:46 +0900 <20130307022446.GB3108@michelle.cdnetworks.com> References: <201302241106.42477.vegeta@tuxpowered.net> <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> <20130307022446.GB3108@michelle.cdnetworks.com> Organization: I.B.S. LLC X-Mailer: MH-E 8.3.1; GNU Mailutils 2.99.97; GNU Emacs 24.0.93 X-Face: &sReWXo3Iwtqql1[My(t1Gkx; y?KF@KF`4X+'9Cs@PtK^y%}^.>Mtbpyz6U=,Op:KPOT.uG )Nvx`=er!l?WASh7KeaGhga"1[&yz$_7ir'cVp7o%CGbJ/V)j/=]vzvvcqcZkf; JDurQG6wTg+?/xA go`}1.Ze//K; Fk&/&OoHd'[b7iGt2UO>o(YskCT[_D)kh4!yY'<&:yt+zM=A`@`~9U+P[qS:f; #9z~ Or/Bo#N-'S'!'[3Wog'ADkyMqmGDvga?WW)qd=?)`Y&k=o}>!ST\ X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Zeus Panchenko List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 06:24:57 -0000 Hi, here is my situation, much like the issue On 06.03.2013 12:26, YongHyeon PYUN wrote: > If you were using latest stable/8, the result would be same on > CURRENT. I use FreeBSD 9.1-RELEASE #0 r243825: amd65 + ZFS on HP ProLiant DL360e Gen8 the box has two 4 headed cards igb(4) I350 and bge(4) NetXtreme BCM5719 according the pciconf data > How frequently do you see the watchdog timeouts? Is there way to > reproduce it? I noticed that after activation, bge(4) stops respond and interface becomes useless, while igb(4) works fine after some sysctl-ing for now I'm forced to not to use bge(4) at all :( > Would you show me the output of dmesg (bge(4) and brgphy(4) only) > and "pciconf -lcbv"? > grep "bge\|brgphy" dmesg.boot bge0: mem 0xfa3f0000-0xfa3fffff,0xfa3e0000-0xfa3effff,0xfa3d0000-0xfa3dffff irq 40 at device 0.0 on pci6 bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus0: on bge0 bge0: Ethernet address: ac:16:2d:83:ec:2c bge1: mem 0xfa3c0000-0xfa3cffff,0xfa3b0000-0xfa3bffff,0xfa3a0000-0xfa3affff irq 44 at device 0.1 on pci6 bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus1: on bge1 bge1: Ethernet address: ac:16:2d:83:ec:2d bge2: mem 0xfa390000-0xfa39ffff,0xfa380000-0xfa38ffff,0xfa370000-0xfa37ffff irq 40 at device 0.2 on pci6 bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus2: on bge2 bge2: Ethernet address: ac:16:2d:83:ec:2e bge3: mem 0xfa360000-0xfa36ffff,0xfa350000-0xfa35ffff,0xfa340000-0xfa34ffff irq 44 at device 0.3 on pci6 bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus3: on bge3 bge3: Ethernet address: ac:16:2d:83:ec:2f brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow brgphy1: PHY 2 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow brgphy2: PHY 3 on miibus2 brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow brgphy3: PHY 4 on miibus3 brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > pciconf -lcbv hostb0@pci0:0:0:0: class=0x060000 card=0x18a8103c chip=0x3c008086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMI2' class = bridge subclass = HOST-PCI cap 10[90] = PCI-Express 2 root port max data 128(128) link x0(x4) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000b[144] = unknown 1 ecap 000b[1d0] = unknown 1 ecap 000b[280] = unknown 1 pcib1@pci0:0:1:0: class=0x060400 card=0x18a8103c chip=0x3c028086 rev=0x07 hdr=0x01 vendor = 'Intel Corporation' device = 'Sandy Bridge IIO PCI Express Root Port 1a' class = bridge subclass = PCI-PCI cap 0d[40] = PCI Bridge card=0x18a8103c cap 05[60] = MSI supports 2 messages, vector masks cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000d[110] = unknown 1 ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000b[1d0] = unknown 1 ecap 0019[250] = unknown 1 ecap 000b[280] = unknown 1 pcib2@pci0:0:1:1: class=0x060400 card=0x18a8103c chip=0x3c038086 rev=0x07 hdr=0x01 vendor = 'Intel Corporation' device = 'Sandy Bridge IIO PCI Express Root Port 1b' class = bridge subclass = PCI-PCI cap 0d[40] = PCI Bridge card=0x18a8103c cap 05[60] = MSI supports 2 messages, vector masks cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000d[110] = unknown 1 ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000b[1d0] = unknown 1 ecap 0019[250] = unknown 1 ecap 000b[280] = unknown 1 pcib3@pci0:0:3:0: class=0x060400 card=0x18a8103c chip=0x3c088086 rev=0x07 hdr=0x01 vendor = 'Intel Corporation' device = 'Sandy Bridge IIO PCI Express Root Port 3a in PCI Express Mode' class = bridge subclass = PCI-PCI cap 0d[40] = PCI Bridge card=0x18a8103c cap 05[60] = MSI supports 2 messages, vector masks cap 10[90] = PCI-Express 2 root port max data 256(256) link x4(x16) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000d[110] = unknown 1 ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000b[1d0] = unknown 1 ecap 0019[250] = unknown 1 ecap 000b[280] = unknown 1 pcib4@pci0:0:3:1: class=0x060400 card=0x18a8103c chip=0x3c098086 rev=0x07 hdr=0x01 vendor = 'Intel Corporation' device = 'Sandy Bridge IIO PCI Express Root Port 3b' class = bridge subclass = PCI-PCI cap 0d[40] = PCI Bridge card=0x18a8103c cap 05[60] = MSI supports 2 messages, vector masks cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000d[110] = unknown 1 ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000b[1d0] = unknown 1 ecap 0019[250] = unknown 1 ecap 000b[280] = unknown 1 pcib5@pci0:0:3:2: class=0x060400 card=0x18a8103c chip=0x3c0a8086 rev=0x07 hdr=0x01 vendor = 'Intel Corporation' device = 'Sandy Bridge IIO PCI Express Root Port 3c' class = bridge subclass = PCI-PCI cap 0d[40] = PCI Bridge card=0x18a8103c cap 05[60] = MSI supports 2 messages, vector masks cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000d[110] = unknown 1 ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000b[1d0] = unknown 1 ecap 0019[250] = unknown 1 ecap 000b[280] = unknown 1 pcib6@pci0:0:3:3: class=0x060400 card=0x18a8103c chip=0x3c0b8086 rev=0x07 hdr=0x01 vendor = 'Intel Corporation' device = 'Sandy Bridge IIO PCI Express Root Port 3d' class = bridge subclass = PCI-PCI cap 0d[40] = PCI Bridge card=0x18a8103c cap 05[60] = MSI supports 2 messages, vector masks cap 10[90] = PCI-Express 2 root port max data 256(256) link x0(x4) cap 01[e0] = powerspec 3 supports D0 D3 current D0 ecap 000b[100] = unknown 1 ecap 000d[110] = unknown 1 ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000b[1d0] = unknown 1 ecap 0019[250] = unknown 1 ecap 000b[280] = unknown 1 none0@pci0:0:4:0: class=0x088000 card=0x18a8103c chip=0x3c208086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 0' class = base peripheral bar [10] = type Memory, range 64, base 0xfa4f0000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none1@pci0:0:4:1: class=0x088000 card=0x18a8103c chip=0x3c218086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 1' class = base peripheral bar [10] = type Memory, range 64, base 0xfa4e0000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none2@pci0:0:4:2: class=0x088000 card=0x18a8103c chip=0x3c228086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 2' class = base peripheral bar [10] = type Memory, range 64, base 0xfa4d0000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none3@pci0:0:4:3: class=0x088000 card=0x18a8103c chip=0x3c238086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 3' class = base peripheral bar [10] = type Memory, range 64, base 0xfa4c0000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none4@pci0:0:4:4: class=0x088000 card=0x18a8103c chip=0x3c248086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 4' class = base peripheral bar [10] = type Memory, range 64, base 0xfa4b0000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none5@pci0:0:4:5: class=0x088000 card=0x18a8103c chip=0x3c258086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 5' class = base peripheral bar [10] = type Memory, range 64, base 0xfa4a0000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none6@pci0:0:4:6: class=0x088000 card=0x18a8103c chip=0x3c268086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 6' class = base peripheral bar [10] = type Memory, range 64, base 0xfa490000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none7@pci0:0:4:7: class=0x088000 card=0x18a8103c chip=0x3c278086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge DMA Channel 7' class = base peripheral bar [10] = type Memory, range 64, base 0xfa480000, size 16384, enabled cap 11[80] = MSI-X supports 1 message in map 0x10 cap 10[90] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) cap 01[e0] = powerspec 3 supports D0 D3 current D0 none8@pci0:0:5:0: class=0x088000 card=0x18a8103c chip=0x3c288086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge Address Map, VTd_Misc, System Management' class = base peripheral cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) none9@pci0:0:5:2: class=0x088000 card=0x18a8103c chip=0x3c2a8086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge Control Status and Global Errors' class = base peripheral cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link x0(x0) ioapic0@pci0:0:5:4: class=0x080020 card=0x18a8103c chip=0x3c2c8086 rev=0x07 hdr=0x00 vendor = 'Intel Corporation' device = 'Sandy Bridge I/O APIC' class = base peripheral subclass = interrupt controller bar [10] = type Memory, range 32, base 0xfa470000, size 4096, enabled cap 01[6c] = powerspec 3 supports D0 D3 current D0 pcib7@pci0:0:17:0: class=0x060400 card=0x18a9103c chip=0x1d3e8086 rev=0x05 hdr=0x01 vendor = 'Intel Corporation' device = 'Patsburg PCI Express Virtual Root Port' class = bridge subclass = PCI-PCI cap 10[40] = PCI-Express 2 root port max data 128(128) link x1(x1) cap 01[80] = powerspec 3 supports D0 D3 current D0 cap 0d[88] = PCI Bridge card=0x18a9103c cap 05[90] = MSI supports 1 message ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 000d[138] = unknown 1 ehci0@pci0:0:26:0: class=0x0c0320 card=0x18a9103c chip=0x1d2d8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'Patsburg USB2 Enhanced Host Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xfa460000, size 1024, enabled cap 01[50] = powerspec 2 supports D0 D3 current D0 cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14 cap 13[98] = PCI Advanced Features: FLR TP pcib8@pci0:0:28:0: class=0x060400 card=0x18a9103c chip=0x1d108086 rev=0xb5 hdr=0x01 vendor = 'Intel Corporation' device = 'Patsburg PCI Express Root Port 1' class = bridge subclass = PCI-PCI cap 10[40] = PCI-Express 2 root port max data 128(128) link x0(x4) cap 05[80] = MSI supports 1 message cap 0d[90] = PCI Bridge card=0x18a9103c cap 01[a0] = powerspec 2 supports D0 D3 current D0 ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected pcib9@pci0:0:28:4: class=0x060400 card=0x18a9103c chip=0x1d188086 rev=0xb5 hdr=0x01 vendor = 'Intel Corporation' device = 'Patsburg PCI Express Root Port 5' class = bridge subclass = PCI-PCI cap 10[40] = PCI-Express 2 root port max data 128(128) link x2(x2) cap 05[80] = MSI supports 1 message cap 0d[90] = PCI Bridge card=0x18a9103c cap 01[a0] = powerspec 2 supports D0 D3 current D0 ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected pcib10@pci0:0:28:7: class=0x060400 card=0x18a9103c chip=0x1d1e8086 rev=0xb5 hdr=0x01 vendor = 'Intel Corporation' device = 'Patsburg PCI Express Root Port 8' class = bridge subclass = PCI-PCI cap 10[40] = PCI-Express 2 root port max data 128(128) link x1(x1) cap 05[80] = MSI supports 1 message cap 0d[90] = PCI Bridge card=0x18a9103c cap 01[a0] = powerspec 2 supports D0 D3 current D0 ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected ehci1@pci0:0:29:0: class=0x0c0320 card=0x18a9103c chip=0x1d268086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'Patsburg USB2 Enhanced Host Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xfa450000, size 1024, enabled cap 01[50] = powerspec 2 supports D0 D3 current D0 cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14 cap 13[98] = PCI Advanced Features: FLR TP pcib11@pci0:0:30:0: class=0x060401 card=0x18a9103c chip=0x244e8086 rev=0xa5 hdr=0x01 vendor = 'Intel Corporation' device = '82801 PCI Bridge' class = bridge subclass = PCI-PCI cap 0d[50] = PCI Bridge card=0x18a9103c isab0@pci0:0:31:0: class=0x060100 card=0x00000000 chip=0x1d418086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'Patsburg LPC Controller' class = bridge subclass = PCI-ISA cap 09[e0] = vendor (length 12) Intel cap 1 version 0 features: AMT, 4 PCI-e x1 slots ahci0@pci0:0:31:2: class=0x010601 card=0x18a9103c chip=0x1d028086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'Patsburg 6-Port SATA AHCI Controller' class = mass storage subclass = SATA bar [10] = type I/O Port, range 32, base 0x4000, size 8, enabled bar [14] = type I/O Port, range 32, base 0x4008, size 4, enabled bar [18] = type I/O Port, range 32, base 0x4010, size 8, enabled bar [1c] = type I/O Port, range 32, base 0x4018, size 4, enabled bar [20] = type I/O Port, range 32, base 0x4020, size 32, enabled bar [24] = type Memory, range 32, base 0xfa440000, size 2048, enabled cap 05[80] = MSI supports 1 message enabled with 1 message cap 01[70] = powerspec 3 supports D0 D3 current D0 cap 12[a8] = SATA Index-Data Pair cap 13[b0] = PCI Advanced Features: FLR TP bge0@pci0:6:0:0: class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5719 Gigabit Ethernet PCIe' class = network subclass = ethernet bar [10] = type Prefetchable Memory, range 64, base 0xfa3f0000, size 65536, enabled bar [18] = type Prefetchable Memory, range 64, base 0xfa3e0000, size 65536, enabled bar [20] = type Prefetchable Memory, range 64, base 0xfa3d0000, size 65536, enabled cap 01[48] = powerspec 3 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message cap 11[a0] = MSI-X supports 17 messages in map 0x20 cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4) ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected ecap 0003[13c] = Serial 1 0000ac162d83ec2c ecap 0004[150] = unknown 1 ecap 0002[160] = VC 1 max VC0 ecap 0017[230] = unknown 1 bge1@pci0:6:0:1: class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5719 Gigabit Ethernet PCIe' class = network subclass = ethernet bar [10] = type Prefetchable Memory, range 64, base 0xfa3c0000, size 65536, enabled bar [18] = type Prefetchable Memory, range 64, base 0xfa3b0000, size 65536, enabled bar [20] = type Prefetchable Memory, range 64, base 0xfa3a0000, size 65536, enabled cap 01[48] = powerspec 3 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message cap 11[a0] = MSI-X supports 17 messages in map 0x20 cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4) ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected ecap 0003[13c] = Serial 1 0000ac162d83ec2d ecap 0004[150] = unknown 1 ecap 0002[160] = VC 1 max VC0 ecap 0017[230] = unknown 1 bge2@pci0:6:0:2: class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5719 Gigabit Ethernet PCIe' class = network subclass = ethernet bar [10] = type Prefetchable Memory, range 64, base 0xfa390000, size 65536, enabled bar [18] = type Prefetchable Memory, range 64, base 0xfa380000, size 65536, enabled bar [20] = type Prefetchable Memory, range 64, base 0xfa370000, size 65536, enabled cap 01[48] = powerspec 3 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message cap 11[a0] = MSI-X supports 17 messages in map 0x20 cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4) ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected ecap 0003[13c] = Serial 1 0000ac162d83ec2e ecap 0004[150] = unknown 1 ecap 0002[160] = VC 1 max VC0 ecap 0017[230] = unknown 1 bge3@pci0:6:0:3: class=0x020000 card=0x3383103c chip=0x165714e4 rev=0x01 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5719 Gigabit Ethernet PCIe' class = network subclass = ethernet bar [10] = type Prefetchable Memory, range 64, base 0xfa360000, size 65536, enabled bar [18] = type Prefetchable Memory, range 64, base 0xfa350000, size 65536, enabled bar [20] = type Prefetchable Memory, range 64, base 0xfa340000, size 65536, enabled cap 01[48] = powerspec 3 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit enabled with 1 message cap 11[a0] = MSI-X supports 17 messages in map 0x20 cap 10[ac] = PCI-Express 2 endpoint max data 256(256) link x4(x4) ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected ecap 0003[13c] = Serial 1 0000ac162d83ec2f ecap 0004[150] = unknown 1 ecap 0002[160] = VC 1 max VC0 ecap 0017[230] = unknown 1 igb0@pci0:2:0:0: class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfbf00000, size 1048576, enabled bar [18] = type I/O Port, range 32, base 0x5000, size 32, enabled bar [1c] = type Memory, range 32, base 0xfbef0000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4) ecap 0001[100] = AER 2 ecap 0003[140] = Serial 1 6c3be5ffffb2dba0 ecap 000e[150] = unknown 1 ecap 0010[160] = unknown 1 ecap 0017[1a0] = unknown 1 ecap 0018[1c0] = unknown 1 ecap 000d[1d0] = unknown 1 igb1@pci0:2:0:1: class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfbd00000, size 1048576, enabled bar [18] = type I/O Port, range 32, base 0x5020, size 32, enabled bar [1c] = type Memory, range 32, base 0xfbcf0000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4) ecap 0001[100] = AER 2 ecap 0003[140] = Serial 1 6c3be5ffffb2dba0 ecap 000e[150] = unknown 1 ecap 0010[160] = unknown 1 ecap 0017[1a0] = unknown 1 ecap 000d[1d0] = unknown 1 igb2@pci0:2:0:2: class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfbb00000, size 1048576, enabled bar [18] = type I/O Port, range 32, base 0x5040, size 32, enabled bar [1c] = type Memory, range 32, base 0xfbaf0000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4) ecap 0001[100] = AER 2 ecap 0003[140] = Serial 1 6c3be5ffffb2dba0 ecap 000e[150] = unknown 1 ecap 0010[160] = unknown 1 ecap 0017[1a0] = unknown 1 ecap 000d[1d0] = unknown 1 igb3@pci0:2:0:3: class=0x020000 card=0x3380103c chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfb900000, size 1048576, enabled bar [18] = type I/O Port, range 32, base 0x5060, size 32, enabled bar [1c] = type Memory, range 32, base 0xfb8f0000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled cap 10[a0] = PCI-Express 2 endpoint max data 128(512) link x2(x4) ecap 0001[100] = AER 2 ecap 0003[140] = Serial 1 6c3be5ffffb2dba0 ecap 000e[150] = unknown 1 ecap 0010[160] = unknown 1 ecap 0017[1a0] = unknown 1 ecap 000d[1d0] = unknown 1 none10@pci0:1:0:0: class=0x088000 card=0x3381103c chip=0x3306103c rev=0x05 hdr=0x00 vendor = 'Hewlett-Packard Company' device = 'Integrated Lights-Out Standard Slave Instrumentation & System Support' class = base peripheral bar [10] = type I/O Port, range 32, base 0x3000, size 256, enabled bar [14] = type Memory, range 32, base 0xfb7f0000, size 512, enabled bar [18] = type I/O Port, range 32, base 0x3400, size 256, enabled cap 01[78] = powerspec 3 supports D0 D3 current D0 cap 05[b0] = MSI supports 1 message, 64 bit cap 10[c0] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1) vgapci0@pci0:1:0:1: class=0x030000 card=0x3381103c chip=0x0533102b rev=0x00 hdr=0x00 vendor = 'Matrox Graphics, Inc.' device = 'MGA G200EH' class = display subclass = VGA bar [10] = type Prefetchable Memory, range 32, base 0xf9000000, size 16777216, enabled bar [14] = type Memory, range 32, base 0xfb7e0000, size 16384, enabled bar [18] = type Memory, range 32, base 0xfa800000, size 8388608, enabled cap 01[a8] = powerspec 3 supports D0 D3 current D0 cap 05[b0] = MSI supports 1 message, 64 bit cap 10[c0] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1) none11@pci0:1:0:2: class=0x088000 card=0x3381103c chip=0x3307103c rev=0x05 hdr=0x00 vendor = 'Hewlett-Packard Company' device = 'Integrated Lights-Out Standard Management Processor Support and Messaging' class = base peripheral bar [10] = type I/O Port, range 32, base 0x3800, size 256, enabled bar [14] = type Memory, range 32, base 0xfa7f0000, size 256, enabled bar [18] = type Memory, range 32, base 0xfa600000, size 1048576, enabled bar [1c] = type Memory, range 32, base 0xfa580000, size 524288, enabled bar [20] = type Memory, range 32, base 0xfa570000, size 32768, enabled bar [24] = type Memory, range 32, base 0xfa560000, size 32768, enabled cap 01[78] = powerspec 3 supports D0 D3 current D0 cap 05[b0] = MSI supports 1 message, 64 bit cap 10[c0] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1) uhci0@pci0:1:0:4: class=0x0c0300 card=0x3381103c chip=0x3300103c rev=0x02 hdr=0x00 vendor = 'Hewlett-Packard Company' device = 'Integrated Lights-Out Standard Virtual USB Controller' class = serial bus subclass = USB bar [20] = type I/O Port, range 32, base 0x3c00, size 32, enabled cap 05[70] = MSI supports 1 message, 64 bit cap 10[80] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1) cap 01[f0] = powerspec 3 supports D0 D3 current D0 -- Zeus V. Panchenko jid:zeus@im.ibs.dn.ua IT Dpt., I.B.S. LLC GMT+2 (EET) From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 06:34:37 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 60890175; Thu, 7 Mar 2013 06:34:37 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 0522C1E9; Thu, 7 Mar 2013 06:34:37 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDUSq-0008ZT-Pk; Thu, 07 Mar 2013 10:38:04 +0400 Message-ID: <513834E4.7050203@FreeBSD.org> Date: Thu, 07 Mar 2013 10:34:12 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: net@freebsd.org Subject: [patch] interface routes Content-Type: multipart/mixed; boundary="------------070403050505050004040202" Cc: Andre Oppermann X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 06:34:37 -0000 This is a multi-part message in MIME format. --------------070403050505050004040202 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello list! There is a known long-lived issue with interface routes addition/deletion: ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in kernel route table (for example, advertised by IGP like OSPF). Interface route can be deleted via route(8) or any route socket user (sometimes this happens with popular opensource daemons like bird/quagga). Problem is reported at least in kern/106722 and kern/155772. This can be fixed the following way: Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' comment) is utilised to mark route 'immutable'. rtrequest1_fib refuses to delete routes with given flag unless RTM_PINNED is set in rti_flags. Every interface address manupulation is done via rtinit[1], so rtinit1() sets this flag (and behavior does not change here). Adding interface address is handled via atomically deleting old prefix and adding interface one. --------------070403050505050004040202 Content-Type: text/plain; name="iface_routes.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="iface_routes.diff" Index: sys/net/if.c =================================================================== --- sys/net/if.c (revision 247623) +++ sys/net/if.c (working copy) @@ -1357,7 +1357,8 @@ if_rtdel(struct radix_node *rn, void *arg) return (0); err = rtrequest_fib(RTM_DELETE, rt_key(rt), rt->rt_gateway, - rt_mask(rt), rt->rt_flags|RTF_RNH_LOCKED, + rt_mask(rt), + rt->rt_flags|RTF_RNH_LOCKED|RTF_PINNED, (struct rtentry **) NULL, rt->rt_fibnum); if (err) { log(LOG_WARNING, "if_rtdel: error %d\n", err); Index: sys/net/route.c =================================================================== --- sys/net/route.c (revision 247842) +++ sys/net/route.c (working copy) @@ -1112,6 +1112,16 @@ rtrequest1_fib(int req, struct rt_addrinfo *info, error = 0; } #endif + if ((flags & RTF_PINNED) == 0) { + /* + * Check if can delete target route. + */ + rt = (struct rtentry *)rnh->rnh_lookup(dst, + netmask, rnh); + if ((rt != NULL) && (rt->rt_flags & RTF_PINNED)) + senderr(EPERM); + } + /* * Remove the item from the tree and return it. * Complain if it is not there and do no more processing. @@ -1430,6 +1440,7 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in int didwork = 0; int a_failure = 0; static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK}; + struct radix_node_head *rnh; if (flags & RTF_HOST) { dst = ifa->ifa_dstaddr; @@ -1488,7 +1499,6 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in */ for ( fibnum = startfib; fibnum <= endfib; fibnum++) { if (cmd == RTM_DELETE) { - struct radix_node_head *rnh; struct radix_node *rn; /* * Look up an rtentry that is in the routing tree and @@ -1538,7 +1548,8 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in */ bzero((caddr_t)&info, sizeof(info)); info.rti_ifa = ifa; - info.rti_flags = flags | (ifa->ifa_flags & ~IFA_RTSELF); + info.rti_flags = flags | + (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED; info.rti_info[RTAX_DST] = dst; /* * doing this for compatibility reasons @@ -1550,6 +1561,32 @@ rtinit1(struct ifaddr *ifa, int cmd, int flags, in info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr; info.rti_info[RTAX_NETMASK] = netmask; error = rtrequest1_fib(cmd, &info, &rt, fibnum); + + if ((error == EEXIST) && (cmd == RTM_ADD)) { + /* + * Interface route addition failed. + * Note we probably already checked + * other interface addresses if given prefix exists. + * Atomically delete current prefix generating + * RTM_DELETE message, and retry adding + * interface address. + */ + rnh = rt_tables_get_rnh(fibnum, dst->sa_family); + RADIX_NODE_HEAD_LOCK(rnh); + /* Delete old prefix */ + info.rti_ifa = NULL; + info.rti_flags = RTF_RNH_LOCKED; + error = rtrequest1_fib(RTM_DELETE, &info, &rt, fibnum); + if (error == 0) { + info.rti_ifa = ifa; + info.rti_flags = flags | RTF_RNH_LOCKED | + (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED; + error = rtrequest1_fib(cmd, &info, &rt, fibnum); + } + RADIX_NODE_HEAD_UNLOCK(rnh); + } + + if (error == 0 && rt != NULL) { /* * notify any listening routing agents of the change Index: sys/net/route.h =================================================================== --- sys/net/route.h (revision 247623) +++ sys/net/route.h (working copy) @@ -176,7 +176,7 @@ struct ortentry { /* 0x20000 unused, was RTF_WASCLONED */ #define RTF_PROTO3 0x40000 /* protocol specific routing flag */ /* 0x80000 unused */ -#define RTF_PINNED 0x100000 /* future use */ +#define RTF_PINNED 0x100000 /* route is immutable */ #define RTF_LOCAL 0x200000 /* route represents a local address */ #define RTF_BROADCAST 0x400000 /* route represents a bcast address */ #define RTF_MULTICAST 0x800000 /* route represents a mcast address */ --------------070403050505050004040202-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 06:45:14 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DF787391 for ; Thu, 7 Mar 2013 06:45:14 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) by mx1.freebsd.org (Postfix) with ESMTP id 913E1247 for ; Thu, 7 Mar 2013 06:45:14 +0000 (UTC) Received: by mail-pa0-f52.google.com with SMTP id fb1so223808pad.11 for ; Wed, 06 Mar 2013 22:45:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=a/VwAV22ylSHJzEXo/ipQmpRMMpJ3lxrlYGPfXm+0q0=; b=L3Pf7WekeiRVK+zuHw9dUSKiMhno3/wEIXg9WIWF0LBEFOWjfYHqwwYlkMTJYPfyMa z/UOpJiaE8kTrP2q9DsK3fbBhFdpfRW0ETAf/dqVOHVL566s9rJNIVFdSsoA9DoEXr7q sWizznJshUtQuXtk1AIGZbk2j9eEdMdAZ6Lx7sEKwOmdv22/lbR6oj/qZdB+dy7WJ8xg t/Z1EavDWK1O+OLRKA0QO8h8lJy9XhW6p42QppSFgFxgETD5TuuRuIKuMyMWWx6856wR 4AttT7G5wQFb+8MaW+f90OTVEeGLvIQt/dELTu9iopsXmGAsPzkmf9/v8aSmvKaUZuGm zdyQ== X-Received: by 10.66.51.198 with SMTP id m6mr1321535pao.215.1362638707969; Wed, 06 Mar 2013 22:45:07 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id c8sm619022pbq.10.2013.03.06.22.45.04 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 06 Mar 2013 22:45:06 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 07 Mar 2013 15:45:00 +0900 From: YongHyeon PYUN Date: Thu, 7 Mar 2013 15:45:00 +0900 To: Zeus Panchenko Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout Message-ID: <20130307064500.GC1478@michelle.cdnetworks.com> References: <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> <20130307022446.GB3108@michelle.cdnetworks.com> <20130307082251.85461@relay.ibs.dn.ua> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130307082251.85461@relay.ibs.dn.ua> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 06:45:14 -0000 On Thu, Mar 07, 2013 at 08:22:51AM +0300, Zeus Panchenko wrote: > Hi, > > here is my situation, much like the issue > No, your issue is completely different one. > On 06.03.2013 12:26, YongHyeon PYUN wrote: > > If you were using latest stable/8, the result would be same on > > CURRENT. > > I use FreeBSD 9.1-RELEASE #0 r243825: amd65 + ZFS > on HP ProLiant DL360e Gen8 > > the box has two 4 headed cards igb(4) I350 and bge(4) NetXtreme BCM5719 > according the pciconf data > > > How frequently do you see the watchdog timeouts? Is there way to > > reproduce it? > > I noticed that after activation, bge(4) stops respond and interface > becomes useless, while igb(4) works fine after some sysctl-ing > > for now I'm forced to not to use bge(4) at all :( 9.1-RELEASE does not have required code to support your controller. Use stable/9. From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 07:14:06 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DFE0DA4A for ; Thu, 7 Mar 2013 07:14:06 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (unknown [IPv6:2001:470:1f09:14c0::2]) by mx1.freebsd.org (Postfix) with ESMTP id 8141E35A for ; Thu, 7 Mar 2013 07:14:06 +0000 (UTC) Received: from bsdrookie.norma.com. ([IPv6:fd00::726]) by elf.hq.norma.perm.ru (8.14.5/8.14.5) with ESMTP id r277E3eK006676 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 7 Mar 2013 13:14:04 +0600 (YEKT) (envelope-from emz@norma.perm.ru) Message-ID: <51383E3B.5030007@norma.perm.ru> Date: Thu, 07 Mar 2013 13:14:03 +0600 From: "Eugene M. Zheganin" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout References: <20130225082042.GB1426@michelle.cdnetworks.com> <512CF97B.8030805@norma.perm.ru> <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> <20130307022446.GB3108@michelle.cdnetworks.com> <513820E2.806@norma.perm.ru> <20130307062335.GB1478@michelle.cdnetworks.com> In-Reply-To: <20130307062335.GB1478@michelle.cdnetworks.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (elf.hq.norma.perm.ru [IPv6:fd00::30a]); Thu, 07 Mar 2013 13:14:04 +0600 (YEKT) X-Spam-Status: No hits=-97.8 bayes=0.5 testhits RDNS_NONE=1.274, SPF_SOFTFAIL=0.972,USER_IN_WHITELIST=-100 autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on elf.hq.norma.perm.ru X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 07:14:06 -0000 Hi. On 07.03.2013 12:23, YongHyeon PYUN wrote: > On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote: >> It was definitely older than "months". It was running something similar >> to "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the >> uname from a neighbor machine. >> >> I have, as I said, identical servers running FreeBSD. Here are some of >> the unames that I don't see timeouts on: >> >> 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) >> 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous >> uptime around 180 days) > These servers do not have 5718/5719/5720 changes. > >> 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) > This server has the bge(4) change but it didn't trigger watchdog > timeouts. Does this server use the same controller? If yes, the > issue didn't come from bge(4) change. > How's that ? It's running even older version than previous two. I guess you misread the year. Eugene. From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 07:39:54 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E2FE191 for ; Thu, 7 Mar 2013 07:39:54 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 5B5DE61A for ; Thu, 7 Mar 2013 07:39:54 +0000 (UTC) Received: (qmail 80793 invoked from network); 7 Mar 2013 08:53:24 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 08:53:24 -0000 Message-ID: <51384443.5070209@freebsd.org> Date: Thu, 07 Mar 2013 08:39:47 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> In-Reply-To: <513834E4.7050203@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 07:39:54 -0000 On 07.03.2013 07:34, Alexander V. Chernikov wrote: > Hello list! > > There is a known long-lived issue with interface routes addition/deletion: > > ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in kernel route table (for > example, advertised by IGP like OSPF). > > Interface route can be deleted via route(8) or any route socket user (sometimes this happens with > popular opensource daemons like bird/quagga). > > Problem is reported at least in kern/106722 and kern/155772. You patch is a welcome addition. > This can be fixed the following way: > Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' comment) is utilised to mark > route 'immutable'. > rtrequest1_fib refuses to delete routes with given flag unless RTM_PINNED is set in rti_flags. How do the routing daemons react to being unable to change/delete such a route? EADDRINUSE would likely be a more descriptive error instead of EPERM? > Every interface address manupulation is done via rtinit[1], so > rtinit1() sets this flag (and behavior does not change here). > > Adding interface address is handled via atomically deleting old prefix and adding interface one. This brings up a long standing sore point of our routing code which this patch makes more pronounced. When an interface link state is down I don't want the route to it to persist but to become inactive so another path can be chosen. This the very point of running a routing daemon. So on the link-down event the installed interface routes should be removed from the routing table. The configured addresses though should persist and the interface routes re-installed on a link-up event. What's your opinion on it? Other than these points I think your code is fine and can go into the tree. -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 07:59:55 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C98A1505 for ; Thu, 7 Mar 2013 07:59:55 +0000 (UTC) (envelope-from sthaug@nethelp.no) Received: from bizet.nethelp.no (bizet.nethelp.no [195.1.209.33]) by mx1.freebsd.org (Postfix) with SMTP id 109576CB for ; Thu, 7 Mar 2013 07:59:54 +0000 (UTC) Received: (qmail 75461 invoked from network); 7 Mar 2013 07:53:12 -0000 Received: from bizet.nethelp.no (HELO localhost) (195.1.209.33) by bizet.nethelp.no with SMTP; 7 Mar 2013 07:53:12 -0000 Date: Thu, 07 Mar 2013 08:53:12 +0100 (CET) Message-Id: <20130307.085312.41695129.sthaug@nethelp.no> To: andre@freebsd.org Subject: Re: [patch] interface routes From: sthaug@nethelp.no In-Reply-To: <51384443.5070209@freebsd.org> References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: melifaro@FreeBSD.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 07:59:55 -0000 > This brings up a long standing sore point of our routing code > which this patch makes more pronounced. When an interface link > state is down I don't want the route to it to persist but to > become inactive so another path can be chosen. This the very > point of running a routing daemon. So on the link-down event > the installed interface routes should be removed from the routing > table. The configured addresses though should persist and the > interface routes re-installed on a link-up event. What's your > opinion on it? Yes please! This is what I take for granted on my routers. Steinar Haug, Nethelp consulting, sthaug@nethelp.no From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 08:16:02 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7B24C915 for ; Thu, 7 Mar 2013 08:16:02 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) by mx1.freebsd.org (Postfix) with ESMTP id 4A7A5767 for ; Thu, 7 Mar 2013 08:16:02 +0000 (UTC) Received: by mail-pa0-f41.google.com with SMTP id fb11so286102pad.0 for ; Thu, 07 Mar 2013 00:15:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=+aDdCEe8i1j6ruNA+ZjB/s+ACQLuzBWpoIyrLdzBB7Q=; b=cKQ68uYCPEXPIf+WUxtdb7qD/kD2nAaht42Z2/NmThaVMA36b4NPtO1cDY32n2d9l/ 953EARcRvETYRtJkNpOMIyq+RJ9eKtxQKaIr14evDdBje6S18ICxY1mfQs2+lEW6G6MT 7qTDvxlIFoi/5vanmq8B9saU4bvwTO8ORs70aSQXXb7Cvp2Hcr6sh7JqJakaoO4R5DXo v7h414Ila586o+Wg3BfhRqbqB4aoi3xU/gg/aVN7Yr/c1hQHV3KW6Ai0EPYVhAmtQSkC qMrvcvR+90zQx/JlYIPP8In2eoVe+bkVEyTh7lkNIMlUuUrRZ70Q/wD/etCMimB2QAxY Ngmw== X-Received: by 10.68.195.33 with SMTP id ib1mr52532958pbc.105.1362644156666; Thu, 07 Mar 2013 00:15:56 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id eg1sm871866pbb.33.2013.03.07.00.15.53 (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 07 Mar 2013 00:15:55 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 07 Mar 2013 17:15:48 +0900 From: YongHyeon PYUN Date: Thu, 7 Mar 2013 17:15:48 +0900 To: "Eugene M. Zheganin" Subject: Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout Message-ID: <20130307081548.GD1478@michelle.cdnetworks.com> References: <20130227020123.GA3581@michelle.cdnetworks.com> <512DE968.4020409@quip.cz> <20130228053558.GA1474@michelle.cdnetworks.com> <5136D89D.4000902@norma.perm.ru> <20130306062658.GC1483@michelle.cdnetworks.com> <513713C2.1000007@norma.perm.ru> <20130307022446.GB3108@michelle.cdnetworks.com> <513820E2.806@norma.perm.ru> <20130307062335.GB1478@michelle.cdnetworks.com> <51383E3B.5030007@norma.perm.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51383E3B.5030007@norma.perm.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 08:16:02 -0000 On Thu, Mar 07, 2013 at 01:14:03PM +0600, Eugene M. Zheganin wrote: > Hi. > > On 07.03.2013 12:23, YongHyeon PYUN wrote: > > On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote: > >> It was definitely older than "months". It was running something similar > >> to "FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011", this is the > >> uname from a neighbor machine. > >> > >> I have, as I said, identical servers running FreeBSD. Here are some of > >> the unames that I don't see timeouts on: > >> > >> 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) > >> 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous > >> uptime around 180 days) > > These servers do not have 5718/5719/5720 changes. > > > >> 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) > > This server has the bge(4) change but it didn't trigger watchdog > > timeouts. Does this server use the same controller? If yes, the > > issue didn't come from bge(4) change. > > > How's that ? It's running even older version than previous two. I guess > you misread the year. Oops, you're right. From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 10:20:01 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B32E36E8 for ; Thu, 7 Mar 2013 10:20:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 8A409F5D for ; Thu, 7 Mar 2013 10:20:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r27AK1ID040962 for ; Thu, 7 Mar 2013 10:20:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r27AK1k7040961; Thu, 7 Mar 2013 10:20:01 GMT (envelope-from gnats) Date: Thu, 7 Mar 2013 10:20:01 GMT Message-Id: <201303071020.r27AK1k7040961@freefall.freebsd.org> To: freebsd-net@FreeBSD.org Cc: From: "Charbon, Julien" Subject: Re: kern/176446: [netinet] [patch] Concurrency in ixgbe driving out-of-order packet process and spurious RST X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: "Charbon, Julien" List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 10:20:01 -0000 The following reply was made to PR kern/176446; it has been noted by GNATS. From: "Charbon, Julien" To: John Baldwin Cc: bug-followup@freebsd.org, "De La Gueronniere, Marc" Subject: Re: kern/176446: [netinet] [patch] Concurrency in ixgbe driving out-of-order packet process and spurious RST Date: Thu, 07 Mar 2013 11:11:25 +0100 On 2/28/13 8:10 PM, Charbon, Julien wrote: > On 2/28/13 4:57 PM, John Baldwin wrote: >> Can you try the fixes from http://svnweb.freebsd.org/base?view=revision&revision=240968? > > Actually, Marc (I CC'ed him) did find the r240968 fix for concurrency > between ixgbe_msix_que() and ixgbe_handle_que(), and made a backport for > release-8.3.0 (see patch [1] below). However, the issue was still > reproducible, then Marc found another place for concurrency from > ixgbe_local_timer() and fix it (see patch [2]). But it was still not > enough, and he found a last place for concurrency due to > ixgbe_rearm_queues() call (see patch [3]). We all these patches > applied, we were not able to reproduce this issue. Just for the record: As expected this issue is reproducible on 9.1-RELEASE: # uname -a FreeBSD atlas 9.1-RELEASE FreeBSD 9.1-RELEASE #1 r247851M: Wed Mar 6 11:17:43 UTC 2013 jcharbon@atlas:/usr/obj/app/jcharbon/9.1.0/sys/GENERIC amd64 Enable TCP debug log: # sysctl net.inet.tcp.log_debug=1 Load enough a TCP service and due to ixgbe race conditions between ixgbe_msix_que() and ixgbe_handle_que(), you will get: Mar 7 10:01:04 atlas kernel: TCP: [192.168.100.21]:12918 to [192.168.100.152]:8080; syncache_socket: in_pcbconnect failed with error 48 Mar 7 10:01:04 atlas kernel: TCP: [192.168.100.21]:12918 to [192.168.100.152]:8080 tcpflags 0x10; tcp_input: Listen socket: Socket allocation failed due to limits or memory shortage, sending RST Mar 7 10:01:04 atlas kernel: TCP: [192.168.100.21]:12918 to [192.168.100.152]:8080 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored We will provide our current fix patch for 9.1-RELEASE. -- Julien From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 11:44:16 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 39D33C23; Thu, 7 Mar 2013 11:44:16 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id EE6C4315; Thu, 7 Mar 2013 11:44:15 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=dhcp170-36-red.yandex.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDZIV-000Aup-L8; Thu, 07 Mar 2013 15:47:43 +0400 Message-ID: <51387D4A.9030408@FreeBSD.org> Date: Thu, 07 Mar 2013 15:43:06 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> In-Reply-To: <51384443.5070209@freebsd.org> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 11:44:16 -0000 On 07.03.2013 11:39, Andre Oppermann wrote: > On 07.03.2013 07:34, Alexander V. Chernikov wrote: >> Hello list! >> >> There is a known long-lived issue with interface routes >> addition/deletion: >> >> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in >> kernel route table (for >> example, advertised by IGP like OSPF). >> >> Interface route can be deleted via route(8) or any route socket user >> (sometimes this happens with >> popular opensource daemons like bird/quagga). >> >> Problem is reported at least in kern/106722 and kern/155772. > > You patch is a welcome addition. > >> This can be fixed the following way: >> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' >> comment) is utilised to mark >> route 'immutable'. >> rtrequest1_fib refuses to delete routes with given flag unless >> RTM_PINNED is set in rti_flags. > > How do the routing daemons react to being unable to change/delete > such a route? routing daemons live long with the fact that there route socket cmds can fail (and the is route(8) utility which can do anything), so typically bird/quagga yells like 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists' and marks given route as not installed in internal RIB. Additionally, daemon will probably re-try to insert such routes on every periodic KRT rescan (tens of minutes). Given that such sutiations usually happens for a very short time (e.g. physical link flaps) everything should become to normal state quickly. > > EADDRINUSE would likely be a more descriptive error instead of EPERM? Well, not sure if EADDRINUSE is very descriptive for _deleting_ route. "Yes, I know that it is in use so that's the reason I'm trying to delete it". > >> Every interface address manupulation is done via rtinit[1], so >> rtinit1() sets this flag (and behavior does not change here). >> >> Adding interface address is handled via atomically deleting old prefix >> and adding interface one. > > This brings up a long standing sore point of our routing code > which this patch makes more pronounced. When an interface link > state is down I don't want the route to it to persist but to > become inactive so another path can be chosen. This the very > point of running a routing daemon. So on the link-down event > the installed interface routes should be removed from the routing > table. The configured addresses though should persist and the > interface routes re-installed on a link-up event. What's your > opinion on it? This is exactly what is done in current code for IPv4: if_down calls if_unroute(), it cals prctlinput() for every interface address, and domain-dependent function like rip_ctlinput calls in_ifscrub() cleaning given interface route. However, address route (/32) still remains (but route daemons, at least bird, tends to ignore it since it is not listed as valid interface address/mask). This is not done for IPv6 and we should probably do the same. > > Other than these points I think your code is fine and can go > into the tree. > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 11:56:00 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 68849149 for ; Thu, 7 Mar 2013 11:56:00 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id DE9543A6 for ; Thu, 7 Mar 2013 11:55:59 +0000 (UTC) Received: (qmail 92765 invoked from network); 7 Mar 2013 13:09:25 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 13:09:25 -0000 Message-ID: <51388046.7040408@freebsd.org> Date: Thu, 07 Mar 2013 12:55:50 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> In-Reply-To: <51387D4A.9030408@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 11:56:00 -0000 On 07.03.2013 12:43, Alexander V. Chernikov wrote: > On 07.03.2013 11:39, Andre Oppermann wrote: >> On 07.03.2013 07:34, Alexander V. Chernikov wrote: >>> Hello list! >>> >>> There is a known long-lived issue with interface routes >>> addition/deletion: >>> >>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in >>> kernel route table (for >>> example, advertised by IGP like OSPF). >>> >>> Interface route can be deleted via route(8) or any route socket user >>> (sometimes this happens with >>> popular opensource daemons like bird/quagga). >>> >>> Problem is reported at least in kern/106722 and kern/155772. >> >> You patch is a welcome addition. >> >>> This can be fixed the following way: >>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' >>> comment) is utilised to mark >>> route 'immutable'. >>> rtrequest1_fib refuses to delete routes with given flag unless >>> RTM_PINNED is set in rti_flags. >> >> How do the routing daemons react to being unable to change/delete >> such a route? > routing daemons live long with the fact that there route socket cmds can > fail (and the is route(8) utility which can do anything), so typically > bird/quagga yells like > 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists' > and marks given route as not installed in internal RIB. Additionally, > daemon will probably re-try to insert such routes on every periodic KRT > rescan (tens of minutes). OK. No problem then. > Given that such sutiations usually happens for a very short time (e.g. > physical link flaps) everything should become to normal state quickly. > >> >> EADDRINUSE would likely be a more descriptive error instead of EPERM? > Well, not sure if EADDRINUSE is very descriptive for _deleting_ route. > "Yes, I know that it is in use so that's the reason I'm trying to delete > it". I'm thinking of distinguishing it from a permission denial, because of insufficient rights (jail or something like that) vs. an explicitly pinned route. With EPERM you may look for the problem in the wrong place. E*INUSE is a common error for something can't be removed due to it still being used by or for something else. Which is the case here and may be more appropriate. >>> Every interface address manupulation is done via rtinit[1], so >>> rtinit1() sets this flag (and behavior does not change here). >>> >>> Adding interface address is handled via atomically deleting old prefix >>> and adding interface one. >> >> This brings up a long standing sore point of our routing code >> which this patch makes more pronounced. When an interface link >> state is down I don't want the route to it to persist but to >> become inactive so another path can be chosen. This the very >> point of running a routing daemon. So on the link-down event >> the installed interface routes should be removed from the routing >> table. The configured addresses though should persist and the >> interface routes re-installed on a link-up event. What's your >> opinion on it? > > This is exactly what is done in current code for IPv4: > if_down calls if_unroute(), it cals prctlinput() for every interface > address, and domain-dependent function like rip_ctlinput calls > in_ifscrub() cleaning given interface route. > However, address route (/32) still remains (but route daemons, at least > bird, tends to ignore it since it is not listed as valid interface > address/mask). IF_DOWN and link state down are not the same thing. When the cable is unplugged the link state goes down but not the interface. > This is not done for IPv6 and we should probably do the same. Yes, they should be synchronized. >> Other than these points I think your code is fine and can go >> into the tree. -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 12:39:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C5D2EF9B for ; Thu, 7 Mar 2013 12:39:21 +0000 (UTC) (envelope-from araujobsdport@gmail.com) Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) by mx1.freebsd.org (Postfix) with ESMTP id 69D6E7A0 for ; Thu, 7 Mar 2013 12:39:21 +0000 (UTC) Received: by mail-wi0-f173.google.com with SMTP id hq4so752131wib.12 for ; Thu, 07 Mar 2013 04:39:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:reply-to:date:message-id:subject:from:to :content-type; bh=N09z2CTOmQFdQJz+y3+QxWIj5EQ0G5wVtCHnQcUhrXE=; b=rSprFeEk36ojNn+lzSTGNTaZ2bWngnElmfFMUvhfMPIlQDmcNxFfC9PvGnujEXy8i5 sCN4zNQ0lvWNAMe+yql8hwdQZUHYorHvk8FrQAfGuGPRoSsoUtDW6XHjQFrsGUZqGHK0 /JcDFZ+4t9TeU1hAmXsMxDCjjPqiARf9/9gGEXxDqHTuqOvdagVT5PG7/owLH5bS3HMz 6TXUTXYp+AcZoAgUvrlRpD4FBn/IAcVEziXL1vOSMbXwyqQmVULarixRPjOxVvEDDFQB 50nkD0T3YczoM8PG7u4kKfac3DwQ8PO26WnCsN3q9bC8j/VK4ce0YyxFuvdydyFXc4sJ 2A7w== MIME-Version: 1.0 X-Received: by 10.194.21.233 with SMTP id y9mr47215955wje.47.1362659960628; Thu, 07 Mar 2013 04:39:20 -0800 (PST) Received: by 10.180.212.51 with HTTP; Thu, 7 Mar 2013 04:39:20 -0800 (PST) Date: Thu, 7 Mar 2013 20:39:20 +0800 Message-ID: Subject: dhclient issue. From: Marcelo Araujo To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: araujo@FreeBSD.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 12:39:21 -0000 Hello Guys, I've faced out some problem with dhclient during this week on 9.1-RELEASE! Below there is the log: [root@home ~]# uname -a FreeBSD HOME 9.1-RELEASE FreeBSD 9.1-RELEASE #10: Tue Mar 5 18:57:14 CST 2013 root@home:/usr/src/sys/HOME.amd64 amd64 [root@home ~]# dhclient ix0 PID = 3276, PPID = 3274 fibnum = 0 fibcmd = setfib 0 interface = ix0 ifconfig: ioctl (SIOCAIFADDR): File exists ix0: not found exiting. [root@home ~]# tail /var/log/messages Mar 17 14:53:52 ESSD46B70 dhclient[3244]: exiting. Mar 17 14:54:01 ESSD46B70 login: ROOT LOGIN (root) ON ttyv0 Mar 17 14:54:15 ESSD46B70 dhclient[3257]: ix0: not found Mar 17 14:54:15 ESSD46B70 dhclient[3257]: exiting. Mar 17 14:54:15 ESSD46B70 dhclient[3258]: connection closed Mar 17 14:54:15 ESSD46B70 dhclient[3258]: exiting. Mar 17 14:54:57 ESSD46B70 dhclient[3274]: ix0: not found [root@home ~]# ifconfig ix0 ix0: flags=8843 metric 0 mtu 1500 options=403bb ether 00:08:9b:d4:6b:71 nd6 options=29 media: Ethernet autoselect (10Gbase-T ) status: active I have another interface em0, and there it works properly! Any idea, what is going on? Best Regards, -- Marcelo Araujo araujo@FreeBSD.org From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 13:38:21 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 096565A9; Thu, 7 Mar 2013 13:38:21 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qc0-x231.google.com (mail-qc0-x231.google.com [IPv6:2607:f8b0:400d:c01::231]) by mx1.freebsd.org (Postfix) with ESMTP id 9D089A4D; Thu, 7 Mar 2013 13:38:20 +0000 (UTC) Received: by mail-qc0-f177.google.com with SMTP id u28so142582qcs.36 for ; Thu, 07 Mar 2013 05:38:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=NKMLvr9xiGvggkPzktzt8+mN1aKo8yIewmNrtHWScr0=; b=gBHB+impy4qYtLRSPCgHZKBtQA0x0+jBuy7pw/ijW24Zp3f6hy4HNnGDkbsnbWqeo7 YCAC2abc/bsFKdlYpgtmd9+vhlYBxRHY8p+wTa1CoEGrnoqnSpdkTSBnCxPzhOtWCgQs NZUrZHmNonujaWueZMTtYLWU6AhMgdqv82Or8hMYW12kLGeE39vNi0NqO+Xs/OSWwyzG N1R8cvuo6PsAbBv4htXl5irLlkhXwUtQsaTXxZMywu4rRP4HXDwV6w6+4lEpj0gY3Lnw jAU7FD8X+5mFykswxPZXZQTCNR6994pN7py44HiCsnc1TZgWuOFR4H353ZYHGGUBznI8 bDPA== MIME-Version: 1.0 X-Received: by 10.49.120.225 with SMTP id lf1mr54103536qeb.14.1362663500045; Thu, 07 Mar 2013 05:38:20 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Thu, 7 Mar 2013 05:38:19 -0800 (PST) In-Reply-To: <51388046.7040408@freebsd.org> References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> Date: Thu, 7 Mar 2013 14:38:19 +0100 X-Google-Sender-Auth: LaUBMKo0jFb4HxaoZviAR3UNSs0 Message-ID: Subject: Re: [patch] interface routes From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 13:38:21 -0000 On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann wrote: > On 07.03.2013 12:43, Alexander V. Chernikov wrote: > >> On 07.03.2013 11:39, Andre Oppermann wrote: >> >>> On 07.03.2013 07:34, Alexander V. Chernikov wrote: >>> >>>> Hello list! >>>> >>>> There is a known long-lived issue with interface routes >>>> addition/deletion: >>>> >>>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in >>>> kernel route table (for >>>> example, advertised by IGP like OSPF). >>>> >>>> Interface route can be deleted via route(8) or any route socket user >>>> (sometimes this happens with >>>> popular opensource daemons like bird/quagga). >>>> >>>> Problem is reported at least in kern/106722 and kern/155772. >>>> >>> >>> You patch is a welcome addition. >>> >>> This can be fixed the following way: >>>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' >>>> comment) is utilised to mark >>>> route 'immutable'. >>>> rtrequest1_fib refuses to delete routes with given flag unless >>>> RTM_PINNED is set in rti_flags. >>>> >>> >>> How do the routing daemons react to being unable to change/delete >>> such a route? >>> >> routing daemons live long with the fact that there route socket cmds can >> fail (and the is route(8) utility which can do anything), so typically >> bird/quagga yells like >> 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists' >> and marks given route as not installed in internal RIB. Additionally, >> daemon will probably re-try to insert such routes on every periodic KRT >> rescan (tens of minutes). >> > > Isn't it better to teach the routing code about metrics. Routing daemons cope better this way and they can handle this. So the policy of this behaviour can be controled by administrator rather than by code! With metrics you can add routes with bigger metric for interfaces and lower from routing daemons. This also can mitigate somehow on interfaces with the same subnet configured possibly. > OK. No problem then. > > > Given that such sutiations usually happens for a very short time (e.g. >> physical link flaps) everything should become to normal state quickly. >> >> >>> EADDRINUSE would likely be a more descriptive error instead of EPERM? >>> >> Well, not sure if EADDRINUSE is very descriptive for _deleting_ route. >> "Yes, I know that it is in use so that's the reason I'm trying to delete >> it". >> > > I'm thinking of distinguishing it from a permission denial, because of > insufficient rights (jail or something like that) vs. an explicitly > pinned route. With EPERM you may look for the problem in the wrong > place. E*INUSE is a common error for something can't be removed due > to it still being used by or for something else. Which is the case > here and may be more appropriate. > > > Every interface address manupulation is done via rtinit[1], so >>>> rtinit1() sets this flag (and behavior does not change here). >>>> >>>> Adding interface address is handled via atomically deleting old prefix >>>> and adding interface one. >>>> >>> >>> This brings up a long standing sore point of our routing code >>> which this patch makes more pronounced. When an interface link >>> state is down I don't want the route to it to persist but to >>> become inactive so another path can be chosen. This the very >>> point of running a routing daemon. So on the link-down event >>> the installed interface routes should be removed from the routing >>> table. The configured addresses though should persist and the >>> interface routes re-installed on a link-up event. What's your >>> opinion on it? >>> >> > > >> This is exactly what is done in current code for IPv4: >> if_down calls if_unroute(), it cals prctlinput() for every interface >> address, and domain-dependent function like rip_ctlinput calls >> in_ifscrub() cleaning given interface route. >> However, address route (/32) still remains (but route daemons, at least >> bird, tends to ignore it since it is not listed as valid interface >> address/mask). >> > > IF_DOWN and link state down are not the same thing. When the cable > is unplugged the link state goes down but not the interface. > > > This is not done for IPv6 and we should probably do the same. >> > > Yes, they should be synchronized. > > > Other than these points I think your code is fine and can go >>> into the tree. >>> >> > -- > Andre > > > ______________________________**_________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org > " > -- Ermal From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 13:51:14 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 73EA88EE for ; Thu, 7 Mar 2013 13:51:14 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id C5FC0AF6 for ; Thu, 7 Mar 2013 13:51:13 +0000 (UTC) Received: (qmail 98301 invoked from network); 7 Mar 2013 15:04:41 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 15:04:41 -0000 Message-ID: <51389B4B.1060003@freebsd.org> Date: Thu, 07 Mar 2013 14:51:07 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: =?ISO-8859-1?Q?Ermal_Lu=E7i?= Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 13:51:14 -0000 On 07.03.2013 14:38, Ermal Luçi wrote: > On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann > wrote: > > On 07.03.2013 12:43, Alexander V. Chernikov wrote: > > On 07.03.2013 11:39, Andre Oppermann wrote: > > On 07.03.2013 07:34, Alexander V. Chernikov wrote: > > Hello list! > > There is a known long-lived issue with interface routes > addition/deletion: > > ifconfig iface inet 1.2.3.4/24 can fail if given prefix is > already in > kernel route table (for > example, advertised by IGP like OSPF). > > Interface route can be deleted via route(8) or any route socket user > (sometimes this happens with > popular opensource daemons like bird/quagga). > > Problem is reported at least in kern/106722 and kern/155772. > > > You patch is a welcome addition. > > This can be fixed the following way: > Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' > comment) is utilised to mark > route 'immutable'. > rtrequest1_fib refuses to delete routes with given flag unless > RTM_PINNED is set in rti_flags. > > > How do the routing daemons react to being unable to change/delete > such a route? > > routing daemons live long with the fact that there route socket cmds can > fail (and the is route(8) utility which can do anything), so typically > bird/quagga yells like > 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists' > and marks given route as not installed in internal RIB. Additionally, > daemon will probably re-try to insert such routes on every periodic KRT > rescan (tens of minutes). > > > > Isn't it better to teach the routing code about metrics. > Routing daemons cope better this way and they can handle this. > So the policy of this behaviour can be controled by administrator rather than by code! > With metrics you can add routes with bigger metric for interfaces and lower from routing daemons. > This also can mitigate somehow on interfaces with the same subnet configured possibly. Generally I agree with you that this would be the ideal outcome. However we're still quite a bit away from reaching that goal. To make this really work we have make mpath plus metrics a first class citizen in the routing code and also the update the routing daemons kernel interfaces to know about this. I hope we get there in the not too distant future. As a first step I think it is important that Alexanders patch goes in to fix a long standing and very annoying problem with the code we have. Also the link down route withdraw should be added asap. Then we can take the next steps towards the ultimate goal you describe. I hope you do not object to Alexanders patch? -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 13:55:58 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E354DAE3; Thu, 7 Mar 2013 13:55:58 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 8808AB3D; Thu, 7 Mar 2013 13:55:58 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=dhcp170-36-red.yandex.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDbLy-000CJ5-BD; Thu, 07 Mar 2013 17:59:26 +0400 Message-ID: <51389C29.8000407@FreeBSD.org> Date: Thu, 07 Mar 2013 17:54:49 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> In-Reply-To: <51388046.7040408@freebsd.org> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 13:55:59 -0000 On 07.03.2013 15:55, Andre Oppermann wrote: > On 07.03.2013 12:43, Alexander V. Chernikov wrote: >> On 07.03.2013 11:39, Andre Oppermann wrote: >>> On 07.03.2013 07:34, Alexander V. Chernikov wrote: >>>> Hello list! >>>> >>>> There is a known long-lived issue with interface routes >>>> addition/deletion: >>>> >>>> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in >>>> kernel route table (for >>>> example, advertised by IGP like OSPF). >>>> >>>> Interface route can be deleted via route(8) or any route socket user >>>> (sometimes this happens with >>>> popular opensource daemons like bird/quagga). >>>> >>>> Problem is reported at least in kern/106722 and kern/155772. >>> >>> You patch is a welcome addition. >>> >>>> This can be fixed the following way: >>>> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' >>>> comment) is utilised to mark >>>> route 'immutable'. >>>> rtrequest1_fib refuses to delete routes with given flag unless >>>> RTM_PINNED is set in rti_flags. >>> >>> How do the routing daemons react to being unable to change/delete >>> such a route? >> routing daemons live long with the fact that there route socket cmds can >> fail (and the is route(8) utility which can do anything), so typically >> bird/quagga yells like >> 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists' >> and marks given route as not installed in internal RIB. Additionally, >> daemon will probably re-try to insert such routes on every periodic KRT >> rescan (tens of minutes). > > OK. No problem then. > >> Given that such sutiations usually happens for a very short time (e.g. >> physical link flaps) everything should become to normal state quickly. >> >>> >>> EADDRINUSE would likely be a more descriptive error instead of EPERM? >> Well, not sure if EADDRINUSE is very descriptive for _deleting_ route. >> "Yes, I know that it is in use so that's the reason I'm trying to delete >> it". OK. > > I'm thinking of distinguishing it from a permission denial, because of > insufficient rights (jail or something like that) vs. an explicitly > pinned route. With EPERM you may look for the problem in the wrong > place. E*INUSE is a common error for something can't be removed due > to it still being used by or for something else. Which is the case > here and may be more appropriate. > >>>> Every interface address manupulation is done via rtinit[1], so >>>> rtinit1() sets this flag (and behavior does not change here). >>>> >>>> Adding interface address is handled via atomically deleting old prefix >>>> and adding interface one. >>> >>> This brings up a long standing sore point of our routing code >>> which this patch makes more pronounced. When an interface link >>> state is down I don't want the route to it to persist but to >>> become inactive so another path can be chosen. This the very >>> point of running a routing daemon. So on the link-down event >>> the installed interface routes should be removed from the routing >>> table. The configured addresses though should persist and the >>> interface routes re-installed on a link-up event. What's your >>> opinion on it? >> >> This is exactly what is done in current code for IPv4: >> if_down calls if_unroute(), it cals prctlinput() for every interface >> address, and domain-dependent function like rip_ctlinput calls >> in_ifscrub() cleaning given interface route. >> However, address route (/32) still remains (but route daemons, at least >> bird, tends to ignore it since it is not listed as valid interface >> address/mask). > > IF_DOWN and link state down are not the same thing. When the cable > is unplugged the link state goes down but not the interface. Ups. I've missed 'link' keyword. Imho 'operational down' should behave exactly the same as 'admin down' e.g. delete interface routes from route table. It should be not very hard to do. > >> This is not done for IPv6 and we should probably do the same. > > Yes, they should be synchronized. > >>> Other than these points I think your code is fine and can go >>> into the tree. > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 14:03:47 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5A8E1DE9 for ; Thu, 7 Mar 2013 14:03:47 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id C63D8BF7 for ; Thu, 7 Mar 2013 14:03:46 +0000 (UTC) Received: (qmail 98964 invoked from network); 7 Mar 2013 15:17:08 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 15:17:08 -0000 Message-ID: <51389E36.3020104@freebsd.org> Date: Thu, 07 Mar 2013 15:03:34 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> <51389C29.8000407@FreeBSD.org> In-Reply-To: <51389C29.8000407@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 14:03:47 -0000 On 07.03.2013 14:54, Alexander V. Chernikov wrote: > On 07.03.2013 15:55, Andre Oppermann wrote: >> On 07.03.2013 12:43, Alexander V. Chernikov wrote: >>> On 07.03.2013 11:39, Andre Oppermann wrote: >>>> This brings up a long standing sore point of our routing code >>>> which this patch makes more pronounced. When an interface link >>>> state is down I don't want the route to it to persist but to >>>> become inactive so another path can be chosen. This the very >>>> point of running a routing daemon. So on the link-down event >>>> the installed interface routes should be removed from the routing >>>> table. The configured addresses though should persist and the >>>> interface routes re-installed on a link-up event. What's your >>>> opinion on it? >>> >>> This is exactly what is done in current code for IPv4: >>> if_down calls if_unroute(), it cals prctlinput() for every interface >>> address, and domain-dependent function like rip_ctlinput calls >>> in_ifscrub() cleaning given interface route. >>> However, address route (/32) still remains (but route daemons, at least >>> bird, tends to ignore it since it is not listed as valid interface >>> address/mask). >> >> IF_DOWN and link state down are not the same thing. When the cable >> is unplugged the link state goes down but not the interface. > > Ups. I've missed 'link' keyword. > Imho 'operational down' should behave exactly the same as 'admin down' > e.g. delete interface routes from route table. > It should be not very hard to do. Are you to implement it after the pinning patch? ;-) -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 14:10:59 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9FDF9164; Thu, 7 Mar 2013 14:10:59 +0000 (UTC) (envelope-from melifaro@ipfw.ru) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 66FA3CD0; Thu, 7 Mar 2013 14:10:59 +0000 (UTC) Received: from [213.87.139.85] (helo=[10.231.93.102]) by mail.ipfw.ru with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDbaO-000CPP-16; Thu, 07 Mar 2013 18:14:27 +0400 References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> <51389C29.8000407@FreeBSD.org> <51389E36.3020104@freebsd.org> Mime-Version: 1.0 (1.0) In-Reply-To: <51389E36.3020104@freebsd.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <0B3FF217-306D-485D-A332-C57B1D7D2F4F@ipfw.ru> X-Mailer: iPhone Mail (10B146) From: "Alexander V. Chernikov" Subject: Re: [patch] interface routes Date: Thu, 7 Mar 2013 18:11:44 +0400 To: Andre Oppermann Cc: "Alexander V. Chernikov" , "net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 14:10:59 -0000 It seems I have no choice :) WBR, Alexander On 07.03.2013, at 18:03, Andre Oppermann wrote: > On 07.03.2013 14:54, Alexander V. Chernikov wrote: >> On 07.03.2013 15:55, Andre Oppermann wrote: >>> On 07.03.2013 12:43, Alexander V. Chernikov wrote: >>>> On 07.03.2013 11:39, Andre Oppermann wrote: >>>>> This brings up a long standing sore point of our routing code >>>>> which this patch makes more pronounced. When an interface link >>>>> state is down I don't want the route to it to persist but to >>>>> become inactive so another path can be chosen. This the very >>>>> point of running a routing daemon. So on the link-down event >>>>> the installed interface routes should be removed from the routing >>>>> table. The configured addresses though should persist and the >>>>> interface routes re-installed on a link-up event. What's your >>>>> opinion on it? >>>> >>>> This is exactly what is done in current code for IPv4: >>>> if_down calls if_unroute(), it cals prctlinput() for every interface >>>> address, and domain-dependent function like rip_ctlinput calls >>>> in_ifscrub() cleaning given interface route. >>>> However, address route (/32) still remains (but route daemons, at least >>>> bird, tends to ignore it since it is not listed as valid interface >>>> address/mask). >>> >>> IF_DOWN and link state down are not the same thing. When the cable >>> is unplugged the link state goes down but not the interface. > > >> Ups. I've missed 'link' keyword. >> Imho 'operational down' should behave exactly the same as 'admin down' >> e.g. delete interface routes from route table. >> It should be not very hard to do. > > Are you to implement it after the pinning patch? ;-) > > -- > Andre > > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 15:22:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 87AFAB8D for ; Thu, 7 Mar 2013 15:22:21 +0000 (UTC) (envelope-from milu@dat.pl) Received: from jab.dat.pl (dat.pl [80.51.155.34]) by mx1.freebsd.org (Postfix) with ESMTP id F075EE7 for ; Thu, 7 Mar 2013 15:22:20 +0000 (UTC) Received: from jab.dat.pl (jsrv.dat.pl [127.0.0.1]) by jab.dat.pl (Postfix) with ESMTP id 46F9C123; Thu, 7 Mar 2013 16:13:49 +0100 (CET) X-Virus-Scanned: amavisd-new at dat.pl Received: from jab.dat.pl ([127.0.0.1]) by jab.dat.pl (jab.dat.pl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id cGJcUAJLhwyx; Thu, 7 Mar 2013 16:13:44 +0100 (CET) Received: from [10.0.6.80] (unknown [212.69.68.42]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by jab.dat.pl (Postfix) with ESMTPSA id D73A22D; Thu, 7 Mar 2013 16:13:43 +0100 (CET) Message-ID: <5138AED9.1020801@dat.pl> Date: Thu, 07 Mar 2013 16:14:33 +0100 From: Maciej Milewski User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130221 Thunderbird/17.0.3 MIME-Version: 1.0 To: freebsd-net Subject: Re: Implementing IP6 in 8.3 References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 15:22:21 -0000 On 06.03.2013 22:02, freebsd-net wrote: > Greetings, > I'm evaluating an ISP for the sake of building BSD operating systems on hardware > that they use (DSL modems, in this case). When I had my old NEC server, I had a > MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at > it for use in alot of hardware I have laying around. In my current situation, I'm > using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively > new modem, it doesn't support IP6. It is my hope to replace the OS with one that > does. :) If it doesn't support IPv6 you can always try to use it in Transparent Bridging (RFC1483) mode. You can then put other router/computer that does IPv6 routing just after that modem. > I leased a /48 of IP4's from them, which /also/ came with as many IP6's. > So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers), > I'm wondering 2 things: > If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho > my gateway (modem) doesn't? > Am I /correctly/ attempting to use it? > I'm answering authoritatively for the many domains I own. They have all functioned > well for many years via IP4. I have added the requisite AAAA records in all the zones, > as well as the associated RR's. > While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of > DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48. > So it's up to the modem to answer accordingly. > BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically, > via rc.conf(5). While I've read as much as I can find on the topic related to BSD, > boot messages indicate at least -- "IP6 gateway unreachable". > I'm currently using: > rc.conf(5): > ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000" > ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000" > I also have the corresponding host IP in hosts(5). > > Any help, pointers, guidance, answers /greatly/ appreciated. > > Thank you for all your time, and consideration. > > --Chris > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" -- Pozdrawiam, Maciej Milewski From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 15:35:49 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 49182F75; Thu, 7 Mar 2013 15:35:49 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id E1A5F160; Thu, 7 Mar 2013 15:35:48 +0000 (UTC) Received: from dhcp170-36-red.yandex.net ([95.108.170.36]) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDcub-000Cxl-8q; Thu, 07 Mar 2013 19:39:17 +0400 Message-ID: <5138B390.2080806@FreeBSD.org> Date: Thu, 07 Mar 2013 19:34:40 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> <51389B4B.1060003@freebsd.org> In-Reply-To: <51389B4B.1060003@freebsd.org> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 15:35:49 -0000 On 07.03.2013 17:51, Andre Oppermann wrote: > On 07.03.2013 14:38, Ermal Luçi wrote: >> On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann > > wrote: >> >> On 07.03.2013 12:43, Alexander V. Chernikov wrote: >> >> On 07.03.2013 11:39, Andre Oppermann wrote: >> >> On 07.03.2013 07:34, Alexander V. Chernikov wrote: >> >> Hello list! >> >> There is a known long-lived issue with interface routes >> addition/deletion: >> >> ifconfig iface inet 1.2.3.4/24 can >> fail if given prefix is >> already in >> kernel route table (for >> example, advertised by IGP like OSPF). >> >> Interface route can be deleted via route(8) or any >> route socket user >> (sometimes this happens with >> popular opensource daemons like bird/quagga). >> >> Problem is reported at least in kern/106722 and >> kern/155772. >> >> >> You patch is a welcome addition. >> >> This can be fixed the following way: >> Immutable route flag (RTM_PINNED, added in 19995 with >> 'for future use' >> comment) is utilised to mark >> route 'immutable'. >> rtrequest1_fib refuses to delete routes with given >> flag unless >> RTM_PINNED is set in rti_flags. >> >> >> How do the routing daemons react to being unable to >> change/delete >> such a route? >> >> routing daemons live long with the fact that there route >> socket cmds can >> fail (and the is route(8) utility which can do anything), so >> typically >> bird/quagga yells like >> 'bird: KRT: Error sending route 11.0.0.0/24 >> to kernel: File exists' >> and marks given route as not installed in internal RIB. >> Additionally, >> daemon will probably re-try to insert such routes on every >> periodic KRT >> rescan (tens of minutes). >> >> >> >> Isn't it better to teach the routing code about metrics. >> Routing daemons cope better this way and they can handle this. >> So the policy of this behaviour can be controled by administrator >> rather than by code! >> With metrics you can add routes with bigger metric for interfaces and >> lower from routing daemons. >> This also can mitigate somehow on interfaces with the same subnet >> configured possibly. > > Generally I agree with you that this would be the ideal outcome. > However we're still quite a bit away from reaching that goal. > To make this really work we have make mpath plus metrics a first > class citizen in the routing code and also the update the routing > daemons kernel interfaces to know about this. I hope we get there > in the not too distant future. Radix is already over-bloated. Typically in performance-oriented solutions (hardware/software routers from vendors) there is clear separation between RIB (where route protocol attributes, best candidate routes, routes with different priority exists) and FIB, which is typically some kind of radix with minimum needed info, e.g: prefix, nexthops, their interfaces, optional L2 data to prepend. Our radix stands somewhere between RIB and FIB (since we have to support route(8) and upper layer protocols): it serves badly as RIB (little functionality) and as FIB: too much overhead and inefficient/too general code. For example, sizeof(rt_nodes[2]) (first element of rte) is 96 bytes on amd64. Additionally, rte refcount approach is totally broken. I'm currently thinking of adding some kind of hooks to current route/radix code to permit building efficient trie (or other structure) for given address family and to use it for forwarding purposes only. For example, I don't need trie while doing MPLS label switching: assuming control plane allocates contiguous label space, I can use label array for efficient lookup. > > As a first step I think it is important that Alexanders patch goes > in to fix a long standing and very annoying problem with the code > we have. Also the link down route withdraw should be added asap. > Then we can take the next steps towards the ultimate goal you describe. > > I hope you do not object to Alexanders patch? > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 16:40:01 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5C20CE43 for ; Thu, 7 Mar 2013 16:40:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 3E21363B for ; Thu, 7 Mar 2013 16:40:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r27Ge1C5014140 for ; Thu, 7 Mar 2013 16:40:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r27Ge14V014139; Thu, 7 Mar 2013 16:40:01 GMT (envelope-from gnats) Date: Thu, 7 Mar 2013 16:40:01 GMT Message-Id: <201303071640.r27Ge14V014139@freefall.freebsd.org> To: freebsd-net@FreeBSD.org Cc: From: Gleb Smirnoff Subject: Re: kern/176667: libalias locks on uninitalized data X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Gleb Smirnoff List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 16:40:01 -0000 The following reply was made to PR kern/176667; it has been noted by GNATS. From: Gleb Smirnoff To: Lutz Donnerhacke Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/176667: libalias locks on uninitalized data Date: Thu, 7 Mar 2013 20:30:26 +0400 On Tue, Mar 05, 2013 at 03:54:50PM +0000, Lutz Donnerhacke wrote: L> L> >Number: 176667 L> >Category: kern L> >Synopsis: libalias locks on uninitalized data L> >Confidential: no L> >Severity: non-critical L> >Priority: low L> >Responsible: freebsd-bugs L> >State: open L> >Quarter: L> >Keywords: L> >Date-Required: L> >Class: sw-bug L> >Submitter-Id: current-users L> >Arrival-Date: Tue Mar 05 16:00:00 UTC 2013 L> >Closed-Date: L> >Last-Modified: L> >Originator: Lutz Donnerhacke L> >Release: FreeBSD 8.3-RELEASE (GENERIC) L> >Organization: L> IKS Service GmbH L> >Environment: L> FreeBSD server7.net.encoline.de 8.3-RELEASE FreeBSD 8.3-RELEASE #0: Mon Apr 9 21:23:18 UTC 2012 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 L> L> >Description: L> While testing terminating a huge number of PPPoX clients the kernel panics while doing in-kernel NAT. L> L> #4 0xffffffff808e8775 at calltrap+0x8 L> #5 0xffffffff80fa0f01 at HouseKeeping+0xa1 L> #6 0xffffffff80f9e6ab at LibAliasOutLocked+0x3b L> L> Please note, that the stack trace is incomplete. There are calls to IncrementalCleanup() and DeleteLink(), which are not reported in the stack trace. L> L> The problem seems to come from incorrect locking, so the contents of the libalias database get corrupted. L> L> This patch might be not the full solution, but is an obvious fix for an obvious bug. L> >How-To-Repeat: L> Setting up ipfw nat, add more then 9000 clients using mpd5.6, generate traffic L> >Fix: L> --- sys/netinet/libalias/alias_db.c.ORIG 2013-03-05 16:49:13.000000000 +0100 L> +++ sys/netinet/libalias/alias_db.c 2013-03-05 16:50:09.000000000 +0100 L> @@ -2767,8 +2767,8 @@ L> struct ip_fw rule; /* On-the-fly built rule */ L> int fwhole; /* Where to punch hole */ L> L> - LIBALIAS_LOCK_ASSERT(la); L> la = lnk->la; L> + LIBALIAS_LOCK_ASSERT(la); L> L> /* Don't do anything unless we are asked to */ L> if (!(la->packetAliasMode & PKT_ALIAS_PUNCH_FW) || L> @@ -2841,8 +2841,8 @@ L> { L> struct libalias *la; L> L> - LIBALIAS_LOCK_ASSERT(la); L> la = lnk->la; L> + LIBALIAS_LOCK_ASSERT(la); L> if (lnk->link_type == LINK_TCP) { L> int fwhole = lnk->data.tcp->fwhole; /* Where is the firewall L> * hole? */ The code edited isn't correct and the patch is neither. The fw punching isn't supported when libalias is compiled into kernel. The LIBALIAS_LOCK_ASSERT(la) on not initialized variable couldn't even pass compiler, if only the entire fw punching code was enabled. So these lines need to be just removed for sanity. Unfortunately this isn't related to panic you are hitting. Do you have cores of that panic? -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 16:54:35 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EFFE553F; Thu, 7 Mar 2013 16:54:34 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-wg0-x22a.google.com (mail-wg0-x22a.google.com [IPv6:2a00:1450:400c:c00::22a]) by mx1.freebsd.org (Postfix) with ESMTP id 6AED76F9; Thu, 7 Mar 2013 16:54:34 +0000 (UTC) Received: by mail-wg0-f42.google.com with SMTP id 12so7028656wgh.5 for ; Thu, 07 Mar 2013 08:54:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=3nlPI5HskgFaltsThx3r8qsuhy086Tj8i7ftUYOgtDc=; b=MJCls8w9vapM/LQkmwU/F2WCPahpdpx79TBxZhsa82yVHS4+De+UYPTprQQRC3gc/O IciommG+Xefjgryf9ffkY03e7EmsLTxwIXcNbxO9+du/bThXXDhs9Cx5GZr4ruBsvNp+ tBNHMECL1ZMlbUh5RpzGxKdR+xuug1pabyMFKUm9udMQKDJzb2ji4ZDGSqfWgcZjSlpN uanQwVC3sN81d1HV8HP1oJkro48SHV5GKOXoyImeJvGDAx2e5YYdzF39kdWXNuyIBYtd b18armTxasrbraE++gSMAgcE4dFGrYdRSeLMNZes4c1F2w/3lvdwWASQOfXSYyfuoBNn L+yg== MIME-Version: 1.0 X-Received: by 10.180.81.164 with SMTP id b4mr34955782wiy.34.1362675273327; Thu, 07 Mar 2013 08:54:33 -0800 (PST) Received: by 10.194.110.195 with HTTP; Thu, 7 Mar 2013 08:54:33 -0800 (PST) In-Reply-To: <51378A9D.6080306@freebsd.org> References: <51378A9D.6080306@freebsd.org> Date: Thu, 7 Mar 2013 08:54:33 -0800 Message-ID: Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) From: Nick Rogers To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 16:54:35 -0000 I'm not sure. I have not explicitly enabled/disabled it. I am using the GENERIC kernel from 9.1 plus PF+ALTQ. # sysctl net.inet.flowtable.enable sysctl: unknown oid 'net.inet.flowtable.enable' # sysctl -a | grep flow kern.sigqueue.overflow: 0 net.inet.tcp.reass.overflows: 0 net.inet6.ip6.auto_flowlabel: 1 uname -v FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013 root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM 8.0 release notes say flowtable is enabled by default on amd64/i386. So I presume it is enabled? I can't seem to find much information about this for FreeBSD 9.x On Wed, Mar 6, 2013 at 10:27 AM, Andre Oppermann wrote: > Courtland, > > the arpresolve observation is very important. Do you have flowtable > enabled in your kernel? > > -- > Andre > > > On 06.03.2013 17:16, Adrian Chadd wrote: >> >> Another instance of it.. >> Adrian >> On 6 March 2013 07:21, Courtland wrote: >>> >>> Has there been any progress on resolving this problem. Does anyone have a >>> better idea as to where it is breaking down? >>> >>> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF >>> for >>> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The >>> default gateway changes to an IP that is not on my network when under >>> heavy >>> network load. >>> >>> The last time this happened I had a stream of arpresolve messages in the >>> kernel for the IP that the default route was changed to. >>> Mar 5 19:12:53 kernel: arpresolve: can't allocate llinfo for >>> 50.142.201.101 >>> The default route was changed to 50.142.201.101 after these messages. >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html >>> Sent from the freebsd-net mailing list archive at Nabble.com. >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 16:56:08 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F09115ED for ; Thu, 7 Mar 2013 16:56:08 +0000 (UTC) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225]) by mx1.freebsd.org (Postfix) with ESMTP id B2A01713 for ; Thu, 7 Mar 2013 16:56:08 +0000 (UTC) Received: from udns.ultimateDNS.NET (localhost [127.0.0.1]) by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r27Gu1vO002076; Thu, 7 Mar 2013 08:56:07 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: (from www@localhost) by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r27Gttj8002070; Thu, 7 Mar 2013 08:55:55 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimatedns.net ([209.180.214.225]) (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP; Thu, 7 Mar 2013 08:55:55 -0800 (PST) Message-ID: In-Reply-To: <5138AED9.1020801@dat.pl> References: <5138AED9.1020801@dat.pl> Date: Thu, 7 Mar 2013 08:55:55 -0800 (PST) Subject: Re: Implementing IP6 in 8.3 From: "freebsd-net" To: "Maciej Milewski" User-Agent: UDNSMS/2.0.3 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 16:56:09 -0000 Greetings Maciej Milewski, and thank you for your thoughtful reply. > On 06.03.2013 22:02, freebsd-net wrote: >> Greetings, >> I'm evaluating an ISP for the sake of building BSD operating systems on hardware >> that they use (DSL modems, in this case). When I had my old NEC server, I had a >> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at >> it for use in alot of hardware I have laying around. In my current situation, I'm >> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively >> new modem, it doesn't support IP6. It is my hope to replace the OS with one that >> does. :) > If it doesn't support IPv6 you can always try to use it in Transparent > Bridging (RFC1483) mode. > > You can then put other router/computer that does IPv6 routing just after > that modem. > Thank you for the links. I was aware of that, but requires that every connection directly to the modem, send the PPPoE creds to the modem. While it's simple enough to connect a router/switch between the modem, and clients, it adds an additional hop. I think I'll be better served building a (free)BSD kernel, and drivers for the modem -- assuming that because the modem doesn't IP6, it's not possible to route IP6 traffic directly, unless through a "tunnel broker". >> I leased a /48 of IP4's from them, which /also/ came with as many IP6's. >> So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers), >> I'm wondering 2 things: >> If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho >> my gateway (modem) doesn't? >> Am I /correctly/ attempting to use it? >> I'm answering authoritatively for the many domains I own. They have all functioned >> well for many years via IP4. I have added the requisite AAAA records in all the zones, >> as well as the associated RR's. >> While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of >> DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48. >> So it's up to the modem to answer accordingly. >> BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically, >> via rc.conf(5). While I've read as much as I can find on the topic related to BSD, >> boot messages indicate at least -- "IP6 gateway unreachable". >> I'm currently using: >> rc.conf(5): >> ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000" >> ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000" >> I also have the corresponding host IP in hosts(5). >> >> Any help, pointers, guidance, answers /greatly/ appreciated. >> >> Thank you for all your time, and consideration. >> >> --Chris >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > -- > Pozdrawiam, > Maciej Milewski Thanks again, for taking the time to respond. --Chris > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 17:07:51 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 429C78F5; Thu, 7 Mar 2013 17:07:51 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-ee0-f49.google.com (mail-ee0-f49.google.com [74.125.83.49]) by mx1.freebsd.org (Postfix) with ESMTP id 8CC1B7A3; Thu, 7 Mar 2013 17:07:49 +0000 (UTC) Received: by mail-ee0-f49.google.com with SMTP id d41so529474eek.36 for ; Thu, 07 Mar 2013 09:07:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=t/FA6a5iw9i48jJqavMhUnLC7Qp6NR1n6q27Ku9YD+U=; b=TIfQPfStDVHYmIx8wi9sebwkCrKJgTWTP8I0H+G7sUjiwkVFO35OVewTU5y1MCuEZC yT1HaL3y1e/Cr2AaOCtGv91rz63hAmgJzAdamCg8N4XvSQ8mZC/hiRurIl1ZVxgCeP5J qX6kLsTiBRmo5U3qRwfmXlPHMV2DV8Sy/U4TnBDcX1CgkCJ3htHBi/XmLLz8w4M49grS 5LQKmZDU+iL1I1LBEVy+7BjKNS8J/nQHmS2RrmECAwBOEWHzychjr0bpjWumJQsxDFrp OoG0WvCAKXWQmBmCxeowJkQl/sscP2eaXm4rFtQemcLpZfBqDvvkQOZkNTBlfksctsXq gxFQ== MIME-Version: 1.0 X-Received: by 10.195.12.133 with SMTP id eq5mr55608632wjd.52.1362676063291; Thu, 07 Mar 2013 09:07:43 -0800 (PST) Received: by 10.194.110.195 with HTTP; Thu, 7 Mar 2013 09:07:43 -0800 (PST) In-Reply-To: <5136FD71.6000408@freebsd.org> References: <5136FD71.6000408@freebsd.org> Date: Thu, 7 Mar 2013 09:07:43 -0800 Message-ID: Subject: Re: Default route changes unexpectedly From: Nick Rogers To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 17:07:51 -0000 On Wed, Mar 6, 2013 at 12:25 AM, Andre Oppermann wrote: > On 05.03.2013 18:39, Nick Rogers wrote: >> >> Hello, >> >> I am attempting to create awareness of a serious issue affecting users >> of FreeBSD 9.x and PF. There appears to be a bug that allows the >> kernel's routing table to be corrupted by traffic routing through the >> system. Under heavy traffic load, the default route can seemingly >> randomly change to an IP address that is not directly connected to the >> network (i.e., is not configured anywhere). Dhclient is not in the >> mix, nor is routed, bgpd, etc. Running `route monitor` shows no >> evidence of the change in the default route. The one commonality >> between all the systems experiencing this problem seems to be the use >> of PF. >> >> Obviously this is a serious problem as it causes all Internet-bound >> traffic to stop routing until the default route is corrected. Some >> users, including myself, are working around this problem by installing >> a script that runs multiple times a second to check if the default >> route is incorrect and fixing it if necessary, which mitigates the >> amount of downtime caused by the bug. > > > Can you describe your traffic forwarding setup in more detail? > Is it only pf, or do you run netgraph, or other things as well? > Do you use flow routing? I use PF for NAT, filtering, and rdr rules. ALTQ for bandwidth management. I do not use netgraph. I use vlans. PF redirects to squid as a transproxy. I'm not familiar with flow routing so unless its enabled in 9.1 by default I do not use it. > > How frequent does this happen? Every other day during periods of heavier Internet-bound traffic. > > I'm trying to create a stack graph to see which parts of the network > stack are involved in handling your packet. > > -- > Andre > >> Please refer to these past posts for more examples and evidence of >> other users experiencing this problem: >> >> http://forums.freebsd.org/showthread.php?p=211610#post211610 >> >> >> http://freebsd.1045724.n5.nabble.com/Default-route-quot-random-quot-gateway-modification-bug-td5750820.html >> >> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html >> >> http://lists.freebsd.org/pipermail/freebsd-ipfw/2010-September/004361.html >> >> There is also a PR that was incorrectly labeled as an IPFW issue. >> Myself and others believe this issue is not restricted to the use of >> IPFW and that the PR should be relabeled. I am inclined to think it is >> strictly a PF issue since I am not using IPFW, however there is >> evidence of the default route changing on people using IPFW for past >> versions of FreeBSD (7.x/8.x), so perhaps this is related. >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/174749 >> >> Another PR for the same problem but specific to IPFW and 8.2-RELEASE >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=157796 >> >> I am hoping someone reading this can give the problem the attention it >> deserves. Thank you. >> >> -Nick >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 17:09:35 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B9E64A8A for ; Thu, 7 Mar 2013 17:09:35 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 1760D7C1 for ; Thu, 7 Mar 2013 17:09:34 +0000 (UTC) Received: (qmail 8318 invoked from network); 7 Mar 2013 18:23:01 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 18:23:01 -0000 Message-ID: <5138C9C8.6030809@freebsd.org> Date: Thu, 07 Mar 2013 18:09:28 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Nick Rogers Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) References: <51378A9D.6080306@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 17:09:35 -0000 On 07.03.2013 17:54, Nick Rogers wrote: > I'm not sure. I have not explicitly enabled/disabled it. I am using > the GENERIC kernel from 9.1 plus PF+ALTQ. > > # sysctl net.inet.flowtable.enable > sysctl: unknown oid 'net.inet.flowtable.enable' > # sysctl -a | grep flow > kern.sigqueue.overflow: 0 > net.inet.tcp.reass.overflows: 0 > net.inet6.ip6.auto_flowlabel: 1 > > uname -v > FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013 > root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM > > 8.0 release notes say flowtable is enabled by default on amd64/i386. > So I presume it is enabled? I can't seem to find much information > about this for FreeBSD 9.x It's not compiled in GENERIC on 9.x because it had/has some stability issues. I just wanted to make sure that the problem really come out of the arpresolve area before digging into it. -- Andre > On Wed, Mar 6, 2013 at 10:27 AM, Andre Oppermann wrote: >> Courtland, >> >> the arpresolve observation is very important. Do you have flowtable >> enabled in your kernel? >> >> -- >> Andre >> >> >> On 06.03.2013 17:16, Adrian Chadd wrote: >>> >>> Another instance of it.. >>> Adrian >>> On 6 March 2013 07:21, Courtland wrote: >>>> >>>> Has there been any progress on resolving this problem. Does anyone have a >>>> better idea as to where it is breaking down? >>>> >>>> I am experiencing the same problem under FreeBSD 9.1-RELEASE. I use PF >>>> for >>>> NAT, ALTQ, and RDR/filter rules. I'm not using PPPoE or dhclient. The >>>> default gateway changes to an IP that is not on my network when under >>>> heavy >>>> network load. >>>> >>>> The last time this happened I had a stream of arpresolve messages in the >>>> kernel for the IP that the default route was changed to. >>>> Mar 5 19:12:53 kernel: arpresolve: can't allocate llinfo for >>>> 50.142.201.101 >>>> The default route was changed to 50.142.201.101 after these messages. >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://freebsd.1045724.n5.nabble.com/kernel-arpresolve-can-t-allocate-llinfo-for-65-59-233-102-tp5742320p5793139.html >>>> Sent from the freebsd-net mailing list archive at Nabble.com. >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >>> >> > > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 17:26:34 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D6DABD2C; Thu, 7 Mar 2013 17:26:34 +0000 (UTC) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225]) by mx1.freebsd.org (Postfix) with ESMTP id 8E6D1879; Thu, 7 Mar 2013 17:26:34 +0000 (UTC) Received: from udns.ultimateDNS.NET (localhost [127.0.0.1]) by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r27HQRaU003640; Thu, 7 Mar 2013 09:26:33 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: (from www@localhost) by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r27HQMHF003634; Thu, 7 Mar 2013 09:26:22 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimatedns.net ([209.180.214.225]) (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP; Thu, 7 Mar 2013 09:26:22 -0800 (PST) Message-ID: In-Reply-To: References: Date: Thu, 7 Mar 2013 09:26:22 -0800 (PST) Subject: Re: dhclient issue. From: "freebsd-net" To: araujo@freebsd.org User-Agent: UDNSMS/2.0.3 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 17:26:34 -0000 > Hello Guys, > > I've faced out some problem with dhclient during this week on 9.1-RELEASE! > Below there is the log: > > [root@home ~]# uname -a > FreeBSD HOME 9.1-RELEASE FreeBSD 9.1-RELEASE #10: Tue Mar 5 18:57:14 CST > 2013 root@home:/usr/src/sys/HOME.amd64 amd64 > > > [root@home ~]# dhclient ix0 > PID = 3276, PPID = 3274 > fibnum = 0 > fibcmd = setfib 0 > interface = ix0 > ifconfig: ioctl (SIOCAIFADDR): File exists > ix0: not found > exiting. > > [root@home ~]# tail /var/log/messages > Mar 17 14:53:52 ESSD46B70 dhclient[3244]: exiting. > Mar 17 14:54:01 ESSD46B70 login: ROOT LOGIN (root) ON ttyv0 > Mar 17 14:54:15 ESSD46B70 dhclient[3257]: ix0: not found > Mar 17 14:54:15 ESSD46B70 dhclient[3257]: exiting. > Mar 17 14:54:15 ESSD46B70 dhclient[3258]: connection closed > Mar 17 14:54:15 ESSD46B70 dhclient[3258]: exiting. > Mar 17 14:54:57 ESSD46B70 dhclient[3274]: ix0: not found > > [root@home ~]# ifconfig ix0 > ix0: flags=8843 metric 0 mtu 1500 > options=403bb > ether 00:08:9b:d4:6b:71 > nd6 options=29 > media: Ethernet autoselect (10Gbase-T ) > status: active > > > I have another interface em0, and there it works properly! > Any idea, what is going on? Anything in rc.conf(5) that might conflict with your attempt to hook this if(1) up via dhclient(8)? For example, if you already have a: ifconfig_ix0="DHCP", and that failed during boot (init), then dhclient(8) may already still be attempting to hook your ix0 if up. Which will result in fail. --Chris > > Best Regards, > -- > Marcelo Araujo > araujo@FreeBSD.org > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 19:27:44 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4FEB68EA; Thu, 7 Mar 2013 19:27:44 +0000 (UTC) (envelope-from krzysiek@airnet.opole.pl) Received: from base.airnet.opole.pl (ns2.airmax.pl [176.111.128.3]) by mx1.freebsd.org (Postfix) with ESMTP id 0C659E50; Thu, 7 Mar 2013 19:27:43 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by base.airnet.opole.pl (Postfix) with ESMTP id EA0727FF031; Thu, 7 Mar 2013 20:27:37 +0100 (CET) Received: from base.airnet.opole.pl ([127.0.0.1]) by localhost (mail.airnet.opole.pl [127.0.0.1]) (maiad, port 10024) with ESMTP id 70250-06; Thu, 7 Mar 2013 20:27:37 +0100 (CET) Received: from [10.10.11.223] (unknown [176.111.138.12]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: krzysiek@airnet.opole.pl) by base.airnet.opole.pl (Postfix) with ESMTPSA id B6B417FF02D; Thu, 7 Mar 2013 20:27:37 +0100 (CET) Message-ID: <5138EA26.4000403@airnet.opole.pl> Date: Thu, 07 Mar 2013 20:27:34 +0100 From: Krzysztof Barcikowski User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) References: <51378A9D.6080306@freebsd.org> <5138C9C8.6030809@freebsd.org> In-Reply-To: <5138C9C8.6030809@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Nick Rogers , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 19:27:44 -0000 W dniu 2013-03-07 18:09, Andre Oppermann pisze: > On 07.03.2013 17:54, Nick Rogers wrote: >> I'm not sure. I have not explicitly enabled/disabled it. I am using >> the GENERIC kernel from 9.1 plus PF+ALTQ. >> >> # sysctl net.inet.flowtable.enable >> sysctl: unknown oid 'net.inet.flowtable.enable' >> # sysctl -a | grep flow >> kern.sigqueue.overflow: 0 >> net.inet.tcp.reass.overflows: 0 >> net.inet6.ip6.auto_flowlabel: 1 >> >> uname -v >> FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013 >> root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM >> >> 8.0 release notes say flowtable is enabled by default on amd64/i386. >> So I presume it is enabled? I can't seem to find much information >> about this for FreeBSD 9.x > > It's not compiled in GENERIC on 9.x because it had/has some stability > issues. I just wanted to make sure that the problem really come out > of the arpresolve area before digging into it. > I can confirm I get these messages as well: Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 86.58.122.125 Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 86.58.122.125 IP 86.58.122.125 is not from IP pool used by me. Krzysiek From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 20:26:29 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0273B679 for ; Thu, 7 Mar 2013 20:26:29 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 5CBCF124 for ; Thu, 7 Mar 2013 20:26:27 +0000 (UTC) Received: (qmail 17974 invoked from network); 7 Mar 2013 21:39:52 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 21:39:52 -0000 Message-ID: <5138F7EC.30804@freebsd.org> Date: Thu, 07 Mar 2013 21:26:20 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Krzysztof Barcikowski Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) References: <51378A9D.6080306@freebsd.org> <5138C9C8.6030809@freebsd.org> <5138EA26.4000403@airnet.opole.pl> In-Reply-To: <5138EA26.4000403@airnet.opole.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Nick Rogers , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 20:26:29 -0000 On 07.03.2013 20:27, Krzysztof Barcikowski wrote: > W dniu 2013-03-07 18:09, Andre Oppermann pisze: >> On 07.03.2013 17:54, Nick Rogers wrote: >>> I'm not sure. I have not explicitly enabled/disabled it. I am using >>> the GENERIC kernel from 9.1 plus PF+ALTQ. >>> >>> # sysctl net.inet.flowtable.enable >>> sysctl: unknown oid 'net.inet.flowtable.enable' >>> # sysctl -a | grep flow >>> kern.sigqueue.overflow: 0 >>> net.inet.tcp.reass.overflows: 0 >>> net.inet6.ip6.auto_flowlabel: 1 >>> >>> uname -v >>> FreeBSD 9.1-RELEASE #0 r245436M: Mon Jan 14 16:34:21 EST 2013 >>> root@fbsd_91:/usr/obj/usr/src/sys/CUSTOM >>> >>> 8.0 release notes say flowtable is enabled by default on amd64/i386. >>> So I presume it is enabled? I can't seem to find much information >>> about this for FreeBSD 9.x >> >> It's not compiled in GENERIC on 9.x because it had/has some stability >> issues. I just wanted to make sure that the problem really come out >> of the arpresolve area before digging into it. >> > > I can confirm I get these messages as well: > > Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 86.58.122.125 > Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for 86.58.122.125 OK. Then this is the common factor. > IP 86.58.122.125 is not from IP pool used by me. You mean it's not from one of the subnets on your interfaces? -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 20:53:40 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5192F895 for ; Thu, 7 Mar 2013 20:53:40 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id C909C24D for ; Thu, 7 Mar 2013 20:53:39 +0000 (UTC) Received: (qmail 19311 invoked from network); 7 Mar 2013 22:07:04 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Mar 2013 22:07:04 -0000 Message-ID: <5138FE4C.5030307@freebsd.org> Date: Thu, 07 Mar 2013 21:53:32 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> <51389B4B.1060003@freebsd.org> <5138B390.2080806@FreeBSD.org> In-Reply-To: <5138B390.2080806@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 20:53:40 -0000 On 07.03.2013 16:34, Alexander V. Chernikov wrote: > On 07.03.2013 17:51, Andre Oppermann wrote: >> On 07.03.2013 14:38, Ermal Luçi wrote: >>> Isn't it better to teach the routing code about metrics. >>> Routing daemons cope better this way and they can handle this. >>> So the policy of this behaviour can be controled by administrator >>> rather than by code! >>> With metrics you can add routes with bigger metric for interfaces and >>> lower from routing daemons. >>> This also can mitigate somehow on interfaces with the same subnet >>> configured possibly. >> >> Generally I agree with you that this would be the ideal outcome. >> However we're still quite a bit away from reaching that goal. >> To make this really work we have make mpath plus metrics a first >> class citizen in the routing code and also the update the routing >> daemons kernel interfaces to know about this. I hope we get there >> in the not too distant future. > > Radix is already over-bloated. Typically in performance-oriented > solutions (hardware/software routers from vendors) there is clear > separation between RIB (where route protocol attributes, best candidate > routes, routes with different priority exists) and FIB, which is > typically some kind of radix with minimum needed info, e.g: > prefix, nexthops, their interfaces, optional L2 data to prepend. ACK. Though the bloat in itself is not main problem other than kernel memory consumption. If you think of it in cache line misses everything more than 128 bytes away is potentially a cache miss. The additional distance due to a large or small structure makes no difference. What makes an important difference is the internal layout of the structure and whether the relevant variables are within the same cache line. This can be a problem in a large structure when some data is at the beginning and other data at the end on a different cache line. Here potentially twice the cache miss latency per trie element hurts. If we can manage to put everything for a trie search into the first cache line we're quit good already. The additional win for tighter packing isn't that large anymore. > Our radix stands somewhere between RIB and FIB (since we have to support > route(8) and upper layer protocols): it serves badly as RIB (little > functionality) and as FIB: too much overhead and inefficient/too general > code. ACK. There is a big philosophical question on the model. Make it a RIB so that independent but complementary routing daemons can add routes concurrently and the kernel knows which have higher priority or are equal cost for traffic balancing (as in bgpd+ospfd). Or strip it to a FIB and have a external program do the RIB and coordination across routing daemons (as in Quagga suite). > For example, sizeof(rt_nodes[2]) (first element of rte) is 96 bytes on > amd64. That is a problem if the trie traversal function accesses fields beyond the this cache line. The main problem is that key and mask are pointers and thus external to the radix_node adding even more cache misses. > Additionally, rte refcount approach is totally broken. ACK. Copy and out. No references or external pointers into the table. > I'm currently thinking of adding some kind of hooks to current > route/radix code to permit building efficient trie (or other structure) > for given address family and to use it for forwarding purposes only. AFAIK Marco Zec and/or Luigi have done some work in this area as well. > For example, I don't need trie while doing MPLS label switching: > assuming control plane allocates contiguous label space, I can use label > array for efficient lookup. Nobody's forcing you to use a radix trie for MPLS. In theory each protocol can chose its own best method. -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 20:58:20 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EA6E696A; Thu, 7 Mar 2013 20:58:20 +0000 (UTC) (envelope-from qing.li@bluecoat.com) Received: from plsvl-mailgw-01.bluecoat.com (plsvl-mailgw-01.bluecoat.com [199.91.133.11]) by mx1.freebsd.org (Postfix) with ESMTP id BEA9F278; Thu, 7 Mar 2013 20:58:20 +0000 (UTC) Received: from pwsvl-exchts-03.internal.cacheflow.com (pwsvl-exchts-03.bluecoat.com [10.2.2.160]) by plsvl-mailgw-01.bluecoat.com (Postfix) with ESMTP id 852BE81A0BE; Thu, 7 Mar 2013 11:51:46 -0900 (AKST) Received: from PWSVL-EXCMBX-04.internal.cacheflow.com ([fe80::c596:c77:dd67:b72d]) by pwsvl-exchts-03.internal.cacheflow.com ([fe80::a508:17dc:1550:e9f6%12]) with mapi id 14.01.0355.002; Thu, 7 Mar 2013 12:51:45 -0800 From: "Li, Qing" To: Krzysztof Barcikowski , Andre Oppermann Subject: RE: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) Thread-Topic: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) Thread-Index: AQHOG2nqowWL/iUfmkKP6+iB7XtxKZias+mw Date: Thu, 7 Mar 2013 20:51:45 +0000 Message-ID: References: <51378A9D.6080306@freebsd.org> <5138C9C8.6030809@freebsd.org> <5138EA26.4000403@airnet.opole.pl> In-Reply-To: <5138EA26.4000403@airnet.opole.pl> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.2.2.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: Nick Rogers , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 20:58:21 -0000 Hi, >=20 > I can confirm I get these messages as well: >=20 > Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for > 86.58.122.125 > Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for > 86.58.122.125 >=20 > IP 86.58.122.125 is not from IP pool used by me. >=20 This kernel message is a merely a side effect of a bad route (with=20 off-net IP address) being injected as a default route replacement. --Qing From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 21:26:37 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8EB13620; Thu, 7 Mar 2013 21:26:37 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 32DC538F; Thu, 7 Mar 2013 21:26:37 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDiO5-000Fet-By; Fri, 08 Mar 2013 01:30:05 +0400 Message-ID: <513905F2.1050409@FreeBSD.org> Date: Fri, 08 Mar 2013 01:26:10 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> <51389B4B.1060003@freebsd.org> <5138B390.2080806@FreeBSD.org> <5138FE4C.5030307@freebsd.org> In-Reply-To: <5138FE4C.5030307@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 21:26:37 -0000 On 08.03.2013 00:53, Andre Oppermann wrote: > On 07.03.2013 16:34, Alexander V. Chernikov wrote: >> On 07.03.2013 17:51, Andre Oppermann wrote: >>> On 07.03.2013 14:38, Ermal Luçi wrote: >>>> Isn't it better to teach the routing code about metrics. >>>> Routing daemons cope better this way and they can handle this. >>>> So the policy of this behaviour can be controled by administrator >>>> rather than by code! >>>> With metrics you can add routes with bigger metric for interfaces and >>>> lower from routing daemons. >>>> This also can mitigate somehow on interfaces with the same subnet >>>> configured possibly. >>> >>> Generally I agree with you that this would be the ideal outcome. >>> However we're still quite a bit away from reaching that goal. >>> To make this really work we have make mpath plus metrics a first >>> class citizen in the routing code and also the update the routing >>> daemons kernel interfaces to know about this. I hope we get there >>> in the not too distant future. > > >> Radix is already over-bloated. Typically in performance-oriented >> solutions (hardware/software routers from vendors) there is clear >> separation between RIB (where route protocol attributes, best candidate >> routes, routes with different priority exists) and FIB, which is >> typically some kind of radix with minimum needed info, e.g: >> prefix, nexthops, their interfaces, optional L2 data to prepend. > > ACK. Though the bloat in itself is not main problem other than kernel > memory consumption. If you think of it in cache line misses everything > more than 128 bytes away is potentially a cache miss. The additional > distance due to a large or small structure makes no difference. What > makes an important difference is the internal layout of the structure > and whether the relevant variables are within the same cache line. > This can be a problem in a large structure when some data is at the > beginning and other data at the end on a different cache line. Here > potentially twice the cache miss latency per trie element hurts. Yup. I'm talking in cache line terms only. > > If we can manage to put everything for a trie search into the first > cache line we're quit good already. The additional win for tighter > packing isn't that large anymore. > >> Our radix stands somewhere between RIB and FIB (since we have to support >> route(8) and upper layer protocols): it serves badly as RIB (little >> functionality) and as FIB: too much overhead and inefficient/too general >> code. > > ACK. There is a big philosophical question on the model. Make it a > RIB so that independent but complementary routing daemons can add > routes concurrently and the kernel knows which have higher priority > or are equal cost for traffic balancing (as in bgpd+ospfd). Or strip > it to a FIB and have a external program do the RIB and coordination > across routing daemons (as in Quagga suite). > >> For example, sizeof(rt_nodes[2]) (first element of rte) is 96 bytes on >> amd64. > > That is a problem if the trie traversal function accesses fields beyond > the this cache line. The main problem is that key and mask are pointers > and thus external to the radix_node adding even more cache misses. Yes. > >> Additionally, rte refcount approach is totally broken. > > ACK. Copy and out. No references or external pointers into the table. > >> I'm currently thinking of adding some kind of hooks to current >> route/radix code to permit building efficient trie (or other structure) >> for given address family and to use it for forwarding purposes only. > > AFAIK Marco Zec and/or Luigi have done some work in this area as well. > >> For example, I don't need trie while doing MPLS label switching: >> assuming control plane allocates contiguous label space, I can use label >> array for efficient lookup. > > Nobody's forcing you to use a radix trie for MPLS. In theory each > protocol can chose its own best method. Well, actually this is not quite true, and that is the problem. Userland has to manage kernel MPLS entries somehow, and route socket is bound to radix pretty heavily. Additionally, our route(8) abuses kvm(3) interface and simply walks thru in-kernel radix tree to print routes and additional information like refcouns/use count. There is very-very-old (but still working) code there printing more or less the same via sysctl api, but additional info is not propagated. > From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 21:42:12 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1611A532; Thu, 7 Mar 2013 21:42:12 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) by mx1.freebsd.org (Postfix) with ESMTP id BE8DA645; Thu, 7 Mar 2013 21:42:11 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id r27Lg5SZ066094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 7 Mar 2013 13:42:05 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id r27Lg5oD066093; Thu, 7 Mar 2013 13:42:05 -0800 (PST) (envelope-from jmg) Date: Thu, 7 Mar 2013 13:42:05 -0800 From: John-Mark Gurney To: Andre Oppermann Subject: Re: [patch] interface routes Message-ID: <20130307214205.GD50035@funkthat.com> Mail-Followup-To: Andre Oppermann , "Alexander V. Chernikov" , net@freebsd.org References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51384443.5070209@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Thu, 07 Mar 2013 13:42:05 -0800 (PST) Cc: "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 21:42:12 -0000 Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 +0100: > >Adding interface address is handled via atomically deleting old prefix and > >adding interface one. > > This brings up a long standing sore point of our routing code > which this patch makes more pronounced. When an interface link > state is down I don't want the route to it to persist but to > become inactive so another path can be chosen. This the very > point of running a routing daemon. So on the link-down event > the installed interface routes should be removed from the routing > table. The configured addresses though should persist and the > interface routes re-installed on a link-up event. What's your > opinion on it? > > Other than these points I think your code is fine and can go > into the tree. The issue that I see with this is that if you bump your cable, all your connections will be dropped, because as soon as they try to send something, they'll get a no route to host, and this will break the TCP connection... If we keep the routes when the link goes down, the packet will be queued or dropped (depending upon ethernet driver), but the TCP connection will not break... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 00:32:53 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 24F131FF for ; Fri, 8 Mar 2013 00:32:53 +0000 (UTC) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225]) by mx1.freebsd.org (Postfix) with ESMTP id CA890D7C for ; Fri, 8 Mar 2013 00:32:51 +0000 (UTC) Received: from udns.ultimateDNS.NET (localhost [127.0.0.1]) by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r280WoB3029376 for ; Thu, 7 Mar 2013 16:32:56 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: (from www@localhost) by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r280Wjrp029370; Thu, 7 Mar 2013 16:32:45 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimatedns.net ([209.180.214.225]) (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP; Thu, 7 Mar 2013 16:32:45 -0800 (PST) Message-ID: <3a292f3eabb7a27bd9f942f98d6b0e20.authenticated@ultimatedns.net> In-Reply-To: References: Date: Thu, 7 Mar 2013 16:32:45 -0800 (PST) Subject: Re: Implementing IP6 in 8.3 From: "freebsd-net" To: freebsd-net@freebsd.org User-Agent: UDNSMS/2.0.3 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 00:32:53 -0000 > Greetings, > I'm evaluating an ISP for the sake of building BSD operating systems on hardware > that they use (DSL modems, in this case). When I had my old NEC server, I had a > MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at > it for use in alot of hardware I have laying around. In my current situation, I'm > using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively > new modem, it doesn't support IP6. It is my hope to replace the OS with one that > does. :) > I leased a /48 of IP4's from them, which /also/ came with as many IP6's. EDIT The above line /should/ have read: I leased a /29 of IP4's from them, which /also/ came with as many IP6's. ___________^^^ /EDIT Sorry. > So, not having implemented IP6 on any of my boxes (except by way of tunnel brokers), > I'm wondering 2 things: > If my underlying OS (FreeBSD-8.3) can support IP6, will it still function, even tho > my gateway (modem) doesn't? > Am I /correctly/ attempting to use it? > I'm answering authoritatively for the many domains I own. They have all functioned > well for many years via IP4. I have added the requisite AAAA records in all the zones, > as well as the associated RR's. > While the gateway (modem) /does/ have an IP6 address, I can't "speak" for it out of > DNS, because it would be an "out of zone" record. Even tho I'm the RP for the /48. > So it's up to the modem to answer accordingly. > BUT, I'm not sure I'm initiating any of this correctly via rc(8). Or more specifically, > via rc.conf(5). While I've read as much as I can find on the topic related to BSD, > boot messages indicate at least -- "IP6 gateway unreachable". > I'm currently using: > rc.conf(5): > ipv6_ifconfig_re0="2602:00d1:b4d6:e100:0000:0000:0000:0000" > ipv6_defaultrouter="2602:00d1:b4d6:e600:0000:0000:0000:0000" > I also have the corresponding host IP in hosts(5). > > Any help, pointers, guidance, answers /greatly/ appreciated. > > Thank you for all your time, and consideration. > > --Chris > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 01:16:00 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B8CE1E4C; Fri, 8 Mar 2013 01:16:00 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-ve0-f170.google.com (mail-ve0-f170.google.com [209.85.128.170]) by mx1.freebsd.org (Postfix) with ESMTP id 3CA93ECD; Fri, 8 Mar 2013 01:15:59 +0000 (UTC) Received: by mail-ve0-f170.google.com with SMTP id 14so903928vea.29 for ; Thu, 07 Mar 2013 17:15:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=j2PSlh2PkxkO1W+jcjXmqz/cReoTUDnkgO6McayluWU=; b=CxXbG7yLi2dVO2Sks2wbCKZyEC6g1R3YBoP9NVxiZlTMguoj1vLojrK2JJgwT4nZMt m+e3ksRUmbYqupwtzuwk/WgIhqd7u7Gc5YUtUkrJCCKw4HGstjt5PQprsD3HQVxDcrot 9KDEZF09irVye9wub7k4XHHSSBQYkwShkm+EDZchF8d6cUtlUZEd0KC6EtnTGq73aGjH un06NgNBSfcL5GObSWvIyuxDyUC5+DQBScyZFZsqltD1EjACUms/YiwWM3lz0fUT1iR/ XcVTrL1Q2nUPHxUqRDtXE5rN3N3Ay3LeqwTuXDUqsOXhcCRLt6XDNxzRxiEka5Yw6TwJ zaaQ== MIME-Version: 1.0 X-Received: by 10.220.151.144 with SMTP id c16mr182147vcw.18.1362705353455; Thu, 07 Mar 2013 17:15:53 -0800 (PST) Received: by 10.52.176.131 with HTTP; Thu, 7 Mar 2013 17:15:53 -0800 (PST) In-Reply-To: References: <51378A9D.6080306@freebsd.org> <5138C9C8.6030809@freebsd.org> <5138EA26.4000403@airnet.opole.pl> Date: Thu, 7 Mar 2013 17:15:53 -0800 Message-ID: Subject: Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102) From: Nick Rogers To: "Li, Qing" Content-Type: text/plain; charset=ISO-8859-1 Cc: Krzysztof Barcikowski , "freebsd-net@freebsd.org" , Andre Oppermann X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 01:16:00 -0000 On Thu, Mar 7, 2013 at 12:51 PM, Li, Qing wrote: > Hi, > >> >> I can confirm I get these messages as well: >> >> Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for >> 86.58.122.125 >> Mar 7 19:40:25 opole kernel: arpresolve: can't allocate llinfo for >> 86.58.122.125 >> >> IP 86.58.122.125 is not from IP pool used by me. >> > > This kernel message is a merely a side effect of a bad route (with > off-net IP address) being injected as a default route replacement. I would normally agree, however in the last case, the arpresolve messages started happening about two hours before the default route was changed to the IP in the arpresolve message. At least, thats when my script that runs every second detected a change in the route. The script parses netstat -rn output to determine if the default route is correct or not. There was not a 2 hour downtime. Heres the logging of my script followed by appropriate /var/log/messages. 2013/03/05 21:13:02 rxgd[10816] DEBUG> Rxg::Route::parseRoutes - /usr/bin/netstat -rnlW -f inet 2013/03/05 21:13:02 rxgd[10816] DEBUG> Rxg::Route::parseRoutes - done parsing routes 2013/03/05 21:13:02 rxgd[10816] INFO> Rxg::Route::checkDefaultRoute - deleting incorrect default route em0/50.142.201.101 Mar 5 19:12:48 westmar kernel: arpresolve: can't allocate llinfo for 50.142.201.101 Mar 5 19:12:48 westmar last message repeated 107 times Mar 5 21:12:48 westmar named[10906]: internal_send: 66.187.177.13#53: Invalid argument Mar 5 19:12:48 westmar kernel: arpresolve: can't allocate llinfo for 50.142.201.101 Mar 5 19:12:48 westmar last message repeated 24 times I don't understand the timestamps however. It is peculiar to have an arpresolve message at 19:12 followed by bind logging from 21:12, then another arpresolve at 19:12. Maybe it is just because of losing the default route. > > --Qing > > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 07:10:43 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9108689A; Fri, 8 Mar 2013 07:10:43 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 24604CEF; Fri, 8 Mar 2013 07:10:42 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r287AfnT054755; Fri, 8 Mar 2013 02:10:41 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r287AfKg054752; Fri, 8 Mar 2013 02:10:41 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20793.36593.774795.720959@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 02:10:41 -0500 From: Garrett Wollman To: freebsd-net@freebsd.org Subject: Limits on jumbo mbuf cluster allocation X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 02:10:42 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: jfv@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 07:10:43 -0000 I have a machine (actually six of them) with an Intel dual-10G NIC on the motherboard. Two of them (so far) are connected to a network using jumbo frames, with an MTU a little under 9k, so the ixgbe driver allocates 32,000 9k clusters for its receive rings. I have noticed, on the machine that is an active NFS server, that it can get into a state where allocating more 9k clusters fails (as reflected in the mbuf failure counters) at a utilization far lower than the configured limits -- in fact, quite close to the number allocated by the driver for its rx ring. Eventually, network traffic grinds completely to a halt, and if one of the interfaces is administratively downed, it cannot be brought back up again. There's generally plenty of physical memory free (at least two or three GB). There are no console messages generated to indicate what is going on, and overall UMA usage doesn't look extreme. I'm guessing that this is a result of kernel memory fragmentation, although I'm a little bit unclear as to how this actually comes about. I am assuming that this hardware has only limited scatter-gather capability and can't receive a single packet into multiple buffers of a smaller size, which would reduce the requirement for two-and-a-quarter consecutive pages of KVA for each packet. In actual usage, most of our clients aren't on a jumbo network, so most of the time, all the packets will fit into a normal 2k cluster, and we've never observed this issue when the *server* is on a non-jumbo network. Does anyone have suggestions for dealing with this issue? Will increasing the amount of KVA (to, say, twice physical memory) help things? It seems to me like a bug that these large packets don't have their own submap to ensure that allocation is always possible when sufficient physical pages are available. -GAWollman From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 07:54:23 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 61106F88 for ; Fri, 8 Mar 2013 07:54:23 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id D7213EFC for ; Fri, 8 Mar 2013 07:54:22 +0000 (UTC) Received: (qmail 52521 invoked from network); 8 Mar 2013 09:07:40 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 8 Mar 2013 09:07:40 -0000 Message-ID: <51399926.6020201@freebsd.org> Date: Fri, 08 Mar 2013 08:54:14 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Garrett Wollman Subject: Re: Limits on jumbo mbuf cluster allocation References: <20793.36593.774795.720959@hergotha.csail.mit.edu> In-Reply-To: <20793.36593.774795.720959@hergotha.csail.mit.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: jfv@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 07:54:23 -0000 On 08.03.2013 08:10, Garrett Wollman wrote: > I have a machine (actually six of them) with an Intel dual-10G NIC on > the motherboard. Two of them (so far) are connected to a network > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver > allocates 32,000 9k clusters for its receive rings. I have noticed, > on the machine that is an active NFS server, that it can get into a > state where allocating more 9k clusters fails (as reflected in the > mbuf failure counters) at a utilization far lower than the configured > limits -- in fact, quite close to the number allocated by the driver > for its rx ring. Eventually, network traffic grinds completely to a > halt, and if one of the interfaces is administratively downed, it > cannot be brought back up again. There's generally plenty of physical > memory free (at least two or three GB). You have an amd64 kernel running HEAD or 9.x? > There are no console messages generated to indicate what is going on, > and overall UMA usage doesn't look extreme. I'm guessing that this is > a result of kernel memory fragmentation, although I'm a little bit > unclear as to how this actually comes about. I am assuming that this > hardware has only limited scatter-gather capability and can't receive > a single packet into multiple buffers of a smaller size, which would > reduce the requirement for two-and-a-quarter consecutive pages of KVA > for each packet. In actual usage, most of our clients aren't on a > jumbo network, so most of the time, all the packets will fit into a > normal 2k cluster, and we've never observed this issue when the > *server* is on a non-jumbo network. > > Does anyone have suggestions for dealing with this issue? Will > increasing the amount of KVA (to, say, twice physical memory) help > things? It seems to me like a bug that these large packets don't have > their own submap to ensure that allocation is always possible when > sufficient physical pages are available. Jumbo pages come directly from the kernel_map which on amd64 is 512GB. So KVA shouldn't be a problem. Your problem indeed appears to come physical memory fragmentation in pmap. There is a buddy memory allocator at work but I fear it runs into serious trouble when it has to allocate a large number of objects spanning more than 2 contiguous pages. Also since you're doing NFS serving almost all memory will be in use for file caching. Running a NIC with jumbo frames enabled gives some interesting trade- offs. Unfortunately most NIC's can't have multiple DMA buffer sizes on the same receive queue and pick the best size for the incoming frame. That means they need to use largest jumbo mbuf for all receive traffic, even a tiny 40 byte ACK. The send side is not constrained in such a way and tries to use PAGE_SIZE clusters for socket buffers whenever it can. Many, but not all, NIC's are able to split a received jumbo frame into multiple smaller DMA segments forming an mbuf chain. The ixgbe hardware is capable of doing this, though the driver supports it but doesn't actively makes use of it. Another issue with many drivers is their inability to deal with mbuf allocation failure for their receive DMA ring. They try to fill it up to the maximal ring size and balk on failure. Rings have become very big and usually are a power of two. The driver could function with a partially filled RX ring too, maybe with some performance impact when it gets really low. On every rxeof it tries to refill the ring, so when resources become available again it'd balance out. NIC's with multiple receive queues/rings make this problem even more acute. A theoretical fix would be to dedicate an entire super page of 1GB or so exclusively to the jumbo frame UMA zone as backing memory. That memory is gone for all other uses though, even if not actually used. Allocating the superpage and determining its size would have to be done manually by setting loader variables. I don't see a reasonable way to do this with autotuning because it requires advance knowledge of the usage patters. IMHO the right fix is to strongly discourage use of jumbo clusters larger than PAGE_SIZE when the hardware is capable of splitting the frame into multiple clusters. The allocation constraint then is only available memory and no longer contiguous pages. Also the waste factor for small frames is much lower. The performance impact is minimal to non-existent. In addition drivers shouldn't break down when the RX ring can't be filled to the max. I recently got yelled at for suggesting to remove jumbo > PAGE_SIZE. However your case proves that such jumbo frames are indeed their own can of worms and should really only and exclusively be used for NIC's that have to do jumbo *and* are incapable of RX scatter DMA. -- Andre From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 07:55:13 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C51139D; Fri, 8 Mar 2013 07:55:13 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-f52.google.com (mail-pb0-f52.google.com [209.85.160.52]) by mx1.freebsd.org (Postfix) with ESMTP id 825B2F0F; Fri, 8 Mar 2013 07:55:13 +0000 (UTC) Received: by mail-pb0-f52.google.com with SMTP id ma3so1014403pbc.25 for ; Thu, 07 Mar 2013 23:55:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=VcQqfCnSkyv44aLqzTPHNhhkPzmU9WWa0b9o1VF+2AM=; b=B+7qkk2cAAsDNNCFlYxomD8Wg2/AktnAfQ6PgeLEKkgZb6O//Kjvvbflh/najsTJnN xjysmxEnz181fZQOwXs2puMpimAsoV5cLC6/Ax5PkgoOeJLhuzwxh+sS3MeH1bYz0NbY YIlxCyXswFPzFDdjgpLVKsqKX6cjR02aW/GZsEFUoMHl0dDdU/+XFYZ5A+7Wm0Zivd6d pFEFvGvDwTCG9QkUjZQYN9JxfbxwvkiJ6/owPv93nofxkypwv4h5hwYXQy+HToePrhyq rfqXMTTaAhBoL/7Ytca1AVGX7TWq2W1X9FP4Jixpfvo9qxoFqUOyih4boYpkyhzsjI/Q ohtA== X-Received: by 10.67.11.4 with SMTP id ee4mr2670500pad.107.1362729307415; Thu, 07 Mar 2013 23:55:07 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id av14sm5355178pac.18.2013.03.07.23.55.03 (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 07 Mar 2013 23:55:06 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Fri, 08 Mar 2013 16:54:58 +0900 From: YongHyeon PYUN Date: Fri, 8 Mar 2013 16:54:58 +0900 To: Garrett Wollman Subject: Re: Limits on jumbo mbuf cluster allocation Message-ID: <20130308075458.GA1442@michelle.cdnetworks.com> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20793.36593.774795.720959@hergotha.csail.mit.edu> User-Agent: Mutt/1.4.2.3i Cc: jfv@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 07:55:13 -0000 On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote: > I have a machine (actually six of them) with an Intel dual-10G NIC on > the motherboard. Two of them (so far) are connected to a network > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver > allocates 32,000 9k clusters for its receive rings. I have noticed, > on the machine that is an active NFS server, that it can get into a > state where allocating more 9k clusters fails (as reflected in the > mbuf failure counters) at a utilization far lower than the configured > limits -- in fact, quite close to the number allocated by the driver > for its rx ring. Eventually, network traffic grinds completely to a > halt, and if one of the interfaces is administratively downed, it > cannot be brought back up again. There's generally plenty of physical > memory free (at least two or three GB). > > There are no console messages generated to indicate what is going on, > and overall UMA usage doesn't look extreme. I'm guessing that this is > a result of kernel memory fragmentation, although I'm a little bit > unclear as to how this actually comes about. I am assuming that this > hardware has only limited scatter-gather capability and can't receive > a single packet into multiple buffers of a smaller size, which would > reduce the requirement for two-and-a-quarter consecutive pages of KVA > for each packet. In actual usage, most of our clients aren't on a > jumbo network, so most of the time, all the packets will fit into a > normal 2k cluster, and we've never observed this issue when the > *server* is on a non-jumbo network. > AFAIK all Intel controllers generate jumbo frame by concatenating multiple mbufs on RX side so there is no physically contiguous 9KB allocation. I vaguely guess there could be mbuf leakage when jumbo frame is enabled. I would check how driver handles mbuf shortage or frame errors while mbuf concatenation for jumbo frame is in progress. > Does anyone have suggestions for dealing with this issue? Will > increasing the amount of KVA (to, say, twice physical memory) help > things? It seems to me like a bug that these large packets don't have > their own submap to ensure that allocation is always possible when > sufficient physical pages are available. > > -GAWollman From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 08:27:44 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 22F4F981; Fri, 8 Mar 2013 08:27:44 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-ve0-f180.google.com (mail-ve0-f180.google.com [209.85.128.180]) by mx1.freebsd.org (Postfix) with ESMTP id B7AFABF; Fri, 8 Mar 2013 08:27:43 +0000 (UTC) Received: by mail-ve0-f180.google.com with SMTP id jx10so1040939veb.25 for ; Fri, 08 Mar 2013 00:27:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=gdU2vJzm6t7YLjU3UDorWz5FHnY+HPa9McEz1yMCyCE=; b=qZZNNrVVstQ0I7K74WY6rwNkgy8oTnegeQiPmHhfgQwDkw2I3fqLOAQA9MPVrtWgUH dU8DH0mp2XmCHtpVoOnVIstJJk+K4i/LLiE7T5P1manDYF/wagbMYl0RZjc/4H2WZAue lXCe+p6qG+7tyTkrA11Og9rjHzBoxAsq5ZOZOcC/TNmh00MhQTmt9S0Mnh4C9lUwNyfI dl5YLvOKi+eDkUPxxfXUOBEzkbGJ/8T8ja/pfetZuhAQmluDH6fO4ANvUFBls+4ivu7i wP9BywpSFZLOkzQb+hPnoFeW2uZVtbVxmzf8sJk4bxuHIp6NH54qD56G0kX4hvpdWcOc Ge5w== MIME-Version: 1.0 X-Received: by 10.58.56.161 with SMTP id b1mr588244veq.42.1362731257517; Fri, 08 Mar 2013 00:27:37 -0800 (PST) Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 00:27:37 -0800 (PST) In-Reply-To: <20130308075458.GA1442@michelle.cdnetworks.com> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <20130308075458.GA1442@michelle.cdnetworks.com> Date: Fri, 8 Mar 2013 00:27:37 -0800 Message-ID: Subject: Re: Limits on jumbo mbuf cluster allocation From: Jack Vogel To: pyunyh@gmail.com Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: jfv@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 08:27:44 -0000 On Thu, Mar 7, 2013 at 11:54 PM, YongHyeon PYUN wrote: > On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote: > > I have a machine (actually six of them) with an Intel dual-10G NIC on > > the motherboard. Two of them (so far) are connected to a network > > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver > > allocates 32,000 9k clusters for its receive rings. I have noticed, > > on the machine that is an active NFS server, that it can get into a > > state where allocating more 9k clusters fails (as reflected in the > > mbuf failure counters) at a utilization far lower than the configured > > limits -- in fact, quite close to the number allocated by the driver > > for its rx ring. Eventually, network traffic grinds completely to a > > halt, and if one of the interfaces is administratively downed, it > > cannot be brought back up again. There's generally plenty of physical > > memory free (at least two or three GB). > > > > There are no console messages generated to indicate what is going on, > > and overall UMA usage doesn't look extreme. I'm guessing that this is > > a result of kernel memory fragmentation, although I'm a little bit > > unclear as to how this actually comes about. I am assuming that this > > hardware has only limited scatter-gather capability and can't receive > > a single packet into multiple buffers of a smaller size, which would > > reduce the requirement for two-and-a-quarter consecutive pages of KVA > > for each packet. In actual usage, most of our clients aren't on a > > jumbo network, so most of the time, all the packets will fit into a > > normal 2k cluster, and we've never observed this issue when the > > *server* is on a non-jumbo network. > > > > AFAIK all Intel controllers generate jumbo frame by concatenating > multiple mbufs on RX side so there is no physically contiguous 9KB > allocation. I vaguely guess there could be mbuf leakage when jumbo > frame is enabled. I would check how driver handles mbuf shortage or > frame errors while mbuf concatenation for jumbo frame is in > progress. > No, this is not true, if using a 9K jumbo it will actually use the larger mbuf pool, the code has been this way for a little while now. Jack > > > Does anyone have suggestions for dealing with this issue? Will > > increasing the amount of KVA (to, say, twice physical memory) help > > things? It seems to me like a bug that these large packets don't have > > their own submap to ensure that allocation is always possible when > > sufficient physical pages are available. > > > > -GAWollman > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 08:31:19 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 77458ABE; Fri, 8 Mar 2013 08:31:19 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vb0-x22d.google.com (mail-vb0-x22d.google.com [IPv6:2607:f8b0:400c:c02::22d]) by mx1.freebsd.org (Postfix) with ESMTP id E3F15FB; Fri, 8 Mar 2013 08:31:18 +0000 (UTC) Received: by mail-vb0-f45.google.com with SMTP id p1so541209vbi.18 for ; Fri, 08 Mar 2013 00:31:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=qPvTnpHswbOK3Z0nN+72XEjySWIXeQkiYSdWHWSEGo8=; b=sS/WOwlYJtiR62rUHNnTTy//dzLQwUQP1iWD+fsC0k8y+l6D2I6nn1GGNKUnS8sMaG 2qQ+jD2vpgdzPS3siDedT34fGf+0m5HdgLCqQVIOfkqpVuiECDz7qQIIAnpXr+CLrPOX ghJayHkWLJsYHZImGsahYnj5yTrlyc4dFRA9A6vV1AtwD14ny3zeZhN+KmvsWR7bPkEi meJSLpzW7hB7jCMDhuTCOBrOAdBwOWQL7g/sJFrT16Li4Xwoi0HivqfHhdkRErh/umLt S2ollcl7IQUNV9U3j+k28w+UjPHwO+VU6jphh0RNtxexmwOqw7K6kRRgQxa+uXKuf8HQ /gOQ== MIME-Version: 1.0 X-Received: by 10.52.19.239 with SMTP id i15mr520505vde.47.1362731478407; Fri, 08 Mar 2013 00:31:18 -0800 (PST) Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 00:31:18 -0800 (PST) In-Reply-To: <51399926.6020201@freebsd.org> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> Date: Fri, 8 Mar 2013 00:31:18 -0800 Message-ID: Subject: Re: Limits on jumbo mbuf cluster allocation From: Jack Vogel To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: jfv@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 08:31:19 -0000 On Thu, Mar 7, 2013 at 11:54 PM, Andre Oppermann wrote: > On 08.03.2013 08:10, Garrett Wollman wrote: > >> I have a machine (actually six of them) with an Intel dual-10G NIC on >> the motherboard. Two of them (so far) are connected to a network >> using jumbo frames, with an MTU a little under 9k, so the ixgbe driver >> allocates 32,000 9k clusters for its receive rings. I have noticed, >> on the machine that is an active NFS server, that it can get into a >> state where allocating more 9k clusters fails (as reflected in the >> mbuf failure counters) at a utilization far lower than the configured >> limits -- in fact, quite close to the number allocated by the driver >> for its rx ring. Eventually, network traffic grinds completely to a >> halt, and if one of the interfaces is administratively downed, it >> cannot be brought back up again. There's generally plenty of physical >> memory free (at least two or three GB). >> > > You have an amd64 kernel running HEAD or 9.x? > > > There are no console messages generated to indicate what is going on, >> and overall UMA usage doesn't look extreme. I'm guessing that this is >> a result of kernel memory fragmentation, although I'm a little bit >> unclear as to how this actually comes about. I am assuming that this >> hardware has only limited scatter-gather capability and can't receive >> a single packet into multiple buffers of a smaller size, which would >> reduce the requirement for two-and-a-quarter consecutive pages of KVA >> for each packet. In actual usage, most of our clients aren't on a >> jumbo network, so most of the time, all the packets will fit into a >> normal 2k cluster, and we've never observed this issue when the >> *server* is on a non-jumbo network. >> >> Does anyone have suggestions for dealing with this issue? Will >> increasing the amount of KVA (to, say, twice physical memory) help >> things? It seems to me like a bug that these large packets don't have >> their own submap to ensure that allocation is always possible when >> sufficient physical pages are available. >> > > Jumbo pages come directly from the kernel_map which on amd64 is 512GB. > So KVA shouldn't be a problem. Your problem indeed appears to come > physical memory fragmentation in pmap. There is a buddy memory > allocator at work but I fear it runs into serious trouble when it has > to allocate a large number of objects spanning more than 2 contiguous > pages. Also since you're doing NFS serving almost all memory will be > in use for file caching. > > Running a NIC with jumbo frames enabled gives some interesting trade- > offs. Unfortunately most NIC's can't have multiple DMA buffer sizes > on the same receive queue and pick the best size for the incoming frame. > That means they need to use largest jumbo mbuf for all receive traffic, > even a tiny 40 byte ACK. The send side is not constrained in such a way > and tries to use PAGE_SIZE clusters for socket buffers whenever it can. > > Many, but not all, NIC's are able to split a received jumbo frame into > multiple smaller DMA segments forming an mbuf chain. The ixgbe hardware > is capable of doing this, though the driver supports it but doesn't > actively makes use of it. > > Another issue with many drivers is their inability to deal with mbuf > allocation failure for their receive DMA ring. They try to fill it > up to the maximal ring size and balk on failure. Rings have become > very big and usually are a power of two. The driver could function > with a partially filled RX ring too, maybe with some performance > impact when it gets really low. On every rxeof it tries to refill > the ring, so when resources become available again it'd balance out. > NIC's with multiple receive queues/rings make this problem even more > acute. > > A theoretical fix would be to dedicate an entire super page of 1GB > or so exclusively to the jumbo frame UMA zone as backing memory. That > memory is gone for all other uses though, even if not actually used. > Allocating the superpage and determining its size would have to be > done manually by setting loader variables. I don't see a reasonable > way to do this with autotuning because it requires advance knowledge > of the usage patters. > > IMHO the right fix is to strongly discourage use of jumbo clusters > larger than PAGE_SIZE when the hardware is capable of splitting the > frame into multiple clusters. The allocation constraint then is only > available memory and no longer contiguous pages. Also the waste > factor for small frames is much lower. The performance impact is > minimal to non-existent. In addition drivers shouldn't break down > when the RX ring can't be filled to the max. > > I recently got yelled at for suggesting to remove jumbo > PAGE_SIZE. > However your case proves that such jumbo frames are indeed their own > can of worms and should really only and exclusively be used for NIC's > that have to do jumbo *and* are incapable of RX scatter DMA. > > I am not strongly opposed to trying the 4k mbuf pool for all larger sizes, Garrett maybe if you would try that on your system and see if that helps you, I could envision making this a tunable at some point perhaps? Thanks for the input Andre. Jack From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 08:39:47 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 727C9D61; Fri, 8 Mar 2013 08:39:47 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id 437FB13D; Fri, 8 Mar 2013 08:39:47 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id fa10so1149766pad.27 for ; Fri, 08 Mar 2013 00:39:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=hcaP4Rxws/QUH/cTA6IdYsVtS4rQ7XmdHcWQlat+qzw=; b=FzHrlkExhdrQDeLCHCKg/dkyQJEJos++rKokAApLBYrcGE/qcBUvmcdPbIVKqq9X6F jpqO+K+dgSHroOwe4jYHRNibtLAr+3alP56zpI3n7emFODykzut6HbKY2aSl0RoSpLhV 5aMC+ORTU5e7Ejrdos9R/xq4uE6yZTQoAfZ/wyXnwRvyT2tYEVfNE4rPbkrk+SfnDhQ2 5WAn5fsFSsKVUBx2N88fumm6gRTCXaCKrF2gNCyaTCuZmj5TnGJFitmrSN3V6eHNEbW5 qS7c8Tpl5BwnLXMwAdcQoI/wYcAOSylLnqpcVJJzImsNiCJSUTjUmeXLrHR5iw0zPtbB aufw== X-Received: by 10.66.9.69 with SMTP id x5mr2713559paa.204.1362731981669; Fri, 08 Mar 2013 00:39:41 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id ip8sm4822866pbc.39.2013.03.08.00.39.37 (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 08 Mar 2013 00:39:40 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Fri, 08 Mar 2013 17:39:32 +0900 From: YongHyeon PYUN Date: Fri, 8 Mar 2013 17:39:32 +0900 To: Jack Vogel Subject: Re: Limits on jumbo mbuf cluster allocation Message-ID: <20130308083932.GB1442@michelle.cdnetworks.com> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <20130308075458.GA1442@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: jfv@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 08:39:47 -0000 On Fri, Mar 08, 2013 at 12:27:37AM -0800, Jack Vogel wrote: > On Thu, Mar 7, 2013 at 11:54 PM, YongHyeon PYUN wrote: > > > On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote: > > > I have a machine (actually six of them) with an Intel dual-10G NIC on > > > the motherboard. Two of them (so far) are connected to a network > > > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver > > > allocates 32,000 9k clusters for its receive rings. I have noticed, > > > on the machine that is an active NFS server, that it can get into a > > > state where allocating more 9k clusters fails (as reflected in the > > > mbuf failure counters) at a utilization far lower than the configured > > > limits -- in fact, quite close to the number allocated by the driver > > > for its rx ring. Eventually, network traffic grinds completely to a > > > halt, and if one of the interfaces is administratively downed, it > > > cannot be brought back up again. There's generally plenty of physical > > > memory free (at least two or three GB). > > > > > > There are no console messages generated to indicate what is going on, > > > and overall UMA usage doesn't look extreme. I'm guessing that this is > > > a result of kernel memory fragmentation, although I'm a little bit > > > unclear as to how this actually comes about. I am assuming that this > > > hardware has only limited scatter-gather capability and can't receive > > > a single packet into multiple buffers of a smaller size, which would > > > reduce the requirement for two-and-a-quarter consecutive pages of KVA > > > for each packet. In actual usage, most of our clients aren't on a > > > jumbo network, so most of the time, all the packets will fit into a > > > normal 2k cluster, and we've never observed this issue when the > > > *server* is on a non-jumbo network. > > > > > > > AFAIK all Intel controllers generate jumbo frame by concatenating > > multiple mbufs on RX side so there is no physically contiguous 9KB > > allocation. I vaguely guess there could be mbuf leakage when jumbo > > frame is enabled. I would check how driver handles mbuf shortage or > > frame errors while mbuf concatenation for jumbo frame is in > > progress. > > > > No, this is not true, if using a 9K jumbo it will actually use the larger > mbuf pool, the code has been this way for a little while now. Ah, thanks for correcting me. If H/W is still able to support old style chaining like em(4), wouldn't it better to use that rather than allocating a 9KB buffer? Allocating a 9KB buffer to handle a pure TCP ACK segment looks inefficient. > > Jack > > > > > > > Does anyone have suggestions for dealing with this issue? Will > > > increasing the amount of KVA (to, say, twice physical memory) help > > > things? It seems to me like a bug that these large packets don't have > > > their own submap to ensure that allocation is always possible when > > > sufficient physical pages are available. > > > > > > -GAWollman > > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 08:39:54 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C902BDDF; Fri, 8 Mar 2013 08:39:54 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qc0-x22b.google.com (mail-qc0-x22b.google.com [IPv6:2607:f8b0:400d:c01::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 6ABFD13E; Fri, 8 Mar 2013 08:39:54 +0000 (UTC) Received: by mail-qc0-f171.google.com with SMTP id d1so475877qca.16 for ; Fri, 08 Mar 2013 00:39:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=HWXKvWPnsDTPeX/WxELjL+Qn397KGxjDDvzqQ8wKgZs=; b=0LMkfwfWUSj/zxxqFlidIPaa4xajS2LOhDx01hf8VTYYGVv5/dgk8VLgk24AEqhQ2b r1BEzdjMyY0TQD13xaVirUqNbxZHEvGBvWdBBqEMbh6RanTVEkaFNomNRSvb8Pyh5+cY KoyfS1UVXoInWvcWhQ/9GzflVzvdciv1Cw2CToThsSPTixuBsMVJRD0K+LlPSwdviSVz mEPDpKNqzaAnP9hFL9GLuAq7h19TLsrOMYih2CJl54uoR+nGmaIzOIMDiS58eVp0p4tO GujewDFRBKODKp+YqYn+IhqU7GSjnvTCgQJZqGcn3IGhDuIt0M9sdrrvK4WtprF7/Jmj oEMg== MIME-Version: 1.0 X-Received: by 10.229.69.24 with SMTP id x24mr450213qci.16.1362731993893; Fri, 08 Mar 2013 00:39:53 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Fri, 8 Mar 2013 00:39:53 -0800 (PST) In-Reply-To: <51389B4B.1060003@freebsd.org> References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <51387D4A.9030408@FreeBSD.org> <51388046.7040408@freebsd.org> <51389B4B.1060003@freebsd.org> Date: Fri, 8 Mar 2013 09:39:53 +0100 X-Google-Sender-Auth: 9vm07c8oiJGPEjF1irXoPF-BOpE Message-ID: Subject: Re: [patch] interface routes From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 08:39:54 -0000 On Thu, Mar 7, 2013 at 2:51 PM, Andre Oppermann wrote: > On 07.03.2013 14:38, Ermal Lu=E7i wrote: > >> On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann > andre@freebsd.org>> wrote: >> >> On 07.03.2013 12:43, Alexander V. Chernikov wrote: >> >> On 07.03.2013 11:39, Andre Oppermann wrote: >> >> On 07.03.2013 07:34, Alexander V. Chernikov wrote: >> >> Hello list! >> >> There is a known long-lived issue with interface routes >> addition/deletion: >> >> ifconfig iface inet 1.2.3.4/24 can >> fail if given prefix is >> >> already in >> kernel route table (for >> example, advertised by IGP like OSPF). >> >> Interface route can be deleted via route(8) or any route >> socket user >> (sometimes this happens with >> popular opensource daemons like bird/quagga). >> >> Problem is reported at least in kern/106722 and >> kern/155772. >> >> >> You patch is a welcome addition. >> >> This can be fixed the following way: >> Immutable route flag (RTM_PINNED, added in 19995 with >> 'for future use' >> comment) is utilised to mark >> route 'immutable'. >> rtrequest1_fib refuses to delete routes with given flag >> unless >> RTM_PINNED is set in rti_flags. >> >> >> How do the routing daemons react to being unable to >> change/delete >> such a route? >> >> routing daemons live long with the fact that there route socket >> cmds can >> fail (and the is route(8) utility which can do anything), so >> typically >> bird/quagga yells like >> 'bird: KRT: Error sending route 11.0.0.0/24 >> to kernel: File exists' >> >> and marks given route as not installed in internal RIB. >> Additionally, >> daemon will probably re-try to insert such routes on every >> periodic KRT >> rescan (tens of minutes). >> >> >> >> Isn't it better to teach the routing code about metrics. >> Routing daemons cope better this way and they can handle this. >> So the policy of this behaviour can be controled by administrator rather >> than by code! >> With metrics you can add routes with bigger metric for interfaces and >> lower from routing daemons. >> This also can mitigate somehow on interfaces with the same subnet >> configured possibly. >> > > Generally I agree with you that this would be the ideal outcome. > However we're still quite a bit away from reaching that goal. > To make this really work we have make mpath plus metrics a first > class citizen in the routing code and also the update the routing > daemons kernel interfaces to know about this. I hope we get there > in the not too distant future. > > As a first step I think it is important that Alexanders patch goes > in to fix a long standing and very annoying problem with the code > we have. Also the link down route withdraw should be added asap. > Then we can take the next steps towards the ultimate goal you describe. > > I hope you do not object to Alexanders patch? No objection, just trying to put the focus where it needs to be. Yeah its good to have options there just as always the interface route should not be scrubbed on interface event since bound sockets to that interface will behave strangely. > > > -- > Andre > > --=20 Ermal From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 09:00:23 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D869C1FF for ; Fri, 8 Mar 2013 09:00:23 +0000 (UTC) (envelope-from vpenkoff@gmail.com) Received: from mail-la0-x22f.google.com (mail-la0-x22f.google.com [IPv6:2a00:1450:4010:c03::22f]) by mx1.freebsd.org (Postfix) with ESMTP id 44D1A201 for ; Fri, 8 Mar 2013 09:00:23 +0000 (UTC) Received: by mail-la0-f47.google.com with SMTP id fj20so1449973lab.34 for ; Fri, 08 Mar 2013 01:00:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=+bXbQQ2fs383BOb+yIayLEKGoTbSeLuz7zSRzDzVAE8=; b=MTBCh2kszzfrSRh1GhKQcZ628lhaakDYWk0g8KRnRirGsDcjI4lJcqq24/p045BenL CqWdwYtR7qVk/GOGB1U9ePEFfay/ZAxluobSobXg5O1Z42rf5zRSUMzNvloEfZAA5wXk XgOBpcQYillkbJCX59xJ3wZJyw9epdtE6H3OKmhTlfREDOvIfmZaQiiYx65IBVJ4mtuF xDQadyyDcxwQKxPr4EDLmsPHxFHlUI0wjTk2CuFrNB54AyhSK5JtecPIdfc4d0faaaD3 eGEPDIW8UAFBiCP1Ax2itTsYnqrzSsccMcSmrCtTqR0bbRtzbuQ54sy7en7nJu0on5Rf Ckhw== MIME-Version: 1.0 X-Received: by 10.112.16.199 with SMTP id i7mr757383lbd.65.1362733222135; Fri, 08 Mar 2013 01:00:22 -0800 (PST) Received: by 10.112.18.43 with HTTP; Fri, 8 Mar 2013 01:00:22 -0800 (PST) Date: Fri, 8 Mar 2013 11:00:22 +0200 Message-ID: Subject: BPF data representation From: Viktor Penkoff To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 09:00:23 -0000 Hi guys. I'm diggin some bpf stuff and i can't figure out, why there are 3 types of data representations: words, halfwords and bytes? I mean how can i know, which one is best in a place to use? In some basic example, e.g. for packet capturing, considering BPF's manual, i use for ETHERTYPE in the ethernet header a halfword representation, but for a IP address - word representation. Let's say we have some read instructions: BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, X, Y), .... BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), BPF_JUMP(BPF_JMP+BPF+JEQ+BPF_K, 0xABABABAB, X,Y) Can someone explain? Thanks! From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 09:02:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CC3F22A9 for ; Fri, 8 Mar 2013 09:02:21 +0000 (UTC) (envelope-from vpenkoff@gmail.com) Received: from mail-la0-x229.google.com (mail-la0-x229.google.com [IPv6:2a00:1450:4010:c03::229]) by mx1.freebsd.org (Postfix) with ESMTP id 54CAF21A for ; Fri, 8 Mar 2013 09:02:21 +0000 (UTC) Received: by mail-la0-f41.google.com with SMTP id fo12so1453281lab.28 for ; Fri, 08 Mar 2013 01:02:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=2qgRNo+YYSnep3Ug6FJjMu+/oIAqofY4vYaeBrNNTKY=; b=ZgwpewYvPXtm6MNt9Ssu1ohJ3oXVLgs2mC/Yk+DcAW5RECYcqo0QzJJR0/lWfcG1Qk tPyLeqCNnx+0WIrcjCHdyiqZxkC5oQkbbXUuOqwztZsdr8zd5VtMV0nAzSio7pNeo2UK SoFzsNSFuUJRtdzvcS0WmgOAMk4ejtMtNTawX65qCxPMMQk+pLHgZ8sUWaGCTdp6fSEO fejWb74VFIqjppfRGGa8CHfFUcwGPojHPcGBUOkOS6AkO1NyV4pCm84n6IpdMscvmCIR KKjUvv//Wng5G+eqK2SLNh34t67YUC3vqLB7lvyahXAhNEIK9bblz9PETHLmwPy2fYIf mOMg== MIME-Version: 1.0 X-Received: by 10.112.103.168 with SMTP id fx8mr778095lbb.32.1362733340257; Fri, 08 Mar 2013 01:02:20 -0800 (PST) Received: by 10.112.18.43 with HTTP; Fri, 8 Mar 2013 01:02:20 -0800 (PST) In-Reply-To: References: Date: Fri, 8 Mar 2013 11:02:20 +0200 Message-ID: Subject: BPF data representation From: Viktor Penkoff To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 09:02:21 -0000 Hi guys. I'm diggin some bpf stuff and i can't figure out, why there are 3 types of data representations: words, halfwords and bytes? I mean how can i know, which one is best in a place to use? In some basic example, e.g. for packet capturing, considering BPF's manual, i use for ETHERTYPE in the ethernet header a halfword representation, but for a IP address - word representation. Let's say we have some read instructions: BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, X, Y), .... BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), BPF_JUMP(BPF_JMP+BPF+JEQ+BPF_K, 0xABABABAB, X,Y) Can someone explain? Thanks! From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 11:56:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F0364C3B for ; Fri, 8 Mar 2013 11:56:21 +0000 (UTC) (envelope-from milu@dat.pl) Received: from jab.dat.pl (dat.pl [80.51.155.34]) by mx1.freebsd.org (Postfix) with ESMTP id A9FB3B6D for ; Fri, 8 Mar 2013 11:56:20 +0000 (UTC) Received: from jab.dat.pl (jsrv.dat.pl [127.0.0.1]) by jab.dat.pl (Postfix) with ESMTP id BDB32F8; Fri, 8 Mar 2013 12:56:18 +0100 (CET) X-Virus-Scanned: amavisd-new at dat.pl Received: from jab.dat.pl ([127.0.0.1]) by jab.dat.pl (jab.dat.pl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id WZ8NBY1bV2Wy; Fri, 8 Mar 2013 12:56:12 +0100 (CET) Received: from [10.0.6.80] (unknown [212.69.68.42]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by jab.dat.pl (Postfix) with ESMTPSA id 8A98C4C; Fri, 8 Mar 2013 12:56:12 +0100 (CET) Message-ID: <5139D20F.4050901@dat.pl> Date: Fri, 08 Mar 2013 12:57:03 +0100 From: Maciej Milewski User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130221 Thunderbird/17.0.3 MIME-Version: 1.0 To: freebsd-net Subject: Re: Implementing IP6 in 8.3 References: <5138AED9.1020801@dat.pl> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 11:56:22 -0000 On 07.03.2013 17:55, freebsd-net wrote: > Greetings Maciej Milewski, and thank you for your thoughtful reply. >> On 06.03.2013 22:02, freebsd-net wrote: >>> Greetings, >>> I'm evaluating an ISP for the sake of building BSD operating systems on hardware >>> that they use (DSL modems, in this case). When I had my old NEC server, I had a >>> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at >>> it for use in alot of hardware I have laying around. In my current situation, I'm >>> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively >>> new modem, it doesn't support IP6. It is my hope to replace the OS with one that >>> does. :) >> If it doesn't support IPv6 you can always try to use it in Transparent >> Bridging (RFC1483) mode. >> >> You can then put other router/computer that does IPv6 routing just after >> that modem. >> > Thank you for the links. I was aware of that, but requires that every connection > directly to the modem, send the PPPoE creds to the modem. While it's simple enough > to connect a router/switch between the modem, and clients, it adds an additional > hop. I think I'll be better served building a (free)BSD kernel, and drivers for > the modem -- assuming that because the modem doesn't IP6, it's not possible to > route IP6 traffic directly, unless through a "tunnel broker". If you are sure that you can build kernel for that modem device then try it. From my experience it's rather hard. Mainly because today's hw is too cheap to have working hw interfaces(like DSL modem) and it's all done in software way. Shortest and fastest way would be setting this modem as transparent bridge. Then put your own router/gateway(which is IPv6 capable). Router on WAN side connects through PPPoE to your ISP and LAN/WLAN side connects to your switch or you computers directly. It will be additional device between you and your ISP but in many cases that's much better than having all-in-one(which can't do IPv6). I'd go that way. > Thanks again, for taking the time to respond. > > --Chris I hope that puts more light to what you try to do. -- Pozdrawiam, Maciej Milewski From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 13:19:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4D879BD9 for ; Fri, 8 Mar 2013 13:19:21 +0000 (UTC) (envelope-from vegeta@tuxpowered.net) Received: from mail-bk0-x22a.google.com (mail-bk0-x22a.google.com [IPv6:2a00:1450:4008:c01::22a]) by mx1.freebsd.org (Postfix) with ESMTP id D6F22CD for ; Fri, 8 Mar 2013 13:19:20 +0000 (UTC) Received: by mail-bk0-f42.google.com with SMTP id jk7so688542bkc.15 for ; Fri, 08 Mar 2013 05:19:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:user-agent:mime-version:x-uid :content-type:content-transfer-encoding:message-id :x-gm-message-state; bh=VplLr68ahVUntemVm74O8Aql7acd7eYNjzo7VLVGfH4=; b=aqS8CbZGI+ay9ovXIfSMdKJ0lq3RfX2RHyf24DJpQbrSb/gBn/e955rRvK5Z69OX5D 7Q4MdKcRxxP4rwZ1p0rjjrfmLLKkqhu1G+guWaKYhFm3Z5eNIzf6q0fxP1ZsOk1z+87M jLr/zGHqp8HV7ZXEk1hFwnfjJtk1p5hABWzXYsJ+TprOTwnmkHfPtsofuyDqQ17oA2Gw Dm1V11tLUT5ItXhY/6jJq1Mk4V77bcRrL7EAhVb5JJn6Gyf4GIw0qlJ3coipindq7oMq MKgSgrycMZ+DSfXrdzWgX4ENnF2pk1jLB5dMNCxei+MNBewHt/VGfhgv7VP3pMG61GdB xICQ== X-Received: by 10.204.198.3 with SMTP id em3mr821253bkb.96.1362748759383; Fri, 08 Mar 2013 05:19:19 -0800 (PST) Received: from zvezda.localnet ([212.48.107.10]) by mx.google.com with ESMTPS id z6sm1676495bkv.11.2013.03.08.05.19.18 (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 08 Mar 2013 05:19:18 -0800 (PST) From: Kajetan Staszkiewicz To: "freebsd-net@freebsd.org" Subject: [patch] Source entries removing is awfully slow. Date: Fri, 8 Mar 2013 14:19:17 +0100 User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; ) MIME-Version: 1.0 X-UID: 1998 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201303081419.17743.vegeta@tuxpowered.net> X-Gm-Message-State: ALoCoQlhTnlJoQ5UWXk/k82qQf2EQF2TP65X5JmF+Bc8mNaNOzgeWbO0dwLIlC4JHkbJHQW5yG92 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 13:19:21 -0000 Hello there! In my enviroment, where I use FreeBSD machines as loadbalancers, after a server is detected as dead, loadbalancer removes the the broken server from a table used in route-to pf rule and then removes Source entries pointing clients to that server, so clients previously assigned to the broken server are re- loadbalanced to alive servers. Each loadbalancer has around 50k Source and 500k State entries. Under those conditions removing a Source from anywhere to a dead server with `pfctl -K 0.0.0.0/0 -K internal.IP.of.server` freezes the machine for a few seconds (or even up to a minute in other datacenter segment, where different services are served, causing thousands instead of just a few hundred States to be matched). Under a DDoS attack, when removing Sources to a server under attack, kernel freezes permanently (I gave up after 10 minutes waiting and restarted the machine). A patch fixing the issue can be found here: http://vegeta.tuxpowered.net/download/link-states-to-src_node.patch -- | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | | Vegeta | www: http://vegeta.tuxpowered.net | `------------------------^---------------------------------------' From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 15:07:14 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 66552B7F for ; Fri, 8 Mar 2013 15:07:14 +0000 (UTC) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimateDNS.NET (ultimatedns.net [209.180.214.225]) by mx1.freebsd.org (Postfix) with ESMTP id D1997964 for ; Fri, 8 Mar 2013 15:07:13 +0000 (UTC) Received: from udns.ultimateDNS.NET (localhost [127.0.0.1]) by udns.ultimateDNS.NET (8.14.5/8.14.5) with ESMTP id r28F70Xg023502; Fri, 8 Mar 2013 07:07:06 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: (from www@localhost) by udns.ultimateDNS.NET (8.14.5/8.14.5/Submit) id r28F6r2Y023489; Fri, 8 Mar 2013 07:06:53 -0800 (PST) (envelope-from fbsdmail@dnswatch.com) Received: from udns.ultimatedns.net ([209.180.214.225]) (UDNSMS authenticated user chrish) by ultimatedns.net with HTTP; Fri, 8 Mar 2013 07:06:53 -0800 (PST) Message-ID: <97d1f60d519956584c4927f72c43e97f.authenticated@ultimatedns.net> In-Reply-To: <5139D20F.4050901@dat.pl> References: <5138AED9.1020801@dat.pl> <5139D20F.4050901@dat.pl> Date: Fri, 8 Mar 2013 07:06:53 -0800 (PST) Subject: Re: Implementing IP6 in 8.3 From: "freebsd-net" To: "Maciej Milewski" User-Agent: UDNSMS/2.0.3 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 15:07:14 -0000 Maciej Milewski, and thank you for your reply. > On 07.03.2013 17:55, freebsd-net wrote: >> Greetings Maciej Milewski, and thank you for your thoughtful reply. >>> On 06.03.2013 22:02, freebsd-net wrote: >>>> Greetings, >>>> I'm evaluating an ISP for the sake of building BSD operating systems on hardware >>>> that they use (DSL modems, in this case). When I had my old NEC server, I had a >>>> MIPS environment to develop in. I managed a 28k kernel. In any case, I'm back at >>>> it for use in alot of hardware I have laying around. In my current situation, I'm >>>> using a ZYXEL Q1000Z modem to connect to their service. While it's a relatively >>>> new modem, it doesn't support IP6. It is my hope to replace the OS with one that >>>> does. :) >>> If it doesn't support IPv6 you can always try to use it in Transparent >>> Bridging (RFC1483) mode. >>> >>> You can then put other router/computer that does IPv6 routing just after >>> that modem. >>> >> Thank you for the links. I was aware of that, but requires that every connection >> directly to the modem, send the PPPoE creds to the modem. While it's simple enough >> to connect a router/switch between the modem, and clients, it adds an additional >> hop. I think I'll be better served building a (free)BSD kernel, and drivers for >> the modem -- assuming that because the modem doesn't IP6, it's not possible to >> route IP6 traffic directly, unless through a "tunnel broker". > If you are sure that you can build kernel for that modem device then try > it. From my experience it's rather hard. Mainly because today's hw is > too cheap to have working hw interfaces(like DSL modem) and it's all > done in software way. > Shortest and fastest way would be setting this modem as transparent > bridge. Then put your own router/gateway(which is IPv6 capable). Router > on WAN side connects through PPPoE to your ISP and LAN/WLAN side > connects to your switch or you computers directly. It will be additional > device between you and your ISP but in many cases that's much better > than having all-in-one(which can't do IPv6). I'd go that way. > >> Thanks again, for taking the time to respond. >> >> --Chris > > I hope that puts more light to what you try to do. While I agree, inserting a router/switch between the modem & the clients/servers would be the shortest/easiest solution. In the end, I think the investment in building a (free)bsd kernel && drivers for the modem would have/provide the biggest reward(s). Truth be told; I have accumulated quite a mass of this type of equipment over the years, and I'd like to take a stab at building a (free)bsd kernel with associated drivers for them. Their all MIPS based, and many of them have ~32Mb && ~64Mb flash space & RAM. So, resources aren't too unreasonable. In the end, the benefits of having something /I/ have control over, makes these devices a great more valuable. It also empowers others whom are currently subject to the limitations their ISP imposes on them. Thank you again for taking the time to respond. --Chris > -- > Pozdrawiam, > Maciej Milewski > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 15:27:01 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 24B2EFF6 for ; Fri, 8 Mar 2013 15:27:01 +0000 (UTC) (envelope-from milu@dat.pl) Received: from jab.dat.pl (dat.pl [80.51.155.34]) by mx1.freebsd.org (Postfix) with ESMTP id D53D4ADE for ; Fri, 8 Mar 2013 15:27:00 +0000 (UTC) Received: from jab.dat.pl (jsrv.dat.pl [127.0.0.1]) by jab.dat.pl (Postfix) with ESMTP id C38D3113; Fri, 8 Mar 2013 16:26:52 +0100 (CET) X-Virus-Scanned: amavisd-new at dat.pl Received: from jab.dat.pl ([127.0.0.1]) by jab.dat.pl (jab.dat.pl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id uzxTyBqV-abg; Fri, 8 Mar 2013 16:26:48 +0100 (CET) Received: from [10.0.6.80] (unknown [212.69.68.42]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by jab.dat.pl (Postfix) with ESMTPSA id 37FF090; Fri, 8 Mar 2013 16:26:48 +0100 (CET) Message-ID: <513A036A.9040406@dat.pl> Date: Fri, 08 Mar 2013 16:27:38 +0100 From: Maciej Milewski User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130221 Thunderbird/17.0.3 MIME-Version: 1.0 To: freebsd-net Subject: Re: Implementing IP6 in 8.3 References: <5138AED9.1020801@dat.pl> <5139D20F.4050901@dat.pl> <97d1f60d519956584c4927f72c43e97f.authenticated@ultimatedns.net> In-Reply-To: <97d1f60d519956584c4927f72c43e97f.authenticated@ultimatedns.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 15:27:01 -0000 On 08.03.2013 16:06, freebsd-net wrote: > While I agree, inserting a router/switch between the modem & the clients/servers > would be the shortest/easiest solution. In the end, I think the investment in > building a (free)bsd kernel && drivers for the modem would have/provide the > biggest reward(s). Truth be told; I have accumulated quite a mass of this type > of equipment over the years, and I'd like to take a stab at building a > (free)bsd kernel with associated drivers for them. Their all MIPS based, and > many of them have ~32Mb && ~64Mb flash space & RAM. So, resources aren't too > unreasonable. In the end, the benefits of having something /I/ have control over, > makes these devices a great more valuable. It also empowers others whom are > currently subject to the limitations their ISP imposes on them. > > Thank you again for taking the time to respond. > > --Chris That's all correct as long as there are all pieces. F.ex. I've heard of some low level problems with some of the chipsets. The wifi chipsets are the most known for this. I think that I've heard about xDSL chipsets with similar problems. I wish you all the best with making your own firmware for this hardware. -- Pozdrawiam, Maciej Milewski From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 15:32:10 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5BA302BB; Fri, 8 Mar 2013 15:32:10 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id F3D81B51; Fri, 8 Mar 2013 15:32:09 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDzKb-000NOA-KA; Fri, 08 Mar 2013 19:35:37 +0400 Message-ID: <513A0459.7010809@FreeBSD.org> Date: Fri, 08 Mar 2013 19:31:37 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: jmg@funkthat.com, Andre Oppermann , net@freebsd.org Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <20130307214205.GD50035@funkthat.com> In-Reply-To: <20130307214205.GD50035@funkthat.com> Content-Type: multipart/mixed; boundary="------------050906070400020401060909" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 15:32:10 -0000 This is a multi-part message in MIME format. --------------050906070400020401060909 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 08.03.2013 01:42, John-Mark Gurney wrote: > Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 +0100: >>> Adding interface address is handled via atomically deleting old prefix and >>> adding interface one. >> >> This brings up a long standing sore point of our routing code >> which this patch makes more pronounced. When an interface link >> state is down I don't want the route to it to persist but to >> become inactive so another path can be chosen. This the very >> point of running a routing daemon. So on the link-down event >> the installed interface routes should be removed from the routing >> table. The configured addresses though should persist and the >> interface routes re-installed on a link-up event. What's your >> opinion on it? >> >> Other than these points I think your code is fine and can go >> into the tree. > > The issue that I see with this is that if you bump your cable, all > your connections will be dropped, because as soon as they try to send > something, they'll get a no route to host, and this will break the > TCP connection... If we keep the routes when the link goes down, > the packet will be queued or dropped (depending upon ethernet driver), > but the TCP connection will not break... Yes. Older one using if_start with OS queue should queue traffic, while if_transmit ones probably drop it. So this behavior should be configurable depending on OS role. Patch attached. Other issues like carp, IPv6 or similar can arise, so this definitely deserves wider discussion. > --------------050906070400020401060909 Content-Type: text/plain; name="remove_iface_routes.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="remove_iface_routes.diff" Index: sys/net/if.c =================================================================== --- sys/net/if.c (revision 247623) +++ sys/net/if.c (working copy) @@ -112,6 +112,12 @@ SYSCTL_INT(_net_link, OID_AUTO, log_link_state_cha &log_link_state_change, 0, "log interface link state change events"); +static VNET_DEFINE(int, remove_if_routes) = 0; +#define V_remove_if_routes VNET(remove_if_routes) +SYSCTL_VNET_INT(_net_link, OID_AUTO, remove_iface_routes_on_change, CTLFLAG_RW, + &VNET_NAME(remove_if_routes), 0, + "Remove iface routes on link state change"); + /* Interface description */ static unsigned int ifdescr_maxlen = 1024; SYSCTL_UINT(_net, OID_AUTO, ifdescr_maxlen, CTLFLAG_RW, @@ -161,10 +167,10 @@ static int ifconf(u_long, caddr_t); static void if_freemulti(struct ifmultiaddr *); static void if_init(void *); static void if_grow(void); -static void if_route(struct ifnet *, int flag, int fam); +static void if_route(struct ifnet *, int fam); static int if_setflag(struct ifnet *, int, int, int *, int); static int if_transmit(struct ifnet *ifp, struct mbuf *m); -static void if_unroute(struct ifnet *, int flag, int fam); +static void if_unroute(struct ifnet *, int fam); static void link_rtrequest(int, struct rtentry *, struct rt_addrinfo *); static int if_rtdel(struct radix_node *, void *); static int ifhwioctl(u_long, struct ifnet *, caddr_t, struct thread *); @@ -1834,22 +1841,13 @@ link_rtrequest(int cmd, struct rtentry *rt, struct * the transition. */ static void -if_unroute(struct ifnet *ifp, int flag, int fam) +if_unroute(struct ifnet *ifp, int fam) { struct ifaddr *ifa; - KASSERT(flag == IFF_UP, ("if_unroute: flag != IFF_UP")); - - ifp->if_flags &= ~flag; - getmicrotime(&ifp->if_lastchange); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) if (fam == PF_UNSPEC || (fam == ifa->ifa_addr->sa_family)) pfctlinput(PRC_IFDOWN, ifa->ifa_addr); - ifp->if_qflush(ifp); - - if (ifp->if_carp) - (*carp_linkstate_p)(ifp); - rt_ifmsg(ifp); } /* @@ -1857,23 +1855,13 @@ static void * the transition. */ static void -if_route(struct ifnet *ifp, int flag, int fam) +if_route(struct ifnet *ifp, int fam) { struct ifaddr *ifa; - KASSERT(flag == IFF_UP, ("if_route: flag != IFF_UP")); - - ifp->if_flags |= flag; - getmicrotime(&ifp->if_lastchange); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) if (fam == PF_UNSPEC || (fam == ifa->ifa_addr->sa_family)) pfctlinput(PRC_IFUP, ifa->ifa_addr); - if (ifp->if_carp) - (*carp_linkstate_p)(ifp); - rt_ifmsg(ifp); -#ifdef INET6 - in6_if_up(ifp); -#endif } void (*vlan_link_state_p)(struct ifnet *); /* XXX: private from if_vlan */ @@ -1909,8 +1897,19 @@ do_link_state_change(void *arg, int pending) int link_state = ifp->if_link_state; CURVNET_SET(ifp->if_vnet); + /* Remove routes if link goes down */ + if (V_remove_if_routes != 0 && link_state == LINK_STATE_DOWN && + (ifp->if_flags & IFF_UP)) + if_unroute(ifp, PF_UNSPEC); + /* Notify that the link state has changed. */ rt_ifmsg(ifp); + + /* Announce routes IFF Oper & Admin state is UP */ + if (V_remove_if_routes != 0 && link_state == LINK_STATE_UP && + (ifp->if_flags & IFF_UP)) + if_route(ifp, PF_UNSPEC); + if (ifp->if_vlantrunk != NULL) (*vlan_link_state_p)(ifp); @@ -1945,7 +1944,16 @@ void if_down(struct ifnet *ifp) { - if_unroute(ifp, IFF_UP, AF_UNSPEC); + ifp->if_flags &= ~IFF_UP; + getmicrotime(&ifp->if_lastchange); + + if_unroute(ifp, AF_UNSPEC); + + ifp->if_qflush(ifp); + + if (ifp->if_carp) + (*carp_linkstate_p)(ifp); + rt_ifmsg(ifp); } /* @@ -1956,7 +1964,18 @@ void if_up(struct ifnet *ifp) { - if_route(ifp, IFF_UP, AF_UNSPEC); + ifp->if_flags |= IFF_UP; + getmicrotime(&ifp->if_lastchange); + + if (V_remove_if_routes == 0 || ifp->if_link_state == LINK_STATE_UP) + if_route(ifp, AF_UNSPEC); + + if (ifp->if_carp) + (*carp_linkstate_p)(ifp); + rt_ifmsg(ifp); +#ifdef INET6 + in6_if_up(ifp); +#endif } /* --------------050906070400020401060909-- From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 17:04:37 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B1251B97 for ; Fri, 8 Mar 2013 17:04:37 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 6B70D377 for ; Fri, 8 Mar 2013 17:04:37 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r28H4a4N003421; Fri, 8 Mar 2013 12:04:36 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r28H4aD4003418; Fri, 8 Mar 2013 12:04:36 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20794.6692.191898.682241@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 12:04:36 -0500 From: Garrett Wollman To: Jack Vogel Subject: Re: Limits on jumbo mbuf cluster allocation In-Reply-To: References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 12:04:36 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 17:04:37 -0000 < said: > I am not strongly opposed to trying the 4k mbuf pool for all larger sizes, > Garrett maybe if you would try that on your system and see if that helps > you, I could envision making this a tunable at some point perhaps? If you can provide a patch I can certainly build it in to our kernel and have it ready the next time the production server crashes. I'd like it to be at least a *little* tested by someone else beforehand, though. -GAWollman From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 17:09:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5D674D4E; Fri, 8 Mar 2013 17:09:57 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 064B15EA; Fri, 8 Mar 2013 17:09:56 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r28H9ugC003460; Fri, 8 Mar 2013 12:09:56 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r28H9uOI003457; Fri, 8 Mar 2013 12:09:56 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20794.7012.265887.99878@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 12:09:56 -0500 From: Garrett Wollman To: Andre Oppermann Subject: Re: Limits on jumbo mbuf cluster allocation In-Reply-To: <51399926.6020201@freebsd.org> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 12:09:56 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: jfv@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 17:09:57 -0000 < said: > [stuff I wrote deleted] > You have an amd64 kernel running HEAD or 9.x? Yes, these are 9.1 with some patches to reduce mutex contention on the NFS server's replay "cache". > Jumbo pages come directly from the kernel_map which on amd64 is 512GB. > So KVA shouldn't be a problem. Your problem indeed appears to come > physical memory fragmentation in pmap. I hadn't realized that they were physically contiguous, but that makes perfect sense. > pages. Also since you're doing NFS serving almost all memory will be > in use for file caching. I actually had the ZFS ARC tuned down to 64 GB (out of 96 GB physmem) when I experienced this, but there are plenty of data structures in the kernel that aren't subject to this limit and I could easily imagine them checkerboarding physical memory to the point where no contiguous three-page allocations were possible. -GAWollman From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 18:06:13 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5437A2D0 for ; Fri, 8 Mar 2013 18:06:13 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id B9162A97 for ; Fri, 8 Mar 2013 18:06:12 +0000 (UTC) Received: (qmail 93868 invoked from network); 8 Mar 2013 19:19:21 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 8 Mar 2013 19:19:21 -0000 Message-ID: <513A2887.2010408@freebsd.org> Date: Fri, 08 Mar 2013 19:05:59 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Garrett Wollman Subject: Re: Limits on jumbo mbuf cluster allocation References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> <20794.6692.191898.682241@hergotha.csail.mit.edu> In-Reply-To: <20794.6692.191898.682241@hergotha.csail.mit.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Jack Vogel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 18:06:13 -0000 On 08.03.2013 18:04, Garrett Wollman wrote: > < said: > >> I am not strongly opposed to trying the 4k mbuf pool for all larger sizes, >> Garrett maybe if you would try that on your system and see if that helps >> you, I could envision making this a tunable at some point perhaps? > > If you can provide a patch I can certainly build it in to our kernel > and have it ready the next time the production server crashes. I'd > like it to be at least a *little* tested by someone else beforehand, > though. This should do the trick. -- Andre Index: dev/ixgbe/ixgbe.c =================================================================== --- dev/ixgbe/ixgbe.c (revision 247893) +++ dev/ixgbe/ixgbe.c (working copy) @@ -1120,12 +1120,8 @@ */ if (adapter->max_frame_size <= 2048) adapter->rx_mbuf_sz = MCLBYTES; - else if (adapter->max_frame_size <= 4096) + else adapter->rx_mbuf_sz = MJUMPAGESIZE; - else if (adapter->max_frame_size <= 9216) - adapter->rx_mbuf_sz = MJUM9BYTES; - else - adapter->rx_mbuf_sz = MJUM16BYTES; /* Prepare receive descriptors and buffers */ if (ixgbe_setup_receive_structures(adapter)) { From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 19:19:30 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B98161BA for ; Fri, 8 Mar 2013 19:19:30 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com [209.85.217.171]) by mx1.freebsd.org (Postfix) with ESMTP id 3B9B37DB for ; Fri, 8 Mar 2013 19:19:28 +0000 (UTC) Received: by mail-lb0-f171.google.com with SMTP id gg13so1597740lbb.30 for ; Fri, 08 Mar 2013 11:19:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:content-type:content-transfer-encoding:subject :message-id:date:to:mime-version:x-mailer:x-gm-message-state; bh=hpP2DfqP0i7Cbc+wFXdogReSXv9+5ysn6PhmsGsBvPg=; b=bpYNttKSr/EG12Mj+4/q0wS049grHQyxDJBdIQy9qvoNL39B0ejeQOY1oHVXM0C5PH 5i71iW3sspPVssrk7YABSqRI4kG3ByztAFslixhiz+lucZrGAUiBBaKlj3l9bxFn0oMA PGFBD7RfbnAivLuRiIrbDinVxQTV1p3sdb0su67ClyaD2aS3T1nkNhHgrVMjfhzRBkoW jVp2cJm1n24F8LFCiRmiymK5eOCq3vWAM1/prr3tZ3tnfR8FTIMenWHDgDKnzhR6AjYU 74nDdWVCfA5dEeDZ0QQTVv5huD3vSF8YT5C03iRIadhS2swdQXlP54GHzgPYQ5NKWpUR eQ3A== X-Received: by 10.152.104.80 with SMTP id gc16mr2937538lab.49.1362770367656; Fri, 08 Mar 2013 11:19:27 -0800 (PST) Received: from grey.home.unixconn.com (h-75-17.a183.priv.bahnhof.se. [46.59.75.17]) by mx.google.com with ESMTPS id xw14sm3072505lab.6.2013.03.08.11.19.25 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 08 Mar 2013 11:19:26 -0800 (PST) From: mxb Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: 9.1-RELEASE-p1: em0: Could not setup receive structures Message-Id: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se> Date: Fri, 8 Mar 2013 20:19:24 +0100 To: freebsd-net@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) X-Gm-Message-State: ALoCoQnplJjATAZxjQmJP+UGWD6UIBjCMX5FPWaXxwMH9eolz2KL0oPbdrG973j6mgaCcMlVJ4Pq X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 19:19:30 -0000 Hello list@, I'm mostly active on OpenBSD-side, however I have several machines = running fbsd with ZFS. I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of em(4) = with the problem on 8.2-stable. However, my problem has not disappeared after mentioned upgrade. I serve VMWare images from this machine. Configuration: lagg0: flags=3D8843 metric 0 mtu = 9000 = options=3D4019b ether 00:25:90:24:70:e8 inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255 nd6 options=3D1 media: Ethernet autoselect status: active laggproto lacp lagghash l2,l3,l4 laggport: igb0 flags=3D1c laggport: em0 flags=3D18 lagg1: flags=3D8843 metric 0 mtu = 9000 = options=3D4019b ether 00:25:90:24:70:e9 inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255 inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255 nd6 options=3D1 media: Ethernet autoselect status: active laggproto lacp lagghash l2,l3,l4 laggport: igb1 flags=3D1c laggport: em1 flags=3D1c current sysctl.conf (not fixed after upgrade): # Every socket is a file, so increase them kern.maxfiles=3D204800 kern.maxfilesperproc=3D200000 kern.ipc.maxsockets=3D204800 # Increase max command-line length showed in `ps` (e.g for Tomcat/Java) # Default is PAGE_SIZE / 16 or 256 on x86 # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120749 kern.ps_arg_cache_limit=3D4096 # Security #net.inet.udp.blackhole=3D1 #net.inet.tcp.blackhole=3D2 kern.ipc.maxsockbuf=3D16777216 kern.ipc.nmbclusters=3D65535 kern.ipc.somaxconn=3D32768 #kern.maxfiles=3D65535 kern.maxvnodes=3D800000 vfs.zfs.l2arc_noprefetch=3D0 vfs.zfs.l2arc_write_max=3D16777216 vfs.zfs.l2arc_write_boost=3D16777216 net.inet.tcp.sendspace=3D65535 net.inet.tcp.recvspace=3D131072 net.inet.tcp.mssdflt=3D1452 net.inet.tcp.sendbuf_max=3D16777216 net.inet.tcp.sendbuf_inc=3D524288 net.inet.tcp.recvbuf_max=3D16777216 net.inet.tcp.recvbuf_inc=3D524288 net.inet.udp.recvspace=3D65535 net.inet.udp.maxdgram=3D65535 net.local.stream.recvspace=3D65535 net.local.stream.sendspace=3D65535 net.inet.tcp.delayed_ack=3D0 Any clues? Please mail directly to me or cc to sysop@prisjakt.nu Regards Maxim From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:03:21 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8FFBF41A for ; Fri, 8 Mar 2013 20:03:21 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vc0-f179.google.com (mail-vc0-f179.google.com [209.85.220.179]) by mx1.freebsd.org (Postfix) with ESMTP id 537CD718 for ; Fri, 8 Mar 2013 20:03:21 +0000 (UTC) Received: by mail-vc0-f179.google.com with SMTP id k1so1095174vck.38 for ; Fri, 08 Mar 2013 12:03:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=nedf9ngqEBAa4nY079vCkQ80qvaCJ3Sbsdtri/ES8t8=; b=X/arb8emGD9aaULCqa5ckKIQjQEDaVMlRTl6V89g68ot0mrjSxA54kWeIQz30Qvf39 JoFT6jw3/w6O8Qyl2R6/iQzDZM0Y+t3JUj5++dl5Imdw+zcUnBHtyUfh5BPuz15Rxdd2 DfcHH2BW6rsX/fd9yJp9lc+yzt6cJs4G8/4j9ECMTxa9wA6keY0OWhuEbdum7ghwq5jC 0w7tP3631rGSnPU1HRq38V0GgB/5J1OXErK9Z0Q/HUWtAuZucV1ozc5st3tK9fEgSLgm W4gEwrEYrNUz5epIfSVCYShxePtrvqVgmxKoh1z6ezESCpT3EcZwdRDmkkLcpenahWee WPmA== MIME-Version: 1.0 X-Received: by 10.52.30.48 with SMTP id p16mr1275596vdh.118.1362773000366; Fri, 08 Mar 2013 12:03:20 -0800 (PST) Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 12:03:20 -0800 (PST) In-Reply-To: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se> References: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se> Date: Fri, 8 Mar 2013 12:03:20 -0800 Message-ID: Subject: Re: 9.1-RELEASE-p1: em0: Could not setup receive structures From: Jack Vogel To: mxb Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:03:21 -0000 The message occurs because you don't have enough mbufs to setup the RX ring, so you need to look at nmbclusters. It may be that em is just the victim, since you have igb interfaces as well from what I see. Jack On Fri, Mar 8, 2013 at 11:19 AM, mxb wrote: > > Hello list@, > > I'm mostly active on OpenBSD-side, however I have several machines running > fbsd with ZFS. > > I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of em(4) > with the problem on 8.2-stable. > However, my problem has not disappeared after mentioned upgrade. > > I serve VMWare images from this machine. > > Configuration: > > lagg0: flags=8843 metric 0 mtu 9000 > > options=4019b > ether 00:25:90:24:70:e8 > inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255 > nd6 options=1 > media: Ethernet autoselect > status: active > laggproto lacp lagghash l2,l3,l4 > laggport: igb0 flags=1c > laggport: em0 flags=18 > lagg1: flags=8843 metric 0 mtu 9000 > > options=4019b > ether 00:25:90:24:70:e9 > inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255 > inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255 > nd6 options=1 > media: Ethernet autoselect > status: active > laggproto lacp lagghash l2,l3,l4 > laggport: igb1 flags=1c > laggport: em1 flags=1c > > current sysctl.conf (not fixed after upgrade): > # Every socket is a file, so increase them > kern.maxfiles=204800 > kern.maxfilesperproc=200000 > kern.ipc.maxsockets=204800 > > # Increase max command-line length showed in `ps` (e.g for Tomcat/Java) > # Default is PAGE_SIZE / 16 or 256 on x86 > # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=120749 > kern.ps_arg_cache_limit=4096 > > # Security > #net.inet.udp.blackhole=1 > #net.inet.tcp.blackhole=2 > > kern.ipc.maxsockbuf=16777216 > kern.ipc.nmbclusters=65535 > kern.ipc.somaxconn=32768 > #kern.maxfiles=65535 > kern.maxvnodes=800000 > > vfs.zfs.l2arc_noprefetch=0 > vfs.zfs.l2arc_write_max=16777216 > vfs.zfs.l2arc_write_boost=16777216 > > net.inet.tcp.sendspace=65535 > net.inet.tcp.recvspace=131072 > net.inet.tcp.mssdflt=1452 > net.inet.tcp.sendbuf_max=16777216 > net.inet.tcp.sendbuf_inc=524288 > net.inet.tcp.recvbuf_max=16777216 > net.inet.tcp.recvbuf_inc=524288 > net.inet.udp.recvspace=65535 > net.inet.udp.maxdgram=65535 > net.local.stream.recvspace=65535 > net.local.stream.sendspace=65535 > net.inet.tcp.delayed_ack=0 > > > Any clues? > > Please mail directly to me or cc to sysop@prisjakt.nu > > Regards > Maxim > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:11:50 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 163E0881; Fri, 8 Mar 2013 20:11:50 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qe0-f43.google.com (mail-qe0-f43.google.com [209.85.128.43]) by mx1.freebsd.org (Postfix) with ESMTP id A7D6E82C; Fri, 8 Mar 2013 20:11:49 +0000 (UTC) Received: by mail-qe0-f43.google.com with SMTP id 1so1241902qee.2 for ; Fri, 08 Mar 2013 12:11:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=6dfG5ak3dFdXeVLMvxfuj1OJZUZSzvkRA24n915cEJ0=; b=USSvbn+UoCu3EK8sPoJWRMDiTWtDb9brRRGWLHAZW9eVjBHSv7H5kqgidwdDyxmM/i p3ggkKyvVcorXyUt+4oE9F41QqWJtJj6wKbA5K6gqINOgK0dhnGFwEYfCQWmvMMSFTts R9vzlaMny4m3vBJdZPp7oRmXHgT3hfQvSsctmVj7u2qockJOcnBz96t87yt5sgr1topB SbdZNG2nfiUVuBq8eXQXnhSr6tkqPYkBEYnc/UiJK4itXxfm8moFaeRk9d7735p0yGv5 v3fqzD/7Q33nQsiWveGxiI7znfhqNC92ggkNqg2aAUUgrE26KduAK8DrMik4STf/HDrp qGEg== MIME-Version: 1.0 X-Received: by 10.224.184.130 with SMTP id ck2mr5848224qab.41.1362773503493; Fri, 08 Mar 2013 12:11:43 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Fri, 8 Mar 2013 12:11:43 -0800 (PST) In-Reply-To: <201303081419.17743.vegeta@tuxpowered.net> References: <201303081419.17743.vegeta@tuxpowered.net> Date: Fri, 8 Mar 2013 21:11:43 +0100 X-Google-Sender-Auth: 9xXpcPwr1C64h_-MLHQWtFTBtYw Message-ID: Subject: Re: [patch] Source entries removing is awfully slow. From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Kajetan Staszkiewicz Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:11:50 -0000 Is this FreeBSD 9.x or HEAD? On Fri, Mar 8, 2013 at 2:19 PM, Kajetan Staszkiewicz wrote: > Hello there! > > In my enviroment, where I use FreeBSD machines as loadbalancers, after a > server > is detected as dead, loadbalancer removes the the broken server from a > table > used in route-to pf rule and then removes Source entries pointing clients > to > that server, so clients previously assigned to the broken server are re- > loadbalanced to alive servers. > > Each loadbalancer has around 50k Source and 500k State entries. Under those > conditions removing a Source from anywhere to a dead server with `pfctl -K > 0.0.0.0/0 -K internal.IP.of.server` freezes the machine for a few seconds > (or > even up to a minute in other datacenter segment, where different services > are > served, causing thousands instead of just a few hundred States to be > matched). > Under a DDoS attack, when removing Sources to a server under attack, kernel > freezes permanently (I gave up after 10 minutes waiting and restarted the > machine). > > A patch fixing the issue can be found here: > > http://vegeta.tuxpowered.net/download/link-states-to-src_node.patch > > -- > | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | > | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | > | Vegeta | www: http://vegeta.tuxpowered.net | > `------------------------^---------------------------------------' > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- Ermal From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:13:29 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DE245969; Fri, 8 Mar 2013 20:13:29 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vb0-x22a.google.com (mail-vb0-x22a.google.com [IPv6:2607:f8b0:400c:c02::22a]) by mx1.freebsd.org (Postfix) with ESMTP id 7BEF4844; Fri, 8 Mar 2013 20:13:29 +0000 (UTC) Received: by mail-vb0-f42.google.com with SMTP id ff1so806973vbb.1 for ; Fri, 08 Mar 2013 12:13:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=WSoNqN97tyowqEdqZLWxrDvqQNDpD7DH6Yyp44TzIZo=; b=YKpA/eGUhk18+UYso/10hrMYdAzx6ZUj5x7P8mlFQC8Sr7KRgSB/xkMPhmG6EYC/EG xa6LbqI3Stls3gC6lboeXurnmx1rlcRFWi2xV/Tc+pk52B9Dzitj6zNYGNYVpRSWLtqb Ib7YphqtJdq2KwNPQmOiV6+w5zj2oz9bUR9GRENWmIIuNDnxM6gPAgP8NtB0NKrl4eI8 MigSZH2mSBPG4V0ni9tVRToLrn5WJ01qutU1woqSgWhRm4wBX1rida4iMMfLLBm1N6Iw 6RYTk7uUegtmickuNHP2Z8fN/TbP/HcHalfe2BTYE9Le9C9PhtV8M0gDP/PxeEi37TVi e01w== MIME-Version: 1.0 X-Received: by 10.52.19.239 with SMTP id i15mr1292070vde.47.1362773608886; Fri, 08 Mar 2013 12:13:28 -0800 (PST) Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 12:13:28 -0800 (PST) In-Reply-To: <513A2887.2010408@freebsd.org> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> <20794.6692.191898.682241@hergotha.csail.mit.edu> <513A2887.2010408@freebsd.org> Date: Fri, 8 Mar 2013 12:13:28 -0800 Message-ID: Subject: Re: Limits on jumbo mbuf cluster allocation From: Jack Vogel To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:13:29 -0000 Yes, in the past the code was in this form, it should work fine Garrett, just make sure the 4K pool is large enough. I've actually been thinking about making the ring mbuf allocation sparse, and what type of strategy could be used. Right now I'm thinking of implementing a tunable threshold, and as long as I'm doing that, the 82599 hardware has an interrupt that can be set up for a low descriptor condition. This could have some performance benefits, I could decouple the mbuf refresh even further from rxeof, relying on the interrupt to initiate the refresh rather than a count as it is now. It needs some experimentation/testing first, but I will look into this. Jack On Fri, Mar 8, 2013 at 10:05 AM, Andre Oppermann wrote: > On 08.03.2013 18:04, Garrett Wollman wrote: > >> < said: >> >> I am not strongly opposed to trying the 4k mbuf pool for all larger >>> sizes, >>> Garrett maybe if you would try that on your system and see if that helps >>> you, I could envision making this a tunable at some point perhaps? >>> >> >> If you can provide a patch I can certainly build it in to our kernel >> and have it ready the next time the production server crashes. I'd >> like it to be at least a *little* tested by someone else beforehand, >> though. >> > > This should do the trick. > > -- > Andre > > Index: dev/ixgbe/ixgbe.c > ==============================**==============================**======= > --- dev/ixgbe/ixgbe.c (revision 247893) > +++ dev/ixgbe/ixgbe.c (working copy) > @@ -1120,12 +1120,8 @@ > */ > if (adapter->max_frame_size <= 2048) > adapter->rx_mbuf_sz = MCLBYTES; > - else if (adapter->max_frame_size <= 4096) > + else > adapter->rx_mbuf_sz = MJUMPAGESIZE; > - else if (adapter->max_frame_size <= 9216) > - adapter->rx_mbuf_sz = MJUM9BYTES; > - else > - adapter->rx_mbuf_sz = MJUM16BYTES; > > /* Prepare receive descriptors and buffers */ > if (ixgbe_setup_receive_**structures(adapter)) { > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:16:55 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B1509A25 for ; Fri, 8 Mar 2013 20:16:55 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-la0-x230.google.com (mail-la0-x230.google.com [IPv6:2a00:1450:4010:c03::230]) by mx1.freebsd.org (Postfix) with ESMTP id 1073B865 for ; Fri, 8 Mar 2013 20:16:54 +0000 (UTC) Received: by mail-la0-f48.google.com with SMTP id fq13so2107035lab.35 for ; Fri, 08 Mar 2013 12:16:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :cc:content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=ngCr5DDSFRRX6s9lrYxa2hy/4HXmTC0ze58H1zEkJUg=; b=ZAMkI5l1Gg6aKrJV+X7DdiyaXn9sblK/W73zpGyArr4az/tkMUXIOFcztO26n1Sljb r4BAXo4bgkUdAOBoYK6OCLWWp3sIGFRbACDI7KBoe45z/QqzvroHylm293rHAmEd2A6p kLXNSwRSE3NadcteipSIX8Y5BEtvLPH8Xgxj9+ehU+aHcx31WEjjTI4fKHJmUn4gpOqS z7MODRq3W3VolEUbUrP2iq/eHKfPDqFuPYJxOPqJIyOn1DcK5vhVmu75+njSCIMRRmC9 dZwZ5m8ZYGTDaPEmttHAIcGuPLzFADN0T04ZWtKXkJIQnhgdbalPFnuEG71N5ubBjDv8 g8Qg== X-Received: by 10.112.104.103 with SMTP id gd7mr1538285lbb.54.1362773813868; Fri, 08 Mar 2013 12:16:53 -0800 (PST) Received: from grey.home.unixconn.com (h-75-17.a183.priv.bahnhof.se. [46.59.75.17]) by mx.google.com with ESMTPS id q9sm2011142lbz.3.2013.03.08.12.16.51 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 08 Mar 2013 12:16:52 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 9.1-RELEASE-p1: em0: Could not setup receive structures From: mxb In-Reply-To: Date: Fri, 8 Mar 2013 21:16:50 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <2E6BC0C8-D435-433A-ABF7-D0E4F649F262@alumni.chalmers.se> References: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se> To: Jack Vogel X-Mailer: Apple Mail (2.1499) X-Gm-Message-State: ALoCoQnQs7wPieT+/P0jO+kHHaSP6N62Obz/c4+bNoOxC15/ncOI4u0wxGzsF83kTrR6HrZZ3Eeg Cc: freebsd-net@freebsd.org, mxb X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:16:55 -0000 Any sysctl I'd should look out for? //maxim On 8 mar 2013, at 21:03, Jack Vogel wrote: > The message occurs because you don't have enough mbufs to setup the RX > ring, so you > need to look at nmbclusters. It may be that em is just the victim, = since > you have igb interfaces > as well from what I see. >=20 > Jack >=20 >=20 > On Fri, Mar 8, 2013 at 11:19 AM, mxb wrote: >=20 >>=20 >> Hello list@, >>=20 >> I'm mostly active on OpenBSD-side, however I have several machines = running >> fbsd with ZFS. >>=20 >> I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of = em(4) >> with the problem on 8.2-stable. >> However, my problem has not disappeared after mentioned upgrade. >>=20 >> I serve VMWare images from this machine. >>=20 >> Configuration: >>=20 >> lagg0: flags=3D8843 metric 0 = mtu 9000 >>=20 >> = options=3D4019b >> ether 00:25:90:24:70:e8 >> inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255 >> nd6 options=3D1 >> media: Ethernet autoselect >> status: active >> laggproto lacp lagghash l2,l3,l4 >> laggport: igb0 flags=3D1c >> laggport: em0 flags=3D18 >> lagg1: flags=3D8843 metric 0 = mtu 9000 >>=20 >> = options=3D4019b >> ether 00:25:90:24:70:e9 >> inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255 >> nd6 options=3D1 >> media: Ethernet autoselect >> status: active >> laggproto lacp lagghash l2,l3,l4 >> laggport: igb1 flags=3D1c >> laggport: em1 flags=3D1c >>=20 >> current sysctl.conf (not fixed after upgrade): >> # Every socket is a file, so increase them >> kern.maxfiles=3D204800 >> kern.maxfilesperproc=3D200000 >> kern.ipc.maxsockets=3D204800 >>=20 >> # Increase max command-line length showed in `ps` (e.g for = Tomcat/Java) >> # Default is PAGE_SIZE / 16 or 256 on x86 >> # For more info see: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120749 >> kern.ps_arg_cache_limit=3D4096 >>=20 >> # Security >> #net.inet.udp.blackhole=3D1 >> #net.inet.tcp.blackhole=3D2 >>=20 >> kern.ipc.maxsockbuf=3D16777216 >> kern.ipc.nmbclusters=3D65535 >> kern.ipc.somaxconn=3D32768 >> #kern.maxfiles=3D65535 >> kern.maxvnodes=3D800000 >>=20 >> vfs.zfs.l2arc_noprefetch=3D0 >> vfs.zfs.l2arc_write_max=3D16777216 >> vfs.zfs.l2arc_write_boost=3D16777216 >>=20 >> net.inet.tcp.sendspace=3D65535 >> net.inet.tcp.recvspace=3D131072 >> net.inet.tcp.mssdflt=3D1452 >> net.inet.tcp.sendbuf_max=3D16777216 >> net.inet.tcp.sendbuf_inc=3D524288 >> net.inet.tcp.recvbuf_max=3D16777216 >> net.inet.tcp.recvbuf_inc=3D524288 >> net.inet.udp.recvspace=3D65535 >> net.inet.udp.maxdgram=3D65535 >> net.local.stream.recvspace=3D65535 >> net.local.stream.sendspace=3D65535 >> net.inet.tcp.delayed_ack=3D0 >>=20 >>=20 >> Any clues? >>=20 >> Please mail directly to me or cc to sysop@prisjakt.nu >>=20 >> Regards >> Maxim >>=20 >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:20:09 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 52313AFC for ; Fri, 8 Mar 2013 20:20:09 +0000 (UTC) (envelope-from jeffrey.e.pieper@intel.com) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by mx1.freebsd.org (Postfix) with ESMTP id BAD05880 for ; Fri, 8 Mar 2013 20:20:08 +0000 (UTC) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 08 Mar 2013 12:19:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,809,1355126400"; d="scan'208";a="298045704" Received: from orsmsx105.amr.corp.intel.com ([10.22.225.132]) by fmsmga001.fm.intel.com with ESMTP; 08 Mar 2013 12:19:51 -0800 Received: from orsmsx110.amr.corp.intel.com (10.22.225.11) by ORSMSX105.amr.corp.intel.com (10.22.225.132) with Microsoft SMTP Server (TLS) id 14.1.355.2; Fri, 8 Mar 2013 12:19:50 -0800 Received: from orsmsx101.amr.corp.intel.com ([169.254.8.213]) by ORSMSX110.amr.corp.intel.com ([10.22.225.11]) with mapi id 14.01.0355.002; Fri, 8 Mar 2013 12:19:50 -0800 From: "Pieper, Jeffrey E" To: mxb , Jack Vogel Subject: RE: 9.1-RELEASE-p1: em0: Could not setup receive structures Thread-Topic: 9.1-RELEASE-p1: em0: Could not setup receive structures Thread-Index: AQHOHDnr1t1jJJcMs0m4zpHbw4SFOpicPASA Date: Fri, 8 Mar 2013 20:19:49 +0000 Message-ID: <2A35EA60C3C77D438915767F458D65687D4608A9@ORSMSX101.amr.corp.intel.com> References: <5587F8D1-2242-4579-B992-357C75425A37@alumni.chalmers.se> <2E6BC0C8-D435-433A-ABF7-D0E4F649F262@alumni.chalmers.se> In-Reply-To: <2E6BC0C8-D435-433A-ABF7-D0E4F649F262@alumni.chalmers.se> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:20:09 -0000 kern.ipc.nmbclusters Jeff -----Original Message----- From: owner-freebsd-net@freebsd.org [mailto:owner-freebsd-net@freebsd.org] = On Behalf Of mxb Sent: Friday, March 08, 2013 12:17 PM To: Jack Vogel Cc: freebsd-net@freebsd.org; mxb Subject: Re: 9.1-RELEASE-p1: em0: Could not setup receive structures Any sysctl I'd should look out for? //maxim On 8 mar 2013, at 21:03, Jack Vogel wrote: > The message occurs because you don't have enough mbufs to setup the RX > ring, so you > need to look at nmbclusters. It may be that em is just the victim, since > you have igb interfaces > as well from what I see. >=20 > Jack >=20 >=20 > On Fri, Mar 8, 2013 at 11:19 AM, mxb wrote: >=20 >>=20 >> Hello list@, >>=20 >> I'm mostly active on OpenBSD-side, however I have several machines runni= ng >> fbsd with ZFS. >>=20 >> I'v recently upgraded(today) from 8.2-stable to 9.1-rel because of em(4) >> with the problem on 8.2-stable. >> However, my problem has not disappeared after mentioned upgrade. >>=20 >> I serve VMWare images from this machine. >>=20 >> Configuration: >>=20 >> lagg0: flags=3D8843 metric 0 mtu= 9000 >>=20 >> options=3D4019b >> ether 00:25:90:24:70:e8 >> inet 172.16.0.243 netmask 0xfffff800 broadcast 172.16.7.255 >> nd6 options=3D1 >> media: Ethernet autoselect >> status: active >> laggproto lacp lagghash l2,l3,l4 >> laggport: igb0 flags=3D1c >> laggport: em0 flags=3D18 >> lagg1: flags=3D8843 metric 0 mtu= 9000 >>=20 >> options=3D4019b >> ether 00:25:90:24:70:e9 >> inet 10.11.11.1 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.11 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.12 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.13 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.14 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.15 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.16 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.17 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.18 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.19 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.20 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.21 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.22 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.23 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.24 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.25 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.26 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.27 netmask 0xffffff00 broadcast 10.11.11.255 >> inet 10.11.11.28 netmask 0xffffff00 broadcast 10.11.11.255 >> nd6 options=3D1 >> media: Ethernet autoselect >> status: active >> laggproto lacp lagghash l2,l3,l4 >> laggport: igb1 flags=3D1c >> laggport: em1 flags=3D1c >>=20 >> current sysctl.conf (not fixed after upgrade): >> # Every socket is a file, so increase them >> kern.maxfiles=3D204800 >> kern.maxfilesperproc=3D200000 >> kern.ipc.maxsockets=3D204800 >>=20 >> # Increase max command-line length showed in `ps` (e.g for Tomcat/Java) >> # Default is PAGE_SIZE / 16 or 256 on x86 >> # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120749 >> kern.ps_arg_cache_limit=3D4096 >>=20 >> # Security >> #net.inet.udp.blackhole=3D1 >> #net.inet.tcp.blackhole=3D2 >>=20 >> kern.ipc.maxsockbuf=3D16777216 >> kern.ipc.nmbclusters=3D65535 >> kern.ipc.somaxconn=3D32768 >> #kern.maxfiles=3D65535 >> kern.maxvnodes=3D800000 >>=20 >> vfs.zfs.l2arc_noprefetch=3D0 >> vfs.zfs.l2arc_write_max=3D16777216 >> vfs.zfs.l2arc_write_boost=3D16777216 >>=20 >> net.inet.tcp.sendspace=3D65535 >> net.inet.tcp.recvspace=3D131072 >> net.inet.tcp.mssdflt=3D1452 >> net.inet.tcp.sendbuf_max=3D16777216 >> net.inet.tcp.sendbuf_inc=3D524288 >> net.inet.tcp.recvbuf_max=3D16777216 >> net.inet.tcp.recvbuf_inc=3D524288 >> net.inet.udp.recvspace=3D65535 >> net.inet.udp.maxdgram=3D65535 >> net.local.stream.recvspace=3D65535 >> net.local.stream.sendspace=3D65535 >> net.inet.tcp.delayed_ack=3D0 >>=20 >>=20 >> Any clues? >>=20 >> Please mail directly to me or cc to sysop@prisjakt.nu >>=20 >> Regards >> Maxim >>=20 >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:28:13 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 08E08E80 for ; Fri, 8 Mar 2013 20:28:13 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 9464B8E2 for ; Fri, 8 Mar 2013 20:28:12 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r28KSBKx005762; Fri, 8 Mar 2013 15:28:11 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r28KSBQV005759; Fri, 8 Mar 2013 15:28:11 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20794.18907.530374.164737@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 15:28:11 -0500 From: Garrett Wollman To: Jack Vogel Subject: UNS: Re: Limits on jumbo mbuf cluster allocation In-Reply-To: References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> <20794.6692.191898.682241@hergotha.csail.mit.edu> <513A2887.2010408@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 15:28:11 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:28:13 -0000 < said: > Yes, in the past the code was in this form, it should work fine Garrett, > just make sure > the 4K pool is large enough. I take it then that the hardware works in the traditional way, and just keeps on using buffers until the packet is completely written, then sets a field on the ring descriptor saying "the end of the packet is HERE"? I'll give that change a try when I get a chance. -GAWollman From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:33:11 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 6FC7AD6; Fri, 8 Mar 2013 20:33:11 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vb0-x22b.google.com (mail-vb0-x22b.google.com [IPv6:2607:f8b0:400c:c02::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 166C090C; Fri, 8 Mar 2013 20:33:11 +0000 (UTC) Received: by mail-vb0-f43.google.com with SMTP id fs19so818997vbb.30 for ; Fri, 08 Mar 2013 12:33:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=/iBg7AYMgRUajUfsdvZt+hW9WgMVNk0duoiVzI09uNY=; b=FscbCUFsB28Z6Q/CpT5RGNkHo3ZQY/PQtVk4uOBx3SG8nZWReOvpsGAfhVeRX/czH6 qAssvofTgp4S6rr6yD4uTka81b2gzVfRxEdazuXqOHWVdVJ5cxwakOshNdmIUnJNt2uh hYLIGRjGKVT8R2tumDR5FqmxI0FGdGMl6fga2t3kC+T/66w8kTfvqPNpE6nAUkGSlyQA v7Now3ABgoPQP6mfjRLlljxM2U1HgvZc5cNKVxM29rdBX0uDDNRwXe40Dq2I8syoG8Fj XXwYnMp9qnl6xHqmzaJURwOR7j+KdyxQ4Xzm/Ga5GxMOYoBW6mxBZhoT4/xRo4ccuhz9 KbnQ== MIME-Version: 1.0 X-Received: by 10.52.19.239 with SMTP id i15mr1317983vde.47.1362774789931; Fri, 08 Mar 2013 12:33:09 -0800 (PST) Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 12:33:09 -0800 (PST) In-Reply-To: <20794.18907.530374.164737@hergotha.csail.mit.edu> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <51399926.6020201@freebsd.org> <20794.6692.191898.682241@hergotha.csail.mit.edu> <513A2887.2010408@freebsd.org> <20794.18907.530374.164737@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 12:33:09 -0800 Message-ID: Subject: Re: UNS: Re: Limits on jumbo mbuf cluster allocation From: Jack Vogel To: Garrett Wollman Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:33:11 -0000 Yes, the write-back descriptor has a bit in the status field that says its EOP (end of packet) or not. Jack On Fri, Mar 8, 2013 at 12:28 PM, Garrett Wollman wrote: > < said: > > > Yes, in the past the code was in this form, it should work fine Garrett, > > just make sure > > the 4K pool is large enough. > > I take it then that the hardware works in the traditional way, and > just keeps on using buffers until the packet is completely written, > then sets a field on the ring descriptor saying "the end of the packet > is HERE"? > > I'll give that change a try when I get a chance. > > -GAWollman > > From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:45:42 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5736C533; Fri, 8 Mar 2013 20:45:42 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 1A30198A; Fri, 8 Mar 2013 20:45:42 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r28Kjflt047423; Fri, 8 Mar 2013 20:45:41 GMT (envelope-from melifaro@freefall.freebsd.org) Received: (from melifaro@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r28Kjf63047422; Fri, 8 Mar 2013 20:45:41 GMT (envelope-from melifaro) Date: Fri, 8 Mar 2013 20:45:41 GMT Message-Id: <201303082045.r28Kjf63047422@freefall.freebsd.org> To: melifaro@FreeBSD.org, freebsd-net@FreeBSD.org, melifaro@FreeBSD.org From: melifaro@FreeBSD.org Subject: Re: kern/155772: ifconfig(8): ioctl (SIOCAIFADDR): File exists on directly connected networks X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:45:42 -0000 Synopsis: ifconfig(8): ioctl (SIOCAIFADDR): File exists on directly connected networks Responsible-Changed-From-To: freebsd-net->melifaro Responsible-Changed-By: melifaro Responsible-Changed-When: Fri Mar 8 20:45:18 UTC 2013 Responsible-Changed-Why: Take http://www.freebsd.org/cgi/query-pr.cgi?pr=155772 From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 20:51:06 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5D08668E for ; Fri, 8 Mar 2013 20:51:06 +0000 (UTC) (envelope-from vegeta@tuxpowered.net) Received: from mail-ee0-f53.google.com (mail-ee0-f53.google.com [74.125.83.53]) by mx1.freebsd.org (Postfix) with ESMTP id B81859C4 for ; Fri, 8 Mar 2013 20:51:05 +0000 (UTC) Received: by mail-ee0-f53.google.com with SMTP id e53so1288731eek.40 for ; Fri, 08 Mar 2013 12:51:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id:x-gm-message-state; bh=XWil0kz7DOw9U0/ENqWwDm2MDNiKp778Re89w4pDYG4=; b=duT3y2usus5Fq8hj6ewPBTLM4XKJ5tAK5O7ilSWUHK+je4NBmTpMm4Peo3wyzzIwvx ARhM+e3cs8WjBNd/5Wp7CMbzwT7fTiO31PAMkG6iupw2TitOPGYyMqBfJTSE7zp21/y4 h8eWSvBjAa2vfbCwnUxSp4mOouU0OL5vqJDmRuA6zcU/NlG2OVAqDYtNtLDiU3ovTuk6 /CzAKRZ9oD5T+/kriioHFH2itJODE5XpzY4mExy0uZT9hCLiumJhJe2VmIDCxwdYVwF4 VL5vcXZeFnt10pvPs8C33XHoYZxtsnlN7iMIgS/grbG1E5BqRGtBQBDF0AtFKSUQSbLJ jrxQ== X-Received: by 10.14.0.135 with SMTP id 7mr9518352eeb.5.1362775864101; Fri, 08 Mar 2013 12:51:04 -0800 (PST) Received: from zvezda.localnet ([37.83.50.199]) by mx.google.com with ESMTPS id s3sm9728785eem.4.2013.03.08.12.51.02 (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 08 Mar 2013 12:51:03 -0800 (PST) From: Kajetan Staszkiewicz To: Ermal =?utf-8?q?Lu=C3=A7i?= Subject: Re: [patch] Source entries removing is awfully slow. Date: Fri, 8 Mar 2013 21:51:00 +0100 User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; ) References: <201303081419.17743.vegeta@tuxpowered.net> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <201303082151.00895.vegeta@tuxpowered.net> X-Gm-Message-State: ALoCoQlvCQaYqgOa7UB12jEbGoAdKsTZFhU4qyepSvGKwF88C7hO+jHjs4wrr06SVtqLNw0uP+5q Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 20:51:06 -0000 Dnia pi=C4=85tek, 8 marca 2013 o 21:11:43 Ermal Lu=C3=A7i napisa=C5=82(a): > Is this FreeBSD 9.x or HEAD? I found the problem and developed the patch on 9.1. =2D-=20 | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | | Vegeta | www: http://vegeta.tuxpowered.net | `------------------------^---------------------------------------' From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 23:14:14 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 75FDAA5D; Fri, 8 Mar 2013 23:14:14 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 3C97DECD; Fri, 8 Mar 2013 23:14:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r28NEE6c077371; Fri, 8 Mar 2013 23:14:14 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r28NEEud077370; Fri, 8 Mar 2013 23:14:14 GMT (envelope-from linimon) Date: Fri, 8 Mar 2013 23:14:14 GMT Message-Id: <201303082314.r28NEEud077370@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/176764: [net] [if_bridge] [patch] use-after-free in if_bridge X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 23:14:14 -0000 Old Synopsis: [net] [if_bridge] use-after-free in if_bridge New Synopsis: [net] [if_bridge] [patch] use-after-free in if_bridge Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Fri Mar 8 23:13:52 UTC 2013 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=176764 From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 00:48:22 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CC08DDF5; Sat, 9 Mar 2013 00:48:22 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 39715224; Sat, 9 Mar 2013 00:48:21 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqEEAIyFOlGDaFvO/2dsb2JhbABDiCW8NoFzdIIsAQEBAwEBAQEgBCcgCxsYAgINGQIpAQkmBggHBAEcBIdsBgypZ5I3gSOMMwV9NAeCLYETA4hxiySCPoEej1SDKE99CBce X-IronPort-AV: E=Sophos;i="4.84,810,1355115600"; d="scan'208";a="20199368" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 08 Mar 2013 19:47:13 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A6806B4032; Fri, 8 Mar 2013 19:47:13 -0500 (EST) Date: Fri, 8 Mar 2013 19:47:13 -0500 (EST) From: Rick Macklem To: Garrett Wollman Message-ID: <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20794.7012.265887.99878@hergotha.csail.mit.edu> Subject: Re: Limits on jumbo mbuf cluster allocation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: jfv@freebsd.org, freebsd-net@freebsd.org, Andre Oppermann , Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 00:48:22 -0000 Garrett Wollman wrote: > < said: > > > [stuff I wrote deleted] > > You have an amd64 kernel running HEAD or 9.x? > > Yes, these are 9.1 with some patches to reduce mutex contention on the > NFS server's replay "cache". > The cached replies are copies of the mbuf list done via m_copym(). As such, the clusters in these replies won't be free'd (ref cnt -> 0) until the cache is trimmed (nfsrv_trimcache() gets called after the TCP layer has received an ACK for receipt of the reply from the client). If reducing the size to 4K doesn't fix the problem, you might want to consider shrinking the tunable vfs.nfsd.tcphighwater and suffering the increased CPU overhead (and some increased mutex contention) of calling nfsrv_trimcache() more frequently. (I'm assuming that you are using drc2.patch + drc3.patch. If you are using one of ivoras@'s variants of the patch, I'm not sure if the tunable is called the same thing, although it should have basically the same effect.) Good luck with it and thanks for running on the "bleeding edge" so these issues get identified, rick > > Jumbo pages come directly from the kernel_map which on amd64 is > > 512GB. > > So KVA shouldn't be a problem. Your problem indeed appears to come > > physical memory fragmentation in pmap. > > I hadn't realized that they were physically contiguous, but that makes > perfect sense. > > > pages. Also since you're doing NFS serving almost all memory will be > > in use for file caching. > > I actually had the ZFS ARC tuned down to 64 GB (out of 96 GB physmem) > when I experienced this, but there are plenty of data structures in > the kernel that aren't subject to this limit and I could easily > imagine them checkerboarding physical memory to the point where no > contiguous three-page allocations were possible. > > -GAWollman > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 01:40:03 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 95E148D8 for ; Sat, 9 Mar 2013 01:40:03 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 37E2436A for ; Sat, 9 Mar 2013 01:40:03 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r291e1kw010071; Fri, 8 Mar 2013 20:40:01 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r291e1O7010068; Fri, 8 Mar 2013 20:40:01 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20794.37617.822910.93537@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 20:40:01 -0500 From: Garrett Wollman To: Rick Macklem Subject: Re: Limits on jumbo mbuf cluster allocation In-Reply-To: <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca> References: <20794.7012.265887.99878@hergotha.csail.mit.edu> <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 20:40:01 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 01:40:03 -0000 < said: > If reducing the size to 4K doesn't fix the problem, you might want to > consider shrinking the tunable vfs.nfsd.tcphighwater and suffering > the increased CPU overhead (and some increased mutex contention) of > calling nfsrv_trimcache() more frequently. Can't do that -- the system becomes intolerably slow when it gets into that state, and seems to get stuck that way, such that the only way to restore performance is to increase the size of the "cache". (Essentially all of the nfsd service threads end up spinning most of the time, load average goes to N, and goodput goes to nearly nil.) It does seem like a lot of effort for an extreme edge case that, in practical terms, never happens. > (I'm assuming that you are using drc2.patch + drc3.patch. I believe that's what I have. If my kernel coding skills were less rusty, I'd fix it to have a separate cache-trimming thread. One other weird thing that I've noticed is that netstat(1) reports the send and receive queues on NFS connections as being far higher than I have the limits configured. Does NFS do something to override this? -GAWollman From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 01:52:46 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B2E9AABF; Sat, 9 Mar 2013 01:52:46 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 5D8EB3E9; Sat, 9 Mar 2013 01:52:46 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r291qj70010186; Fri, 8 Mar 2013 20:52:45 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r291qjvK010183; Fri, 8 Mar 2013 20:52:45 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20794.38381.221980.5038@hergotha.csail.mit.edu> Date: Fri, 8 Mar 2013 20:52:45 -0500 From: Garrett Wollman To: Rick Macklem Subject: Re: NFS DRC size In-Reply-To: <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca> References: <20794.7012.265887.99878@hergotha.csail.mit.edu> <2050712270.3721724.1362790033662.JavaMail.root@erie.cs.uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 08 Mar 2013 20:52:45 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 01:52:46 -0000 < said: > The cached replies are copies of the mbuf list done via m_copym(). > As such, the clusters in these replies won't be free'd (ref cnt -> 0) > until the cache is trimmed (nfsrv_trimcache() gets called after the > TCP layer has received an ACK for receipt of the reply from the client). I wonder if this bit is even working at all. In my experience, the size of the DRC quickly grows under load up to the maximum (or actually, slightly beyond), and never drops much below that level. On my production server right now, "nfsstat -se" reports: Server Info: Getattr Setattr Lookup Readlink Read Write Create Remove 13036780 359901 1723623 3420 36397693 12385668 346590 109984 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access 45173 16 116791 14192 1176 24 12876747 3398533 Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf 0 2703 14992 7502 1329196 0 1 1 Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock 263034 0 0 263019 0 0 545104 0 LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH 0 0 263012 0 0 23753375 0 1 Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create 2 263006 263033 0 0 0 Server: Retfailed Faults Clients 0 0 1 OpenOwner Opens LockOwner Locks Delegs 56 10 0 0 0 Server Cache Stats: Inprog Idem Non-idem Misses CacheSize TCPPeak 0 0 0 81714128 60997 61017 It's only been up for about the last 24 hours. Should I be setting the size limit to something truly outrageous, like 200,000? (I'd definitely need to deal with the mbuf cluster issue then!) The average request rate over this time is about 1000/s, but that includes several episodes of high-cpu spinning (which I resolved by increasing the DRC limit). Meanwhile, some relevant bits from sysctl: vfs.nfsd.udphighwater: 500 vfs.nfsd.tcphighwater: 61000 vfs.nfsd.minthreads: 16 vfs.nfsd.maxthreads: 64 vfs.nfsd.threads: 64 vfs.nfsd.request_space_used: 1416 vfs.nfsd.request_space_used_highest: 4284672 vfs.nfsd.request_space_high: 47185920 vfs.nfsd.request_space_low: 31457280 vfs.nfsd.request_space_throttled: 0 vfs.nfsd.request_space_throttle_count: 0 (I'd actually like to put maxthreads back up at 256, which is where I had it during testing, but I need to test that the jumbo-frames issue is fixed first. I did pre-production testing on a non-jumbo network.) -GAWollman From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 12:14:25 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 716F8F4F; Sat, 9 Mar 2013 12:14:25 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qe0-f42.google.com (mail-qe0-f42.google.com [209.85.128.42]) by mx1.freebsd.org (Postfix) with ESMTP id 21C54D0F; Sat, 9 Mar 2013 12:14:24 +0000 (UTC) Received: by mail-qe0-f42.google.com with SMTP id f6so1549085qej.1 for ; Sat, 09 Mar 2013 04:14:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=UHcs7vGdgH8kblpB8G3zinSns84zyrBia5+mdu0h4JI=; b=FaYXZcVSrwD+4NmmdszJDXIVru1k+chxA4egobAae5nsfklbiXzpU2nadlpVps1oWh zq76F8b+GoPKi4FW4H1x56YOOC8WsrYnSIT9dm5w4TZDdaRsmteMu05mjdLL/IEGRyR/ e0qRYx3bv3lH2fuVBzUQxD1JZYGDurLWcJvidIZZ68KesaCGx3rVJ9O9uX+cZVcO/n7H ROM8KbnmmrJ5/wtcktUCUEeFqfMvdplRp3R0TRQ4bazuIFnVBdSZfp4V3HWFjHeDpHS/ DCRQ6bc22+J0EKc0CNLg/rEz/klFMtYwPeOUOmfUZjMq41V3E50bljkiXPYO/T1VdOAs ziuA== MIME-Version: 1.0 X-Received: by 10.49.6.101 with SMTP id z5mr9322969qez.50.1362831258610; Sat, 09 Mar 2013 04:14:18 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Sat, 9 Mar 2013 04:14:16 -0800 (PST) In-Reply-To: <201303082151.00895.vegeta@tuxpowered.net> References: <201303081419.17743.vegeta@tuxpowered.net> <201303082151.00895.vegeta@tuxpowered.net> Date: Sat, 9 Mar 2013 13:14:16 +0100 X-Google-Sender-Auth: PO_l65cnq0c2RwQhae4xh5miDZE Message-ID: Subject: Re: [patch] Source entries removing is awfully slow. From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Kajetan Staszkiewicz Content-Type: multipart/mixed; boundary=047d7bea40f00ef2e504d77ce15b X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 12:14:25 -0000 --047d7bea40f00ef2e504d77ce15b Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz wrote: > Dnia pi=B1tek, 8 marca 2013 o 21:11:43 Ermal Lu=E7i napisa=B3(a): > > Is this FreeBSD 9.x or HEAD? > > I found the problem and developed the patch on 9.1. > > Can you please test this more 'beautiful' patch. Its similar to yours but also delays src state removal to the proper purge thread. Though the src node removal option through pfctl -K does a lot of job to cleanup things Still need to undertand why it takes so much time for you to loop through 500K states. The purge thread does that every tick by partitioning it to a few per time slot but still minutes is way loong. Can you please try to give a top -SH view of the time when this happens and a pfctl -vvsa output? > -- > | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | > | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | > | Vegeta | www: http://vegeta.tuxpowered.net | > `------------------------^---------------------------------------' > --=20 Ermal --047d7bea40f00ef2e504d77ce15b Content-Type: application/octet-stream; name="state_unlink_optimization2.diff" Content-Disposition: attachment; filename="state_unlink_optimization2.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_he2q1w430 ZGlmZiAtLWdpdCBhL3N5cy9jb250cmliL3BmL25ldC9wZi5jIGIvc3lzL2NvbnRyaWIvcGYvbmV0 L3BmLmMKaW5kZXggOWZiMDVhZS4uNGRmNDBjYyAxMDA2NDQKLS0tIGEvc3lzL2NvbnRyaWIvcGYv bmV0L3BmLmMKKysrIGIvc3lzL2NvbnRyaWIvcGYvbmV0L3BmLmMKQEAgLTcyMCw2ICs3MjAsOSBA QCBwZl9pbnNlcnRfc3JjX25vZGUoc3RydWN0IHBmX3NyY19ub2RlICoqc24sIHN0cnVjdCBwZl9y dWxlICpydWxlLAogCQkgICAgcnVsZS0+bWF4X3NyY19jb25uX3JhdGUubGltaXQsCiAJCSAgICBy dWxlLT5tYXhfc3JjX2Nvbm5fcmF0ZS5zZWNvbmRzKTsKIAorI2lmZGVmIF9fRnJlZUJTRF9fCisJ CVRBSUxRX0lOSVQoJigqc24pLT5zdGF0ZV9saXN0KTsKKyNlbmRpZgogCQkoKnNuKS0+YWYgPSBh ZjsKIAkJaWYgKHJ1bGUtPnJ1bGVfZmxhZyAmIFBGUlVMRV9SVUxFU1JDVFJBQ0sgfHwKIAkJICAg IHJ1bGUtPnJwb29sLm9wdHMgJiBQRl9QT09MX1NUSUNLWUFERFIpCkBAIC0xNDUzLDYgKzE0NTYs OSBAQCBwZl9wdXJnZV9leHBpcmVkX3NyY19ub2RlcyhpbnQgd2FzbG9ja2VkKQogI2VuZGlmCiB7 CiAJc3RydWN0IHBmX3NyY19ub2RlCQkqY3VyLCAqbmV4dDsKKyNpZmRlZiBfX0ZyZWVCU0RfXwor CXN0cnVjdCBwZl9zdGF0ZQkJCSpzOworI2VuZGlmCiAJaW50CQkJCSBsb2NrZWQgPSB3YXNsb2Nr ZWQ7CiAKICNpZmRlZiBfX0ZyZWVCU0RfXwpAQCAtMTQ4Niw2ICsxNDkyLDEyIEBAIHBmX3B1cmdl X2V4cGlyZWRfc3JjX25vZGVzKGludCB3YXNsb2NrZWQpCiAJCQkJCXBmX3JtX3J1bGUoTlVMTCwg Y3VyLT5ydWxlLnB0cik7CiAJCQl9CiAjaWZkZWYgX19GcmVlQlNEX18KKwkJCXdoaWxlICghVEFJ TFFfRU1QVFkoJmN1ci0+c3RhdGVfbGlzdCkpIHsKKwkJCQlzID0gVEFJTFFfRklSU1QoJmN1ci0+ c3RhdGVfbGlzdCk7CisJCQkJVEFJTFFfUkVNT1ZFKCZjdXItPnN0YXRlX2xpc3QsIHMsIHNyY25v ZGVfbGluayk7CisJCQkJcy0+c3JjX25vZGUgPSBOVUxMOworCQkJCXMtPm5hdF9zcmNfbm9kZSA9 IE5VTEw7CisJCQl9CiAJCQlSQl9SRU1PVkUocGZfc3JjX3RyZWUsICZWX3RyZWVfc3JjX3RyYWNr aW5nLCBjdXIpOwogCQkJVl9wZl9zdGF0dXMuc2NvdW50ZXJzW1NDTlRfU1JDX05PREVfUkVNT1ZB TFNdKys7CiAJCQlWX3BmX3N0YXR1cy5zcmNfbm9kZXMtLTsKQEAgLTE1MjksNiArMTU0MSwxMCBA QCBwZl9zcmNfdHJlZV9yZW1vdmVfc3RhdGUoc3RydWN0IHBmX3N0YXRlICpzKQogI2VuZGlmCiAJ CQlzLT5zcmNfbm9kZS0+ZXhwaXJlID0gdGltZV9zZWNvbmQgKyB0aW1lb3V0OwogCQl9CisjaWZk ZWYgX19GcmVlQlNEX18KKwkJaWYgKCFUQUlMUV9FTVBUWSgmcy0+c3JjX25vZGUtPnN0YXRlX2xp c3QpKQorCQkJVEFJTFFfUkVNT1ZFKCZzLT5zcmNfbm9kZS0+c3RhdGVfbGlzdCwgcywgc3Jjbm9k ZV9saW5rKTsKKyNlbmRpZgogCX0KIAlpZiAocy0+bmF0X3NyY19ub2RlICE9IHMtPnNyY19ub2Rl ICYmIHMtPm5hdF9zcmNfbm9kZSAhPSBOVUxMKSB7CiAJCWlmICgtLXMtPm5hdF9zcmNfbm9kZS0+ c3RhdGVzIDw9IDApIHsKQEAgLTE1NDIsNiArMTU1OCwxMCBAQCBwZl9zcmNfdHJlZV9yZW1vdmVf c3RhdGUoc3RydWN0IHBmX3N0YXRlICpzKQogI2VuZGlmCiAJCQlzLT5uYXRfc3JjX25vZGUtPmV4 cGlyZSA9IHRpbWVfc2Vjb25kICsgdGltZW91dDsKIAkJfQorI2lmZGVmIF9fRnJlZUJTRF9fCisJ CWlmICghVEFJTFFfRU1QVFkoJnMtPm5hdF9zcmNfbm9kZS0+c3RhdGVfbGlzdCkpCisJCQlUQUlM UV9SRU1PVkUoJnMtPm5hdF9zcmNfbm9kZS0+c3RhdGVfbGlzdCwgcywgc3Jjbm9kZV9saW5rKTsK KyNlbmRpZgogCX0KIAlzLT5zcmNfbm9kZSA9IHMtPm5hdF9zcmNfbm9kZSA9IE5VTEw7CiB9CkBA IC0zOTQ5LDggKzM5NjksMTggQEAgcGZfY3JlYXRlX3N0YXRlKHN0cnVjdCBwZl9ydWxlICpyLCBz dHJ1Y3QgcGZfcnVsZSAqbnIsIHN0cnVjdCBwZl9ydWxlICphLAogCQlwb29sX3B1dCgmcGZfc3Rh dGVfcGwsIHMpOwogI2VuZGlmCiAJCXJldHVybiAoUEZfRFJPUCk7CisjaWZkZWYgX19GcmVlQlNE X18KKwl9IGVsc2UgeworCQlpZiAoc24gIT0gTlVMTCkKKwkJCVRBSUxRX0lOU0VSVF9IRUFEKCZz bi0+c3RhdGVfbGlzdCwgcywgc3Jjbm9kZV9saW5rKTsKKwkJaWYgKG5zbiAhPSBOVUxMKQorCQkJ VEFJTFFfSU5TRVJUX0hFQUQoJm5zbi0+c3RhdGVfbGlzdCwgcywgc3Jjbm9kZV9saW5rKTsKKwkJ KnNtID0gczsKKwl9CisjZWxzZQogCX0gZWxzZQogCQkqc20gPSBzOworI2VuZGlmCiAKIAlwZl9z ZXRfcnRfaWZwKHMsIHBkLT5zcmMpOwkvKiBuZWVkcyBzLT5zdGF0ZV9rZXkgc2V0ICovCiAJaWYg KHRhZyA+IDApIHsKZGlmZiAtLWdpdCBhL3N5cy9jb250cmliL3BmL25ldC9wZl9pb2N0bC5jIGIv c3lzL2NvbnRyaWIvcGYvbmV0L3BmX2lvY3RsLmMKaW5kZXggM2IxMzBlNS4uMjg2NGU5YSAxMDA2 NDQKLS0tIGEvc3lzL2NvbnRyaWIvcGYvbmV0L3BmX2lvY3RsLmMKKysrIGIvc3lzL2NvbnRyaWIv cGYvbmV0L3BmX2lvY3RsLmMKQEAgLTM3ODksNyArMzc4OSw5IEBAIHBmaW9jdGwoZGV2X3QgZGV2 LCB1X2xvbmcgY21kLCBjYWRkcl90IGFkZHIsIGludCBmbGFncywgc3RydWN0IHByb2MgKnApCiAK IAljYXNlIERJT0NLSUxMU1JDTk9ERVM6IHsKIAkJc3RydWN0IHBmX3NyY19ub2RlCSpzbjsKKyNp Zm5kZWYgX19GcmVlQlNEX18KIAkJc3RydWN0IHBmX3N0YXRlCQkqczsKKyNlbmRpZgogCQlzdHJ1 Y3QgcGZpb2Nfc3JjX25vZGVfa2lsbCAqcHNuayA9CiAJCSAgICAoc3RydWN0IHBmaW9jX3NyY19u b2RlX2tpbGwgKilhZGRyOwogCQl1X2ludAkJCWtpbGxlZCA9IDA7CkBAIC0zODA4LDYgKzM4MTAs NyBAQCBwZmlvY3RsKGRldl90IGRldiwgdV9sb25nIGNtZCwgY2FkZHJfdCBhZGRyLCBpbnQgZmxh Z3MsIHN0cnVjdCBwcm9jICpwKQogCQkJCSZwc25rLT5wc25rX2RzdC5hZGRyLnYuYS5tYXNrLAog CQkJCSZzbi0+cmFkZHIsIHNuLT5hZikpIHsKIAkJCQkvKiBIYW5kbGUgc3RhdGUgdG8gc3JjX25v ZGUgbGlua2FnZSAqLworI2lmbmRlZiBfX0ZyZWVCU0RfXyAKIAkJCQlpZiAoc24tPnN0YXRlcyAh PSAwKSB7CiAJCQkJCVJCX0ZPUkVBQ0gocywgcGZfc3RhdGVfdHJlZV9pZCwKICNpZmRlZiBfX0Zy ZWVCU0RfXwpAQCAtMzgyMiwxMyArMzgyNSwxNiBAQCBwZmlvY3RsKGRldl90IGRldiwgdV9sb25n IGNtZCwgY2FkZHJfdCBhZGRyLCBpbnQgZmxhZ3MsIHN0cnVjdCBwcm9jICpwKQogCQkJCQl9CiAJ CQkJCXNuLT5zdGF0ZXMgPSAwOwogCQkJCX0KKyNlbmRpZgogCQkJCXNuLT5leHBpcmUgPSAxOwog CQkJCWtpbGxlZCsrOwogCQkJfQogCQl9CiAKKyNpZiAwCiAJCWlmIChraWxsZWQgPiAwKQogCQkJ cGZfcHVyZ2VfZXhwaXJlZF9zcmNfbm9kZXMoMSk7CisjZW5kaWYKIAogCQlwc25rLT5wc25rX2tp bGxlZCA9IGtpbGxlZDsKIAkJYnJlYWs7CmRpZmYgLS1naXQgYS9zeXMvY29udHJpYi9wZi9uZXQv cGZ2YXIuaCBiL3N5cy9jb250cmliL3BmL25ldC9wZnZhci5oCmluZGV4IGRhYjcwYzUuLmUzMWQz OWQgMTAwNjQ0Ci0tLSBhL3N5cy9jb250cmliL3BmL25ldC9wZnZhci5oCisrKyBiL3N5cy9jb250 cmliL3BmL25ldC9wZnZhci5oCkBAIC03MzksNiArNzM5LDkgQEAgc3RydWN0IHBmX3NyY19ub2Rl IHsKIAlzdHJ1Y3QgcGZfYWRkcgkgcmFkZHI7CiAJdW5pb24gcGZfcnVsZV9wdHIgcnVsZTsKIAlz dHJ1Y3QgcGZpX2tpZgkqa2lmOworI2lmZGVmIF9fRnJlZUJTRF9fCisJVEFJTFFfSEVBRCgsIHBm X3N0YXRlKQlzdGF0ZV9saXN0OworI2VuZGlmCiAJdV9pbnQ2NF90CSBieXRlc1syXTsKIAl1X2lu dDY0X3QJIHBhY2tldHNbMl07CiAJdV9pbnQzMl90CSBzdGF0ZXM7CkBAIC04NDAsNiArODQzLDkg QEAgc3RydWN0IHBmX3N0YXRlIHsKIAogCVRBSUxRX0VOVFJZKHBmX3N0YXRlKQkgc3luY19saXN0 OwogCVRBSUxRX0VOVFJZKHBmX3N0YXRlKQkgZW50cnlfbGlzdDsKKyNpZmRlZiBfX0ZyZWVCU0Rf XworCVRBSUxRX0VOVFJZKHBmX3N0YXRlKQkgc3Jjbm9kZV9saW5rOworI2VuZGlmCiAJUkJfRU5U UlkocGZfc3RhdGUpCSBlbnRyeV9pZDsKIAlzdHJ1Y3QgcGZfc3RhdGVfcGVlcgkgc3JjOwogCXN0 cnVjdCBwZl9zdGF0ZV9wZWVyCSBkc3Q7Cg== --047d7bea40f00ef2e504d77ce15b-- From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 12:15:05 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 53D619D; Sat, 9 Mar 2013 12:15:05 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qe0-f53.google.com (mail-qe0-f53.google.com [209.85.128.53]) by mx1.freebsd.org (Postfix) with ESMTP id 06BC3D26; Sat, 9 Mar 2013 12:15:04 +0000 (UTC) Received: by mail-qe0-f53.google.com with SMTP id cz11so1542710qeb.12 for ; Sat, 09 Mar 2013 04:15:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=IF4nE86t6qj/q8ZZ8pG7hCBxHuKDGdQchvqsuaVRHYE=; b=G2dj1zF4+MwPcCmU68wZgHh3FooxhAyYhrNaCtjJtehhstrk4HXuh8wcmYH6ov2NEn +twD0AD5BCM2smDgXr9IwOC12/cYUBfLBj3poXGR6hlqKJB1a+p9uzo3h+kh6zgkZilt ah1Xfnm7u3cIVLEsUHrcDbZvLT6/1eqCgkt3KqtsXAWLj2atE2WS+ouoDEE1lGq3Onqu Begh2eyRgBrc2sUKK4F4J254DAiNePjUieRqEejcAQwmcr+DupoCJsBrBDNDQWsMhdz/ TGJ/LVDEJeJ6egMqnEhsdGIWihm76x6QkhsMdgh3jvMFlj0TkljWPLEQm4QmpWSwc/6j SMGg== MIME-Version: 1.0 X-Received: by 10.224.186.82 with SMTP id cr18mr8691238qab.64.1362831304317; Sat, 09 Mar 2013 04:15:04 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Sat, 9 Mar 2013 04:15:04 -0800 (PST) In-Reply-To: References: <201303081419.17743.vegeta@tuxpowered.net> <201303082151.00895.vegeta@tuxpowered.net> Date: Sat, 9 Mar 2013 13:15:04 +0100 X-Google-Sender-Auth: YuZhHC-J6WEuDQwMu0GUxbI9FRw Message-ID: Subject: Re: [patch] Source entries removing is awfully slow. From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Kajetan Staszkiewicz Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 12:15:05 -0000 Also do not forget to rebuild pfctl so that statistics are shown correctly. On Sat, Mar 9, 2013 at 1:14 PM, Ermal Lu=C3=A7i wrote: > > > > On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz < > vegeta@tuxpowered.net> wrote: > >> Dnia pi=C4=85tek, 8 marca 2013 o 21:11:43 Ermal Lu=C3=A7i napisa=C5=82(a= ): >> > Is this FreeBSD 9.x or HEAD? >> >> I found the problem and developed the patch on 9.1. >> >> Can you please test this more 'beautiful' patch. > Its similar to yours but also delays src state removal to the proper purg= e > thread. > > Though the src node removal option through pfctl -K does a lot of job to > cleanup things > Still need to undertand why it takes so much time for you to loop through > 500K states. > The purge thread does that every tick by partitioning it to a few per tim= e > slot but still minutes is way loong. > > Can you please try to give a top -SH view of the time when this happens > and a pfctl -vvsa output? > > > >> -- >> | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | >> | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | >> | Vegeta | www: http://vegeta.tuxpowered.net | >> `------------------------^---------------------------------------' >> > > > > -- > Ermal > --=20 Ermal From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 13:37:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8451DBDB for ; Sat, 9 Mar 2013 13:37:57 +0000 (UTC) (envelope-from vegeta@tuxpowered.net) Received: from mail-ea0-x22a.google.com (mail-ea0-x22a.google.com [IPv6:2a00:1450:4013:c01::22a]) by mx1.freebsd.org (Postfix) with ESMTP id F2FA01D5 for ; Sat, 9 Mar 2013 13:37:56 +0000 (UTC) Received: by mail-ea0-f170.google.com with SMTP id a15so551164eae.15 for ; Sat, 09 Mar 2013 05:37:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id:x-gm-message-state; bh=ZfG3s3NFULdgi+srDZR0jl6TBJ+5Egt1AF4+r9mpUUI=; b=VNYp6mQA58OvDbOrGDKwua95jkxnt5bDmiRLkMFAoaIX0HmrjQv3OXLCkKSKc1ZK0z lm/4EXWW/SMwK6+2GAjHGU6584I61gwNEFQnsbFTP2smzhtK4oHjHJ0/i3YDrfjUjJdS D8Qole7pOddTcoQtqUa4w2FB0dquxQInAIsOFe6sH7ULwoz7LFm3VdCOQbpzJE2HDcWM mmVkekHVnJZQd2+hPiAO7DoT37TV7JRzoEj+Aicrt0zt8dnkPZOLX9aH0iotYPXqTO2G hvOr3kE0NGxB6dPC8zkU/YRfA5l4pKJjkco0X8S3bmmoarsHxPjPM0OHP5qs9bRviS16 FP6Q== X-Received: by 10.14.183.198 with SMTP id q46mr16472183eem.1.1362836275730; Sat, 09 Mar 2013 05:37:55 -0800 (PST) Received: from zvezda.localnet ([37.81.64.97]) by mx.google.com with ESMTPS id 44sm13262429eek.5.2013.03.09.05.37.53 (version=TLSv1 cipher=RC4-SHA bits=128/128); Sat, 09 Mar 2013 05:37:54 -0800 (PST) From: Kajetan Staszkiewicz To: Ermal =?utf-8?q?Lu=C3=A7i?= Subject: Re: [patch] Source entries removing is awfully slow. Date: Sat, 9 Mar 2013 14:37:51 +0100 User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; ) References: <201303081419.17743.vegeta@tuxpowered.net> <201303082151.00895.vegeta@tuxpowered.net> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <201303091437.51945.vegeta@tuxpowered.net> X-Gm-Message-State: ALoCoQn09kjRt4d+P7fNlvvJYQ+w9TlP8yVULZMp79p7cm7tRAdsSin3D8UK8LSQTNNIBYIE08hQ Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 13:37:57 -0000 Dnia sobota, 9 marca 2013 o 13:14:16 Ermal Lu=C3=A7i napisa=C5=82(a): > On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz >=20 > wrote: > > Dnia pi=C4=85tek, 8 marca 2013 o 21:11:43 Ermal Lu=C3=A7i napisa=C5=82(= a): > > > Is this FreeBSD 9.x or HEAD? > >=20 > > I found the problem and developed the patch on 9.1. > >=20 > Can you please test this more 'beautiful' patch. Oh, somehow I did not notice an existing implementation for doubly linked l= ist.=20 I'm quite new to kernel programming. > Its similar to yours but also delays src state removal to the proper purge > thread. I'll try it right after the weekend. > Though the src node removal option through pfctl -K does a lot of job to > cleanup things > Still need to undertand why it takes so much time for you to loop through > 500K states. That is because the loop will not be called just once. `pfctl -K 0.0.0.0/0 -K ip.of.internal.server.behind.this.loadbalancer` will= =20 match multiple Source entries, up to a thousand of them in normal condition= s=20 ("normal" for my loadbalancers) and many many more when under a DDoS attack. > The purge thread does that every tick by partitioning it to a few per time > slot but still minutes is way loong. >=20 > Can you please try to give a top -SH view of the time when this happens a= nd > a pfctl -vvsa output? I'll try on Monday, although as far as I remember the machine was quite fro= zen=20 during this operation. =2D-=20 | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | | Vegeta | www: http://vegeta.tuxpowered.net | `------------------------^---------------------------------------' From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 15:11:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 81FDBE46; Sat, 9 Mar 2013 15:11:57 +0000 (UTC) (envelope-from ermal.luci@gmail.com) Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 350727FA; Sat, 9 Mar 2013 15:11:57 +0000 (UTC) Received: by mail-qa0-f48.google.com with SMTP id j8so295744qah.0 for ; Sat, 09 Mar 2013 07:11:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=t8tBgO4mi0deOXxY60/CPlmuTr33/GZM/wLwrlvJua4=; b=DyMck+jw/XdCiePMWTFnQtBCr4St/A9AfdkCSSV75ALyi1Zp526hzee3Es+ZlSBMLH h8gWhjtI1wCgdIDzOtoiI1hLSL0VPSq9Ug3lWfrW68EzLI45PDdDMDWXGzEx8zW13SbL GdlnY0qsIDVw/ZuIWXtUpaEnubEFqLh57N1tJncoS0tAeOOBNx0XMwADeIZ47UzEA2LO FKO35jtOhIfAZe78ldaqx+6W1rqxVNYnyvOz/KC0ALcdbdRByLIYi3PtmSejYLY+IMA/ 5MfpdfbJ3Y9WrDOxVyBlcYdHtnOyVfkmebTk5egIOVK+m2zmJoGfftaXvTsvUm+6p9/e aVvA== MIME-Version: 1.0 X-Received: by 10.224.178.77 with SMTP id bl13mr9338639qab.13.1362841916475; Sat, 09 Mar 2013 07:11:56 -0800 (PST) Sender: ermal.luci@gmail.com Received: by 10.49.27.197 with HTTP; Sat, 9 Mar 2013 07:11:56 -0800 (PST) In-Reply-To: <201303091437.51945.vegeta@tuxpowered.net> References: <201303081419.17743.vegeta@tuxpowered.net> <201303082151.00895.vegeta@tuxpowered.net> <201303091437.51945.vegeta@tuxpowered.net> Date: Sat, 9 Mar 2013 16:11:56 +0100 X-Google-Sender-Auth: SDQcnfZIop-Qf76jdAFs98G2DVc Message-ID: Subject: Re: [patch] Source entries removing is awfully slow. From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= To: Kajetan Staszkiewicz Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 15:11:57 -0000 On Sat, Mar 9, 2013 at 2:37 PM, Kajetan Staszkiewicz wrote: > Dnia sobota, 9 marca 2013 o 13:14:16 Ermal Lu=E7i napisa=B3(a): > > On Fri, Mar 8, 2013 at 9:51 PM, Kajetan Staszkiewicz > > > > wrote: > > > Dnia pi=B1tek, 8 marca 2013 o 21:11:43 Ermal Lu=E7i napisa=B3(a): > > > > Is this FreeBSD 9.x or HEAD? > > > > > > I found the problem and developed the patch on 9.1. > > > > > Can you please test this more 'beautiful' patch. > > Oh, somehow I did not notice an existing implementation for doubly linked > list. > I'm quite new to kernel programming. > > > Its similar to yours but also delays src state removal to the proper > purge > > thread. > > I'll try it right after the weekend. > > > Though the src node removal option through pfctl -K does a lot of job t= o > > cleanup things > > Still need to undertand why it takes so much time for you to loop throu= gh > > 500K states. > > That is because the loop will not be called just once. > > `pfctl -K 0.0.0.0/0 -K ip.of.internal.server.behind.this.loadbalancer` > will > match multiple Source entries, up to a thousand of them in normal > conditions > ("normal" for my loadbalancers) and many many more when under a DDoS > attack. > > I would expect from a proper software to kill states from those clients and then kill the srcnode for the backend server. It does not make proper sense to not kill state before src nodes since that is what will impact your connectivity. Though the patch improves your use case a lot still would be better to even kill those states during this step, with an extra option, since otherwise you'd have to create for each of those client a separate request. Do you control the application to test an extra addition to this patch to allow killing the linked states as well? > > The purge thread does that every tick by partitioning it to a few per > time > > slot but still minutes is way loong. > > > > Can you please try to give a top -SH view of the time when this happens > and > > a pfctl -vvsa output? > > I'll try on Monday, although as far as I remember the machine was quite > frozen > during this operation. > > -- > | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | > | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | > | Vegeta | www: http://vegeta.tuxpowered.net | > `------------------------^---------------------------------------' > --=20 Ermal From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 16:15:47 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B5042E2B for ; Sat, 9 Mar 2013 16:15:47 +0000 (UTC) (envelope-from vegeta@tuxpowered.net) Received: from mail-ea0-x229.google.com (mail-ea0-x229.google.com [IPv6:2a00:1450:4013:c01::229]) by mx1.freebsd.org (Postfix) with ESMTP id 33C94E02 for ; Sat, 9 Mar 2013 16:15:46 +0000 (UTC) Received: by mail-ea0-f169.google.com with SMTP id z7so579293eaf.14 for ; Sat, 09 Mar 2013 08:15:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id:x-gm-message-state; bh=87U5I1ti2Xiokb1uoM/2wytuL6lompIPNW4nwc8R2X8=; b=bpzguDRGogILgT5DTgJVShj4mbcVP47Bygb8Qo1ienaMCs9mPvcJM/BMRncKqo6Uwo sdxVk8Q4MlATwzBOeKBwgJjbSAaXMfehbS4DyCGoKe1+UtX3KJKD5T3VboQy62Y5HmeY 6IeXX3rqjwbhKq6k9BJ6Fy9ioCsdvy8qtaUFzzDQPyXq3Ti05KufFe02ReImImMJeTSa hh1D2PdZSehEdKDB5Bzja5SmsxvbN1Oqylmz/Q00jUoA8Tyl8Js+iNg6vn8Cb6W9UOX7 CZ2SStJ3PwzRYnOYZ5ZcvK4rGeiUtDnNQU2nBNvgDi5+qib6wXKnGo0BMWc7uIlbw4MI fm2w== X-Received: by 10.14.4.69 with SMTP id 45mr17622104eei.0.1362845745816; Sat, 09 Mar 2013 08:15:45 -0800 (PST) Received: from zvezda.localnet ([37.81.64.97]) by mx.google.com with ESMTPS id 3sm13797558eej.6.2013.03.09.08.15.43 (version=TLSv1 cipher=RC4-SHA bits=128/128); Sat, 09 Mar 2013 08:15:44 -0800 (PST) From: Kajetan Staszkiewicz To: Ermal =?utf-8?q?Lu=C3=A7i?= Subject: Re: [patch] Source entries removing is awfully slow. Date: Sat, 9 Mar 2013 17:15:42 +0100 User-Agent: KMail/1.13.5 (Linux/3.6.6-vegeta.1; KDE/4.4.5; x86_64; ; ) References: <201303081419.17743.vegeta@tuxpowered.net> <201303091437.51945.vegeta@tuxpowered.net> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <201303091715.42624.vegeta@tuxpowered.net> X-Gm-Message-State: ALoCoQmKh3qvj6TlKdybwlU8fvTHcwi2t84HCc2J6fSoQ2inHWVn75kU0MrAhiWpCdSRyo/Pm4b3 Cc: "freebsd-net@freebsd.org" , "freebsd-pf@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 16:15:47 -0000 Dnia sobota, 9 marca 2013 o 16:11:56 napisa=C5=82e=C5=9B: > > > Though the src node removal option through pfctl -K does a lot of job > > > to cleanup things > > > Still need to undertand why it takes so much time for you to loop > > > through 500K states. > >=20 > > That is because the loop will not be called just once. > >=20 > > `pfctl -K 0.0.0.0/0 -K ip.of.internal.server.behind.this.loadbalancer` > > will > > match multiple Source entries, up to a thousand of them in normal > > conditions > > ("normal" for my loadbalancers) and many many more when under a DDoS > > attack. >=20 > I would expect from a proper software to kill states from those clients a= nd > then kill the srcnode for the backend server. =46irst of all, I do not know which clients are affected. I know which serv= er is=20 dead. But I can not remove states to this server using pfctl, as states are= =20 from clients' public IP addresses to loadbalancer's public IP address. Sour= ces=20 on the other hand point to the internal IP address of the broken server. And the second thing is, that under normal conditions removing just a bit o= f=20 states would not help the performance. Also the server health checking soft= ware=20 is unaware of DDoS attacks and will not remove states resulting from the at= tack=20 in advance. > It does not make proper sense to not kill state before src nodes since th= at > is what will impact your connectivity. I agree, it makes only sense to remove both sources and linked states at th= e=20 same time. With removing sources only, states are still pointing to the bro= ken=20 server and clients are still connected to it in existing tcp connections. I= f=20 states would be also removed, clients will loose all connectivity (which I= =20 prefer rather than them seeing wrong data) and (hopefully) reconnect to ano= ther=20 live server. > Though the patch improves your use case a lot still would be better to ev= en > kill those states during this step, with an extra option, > since otherwise you'd have to create for each of those client a separate > request. That would be in updated version of the patch I hope to send to the list on= =20 Monday. =2D-=20 | pozdrawiam / greetings | powered by Debian, CentOS and FreeBSD | | Kajetan Staszkiewicz | jabber,email: vegeta()tuxpowered net | | Vegeta | www: http://vegeta.tuxpowered.net | `------------------------^---------------------------------------' From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 16:27:35 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 50F51FD2; Sat, 9 Mar 2013 16:27:35 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id B9078E61; Sat, 9 Mar 2013 16:27:34 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqEEAE5iO1GDaFvO/2dsb2JhbABDiCi8OIF1dIItAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASHbAYMqT2SC4EjjCkKBX00B4ItgRMDiHGLJYI+gR6PVYMoT30IFx4 X-IronPort-AV: E=Sophos;i="4.84,814,1355115600"; d="scan'208";a="17907963" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu.net.uoguelph.ca with ESMTP; 09 Mar 2013 11:27:32 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BBA7BB4036; Sat, 9 Mar 2013 11:27:32 -0500 (EST) Date: Sat, 9 Mar 2013 11:27:32 -0500 (EST) From: Rick Macklem To: Garrett Wollman Message-ID: <1639798917.3728142.1362846452693.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20794.38381.221980.5038@hergotha.csail.mit.edu> Subject: Re: NFS DRC size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 16:27:35 -0000 Garrett Wollman wrote: > < said: > > > The cached replies are copies of the mbuf list done via m_copym(). > > As such, the clusters in these replies won't be free'd (ref cnt -> > > 0) > > until the cache is trimmed (nfsrv_trimcache() gets called after the > > TCP layer has received an ACK for receipt of the reply from the > > client). > > I wonder if this bit is even working at all. In my experience, the > size of the DRC quickly grows under load up to the maximum (or > actually, slightly beyond), and never drops much below that level. On > my production server right now, "nfsstat -se" reports: > Well, once you add the patches and turn vfs.nfsd.tcphighwater up, it will only trim the cache when that highwater mark is exceeded. When it does the trim, the size does drop for the simple testing I do with a single client. (I'll take another look at drc3.patch and see if I can spot anywhere this might be broken, although my hunch is that you have a lot of TCP connections and enough activity that it rapidly grows back up to the limit.) The fact that it trims down to around the highwater mark basically indicates this is working. If it wasn't throwing away replies where the receipt has been ack'd at the TCP level, the cache would grow very large, since they would only be discarded after a loonnngg timeout (12hours unless you've changes NFSRVCACHE_TCPTIMEOUT in sys/fs/nfs/nfs.h). > Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 13036780 359901 1723623 3420 36397693 12385668 346590 109984 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 45173 16 116791 14192 1176 24 12876747 3398533 > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > 0 2703 14992 7502 1329196 0 1 1 > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > 263034 0 0 263019 0 0 545104 0 > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > 0 0 263012 0 0 23753375 0 1 > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > 2 263006 263033 0 0 0 > Server: > Retfailed Faults Clients > 0 0 1 > OpenOwner Opens LockOwner Locks Delegs > 56 10 0 0 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 0 81714128 60997 61017 > > It's only been up for about the last 24 hours. Should I be setting > the size limit to something truly outrageous, like 200,000? (I'd > definitely need to deal with the mbuf cluster issue then!) The > average request rate over this time is about 1000/s, but that includes > several episodes of high-cpu spinning (which I resolved by increasing > the DRC limit). > It is the number of TCP connections from clients that determines how much gets cached, not the request rate. For TCP, a scheme like LRU doesn't work, because RPC retries (as opposed to TCP segment retransmits) only happen long after the initial RPC request. (Usually after a TCP connection has broken and the client has established a new connection, although some NFSv3 over TCP clients will retry an RPC after a long timeout.) The cache needs to hold the last N RPC replies for each TCP connection and discard them when further traffic on the TCP connection indicates that the connection is still working. (Some NFSv3 over TCP servers don't guarantee to generate a reply for an RPC when resource constrained, but the FreeBSD one always sends a reply, except for NFSv2, where it will close down the TCP connection when it has no choice. I doubt any client is doing NFSv2 over TCP, so I don't consider this relevent.) If the CPU is spinning in nfsrc_trimcache() a lot, increasing vfs.nfsd.tcphighwater should decrease that, but with an increase in mbuf cluster allocation. If there is a lot of contention for mutexes, increasing the size of the hash table might help. The drc3.patch bumped the hash table from 20->200, but that would still be about 300 entries per hash list and one mutex for those 300 entries, assuming the hash function is working well. Increasing it only adds list head pointers and mutexes. (It's NFSRVCACHE_HASHSIZE in sys/fs/nfs/nfsrvcache.h.) Unfortunately, increasing it requires a kernel rebuild/reboot. Maybe the patch for head should change the size of the hash table when vfs.nfsd.tcphighwater is set much larger? (Not quite trivial and will probably result in a short stall of the nfsd threads, since all the entries will need to be rehashed/moved to new lists, but could be worth the effort.) > Meanwhile, some relevant bits from sysctl: > > vfs.nfsd.udphighwater: 500 > vfs.nfsd.tcphighwater: 61000 > vfs.nfsd.minthreads: 16 > vfs.nfsd.maxthreads: 64 > vfs.nfsd.threads: 64 > vfs.nfsd.request_space_used: 1416 > vfs.nfsd.request_space_used_highest: 4284672 > vfs.nfsd.request_space_high: 47185920 > vfs.nfsd.request_space_low: 31457280 > vfs.nfsd.request_space_throttled: 0 > vfs.nfsd.request_space_throttle_count: 0 > > (I'd actually like to put maxthreads back up at 256, which is where I > had it during testing, but I need to test that the jumbo-frames issue > is fixed first. I did pre-production testing on a non-jumbo network.) > > -GAWollman > Well, the DRC will try to cache replies until the client's TCP layer acknowledges receipt of the reply. It is hard to say how many replies that is for a given TCP connection, since it is a function of the level of concurrently (# of nfsiod threads in the FreeBSD client) in the client. I'd guess it's somewhere between 1<->20? Multiply that by the number of TCP connections from all clients and you have about how big the server's DRC will be. (Some clients use a single TCP connection for the client whereas others use a separate TCP connection for each mount point.) When ivoras@ and I have a patch for head, it should probably allow the DRC to be disabled for TCP mounts (by setting vfs.nfsd.tcphighwater == -1?). I don't really like the idea, but I can see the argument that TCP maintains a reliable enough RPC transport that the DRC isn't needed. rick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 16:50:31 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A911E222; Sat, 9 Mar 2013 16:50:31 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 5D10AF92; Sat, 9 Mar 2013 16:50:31 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqEEAPNmO1GDaFvO/2dsb2JhbABDiCi8OIF1dIItAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASHbAYMqUmSDIEjjDh9NAeCLYETA4hxiyWCPoEej1WDKE+BBTU X-IronPort-AV: E=Sophos;i="4.84,814,1355115600"; d="scan'208";a="17909999" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu.net.uoguelph.ca with ESMTP; 09 Mar 2013 11:50:30 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 72B81B3F51; Sat, 9 Mar 2013 11:50:30 -0500 (EST) Date: Sat, 9 Mar 2013 11:50:30 -0500 (EST) From: Rick Macklem To: Garrett Wollman Message-ID: <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20794.37617.822910.93537@hergotha.csail.mit.edu> Subject: Re: Limits on jumbo mbuf cluster allocation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 16:50:31 -0000 Garrett Wollman wrote: > < said: > > > If reducing the size to 4K doesn't fix the problem, you might want > > to > > consider shrinking the tunable vfs.nfsd.tcphighwater and suffering > > the increased CPU overhead (and some increased mutex contention) of > > calling nfsrv_trimcache() more frequently. > > Can't do that -- the system becomes intolerably slow when it gets into > that state, and seems to get stuck that way, such that the only way to > restore performance is to increase the size of the "cache". > (Essentially all of the nfsd service threads end up spinning most of > the time, load average goes to N, and goodput goes to nearly nil.) It > does seem like a lot of effort for an extreme edge case that, in > practical terms, never happens. > So, it sounds like you've found a reasonable setting. Yes, if it is too small, it will keep trimming over and over and over again... I suspect this indicates that it isn't mutex contention, since the threads would block waiting for the mutex for that case, I think? (Bumping up NFSRVCACHE_HASHSIZE can't hurt if/when you get the chance.) > > (I'm assuming that you are using drc2.patch + drc3.patch. > > I believe that's what I have. If my kernel coding skills were less > rusty, I'd fix it to have a separate cache-trimming thread. > I've thought about this. My concern is that the separate thread might not keep up with the trimming demand. If that occurred, the cache would grow veryyy laarrggge, with effects like running out of mbuf clusters. By having the nfsd threads do it, they slow down, which provides feedback to the clients (slower RPC replies->generate fewer request->less to cache). (I think you are probably familiar with the generic concept that a system needs feedback to remain stable. An M/M/1 queue with open arrivals and no feedback to slow the arrival rate explodes when the arrival rate approaches the service rate, etc and so on...) As such, I'm not convinced a separate thread is a good idea. I think that simply allowing sysadmins to disable the DRC for TCP may make sense. Although I prefer more reliable vs better performance, I can see the argument that TCP transport for RPC is "good enough" for some environments. (Basically, if a site has a high degree of confidence in their network fabric, such that network partitioning type failures are pretty well non-existent and the NFS server isn't getting overloaded to the point of very slow RPC replies, I can see TCP retransmits as being sufficient?) > One other weird thing that I've noticed is that netstat(1) reports the > send and receive queues on NFS connections as being far higher than I > have the limits configured. Does NFS do something to override this? > > -GAWollman > The nfs server does soreserve(so, sb_max_adj, sb_max_adj); I can't recall exactly why it is that way, except that it needs to be large enough to handle the largest RPC request a client might generate. I should take another look at this, in case sb_max_adj is now too large? rick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 17:34:51 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C9184346 for ; Sat, 9 Mar 2013 17:34:51 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7F7031AD for ; Sat, 9 Mar 2013 17:34:51 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r29HYo0R061832; Sat, 9 Mar 2013 12:34:50 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r29HYohJ061829; Sat, 9 Mar 2013 12:34:50 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20795.29370.194678.963351@hergotha.csail.mit.edu> Date: Sat, 9 Mar 2013 12:34:50 -0500 From: Garrett Wollman To: Rick Macklem Subject: Re: Limits on jumbo mbuf cluster allocation In-Reply-To: <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca> References: <20794.37617.822910.93537@hergotha.csail.mit.edu> <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Sat, 09 Mar 2013 12:34:50 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 17:34:51 -0000 < said: > I suspect this indicates that it isn't mutex contention, since the > threads would block waiting for the mutex for that case, I think? No, because our mutexes are adaptive, so each thread spins for a while before blocking. With the current implementation, all of them end up doing this in pretty close to lock-step. > (Bumping up NFSRVCACHE_HASHSIZE can't hurt if/when you get the chance.) I already have it set to 129 (up from 20); I could see putting it up to, say, 1023. It would be nice to have a sysctl for maximum chain length to see how bad it's getting (and if the hash function is actually effective). > I've thought about this. My concern is that the separate thread might > not keep up with the trimming demand. If that occurred, the cache would > grow veryyy laarrggge, with effects like running out of mbuf clusters. At a minimum, once one nfsd thread is committed to doing the cache trim, a flag should be set to discourage other threads from trying to do it. Having them all spinning their wheels punishes the clients much too much. > By having the nfsd threads do it, they slow down, which provides feedback > to the clients (slower RPC replies->generate fewer request->less to cache). > (I think you are probably familiar with the generic concept that a system > needs feedback to remain stable. An M/M/1 queue with open arrivals and > no feedback to slow the arrival rate explodes when the arrival rate > approaches the service rate, etc and so on...) Unfortunately, the feedback channel that I have is: one user starts 500 virtual machines accessing a filesystem on the server -> other users of this server see their goodput go to zero -> everyone sends in angry trouble tickets -> I increase the DRC size manually. It would be nice if, by the time I next want to take a vacation, I have this figured out. I'm OK with throwing memory at the problem -- these servers have 96 GB and can hold up to 144 GB -- so long as I can find a tuning that provides stability and consistent, reasonable performance for the users. > The nfs server does soreserve(so, sb_max_adj, sb_max_adj); I can't > recall exactly why it is that way, except that it needs to be large > enough to handle the largest RPC request a client might generate. > I should take another look at this, in case sb_max_adj is now > too large? It probably shouldn't be larger than the net.inet.tcp.{send,recv}buf_max, and the read and write sizes that are negotiated should be chosen so that a whole RPC can fit in that space. If that's too hard for whatever reason, nfsd should at least log a message saying "hey, your socket buffer limits are too small, I'm going to ignore them". -GAWollman From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 18:00:05 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C5033C08; Sat, 9 Mar 2013 18:00:05 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 66E9627C; Sat, 9 Mar 2013 18:00:05 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r29I04Sp062160; Sat, 9 Mar 2013 13:00:04 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r29I04gL062157; Sat, 9 Mar 2013 13:00:04 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20795.30884.330015.123616@hergotha.csail.mit.edu> Date: Sat, 9 Mar 2013 13:00:04 -0500 From: Garrett Wollman To: Rick Macklem Subject: Re: NFS DRC size In-Reply-To: <1639798917.3728142.1362846452693.JavaMail.root@erie.cs.uoguelph.ca> References: <20794.38381.221980.5038@hergotha.csail.mit.edu> <1639798917.3728142.1362846452693.JavaMail.root@erie.cs.uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Sat, 09 Mar 2013 13:00:04 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 18:00:05 -0000 < said: > around the highwater mark basically indicates this is working. If it wasn't > throwing away replies where the receipt has been ack'd at the TCP > level, the cache would grow very large, since they would only be > discarded after a loonnngg timeout (12hours unless you've changes > NFSRVCACHE_TCPTIMEOUT in sys/fs/nfs/nfs.h). That seems unreasonably large. > Well, the DRC will try to cache replies until the client's TCP layer > acknowledges receipt of the reply. It is hard to say how many replies > that is for a given TCP connection, since it is a function of the level > of concurrently (# of nfsiod threads in the FreeBSD client) > in the client. I'd guess it's somewhere between 1<->20? Nearly all our clients are Linux, so it's likely to be whatever Debian does by default. > Multiply that by the number of TCP connections from all clients and > you have about how big the server's DRC will be. (Some clients use > a single TCP connection for the client whereas others use a separate > TCP connection for each mount point.) The Debian client appears to use a single TCP connection for everything. So if I want to support 2,000 clients each with 20 requests in flight, that would suggest that I need a DRC size of 40,000, which my experience shows is not sufficient with even a much smaller number of clients. -GAWollman From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 18:46:11 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7D31A6AA for ; Sat, 9 Mar 2013 18:46:11 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id EE57C606 for ; Sat, 9 Mar 2013 18:46:10 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r29Ik927062597; Sat, 9 Mar 2013 13:46:09 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r29Ik9jX062596; Sat, 9 Mar 2013 13:46:09 -0500 (EST) (envelope-from wollman) Date: Sat, 9 Mar 2013 13:46:09 -0500 (EST) From: Garrett Wollman Message-Id: <201303091846.r29Ik9jX062596@hergotha.csail.mit.edu> To: rmacklem@uoguelph.ca Subject: Re: Limits on jumbo mbuf cluster allocation X-Newsgroups: mit.lcs.mail.freebsd-net In-Reply-To: <20795.29370.194678.963351@hergotha.csail.mit.edu> References: <20794.37617.822910.93537@hergotha.csail.mit.edu> <1700261042.3728432.1362847830447.JavaMail.root@erie.cs.uoguelph.ca> Organization: none X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Sat, 09 Mar 2013 13:46:09 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 18:46:11 -0000 In article <20795.29370.194678.963351@hergotha.csail.mit.edu>, I wrote: >< said: >> I've thought about this. My concern is that the separate thread might >> not keep up with the trimming demand. If that occurred, the cache would >> grow veryyy laarrggge, with effects like running out of mbuf clusters. > >At a minimum, once one nfsd thread is committed to doing the cache >trim, a flag should be set to discourage other threads from trying to >do it. Having them all spinning their wheels punishes the clients >much too much. Also, it occurs to me that this strategy is subject to livelock. To put backpressure on the clients, it is far better to get them to stop sending (by advertising a small receive window) than to accept their traffic but queue it for a long time. By the time the NFS code gets an RPC, the system has already invested so much into it that it should be processed as quickly as possible, and this strategy essentially guarantees[1] that, once those 2 MB socket buffers start to fill up, they will stay filled, sending latency through the roof. If nfsd didn't override the usual socket-buffer sizing mechanisms, then sysadmins could limit the buffers to ensure a stable response time. The bandwidth-delay product in our network is somewhere between 12.5 kB and 125 kB, depending on how the client is connected and what sort of latency they experience. The usual theory would suggest that socket buffers should be no more than twice that -- i.e., about 256 kB. I'd actually like to see something like WFQ in the NFS server to allow me to limit the amount of damage one client or group of clients can do without unnecessarily limiting other clients. -GAWollman [1] The largest RPC is a bit more than 64 KiB (negotiated), so if the server gets slow, the 2 MB receive queue will be refilled by the client before the server manages to perform the RPC and send a response. From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 19:18:04 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0AB9EE8B; Sat, 9 Mar 2013 19:18:04 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-wg0-x22a.google.com (mail-wg0-x22a.google.com [IPv6:2a00:1450:400c:c00::22a]) by mx1.freebsd.org (Postfix) with ESMTP id 2339D6E7; Sat, 9 Mar 2013 19:18:02 +0000 (UTC) Received: by mail-wg0-f42.google.com with SMTP id 12so824752wgh.3 for ; Sat, 09 Mar 2013 11:18:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:subject:mime-version:content-type:from:in-reply-to:date :cc:content-transfer-encoding:message-id:references:to:x-mailer; bh=Vo2XatOVIi1H9+9J+ZkiPtFJDTs0Nf5R27czcbeFKvM=; b=c7QJCwLUWI+UDQFVBsk1cGZPxXWWVS3neWNsZExVlk2MvQmlfCglGa10tt8ePkU267 /TkdYldy4zj/YWyV0N7ljcqxJHXrCXmblXg+zd044YhiOJ0AlrzpTh1X2ilxrsWeJnIS sKI9hQq+RIlIvKa7oNUOAkpUuEWWFZTYgTqonoa6ZnowLM52J2MrUJn5tU7SWOTuWaVz ybiMUtJgtWZRvPR6518pgGSrVt0AwjYOXG9MVOh9RJD1LwGMEmG71jfrYmnIh9HBxlad SnZWePQPKp9OKmbh+CYH4gZlw1AEU+srkMb/cBGxsOBjmv5diQfbrfz+n5wQd5bM1fG1 BIzQ== X-Received: by 10.180.82.33 with SMTP id f1mr4727528wiy.13.1362856682282; Sat, 09 Mar 2013 11:18:02 -0800 (PST) Received: from [192.168.1.35] ([188.141.28.166]) by mx.google.com with ESMTPS id c15sm6550408wiw.3.2013.03.09.11.18.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 09 Mar 2013 11:18:01 -0800 (PST) Subject: Re: [patch] interface routes Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Content-Type: text/plain; charset=us-ascii From: Nikolay Denev In-Reply-To: <20130307214205.GD50035@funkthat.com> Date: Sat, 9 Mar 2013 19:17:59 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <5205A02F-E886-4B7E-8494-1D92F930933B@gmail.com> References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <20130307214205.GD50035@funkthat.com> To: John-Mark Gurney X-Mailer: Apple Mail (2.1499) Cc: "Alexander V. Chernikov" , Andre Oppermann , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 19:18:04 -0000 On Mar 7, 2013, at 9:42 PM, John-Mark Gurney wrote: > Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 = +0100: >>> Adding interface address is handled via atomically deleting old = prefix and=20 >>> adding interface one. >>=20 >> This brings up a long standing sore point of our routing code >> which this patch makes more pronounced. When an interface link >> state is down I don't want the route to it to persist but to >> become inactive so another path can be chosen. This the very >> point of running a routing daemon. So on the link-down event >> the installed interface routes should be removed from the routing >> table. The configured addresses though should persist and the >> interface routes re-installed on a link-up event. What's your >> opinion on it? >>=20 >> Other than these points I think your code is fine and can go >> into the tree. >=20 > The issue that I see with this is that if you bump your cable, all > your connections will be dropped, because as soon as they try to send > something, they'll get a no route to host, and this will break the > TCP connection... If we keep the routes when the link goes down, > the packet will be queued or dropped (depending upon ethernet driver), > but the TCP connection will not break... >=20 > --=20 > John-Mark Gurney Voice: +1 415 225 5579 >=20 > "All that I will do, has been done, All that I have, has not." Maybe this can be made a option that can be turned on when needed. What you describe can be very undesirable for a workstation/laptop or a = server, but a router that itself does not have many connections originating or = terminating on it could actually benefit from this. The current state is actually much worse for routers. A link down does = not do anything, and while there may be a alternative route to be installed for example from = OSPF, the interface without link pertains its routes and effectively blackholes all traffic. -- Nikolay From owner-freebsd-net@FreeBSD.ORG Sat Mar 9 19:20:23 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C8813F6D; Sat, 9 Mar 2013 19:20:23 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 8E0C0701; Sat, 9 Mar 2013 19:20:23 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UEPN0-00092D-Td; Sat, 09 Mar 2013 23:23:51 +0400 Message-ID: <513B8B56.1000005@FreeBSD.org> Date: Sat, 09 Mar 2013 23:19:50 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: Nikolay Denev Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> <20130307214205.GD50035@funkthat.com> <5205A02F-E886-4B7E-8494-1D92F930933B@gmail.com> In-Reply-To: <5205A02F-E886-4B7E-8494-1D92F930933B@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: John-Mark Gurney , Andre Oppermann , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Mar 2013 19:20:23 -0000 On 09.03.2013 23:17, Nikolay Denev wrote: > On Mar 7, 2013, at 9:42 PM, John-Mark Gurney wrote: > >> Andre Oppermann wrote this message on Thu, Mar 07, 2013 at 08:39 +0100: >>>> Adding interface address is handled via atomically deleting old prefix and >>>> adding interface one. >>> >>> This brings up a long standing sore point of our routing code >>> which this patch makes more pronounced. When an interface link >>> state is down I don't want the route to it to persist but to >>> become inactive so another path can be chosen. This the very >>> point of running a routing daemon. So on the link-down event >>> the installed interface routes should be removed from the routing >>> table. The configured addresses though should persist and the >>> interface routes re-installed on a link-up event. What's your >>> opinion on it? >>> >>> Other than these points I think your code is fine and can go >>> into the tree. >> >> The issue that I see with this is that if you bump your cable, all >> your connections will be dropped, because as soon as they try to send >> something, they'll get a no route to host, and this will break the >> TCP connection... If we keep the routes when the link goes down, >> the packet will be queued or dropped (depending upon ethernet driver), >> but the TCP connection will not break... >> >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." > > Maybe this can be made a option that can be turned on when needed. Yes. There is another patch in this thread with "remove_iface_routes_on_change" per-VNET sysctl, turned off by default. > What you describe can be very undesirable for a workstation/laptop or a server, > but a router that itself does not have many connections originating or terminating on it could > actually benefit from this. > The current state is actually much worse for routers. A link down does not do anything, and > while there may be a alternative route to be installed for example from OSPF, the interface without link > pertains its routes and effectively blackholes all traffic. > > -- > Nikolay > >