From owner-freebsd-net@FreeBSD.ORG Sun Nov 11 17:28:43 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8E85F256C for ; Sun, 11 Nov 2012 17:28:43 +0000 (UTC) (envelope-from thesaurarius.romae@yandex.ru) Received: from forward10.mail.yandex.net (forward10.mail.yandex.net [IPv6:2a02:6b8:0:202::5]) by mx1.freebsd.org (Postfix) with ESMTP id ABAFB8FC12 for ; Sun, 11 Nov 2012 17:28:42 +0000 (UTC) Received: from smtp9.mail.yandex.net (smtp9.mail.yandex.net [77.88.61.35]) by forward10.mail.yandex.net (Yandex) with ESMTP id 4DE7D1020542 for ; Sun, 11 Nov 2012 21:28:38 +0400 (MSK) Received: from smtp9.mail.yandex.net (localhost [127.0.0.1]) by smtp9.mail.yandex.net (Yandex) with ESMTP id 2A3D01520780 for ; Sun, 11 Nov 2012 21:28:38 +0400 (MSK) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by smtp9.mail.yandex.net (nwsmtp/Yandex) with ESMTP id SaLuHAUq-SbLW9KRr; Sun, 11 Nov 2012 21:28:37 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1352654918; bh=gicTzs1/5Vm5Lsl/OBVG/xdrQ0RuNjVr71UuIqDxrKo=; h=Received:X-Google-DKIM-Signature:MIME-Version:Received:Received: X-Originating-IP:Received:In-Reply-To:References:Date:Message-ID: Subject:From:To:Content-Type:X-Gm-Message-State; b=voAcxqklNmnzYQE1jcnRuuOu5SMRbHy2MNKt8QaEfmMf4OrK6UOxrMr/u5OJqWKeC 8her8bXJUqjBxQB+6MeVrQG9aBha/+n6BcBlJo1wGrf6r/JjKEEnh5Yurj+9TO62t8 mAHbDpf3YKbNjxKDSqxB/f9TlquJCVjs5E+NBJHM= Received: by mail-vb0-f54.google.com with SMTP id l1so7139403vba.13 for ; Sun, 11 Nov 2012 09:28:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=gicTzs1/5Vm5Lsl/OBVG/xdrQ0RuNjVr71UuIqDxrKo=; b=R3deoF/5DFbEPKJutZ0LEmydpzQIgFg/RDe/VlNeJLTR550GutFGfwrlINnVLl3/AJ xnSvl6gCT4F/Reo6j869lmAa71z2JawVATNhGClkY5EUTKYiYtPG1c+9oh0iTxKSabSH FY4vThnxZvQEYEHdXe+WPDUQTSXK1vZrly9LBiXWFUyNZz54H5+nXA/kt+ue1osi6QQn d8Fw9fzbrYsXHb7IESo/jW7IV1aE5NSxFDK5c1KE3PLlggqOPxGqwJbMz3n9LRWnYvdq ++tWNHD6Btb9G+xRWLT3izqNx5uuVZXRtJ+3P+r3HPGBh1fMOI49+/7EUS2mjgQfE0Q1 r8FQ== MIME-Version: 1.0 Received: by 10.220.231.8 with SMTP id jo8mr20869819vcb.40.1352654916116; Sun, 11 Nov 2012 09:28:36 -0800 (PST) Received: by 10.58.229.200 with HTTP; Sun, 11 Nov 2012 09:28:36 -0800 (PST) X-Originating-IP: [78.36.196.85] Received: by 10.58.229.200 with HTTP; Sun, 11 Nov 2012 09:28:36 -0800 (PST) In-Reply-To: References: Date: Sun, 11 Nov 2012 21:28:36 +0400 Message-ID: Subject: IPv6 NDP Proxy From: Thesaurarius Romae To: freebsd-net@freebsd.org X-Gm-Message-State: ALoCoQl3h1BkBh7zXTPDxYl2VNdqbZ3YFhnAy/miY7RXwV1bmfqwRPxsvWI2fKak9y4KsxTcae4r Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Nov 2012 17:28:43 -0000 Hello all! I have a small problem - I need to give IPv6 addresses to several machines on a network interfaces, but provider who gave me IPv6 /64 network wants to see all the IPv6 hosts in the same L2 network. To be more exact, I have a physical server at hetzner.de, with interface re0. I successfully configured IPv6 address on this interface and everything works fine, but I also have VMs on interfaces tap0, vboxnet0 and OpenVPN clients on tun0. I google about that subject and solution I found is to use NDP proxy. But all the examples I found are for linux, while I use FreeBSD and can't figure how to the same thing on it. Here's the linux-solution: http://www.stocksy.co.uk/articles/Networks/ipv6_for_xen_hosts_on_a_hetzner_leased_server_with_a_routed_ipv4_allocation I'm pretty sure that solution is easy to implement and understand, but I guess my lack of IPv6-knowledge prevents me from figuring it. Thanks for your reply! P.S.: Please, CC me personally, since I'm not subscribed to the list. From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 07:10:39 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BF7F691C; Mon, 12 Nov 2012 07:10:39 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id A1D218FC0C; Mon, 12 Nov 2012 07:10:39 +0000 (UTC) Received: from Alfreds-MacBook-Pro-5.local (c-67-180-208-218.hsd1.ca.comcast.net [67.180.208.218]) by elvis.mu.org (Postfix) with ESMTPSA id 36F641A3CC0; Sun, 11 Nov 2012 23:10:39 -0800 (PST) Message-ID: <50A0A0EF.3020109@mu.org> Date: Sun, 11 Nov 2012 23:10:39 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: freebsd-net@freebsd.org, Peter Wemm , Adrian Chadd Subject: auto tuning tcp Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 07:10:39 -0000 I noticed that TCBHASHSIZE does not autotune. What do you think of the following algorithm? Basically round down to next power of two based on nmbclusters / 64. -Alfred #include #include #include int main(int argc, char **argv) { int nmbclusters; int pow2cl; nmbclusters = atoi(argv[1]); pow2cl = 1 << (fls(nmbclusters / 64)-1); if (pow2cl < 512) pow2cl = 512; printf("%d\n", pow2cl); return (0); } From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 07:28:12 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C677DE22 for ; Mon, 12 Nov 2012 07:28:12 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 30C768FC0C for ; Mon, 12 Nov 2012 07:28:10 +0000 (UTC) Received: (qmail 94325 invoked from network); 12 Nov 2012 09:02:30 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 Nov 2012 09:02:30 -0000 Message-ID: <50A0A502.1030306@networx.ch> Date: Mon, 12 Nov 2012 08:28:02 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> In-Reply-To: <50A0A0EF.3020109@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 07:28:12 -0000 On 12.11.2012 08:10, Alfred Perlstein wrote: > I noticed that TCBHASHSIZE does not autotune. > > What do you think of the following algorithm? > > Basically round down to next power of two based on nmbclusters / 64. Please wait out for a real fix of the various mbuf-whatever tuning issue I'll propose shortly. This approach may become inapproriate. Also the mbuf limits can be changed at runtime by sysctl. -- Andre > -Alfred > > #include > #include > #include > > > int > main(int argc, char **argv) > { > int nmbclusters; > int pow2cl; > > nmbclusters = atoi(argv[1]); > pow2cl = 1 << (fls(nmbclusters / 64)-1); > if (pow2cl < 512) > pow2cl = 512; > printf("%d\n", pow2cl); > return (0); > > } > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 08:35:33 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A3BFCCB6; Mon, 12 Nov 2012 08:35:33 +0000 (UTC) (envelope-from fabien.thomas@netasq.com) Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1]) by mx1.freebsd.org (Postfix) with ESMTP id 5E48A8FC0C; Mon, 12 Nov 2012 08:35:32 +0000 (UTC) Received: from [10.2.1.1] (unknown [10.2.1.1]) by work.netasq.com (Postfix) with ESMTPSA id 6DCD527052B6; Mon, 12 Nov 2012 09:35:26 +0100 (CET) Subject: Re: [patch] reducing arp locking Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: Fabien Thomas In-Reply-To: <509D518A.9000803@FreeBSD.org> Date: Mon, 12 Nov 2012 09:35:25 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <509AEDAC.10002@FreeBSD.org> <509B884F.7040106@networx.ch> <509B88B1.3070905@FreeBSD.org> <49EE4F42-6162-40F4-9DE0-1ACA1289B225@netasq.com> <509CC776.9010200@FreeBSD.org> <37E1F76F-D951-4B36-AF00-039DA1CC5CF3@netasq.com> <509CE6A2.2040200@FreeBSD.org> <8169CD67-E444-46FC-A7C8-DD6FB59091E1@netasq.com> <509D32CC.6000201@xip.at> <509D518A.9000803@FreeBSD.org> To: "Alexander V. Chernikov" X-Mailer: Apple Mail (2.1283) Cc: freebsd-net@freebsd.org, Ingo Flaschberger X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 08:35:33 -0000 Le 9 nov. 2012 =E0 19:55, Alexander V. Chernikov a =E9crit : > On 09.11.2012 20:51, Fabien Thomas wrote: >>=20 >> Le 9 nov. 2012 =E0 17:43, Ingo Flaschberger a =E9crit : >>=20 >>> Am 09.11.2012 15:03, schrieb Fabien Thomas: >>>> In in_arpinput only exclusive access to the entry is taken during = the update no IF_AFDATA_LOCK that's why i was surprised. > I'll update patch to reflect changes discussed in previous e-mails. >>>=20 >>> what about this: >>=20 >> I'm not against optimizing but an API that seems clear (correct this = if i'm wrong): >> - one lock for list modification >> - one RW lock for lle entry access >> - one refcount for ptr unref >>=20 >> is now a lot more unclear and from my point of view dangerous. >=20 > This can be changed/documented as the following: > - table rW lock for list modification > - table rW lock lle_addr, la_expire change > - per-lle rw lock for refcount and other fields not used by 'main = path' code Yes that's fine if documented and if every access to lle_addr + = la_expire is under the table lock. >>=20 >> My next question is why do we need a per entry lock if we use the = table lock to protect entry access? > Because there are other cases, like sending traffic to unresolved rte = (arp request send, but reply is not received, and we have to maintain = packets queue to that destination). >=20 > .. and it seems flags handling (LLE_VALID) should be done with more = care. >>=20 >> Fabien >>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >=20 >=20 >=20 > --=20 > WBR, Alexander >=20 From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 08:52:44 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E23DBFAB; Mon, 12 Nov 2012 08:52:44 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id C62C18FC08; Mon, 12 Nov 2012 08:52:44 +0000 (UTC) Received: from Alfreds-MacBook-Pro-5.local (c-67-180-208-218.hsd1.ca.comcast.net [67.180.208.218]) by elvis.mu.org (Postfix) with ESMTPSA id 2EFF91A3CCD; Mon, 12 Nov 2012 00:52:44 -0800 (PST) Message-ID: <50A0B8DA.9090409@mu.org> Date: Mon, 12 Nov 2012 00:52:42 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> In-Reply-To: <50A0A502.1030306@networx.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 08:52:44 -0000 On 11/11/12 11:28 PM, Andre Oppermann wrote: > On 12.11.2012 08:10, Alfred Perlstein wrote: >> I noticed that TCBHASHSIZE does not autotune. >> >> What do you think of the following algorithm? >> >> Basically round down to next power of two based on nmbclusters / 64. > > Please wait out for a real fix of the various mbuf-whatever tuning > issue I'll propose shortly. This approach may become inapproriate. > Also the mbuf limits can be changed at runtime by sysctl. > What is the timeline you are asking for to wait? -Alfred From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 09:27:20 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CFEC661C for ; Mon, 12 Nov 2012 09:27:20 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 372C58FC0C for ; Mon, 12 Nov 2012 09:27:19 +0000 (UTC) Received: (qmail 94758 invoked from network); 12 Nov 2012 11:01:43 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 Nov 2012 11:01:43 -0000 Message-ID: <50A0C0F4.8010706@networx.ch> Date: Mon, 12 Nov 2012 10:27:16 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> In-Reply-To: <50A0B8DA.9090409@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 09:27:21 -0000 On 12.11.2012 09:52, Alfred Perlstein wrote: > On 11/11/12 11:28 PM, Andre Oppermann wrote: >> On 12.11.2012 08:10, Alfred Perlstein wrote: >>> I noticed that TCBHASHSIZE does not autotune. >>> >>> What do you think of the following algorithm? >>> >>> Basically round down to next power of two based on nmbclusters / 64. >> >> Please wait out for a real fix of the various mbuf-whatever tuning >> issue I'll propose shortly. This approach may become inapproriate. >> Also the mbuf limits can be changed at runtime by sysctl. >> > What is the timeline you are asking for to wait? http://svnweb.freebsd.org/changeset/base/242910 -- Andre From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 11:06:47 2012 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BD274A4C for ; Mon, 12 Nov 2012 11:06:47 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id A08688FC08 for ; Mon, 12 Nov 2012 11:06:47 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id qACB6l2n000425 for ; Mon, 12 Nov 2012 11:06:47 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id qACB6lcY000423 for freebsd-net@FreeBSD.org; Mon, 12 Nov 2012 11:06:47 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 12 Nov 2012 11:06:47 GMT Message-Id: <201211121106.qACB6lcY000423@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-net@FreeBSD.org Subject: Current problem reports assigned to freebsd-net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 11:06:47 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/173481 net [NFS] RH63 NFSv4 client does not reconnect to FreeBSD o kern/173479 net [nfs] chown and chgrp operations fail between FreeBSD o kern/173475 net [tun] tun(4) stays opened by PID after process is term o kern/173201 net [ixgbe] [patch] Missing / broken ixgbe sysctl's and tu o kern/173137 net [em] em(4) unable to run at gigabit with 9.1-RC2 o kern/173002 net [patch] data type size problem in if_spppsubr.c o kern/172985 net [patch] [ip6] lltable leak when adding and removing IP o kern/172895 net [ixgb] [ixgbe] do not properly determine link-state o kern/172683 net [ip6] Duplicate IPv6 Link Local Addresses o kern/172675 net [netinet] [patch] sysctl_tcp_hc_list (net.inet.tcp.hos o kern/172113 net [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4 o kern/171840 net [ip6] IPv6 packets transmitting only on queue 0 o kern/171838 net [oce] [patch] Possible lock reversal and duplicate loc o kern/171739 net [bce] [panic] bce related kernel panic o kern/171728 net [arp] arp issue o kern/171711 net [dummynet] [panic] Kernel panic in dummynet o kern/171697 net [ip6] [ndp] panic when changing routes o kern/171532 net [ndis] ndis(4) driver includes 'pccard'-specific code, o kern/171531 net [ndis] undocumented dependency for ndis(4) o kern/171524 net [ipmi] ipmi driver crashes kernel by reboot or shutdow s kern/171508 net [epair] [request] Add the ability to name epair device o kern/171228 net [re] [patch] if_re - eeprom write issues o kern/170701 net [ppp] killl ppp or reboot with active ppp connection c o kern/170267 net [ixgbe] IXGBE_LE32_TO_CPUS is probably an unintentiona o kern/170081 net [fxp] pf/nat/jails not working if checksum offloading o kern/169898 net ifconfig(8) fails to set MTU on multiple interfaces. o kern/169676 net [bge] [hang] system hangs, fully or partially after re o kern/169664 net [bgp] Wrongful replacement of interface connected net o kern/169620 net [ng] [pf] ng_l2tp incoming packet bypass pf firewall o kern/169459 net [ppp] umodem/ppp/3g stopped working after update from o kern/169438 net [ipsec] ipv4-in-ipv6 tunnel mode IPsec does not work p kern/168294 net [ixgbe] [patch] ixgbe driver compiled in kernel has no o kern/168246 net [em] Multiple em(4) not working with qemu o kern/168245 net [arp] [regression] Permanent ARP entry not deleted on o kern/168244 net [arp] [regression] Unable to manually remove permanent o kern/168183 net [bce] bce driver hang system o kern/167947 net [setfib] [patch] arpresolve checks only the default FI o kern/167603 net [ip] IP fragment reassembly's broken: file transfer ov o kern/167500 net [em] [panic] Kernel panics in em driver o kern/167325 net [netinet] [patch] sosend sometimes return EINVAL with o kern/167202 net [igmp]: Sending multiple IGMP packets crashes kernel o kern/167059 net [tcp] [panic] System does panic in in_pcbbind() and ha o kern/166940 net [ipfilter] [panic] Double fault in kern 8.2 o kern/166462 net [gre] gre(4) when using a tunnel source address from c o kern/166372 net [patch] ipfilter drops UDP packets with zero checksum o kern/166285 net [arp] FreeBSD v8.1 REL p8 arp: unknown hardware addres o kern/166255 net [net] [patch] It should be possible to disable "promis o kern/165963 net [panic] [ipf] ipfilter/nat NULL pointer deference o kern/165903 net mbuf leak o kern/165643 net [net] [patch] Missing vnet restores in net/if_ethersub o kern/165622 net [ndis][panic][patch] Unregistered use of FPU in kernel s kern/165562 net [request] add support for Intel i350 in FreeBSD 7.4 o kern/165526 net [bxe] UDP packets checksum calculation whithin if_bxe o kern/165488 net [ppp] [panic] Fatal trap 12 jails and ppp , kernel wit o kern/165305 net [ip6] [request] Feature parity between IP_TOS and IPV6 o kern/165296 net [vlan] [patch] Fix EVL_APPLY_VLID, update EVL_APPLY_PR o kern/165181 net [igb] igb freezes after about 2 weeks of uptime o kern/165174 net [patch] [tap] allow tap(4) to keep its address on clos o kern/165152 net [ip6] Does not work through the issue of ipv6 addresse o kern/164495 net [igb] connect double head igb to switch cause system t o kern/164490 net [pfil] Incorrect IP checksum on pfil pass from ip_outp o kern/164475 net [gre] gre misses RUNNING flag after a reboot o kern/164265 net [netinet] [patch] tcp_lro_rx computes wrong checksum i o kern/163903 net [igb] "igb0:tx(0)","bpf interface lock" v2.2.5 9-STABL o kern/163481 net freebsd do not add itself to ping route packet o kern/162927 net [tun] Modem-PPP error ppp[1538]: tun0: Phase: Clearing o kern/162926 net [ipfilter] Infinite loop in ipfilter with fragmented I o kern/162558 net [dummynet] [panic] seldom dummynet panics o kern/162153 net [em] intel em driver 7.2.4 don't compile o kern/162110 net [igb] [panic] RELENG_9 panics on boot in IGB driver - o kern/162028 net [ixgbe] [patch] misplaced #endif in ixgbe.c o kern/161277 net [em] [patch] BMC cannot receive IPMI traffic after loa o kern/160873 net [igb] igb(4) from HEAD fails to build on 7-STABLE o kern/160750 net Intel PRO/1000 connection breaks under load until rebo o kern/160693 net [gif] [em] Multicast packet are not passed from GIF0 t o kern/160293 net [ieee80211] ppanic] kernel panic during network setup o kern/160206 net [gif] gifX stops working after a while (IPv6 tunnel) o kern/159817 net [udp] write UDPv4: No buffer space available (code=55) o kern/159629 net [ipsec] [panic] kernel panic with IPsec in transport m o kern/159621 net [tcp] [panic] panic: soabort: so_count o kern/159603 net [netinet] [patch] in_ifscrubprefix() - network route c o kern/159601 net [netinet] [patch] in_scrubprefix() - loopback route re o kern/159294 net [em] em watchdog timeouts o kern/159203 net [wpi] Intel 3945ABG Wireless LAN not support IBSS o kern/158930 net [bpf] BPF element leak in ifp->bpf_if->bif_dlist o kern/158726 net [ip6] [patch] ICMPv6 Router Announcement flooding limi o kern/158694 net [ix] [lagg] ix0 is not working within lagg(4) o kern/158665 net [ip6] [panic] kernel pagefault in in6_setscope() o kern/158635 net [em] TSO breaks BPF packet captures with em driver f kern/157802 net [dummynet] [panic] kernel panic in dummynet o kern/157785 net amd64 + jail + ipfw + natd = very slow outbound traffi o kern/157418 net [em] em driver lockup during boot on Supermicro X9SCM- o kern/157410 net [ip6] IPv6 Router Advertisements Cause Excessive CPU U o kern/157287 net [re] [panic] INVARIANTS panic (Memory modified after f o kern/157209 net [ip6] [patch] locking error in rip6_input() (sys/netin o kern/157200 net [network.subr] [patch] stf(4) can not communicate betw o kern/157182 net [lagg] lagg interface not working together with epair o kern/156877 net [dummynet] [panic] dummynet move_pkt() null ptr derefe o kern/156667 net [em] em0 fails to init on CURRENT after March 17 o kern/156408 net [vlan] Routing failure when using VLANs vs. Physical e o kern/156328 net [icmp]: host can ping other subnet but no have IP from o kern/156317 net [ip6] Wrong order of IPv6 NS DAD/MLD Report o kern/156283 net [ip6] [patch] nd6_ns_input - rtalloc_mpath does not re o kern/156279 net [if_bridge][divert][ipfw] unable to correctly re-injec o kern/156226 net [lagg]: failover does not announce the failover to swi o kern/156030 net [ip6] [panic] Crash in nd6_dad_start() due to null ptr o kern/155772 net ifconfig(8): ioctl (SIOCAIFADDR): File exists on direc o kern/155680 net [multicast] problems with multicast s kern/155642 net [request] Add driver for Realtek RTL8191SE/RTL8192SE W o kern/155597 net [panic] Kernel panics with "sbdrop" message o kern/155420 net [vlan] adding vlan break existent vlan o kern/155177 net [route] [panic] Panic when inject routes in kernel p kern/155030 net [igb] igb(4) DEVICE_POLLING does not work with carp(4) o kern/155010 net [msk] ntfs-3g via iscsi using msk driver cause kernel o kern/154943 net [gif] ifconfig gifX create on existing gifX clears IP s kern/154851 net [request]: Port brcm80211 driver from Linux to FreeBSD o kern/154850 net [netgraph] [patch] ng_ether fails to name nodes when t o kern/154679 net [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R o kern/154600 net [tcp] [panic] Random kernel panics on tcp_output o kern/154557 net [tcp] Freeze tcp-session of the clients, if in the gat o kern/154443 net [if_bridge] Kernel module bridgestp.ko missing after u o kern/154286 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/154255 net [nfs] NFS not responding o kern/154214 net [stf] [panic] Panic when creating stf interface o kern/154185 net race condition in mb_dupcl o kern/154169 net [multicast] [ip6] Node Information Query multicast add o kern/154134 net [ip6] stuck kernel state in LISTEN on ipv6 daemon whic o kern/154091 net [netgraph] [panic] netgraph, unaligned mbuf? o conf/154062 net [vlan] [patch] change to way of auto-generatation of v o kern/153937 net [ral] ralink panics the system (amd64 freeBSDD 8.X) wh o kern/153936 net [ixgbe] [patch] MPRC workaround incorrectly applied to o kern/153816 net [ixgbe] ixgbe doesn't work properly with the Intel 10g o kern/153772 net [ixgbe] [patch] sysctls reference wrong XON/XOFF varia o kern/153497 net [netgraph] netgraph panic due to race conditions o kern/153454 net [patch] [wlan] [urtw] Support ad-hoc and hostap modes o kern/153308 net [em] em interface use 100% cpu o kern/153244 net [em] em(4) fails to send UDP to port 0xffff o kern/152893 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/152853 net [em] tftpd (and likely other udp traffic) fails over e o kern/152828 net [em] poor performance on 8.1, 8.2-PRE o kern/152569 net [net]: Multiple ppp connections and routing table prob o kern/152235 net [arp] Permanent local ARP entries are not properly upd o kern/152141 net [vlan] [patch] encapsulate vlan in ng_ether before out o kern/152036 net [libc] getifaddrs(3) returns truncated sockaddrs for n o kern/151690 net [ep] network connectivity won't work until dhclient is o kern/151681 net [nfs] NFS mount via IPv6 leads to hang on client with o kern/151593 net [igb] [panic] Kernel panic when bringing up igb networ o kern/150920 net [ixgbe][igb] Panic when packets are dropped with heade o kern/150557 net [igb] igb0: Watchdog timeout -- resetting o kern/150251 net [patch] [ixgbe] Late cable insertion broken o kern/150249 net [ixgbe] Media type detection broken o bin/150224 net ppp(8) does not reassign static IP after kill -KILL co f kern/149969 net [wlan] [ral] ralink rt2661 fails to maintain connectio o kern/149937 net [ipfilter] [patch] kernel panic in ipfilter IP fragmen o kern/149643 net [rum] device not sending proper beacon frames in ap mo o kern/149609 net [panic] reboot after adding second default route o kern/149117 net [inet] [patch] in_pcbbind: redundant test o kern/149086 net [multicast] Generic multicast join failure in 8.1 o kern/148018 net [flowtable] flowtable crashes on ia64 o kern/147912 net [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300 11 o kern/147894 net [ipsec] IPv6-in-IPv4 does not work inside an ESP-only o kern/147155 net [ip6] setfb not work with ipv6 o kern/146845 net [libc] close(2) returns error 54 (connection reset by f kern/146792 net [flowtable] flowcleaner 100% cpu's core load o kern/146719 net [pf] [panic] PF or dumynet kernel panic o kern/146534 net [icmp6] wrong source address in echo reply o kern/146427 net [mwl] Additional virtual access points don't work on m f kern/146394 net [vlan] IP source address for outgoing connections o bin/146377 net [ppp] [tun] Interface doesn't clear addresses when PPP o kern/146358 net [vlan] wrong destination MAC address o kern/146165 net [wlan] [panic] Setting bssid in adhoc mode causes pani o kern/146082 net [ng_l2tp] a false invaliant check was performed in ng_ o kern/146037 net [panic] mpd + CoA = kernel panic o kern/145825 net [panic] panic: soabort: so_count o kern/145728 net [lagg] Stops working lagg between two servers. p kern/145600 net TCP/ECN behaves different to CE/CWR than ns2 reference f kern/144917 net [flowtable] [panic] flowtable crashes system [regressi o kern/144882 net MacBookPro =>4.1 does not connect to BSD in hostap wit o kern/144874 net [if_bridge] [patch] if_bridge frees mbuf after pfil ho o conf/144700 net [rc.d] async dhclient breaks stuff for too many people o kern/144616 net [nat] [panic] ip_nat panic FreeBSD 7.2 f kern/144315 net [ipfw] [panic] freebsd 8-stable reboot after add ipfw o kern/144231 net bind/connect/sendto too strict about sockaddr length o kern/143846 net [gif] bringing gif3 tunnel down causes gif0 tunnel to s kern/143673 net [stf] [request] there should be a way to support multi s kern/143666 net [ip6] [request] PMTU black hole detection not implemen o kern/143622 net [pfil] [patch] unlock pfil lock while calling firewall o kern/143593 net [ipsec] When using IPSec, tcpdump doesn't show outgoin o kern/143591 net [ral] RT2561C-based DLink card (DWL-510) fails to work o kern/143208 net [ipsec] [gif] IPSec over gif interface not working o kern/143034 net [panic] system reboots itself in tcp code [regression] o kern/142877 net [hang] network-related repeatable 8.0-STABLE hard hang o kern/142774 net Problem with outgoing connections on interface with mu o kern/142772 net [libc] lla_lookup: new lle malloc failed f kern/142518 net [em] [lagg] Problem on 8.0-STABLE with em and lagg o kern/142018 net [iwi] [patch] Possibly wrong interpretation of beacon- o kern/141861 net [wi] data garbled with WEP and wi(4) with Prism 2.5 f kern/141741 net Etherlink III NIC won't work after upgrade to FBSD 8, o kern/140742 net rum(4) Two asus-WL167G adapters cannot talk to each ot o kern/140682 net [netgraph] [panic] random panic in netgraph f kern/140634 net [vlan] destroying if_lagg interface with if_vlan membe o kern/140619 net [ifnet] [patch] refine obsolete if_var.h comments desc o kern/140346 net [wlan] High bandwidth use causes loss of wlan connecti o kern/140142 net [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6 o kern/140066 net [bwi] install report for 8.0 RC 2 (multiple problems) o kern/139565 net [ipfilter] ipfilter ioctl SIOCDELST broken o kern/139387 net [ipsec] Wrong lenth of PF_KEY messages in promiscuous o bin/139346 net [patch] arp(8) add option to remove static entries lis o kern/139268 net [if_bridge] [patch] allow if_bridge to forward just VL p kern/139204 net [arp] DHCP server replies rejected, ARP entry lost bef o kern/139117 net [lagg] + wlan boot timing (EBUSY) o kern/139058 net [ipfilter] mbuf cluster leak on FreeBSD 7.2 o kern/138850 net [dummynet] dummynet doesn't work correctly on a bridge o kern/138782 net [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00 o kern/138688 net [rum] possibly broken on 8 Beta 4 amd64: able to wpa a o kern/138678 net [lo] FreeBSD does not assign linklocal address to loop o kern/138407 net [gre] gre(4) interface does not come up after reboot o kern/138332 net [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/ o kern/138266 net [panic] kernel panic when udp benchmark test used as r o kern/138177 net [ipfilter] FreeBSD crashing repeatedly in ip_nat.c:257 f kern/138029 net [bpf] [panic] periodically kernel panic and reboot o kern/137881 net [netgraph] [panic] ng_pppoe fatal trap 12 p bin/137841 net [patch] wpa_supplicant(8) cannot verify SHA256 signed p kern/137776 net [rum] panic in rum(4) driver on 8.0-BETA2 o bin/137641 net ifconfig(8): various problems with "vlan_device.vlan_i o kern/137392 net [ip] [panic] crash in ip_nat.c line 2577 o kern/137372 net [ral] FreeBSD doesn't support wireless interface from o kern/137089 net [lagg] lagg falsely triggers IPv6 duplicate address de o bin/136994 net [patch] ifconfig(8) print carp mac address o kern/136911 net [netgraph] [panic] system panic on kldload ng_bpf.ko t o kern/136618 net [pf][stf] panic on cloning interface without unit numb o kern/135502 net [periodic] Warning message raised by rtfree function i o kern/134583 net [hang] Machine with jail freezes after random amount o o kern/134531 net [route] [panic] kernel crash related to routes/zebra o kern/134157 net [dummynet] dummynet loads cpu for 100% and make a syst o kern/133969 net [dummynet] [panic] Fatal trap 12: page fault while in o kern/133968 net [dummynet] [panic] dummynet kernel panic o kern/133736 net [udp] ip_id not protected ... o kern/133595 net [panic] Kernel Panic at pcpu.h:195 o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre f kern/133213 net arp and sshd errors on 7.1-PRERELEASE o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o conf/132851 net [patch] rc.conf(5): allow to setfib(1) for service run o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o kern/131601 net [ipfilter] [panic] 7-STABLE panic in nat_finalise (tcp o bin/131567 net [socket] [patch] Update for regression/sockets/unix_cm o bin/131365 net route(8): route add changes interpretation of network f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129517 net [ipsec] [panic] double fault / stack overflow f kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o bin/128954 net ifconfig(8) deletes valid routes o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by p kern/127360 net [socket] TOE socket options missing from sosetopt() o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126339 net [ipw] ipw driver drops the connection o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124225 net [ndis] [patch] ndis network driver sometimes loses net o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o kern/123758 net [panic] panic while restarting net/freenet6 o bin/123633 net ifconfig(8) doesn't set inet and ether address in one o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices f kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal o kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup ieee o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] [security] ppp(8): fix local stack overflow in o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/120966 net [rum] kernel panic with if_rum and WPA encryption o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr o kern/118727 net [netgraph] [patch] [request] add new ng_pf module o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o kern/113432 net [ucom] WARNING: attempt to net_add_domain(netgraph) af o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111537 net [inet6] [patch] ip6_input() treats mbuf cluster wrong o kern/111457 net [ral] ral(4) freeze o kern/110284 net [if_ethersubr] Invalid Assumption in SIOCSIFADDR in et o kern/110249 net [kernel] [regression] [patch] setsockopt() error regre o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106444 net [netgraph] [panic] Kernel Panic on Binding to an ip to o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o kern/102540 net [netgraph] [patch] supporting vlan(4) by ng_fec(4) o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working o kern/97306 net [netgraph] NG_L2TP locks after connection with failed o conf/97014 net [gif] gifconfig_gif? in rc.conf does not recognize IPv f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87421 net [netgraph] [panic]: ng_ether + ng_eiface + if_bridge o kern/86871 net [tcp] [patch] allocation logic for PCBs in TIME_WAIT s o kern/86427 net [lor] Deadlock with FASTIPSEC and nat o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ p kern/85320 net [gre] [patch] possible depletion of kernel stack in ip o bin/82975 net route change does not parse classfull network as given o kern/82881 net [netgraph] [panic] ng_fec(4) causes kernel panic after o kern/82468 net Using 64MB tcp send/recv buffers, trafficflow stops, i o bin/82185 net [patch] ndp(8) can delete the incorrect entry o kern/81095 net IPsec connection stops working if associated network i o kern/78968 net FreeBSD freezes on mbufs exhaustion (network interface o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if o kern/77341 net [ip6] problems with IPV6 implementation s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time a kern/71474 net [route] route lookup does not skip interfaces marked d o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/68889 net [panic] m_copym, length > size of mbuf chain o kern/66225 net [netgraph] [patch] extend ng_eiface(4) control message o kern/65616 net IPSEC can't detunnel GRE packets after real ESP encryp s kern/60293 net [patch] FreeBSD arp poison patch a kern/56233 net IPsec tunnel (ESP) over IPv6: MTU computation is wrong s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr o kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31940 net ip queue length too short for >500kpps o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c f kern/24959 net [patch] proper TCP_NOPUSH/TCP_CORK compatibility o conf/23063 net [arp] [patch] for static ARP tables in rc.network o kern/21998 net [socket] [patch] ident only for outgoing connections o kern/5877 net [socket] sb_cc counts control data as well as data dat 430 problems total. From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 17:43:48 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C3F8154F; Mon, 12 Nov 2012 17:43:48 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id A66FC8FC0C; Mon, 12 Nov 2012 17:43:48 +0000 (UTC) Received: from [10.0.1.17] (c-67-180-208-218.hsd1.ca.comcast.net [67.180.208.218]) by elvis.mu.org (Postfix) with ESMTPSA id 480BE1A3CEB; Mon, 12 Nov 2012 09:43:48 -0800 (PST) References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> In-Reply-To: <50A0C0F4.8010706@networx.ch> Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Message-Id: X-Mailer: iPhone Mail (9B206) From: Alfred Perlstein Subject: Re: auto tuning tcp Date: Mon, 12 Nov 2012 09:43:46 -0800 To: Andre Oppermann Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 17:43:48 -0000 On Nov 12, 2012, at 1:27 AM, Andre Oppermann wrote: > On 12.11.2012 09:52, Alfred Perlstein wrote: >> On 11/11/12 11:28 PM, Andre Oppermann wrote: >>> On 12.11.2012 08:10, Alfred Perlstein wrote: >>>> I noticed that TCBHASHSIZE does not autotune. >>>> >>>> What do you think of the following algorithm? >>>> >>>> Basically round down to next power of two based on nmbclusters / 64. >>> >>> Please wait out for a real fix of the various mbuf-whatever tuning >>> issue I'll propose shortly. This approach may become inapproriate. >>> Also the mbuf limits can be changed at runtime by sysctl. >>> >> What is the timeline you are asking for to wait? > > http://svnweb.freebsd.org/changeset/base/242910 Very cool! So instead of nmbclusters, will maxsockets work? Ideas/suggestions? -Alfred. From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 18:01:10 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DE44A804 for ; Mon, 12 Nov 2012 18:01:10 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 3CA098FC16 for ; Mon, 12 Nov 2012 18:01:09 +0000 (UTC) Received: (qmail 18936 invoked from network); 12 Nov 2012 19:35:29 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 Nov 2012 19:35:29 -0000 Message-ID: <50A13961.1030909@networx.ch> Date: Mon, 12 Nov 2012 19:01:05 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 18:01:11 -0000 On 12.11.2012 18:43, Alfred Perlstein wrote: > > > On Nov 12, 2012, at 1:27 AM, Andre Oppermann wrote: > >> On 12.11.2012 09:52, Alfred Perlstein wrote: >>> On 11/11/12 11:28 PM, Andre Oppermann wrote: >>>> On 12.11.2012 08:10, Alfred Perlstein wrote: >>>>> I noticed that TCBHASHSIZE does not autotune. >>>>> >>>>> What do you think of the following algorithm? >>>>> >>>>> Basically round down to next power of two based on nmbclusters / 64. >>>> >>>> Please wait out for a real fix of the various mbuf-whatever tuning >>>> issue I'll propose shortly. This approach may become inapproriate. >>>> Also the mbuf limits can be changed at runtime by sysctl. >>>> >>> What is the timeline you are asking for to wait? >> >> http://svnweb.freebsd.org/changeset/base/242910 > > Very cool! > > So instead of nmbclusters, will maxsockets work? Ideas/suggestions? I've already added the tunable "kern.maxmbufmem" which is in pages. That's probably not very convenient to work with. I can change it to a percentage of phymem/kva. Would that make you happy? -- Andre From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 18:05:01 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2F54C8D9 for ; Mon, 12 Nov 2012 18:05:01 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet02.ebureau.com (internet02.tru-signal.biz [65.127.24.21]) by mx1.freebsd.org (Postfix) with ESMTP id E2CE18FC14 for ; Mon, 12 Nov 2012 18:05:00 +0000 (UTC) Received: from service02.office.ebureau.com (internet06.ebureau.com [65.127.24.25]) by internet02.ebureau.com (Postfix) with ESMTP id 25B80E0C34E for ; Mon, 12 Nov 2012 11:57:02 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by service02.office.ebureau.com (Postfix) with ESMTP id 211CCDC91B1 for ; Mon, 12 Nov 2012 11:57:02 -0600 (CST) X-Virus-Scanned: amavisd-new at ebureau.com Received: from service02.office.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3eKmXcZcWB3x for ; Mon, 12 Nov 2012 11:57:01 -0600 (CST) Received: from square.office.iscompanies.com (square.office.iscompanies.com [10.10.20.22]) by service02.office.ebureau.com (Postfix) with ESMTPSA id 9F95ADC91A2 for ; Mon, 12 Nov 2012 11:57:01 -0600 (CST) From: Dustin Wenz Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Default ephemeral port range Message-Id: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> Date: Mon, 12 Nov 2012 11:57:01 -0600 To: freebsd-net@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\)) X-Mailer: Apple Mail (2.1498) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 18:05:01 -0000 I'm trying to determine why the default ephemeral port range appears to = be 10000 through 65535 in at least 8.1 through 9.1RC. Documentation = regarding the lower bound on the range seems inconsistent. The FreeBSD = website (http://wiki.freebsd.org/SystemTuning) suggests that = net.inet.ip.portrange.first defaults to 49152, which I don't believe is = accurate. The IANA recommends the range be 49152 through 65535 = (http://tools.ietf.org/html/rfc6056). Is there any particular reason why = net.inet.ip.portrange.first defaults to 10000? - .Dustin From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 18:48:02 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BFE8556F; Mon, 12 Nov 2012 18:48:02 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 9F04F8FC13; Mon, 12 Nov 2012 18:48:02 +0000 (UTC) Received: from Alfreds-MacBook-Pro-5.local (c-67-180-208-218.hsd1.ca.comcast.net [67.180.208.218]) by elvis.mu.org (Postfix) with ESMTPSA id C33021A3C6B; Mon, 12 Nov 2012 10:48:01 -0800 (PST) Message-ID: <50A14460.9020504@mu.org> Date: Mon, 12 Nov 2012 10:48:00 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> In-Reply-To: <50A13961.1030909@networx.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 18:48:02 -0000 On 11/12/12 10:01 AM, Andre Oppermann wrote: > On 12.11.2012 18:43, Alfred Perlstein wrote: >> >> >> On Nov 12, 2012, at 1:27 AM, Andre Oppermann >> wrote: >> >>> On 12.11.2012 09:52, Alfred Perlstein wrote: >>>> On 11/11/12 11:28 PM, Andre Oppermann wrote: >>>>> On 12.11.2012 08:10, Alfred Perlstein wrote: >>>>>> I noticed that TCBHASHSIZE does not autotune. >>>>>> >>>>>> What do you think of the following algorithm? >>>>>> >>>>>> Basically round down to next power of two based on nmbclusters / 64. >>>>> >>>>> Please wait out for a real fix of the various mbuf-whatever tuning >>>>> issue I'll propose shortly. This approach may become inapproriate. >>>>> Also the mbuf limits can be changed at runtime by sysctl. >>>>> >>>> What is the timeline you are asking for to wait? >>> >>> http://svnweb.freebsd.org/changeset/base/242910 >> >> Very cool! >> >> So instead of nmbclusters, will maxsockets work? Ideas/suggestions? > > I've already added the tunable "kern.maxmbufmem" which is in pages. > That's probably not very convenient to work with. I can change it > to a percentage of phymem/kva. Would that make you happy? > It really makes sense to have the hash table be some relation to sockets rather than buffers. If you are hashing "foo-objects" you want the hash to be some relation to the max amount of "foo-objects" you'll see, not backwards derived from the number of "bar-objects" that "foo-objects" contain, right? Because we are hashing the sockets, right? not clusters. Maybe I'm wrong? I'm open to ideas. -Alfred From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 18:49:26 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6B44262F for ; Mon, 12 Nov 2012 18:49:26 +0000 (UTC) (envelope-from cokeeffe@gmail.com) Received: from mail-wi0-f170.google.com (mail-wi0-f170.google.com [209.85.212.170]) by mx1.freebsd.org (Postfix) with ESMTP id E4CC18FC0C for ; Mon, 12 Nov 2012 18:49:25 +0000 (UTC) Received: by mail-wi0-f170.google.com with SMTP id hm9so2535121wib.1 for ; Mon, 12 Nov 2012 10:49:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=/dvqoVHI/gGJ9HhG/m+Po6JxrkI0Z3PE8vKwMowwovk=; b=nbMlGowBWB7cdtA/ANFrAUHu5DUrWjQN9oOMpxynTrnroYwgjHdac+BLQc5TJHHYWw TdG0qQBZhMy5NafOXWKm5Ulwb/xFA1N+MO/P8V3NzHOLZmbu9vcn/4GiJl78shA2gmBB 8GxZDAjWJ3aNbudiiav7ZivqAfMozikzB4/xGQPovnzqNDPXqUDl7wir7gl+QdzcGo6G fLMi1tJYWR+a6rObDib1njqU9CGFjODNCdg/owMWbItmOKyPuXuFy+0pEbC61B5aZE71 01xIKnP4dz1nu4W5aDvPrCa3G2NDk1wuheMrLe+pmuAuOzV03Itb0injUGsgVsHtUBmu xkJg== Received: by 10.216.226.220 with SMTP id b70mr7758828weq.10.1352746159033; Mon, 12 Nov 2012 10:49:19 -0800 (PST) Received: from [10.10.10.18] ([109.78.27.254]) by mx.google.com with ESMTPS id gz3sm3442338wib.2.2012.11.12.10.49.18 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 12 Nov 2012 10:49:18 -0800 (PST) Subject: Re: Default ephemeral port range Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=us-ascii From: Colin O'Keeffe In-Reply-To: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> Date: Mon, 12 Nov 2012 18:49:17 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <95686CBD-5A11-48BD-A556-5133F537C82E@gmail.com> References: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> To: Dustin Wenz X-Mailer: Apple Mail (2.1278) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 18:49:26 -0000 8.1 through 9.1RC will use net.inet.ip.portrange.hifirst (49152) to = .hilast (65535) for ephemeral ports as far as I'm aware. = net.inet.ip.portrange.first to .last are just a reference to available = port numbers as per RFC6056 Correct me if I'm wrong but netinet/in_pcb.c:490 indicates this is the = case. -Colin On 12 Nov 2012, at 17:57, Dustin Wenz wrote: > I'm trying to determine why the default ephemeral port range appears = to be 10000 through 65535 in at least 8.1 through 9.1RC. Documentation = regarding the lower bound on the range seems inconsistent. The FreeBSD = website (http://wiki.freebsd.org/SystemTuning) suggests that = net.inet.ip.portrange.first defaults to 49152, which I don't believe is = accurate. >=20 > The IANA recommends the range be 49152 through 65535 = (http://tools.ietf.org/html/rfc6056). Is there any particular reason why = net.inet.ip.portrange.first defaults to 10000? >=20 > - .Dustin >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 19:16:03 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8FCBC473 for ; Mon, 12 Nov 2012 19:16:03 +0000 (UTC) (envelope-from fodillemlinkarim@gmail.com) Received: from mail-ia0-f182.google.com (mail-ia0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 512688FC08 for ; Mon, 12 Nov 2012 19:16:02 +0000 (UTC) Received: by mail-ia0-f182.google.com with SMTP id x2so85832iad.13 for ; Mon, 12 Nov 2012 11:16:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=/L2GN+b19cWYplZHr/34e8VOaKqJfMh4TPJrW3NZYVM=; b=z8asEgO7x6WkqjNWwY9Gni2PJo5aBZnderBWDPGHlQLAfRP27eKPhKl1MEsCBl8rNU ird4YRN6WPNzIVMgbrAMdZH8jlj+gICBL5lTQ3DxsoIfOTbCmvi5qJsVGWa08HJTmw3q FbOjLW/8ptY3x4O6r/0GhxjHxcpGeP0cH+2TycpvelxzR42I9UWOscOXozxF4z2eYRXA PH+1ZSL6qTEohIuCyyeFWatr8F+Cx6ZBYuescIqrAZUc1r0lOqViE5plW0RDA3qBf437 2rwxssNBU/zZT7SuObVyktu3bnL8xkoT0lIRSINlQ2sHs42/AWNs9FcEjc5Qyod98/l6 y0kw== Received: by 10.50.220.199 with SMTP id py7mr8814416igc.34.1352747762324; Mon, 12 Nov 2012 11:16:02 -0800 (PST) Received: from [192.168.1.71] ([208.85.112.101]) by mx.google.com with ESMTPS id uz1sm6917845igb.16.2012.11.12.11.16.01 (version=SSLv3 cipher=OTHER); Mon, 12 Nov 2012 11:16:02 -0800 (PST) Message-ID: <50A14AEC.2040303@gmail.com> Date: Mon, 12 Nov 2012 14:15:56 -0500 From: Karim User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120410 Thunderbird/11.0.1 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: em/igb if_transmit (drbr) and ALTQ Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 19:16:03 -0000 Hi all, I have been following the current discussions on igb tx/rx locking with great interest and I though I would point out (as it was before pointed out in kern/138392) that any driver setting if_start to NULL and using if_transmit (multi-queue or not) will effectively break ALTQ. The reason for this can be found in tbr_timeout() in altq_subr.c where the token bucket regulator goes through all ALTQ registered interface and tries to 'kick' them into sending through if_start(). tbr_timeout() : ... for (ifp = TAILQ_FIRST(&V_ifnet); ifp; ifp = TAILQ_NEXT(ifp, if_list)) { /* read from if_snd unlocked */ if (!TBR_IS_ENABLED(&ifp->if_snd)) continue; active++; if (!IFQ_IS_EMPTY(&ifp->if_snd) && ifp->if_start != NULL) /* if_start is NULL if if_transmit is used in em/igb driver */ (*ifp->if_start)(ifp); } ... As you can see if_start is NULL on those new multi-queue enabled drivers which has for net effect to 'break' ALTQ's token bucket regulator. I am writing this because I am interested in your comments on how this can be fixed properly looking forward. The whole range of suggestions; from 'don't compile with EM_MULTIQUEUE defined' to 'here is how you can make ALTQ use drbr' will help. Thanks you, Karim. From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 20:27:34 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8979E355 for ; Mon, 12 Nov 2012 20:27:34 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id E5FE48FC14 for ; Mon, 12 Nov 2012 20:27:33 +0000 (UTC) Received: (qmail 20583 invoked from network); 12 Nov 2012 22:01:52 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 Nov 2012 22:01:52 -0000 Message-ID: <50A15BB0.60102@networx.ch> Date: Mon, 12 Nov 2012 21:27:28 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Karim Subject: Re: em/igb if_transmit (drbr) and ALTQ References: <50A14AEC.2040303@gmail.com> In-Reply-To: <50A14AEC.2040303@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 20:27:34 -0000 On 12.11.2012 20:15, Karim wrote: > Hi all, > > I have been following the current discussions on igb tx/rx locking with great interest and I though > I would point out (as it was before pointed out in kern/138392) that any driver setting if_start to > NULL and using if_transmit (multi-queue or not) will effectively break ALTQ. > > The reason for this can be found in tbr_timeout() in altq_subr.c where the token bucket regulator > goes through all ALTQ registered interface and tries to 'kick' them into sending through if_start(). > > tbr_timeout() : > > ... > > for (ifp = TAILQ_FIRST(&V_ifnet); ifp; > ifp = TAILQ_NEXT(ifp, if_list)) { > /* read from if_snd unlocked */ > if (!TBR_IS_ENABLED(&ifp->if_snd)) > continue; > active++; > if (!IFQ_IS_EMPTY(&ifp->if_snd) && > ifp->if_start != NULL) /* if_start is NULL if if_transmit is > used in em/igb driver */ > (*ifp->if_start)(ifp); > } > > ... > > As you can see if_start is NULL on those new multi-queue enabled drivers which has for net effect to > 'break' ALTQ's token bucket regulator. > > I am writing this because I am interested in your comments on how this can be fixed properly looking > forward. The whole range of suggestions; from 'don't compile with EM_MULTIQUEUE defined' to 'here is > how you can make ALTQ use drbr' will help. This whole area needs some serious re-consideration for 10.0. Even without the if_start issue ALTQ is somewhat broken because the DMA rings nowadays are just too deep. See this recent message to this list for some thoughts: http://lists.freebsd.org/pipermail/freebsd-net/2012-November/033780.html On top of refining the stack/driver boundary I'm thinking of a dedicated ethernet layer to consolidate all the ethernet extensions, have a single place to go and to prevent the drivers from re-inventing the wheel all over every time. I'm working towards it and leaning into various drivers while trying to bring in the hybrid interrupt/polling mode with life-lock prevention. It'll take a few weeks until I'm ready to show a stack/ethernet/driver prototype for discussion. Parts may surface earlier in my tcp_workqueue svn branch. -- Andre From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 21:47:13 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A753D122 for ; Mon, 12 Nov 2012 21:47:13 +0000 (UTC) (envelope-from fodillemlinkarim@gmail.com) Received: from mail-ia0-f182.google.com (mail-ia0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6630A8FC14 for ; Mon, 12 Nov 2012 21:47:13 +0000 (UTC) Received: by mail-ia0-f182.google.com with SMTP id x2so211884iad.13 for ; Mon, 12 Nov 2012 13:47:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=ELNyghnvPiOVTeepFjaWCXMdbh4Ql3E00h5OmLojbq8=; b=F2KSx8BS+MSC+wDGSiOiTBh3og/Byn26eSkLYkJaydDyteAdi+JVSspIoxx7cXLz5h 4d3tlOKqZHyR79r4bNPX45sOXMnbN+QxdPfGMcE9OFRn7gA6VHiQ9fxiDAUBPxJvmEC5 TdT7Io/kOcPCkhWO29K6DaaK++r+OyWQRF0Kf6qAJoXT5xOl3Ubg6c0kCHvLDAiUNlby /MbuZ3TmMshs0FneBVYc4GEBBU69yQhJ4wwvc7dnigxi9R2nqf98wGdRo6t3Wv8Be7Bg N772zWFgZQftnjq5KgMCjLibm0bPB+LqMCUFPm5bWeBsNMzog4jFfH68jhpOgO0/4f5C jDDw== Received: by 10.50.188.136 with SMTP id ga8mr9315432igc.24.1352756832868; Mon, 12 Nov 2012 13:47:12 -0800 (PST) Received: from [192.168.1.71] ([208.85.112.101]) by mx.google.com with ESMTPS id x7sm7274530igk.8.2012.11.12.13.47.11 (version=SSLv3 cipher=OTHER); Mon, 12 Nov 2012 13:47:12 -0800 (PST) Message-ID: <50A16E5A.6080401@gmail.com> Date: Mon, 12 Nov 2012 16:47:06 -0500 From: Karim User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120410 Thunderbird/11.0.1 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: em/igb if_transmit (drbr) and ALTQ References: <50A14AEC.2040303@gmail.com> <50A15BB0.60102@networx.ch> In-Reply-To: <50A15BB0.60102@networx.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 21:47:13 -0000 On 12-11-12 03:27 PM, Andre Oppermann wrote: > On 12.11.2012 20:15, Karim wrote: >> Hi all, >> >> I have been following the current discussions on igb tx/rx locking >> with great interest and I though >> I would point out (as it was before pointed out in kern/138392) that >> any driver setting if_start to >> NULL and using if_transmit (multi-queue or not) will effectively >> break ALTQ. >> >> The reason for this can be found in tbr_timeout() in altq_subr.c >> where the token bucket regulator >> goes through all ALTQ registered interface and tries to 'kick' them >> into sending through if_start(). >> >> tbr_timeout() : >> >> ... >> >> for (ifp = TAILQ_FIRST(&V_ifnet); ifp; >> ifp = TAILQ_NEXT(ifp, if_list)) { >> /* read from if_snd unlocked */ >> if (!TBR_IS_ENABLED(&ifp->if_snd)) >> continue; >> active++; >> if (!IFQ_IS_EMPTY(&ifp->if_snd) && >> ifp->if_start != NULL) /* >> if_start is NULL if if_transmit is >> used in em/igb driver */ >> (*ifp->if_start)(ifp); >> } >> >> ... >> >> As you can see if_start is NULL on those new multi-queue enabled >> drivers which has for net effect to >> 'break' ALTQ's token bucket regulator. >> >> I am writing this because I am interested in your comments on how >> this can be fixed properly looking >> forward. The whole range of suggestions; from 'don't compile with >> EM_MULTIQUEUE defined' to 'here is >> how you can make ALTQ use drbr' will help. > > This whole area needs some serious re-consideration for 10.0. > Even without the if_start issue ALTQ is somewhat broken because > the DMA rings nowadays are just too deep. > > See this recent message to this list for some thoughts: > http://lists.freebsd.org/pipermail/freebsd-net/2012-November/033780.html > > On top of refining the stack/driver boundary I'm thinking of a > dedicated ethernet layer to consolidate all the ethernet extensions, > have a single place to go and to prevent the drivers from re-inventing > the wheel all over every time. > > I'm working towards it and leaning into various drivers while trying > to bring in the hybrid interrupt/polling mode with life-lock prevention. > > It'll take a few weeks until I'm ready to show a stack/ethernet/driver > prototype for discussion. Parts may surface earlier in my tcp_workqueue > svn branch. > Hi, Glad to see someone looking into this :) The addition of a common layer for queuing algorithm is an interesting idea and makes me wonder how alternate queuing techniques would be able to use this shim layer to leverage devices with multiple queues. The current limitation (in terms of performances) with ALTQ going forward is its current inability to use multiple queues, in part because it was left out when the current (drbr) implementation of multiqueues was made but mainly because of its current internal structure (using a single global IFQ lock). Is it part of the plan to abstract multiqueuing from alternate queuing algorithm or will it still be left to ALTQ, through driver managed queues for example, to manage the TX DMA ring lock? Thanks, Karim. From owner-freebsd-net@FreeBSD.ORG Mon Nov 12 22:13:49 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1D7977F7 for ; Mon, 12 Nov 2012 22:13:49 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 73F558FC08 for ; Mon, 12 Nov 2012 22:13:48 +0000 (UTC) Received: (qmail 20976 invoked from network); 12 Nov 2012 23:48:06 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 Nov 2012 23:48:06 -0000 Message-ID: <50A17497.10401@networx.ch> Date: Mon, 12 Nov 2012 23:13:43 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Karim Subject: Re: em/igb if_transmit (drbr) and ALTQ References: <50A14AEC.2040303@gmail.com> <50A15BB0.60102@networx.ch> <50A16E5A.6080401@gmail.com> In-Reply-To: <50A16E5A.6080401@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 22:13:49 -0000 On 12.11.2012 22:47, Karim wrote: > On 12-11-12 03:27 PM, Andre Oppermann wrote: >> On 12.11.2012 20:15, Karim wrote: >>> Hi all, >>> >>> I have been following the current discussions on igb tx/rx locking with great interest and I though >>> I would point out (as it was before pointed out in kern/138392) that any driver setting if_start to >>> NULL and using if_transmit (multi-queue or not) will effectively break ALTQ. >>> >>> The reason for this can be found in tbr_timeout() in altq_subr.c where the token bucket regulator >>> goes through all ALTQ registered interface and tries to 'kick' them into sending through if_start(). >>> >>> tbr_timeout() : >>> >>> ... >>> >>> for (ifp = TAILQ_FIRST(&V_ifnet); ifp; >>> ifp = TAILQ_NEXT(ifp, if_list)) { >>> /* read from if_snd unlocked */ >>> if (!TBR_IS_ENABLED(&ifp->if_snd)) >>> continue; >>> active++; >>> if (!IFQ_IS_EMPTY(&ifp->if_snd) && >>> ifp->if_start != NULL) /* if_start is NULL if if_transmit is >>> used in em/igb driver */ >>> (*ifp->if_start)(ifp); >>> } >>> >>> ... >>> >>> As you can see if_start is NULL on those new multi-queue enabled drivers which has for net effect to >>> 'break' ALTQ's token bucket regulator. >>> >>> I am writing this because I am interested in your comments on how this can be fixed properly looking >>> forward. The whole range of suggestions; from 'don't compile with EM_MULTIQUEUE defined' to 'here is >>> how you can make ALTQ use drbr' will help. >> >> This whole area needs some serious re-consideration for 10.0. >> Even without the if_start issue ALTQ is somewhat broken because >> the DMA rings nowadays are just too deep. >> >> See this recent message to this list for some thoughts: >> http://lists.freebsd.org/pipermail/freebsd-net/2012-November/033780.html >> >> On top of refining the stack/driver boundary I'm thinking of a >> dedicated ethernet layer to consolidate all the ethernet extensions, >> have a single place to go and to prevent the drivers from re-inventing >> the wheel all over every time. >> >> I'm working towards it and leaning into various drivers while trying >> to bring in the hybrid interrupt/polling mode with life-lock prevention. >> >> It'll take a few weeks until I'm ready to show a stack/ethernet/driver >> prototype for discussion. Parts may surface earlier in my tcp_workqueue >> svn branch. >> > Hi, > > Glad to see someone looking into this :) > > The addition of a common layer for queuing algorithm is an interesting idea and makes me wonder how > alternate queuing techniques would be able to use this shim layer to leverage devices with multiple > queues. > > The current limitation (in terms of performances) with ALTQ going forward is its current inability > to use multiple queues, in part because it was left out when the current (drbr) implementation of > multiqueues was made but mainly because of its current internal structure (using a single global IFQ > lock). > > Is it part of the plan to abstract multiqueuing from alternate queuing algorithm or will it still be > left to ALTQ, through driver managed queues for example, to manage the TX DMA ring lock? I don't know yet how exactly it will look like. ALTQ, in modified form, will remain part of the functionality set. Most multi-queue network cards also support various modes of queue arbitration for different service classes. This may be leveraged for ALTQ as well. There's many different variants of multi-queue usage depending on the goal of the overall system. I try to allow them to co-exist and to be selectable at run-time while remaining at a sane complexity level. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 06:03:54 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8619372C; Tue, 13 Nov 2012 06:03:54 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 623848FC13; Tue, 13 Nov 2012 06:03:54 +0000 (UTC) Received: from kruse-124.4.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) by elvis.mu.org (Postfix) with ESMTPSA id B1BD01A3D25; Mon, 12 Nov 2012 22:03:53 -0800 (PST) Message-ID: <50A1E2E7.3090705@mu.org> Date: Mon, 12 Nov 2012 22:04:23 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> In-Reply-To: <50A14460.9020504@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 06:03:54 -0000 On 11/12/12 10:48 AM, Alfred Perlstein wrote: > On 11/12/12 10:01 AM, Andre Oppermann wrote: >> >> I've already added the tunable "kern.maxmbufmem" which is in pages. >> That's probably not very convenient to work with. I can change it >> to a percentage of phymem/kva. Would that make you happy? >> > > It really makes sense to have the hash table be some relation to > sockets rather than buffers. > > If you are hashing "foo-objects" you want the hash to be some relation > to the max amount of "foo-objects" you'll see, not backwards derived > from the number of "bar-objects" that "foo-objects" contain, right? > > Because we are hashing the sockets, right? not clusters. > > Maybe I'm wrong? I'm open to ideas. Hey Andre, the following patch is what I was thinking (uncompiled/untested), it basically rounds up the maxsockets to a power of 2 and replaces the default 512 tcb hashsize. It might make sense to make the auto-tuning default to a minimum of 512. There are a number of other hashes with static sizes that could make use of this logic provided it's not upside-down. Any thoughts on this? Tune the tcp pcb hash based on maxsockets. Be more forgiving of poorly chosen tunables by finding a closer power of two rather than clamping down to 512. Index: tcp_subr.c =================================================================== --- tcp_subr.c (revision 242936) +++ tcp_subr.c (working copy) @@ -235,7 +235,7 @@ * variable net.inet.tcp.tcbhashsize */ #ifndef TCBHASHSIZE -#define TCBHASHSIZE 512 +#define TCBHASHSIZE 0 #endif /* @@ -282,6 +282,27 @@ return (0); } +/* + * Take a value and get the next power of 2 that doesn't overflow. + * Used to size the tcp_inpcb hash buckets. + */ +static int +maketcp_hashsize(int size) +{ + int hashsize; + + /* + * auto tune. + * get the next power of 2 higher than maxsockets. + */ + hashsize = 1 << fls(maxsockets); + /* catch overflow, and just go one power of 2 smaller */ + if (hashsize < maxsockets) { + hashsize = 1 << (fls(maxsockets) - 1); + } + return hashsize; +} + void tcp_init(void) { @@ -296,9 +317,20 @@ hashsize = TCBHASHSIZE; TUNABLE_INT_FETCH("net.inet.tcp.tcbhashsize", &hashsize); + if (hashsize == 0) { + /* auto tune based on maxsockets */ + hashsize = maketcp_hashsize(maxsockets); + } + /* + * Be forgiving of admins that don't know to make the tunable + * a power of two. + */ if (!powerof2(hashsize)) { - printf("WARNING: TCB hash size not a power of 2\n"); - hashsize = 512; /* safe default */ + int oldhashsize = hashsize; + + hashsize = maketcp_hashsize(hashsize); + printf("%s: WARNING: TCB hash size not a power of 2, " + "fixed %d -> %d\n", __func__, oldhashsize, hashsize); } in_pcbinfo_init(&V_tcbinfo, "tcp", &V_tcb, hashsize, hashsize, "tcp_inpcb", tcp_inpcb_init, NULL, UMA_ZONE_NOFREE, From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 06:10:38 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C70B1A1D; Tue, 13 Nov 2012 06:10:38 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 9C2518FC12; Tue, 13 Nov 2012 06:10:38 +0000 (UTC) Received: from kruse-124.4.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) by elvis.mu.org (Postfix) with ESMTPSA id 1AB871A3C1A; Mon, 12 Nov 2012 22:10:38 -0800 (PST) Message-ID: <50A1E47C.1030208@mu.org> Date: Mon, 12 Nov 2012 22:11:08 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> In-Reply-To: <50A1E2E7.3090705@mu.org> Content-Type: multipart/mixed; boundary="------------090100000306090603090504" Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 06:10:38 -0000 This is a multi-part message in MIME format. --------------090100000306090603090504 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 11/12/12 10:04 PM, Alfred Perlstein wrote: > On 11/12/12 10:48 AM, Alfred Perlstein wrote: >> On 11/12/12 10:01 AM, Andre Oppermann wrote: >>> >>> I've already added the tunable "kern.maxmbufmem" which is in pages. >>> That's probably not very convenient to work with. I can change it >>> to a percentage of phymem/kva. Would that make you happy? >>> >> >> It really makes sense to have the hash table be some relation to >> sockets rather than buffers. >> >> If you are hashing "foo-objects" you want the hash to be some >> relation to the max amount of "foo-objects" you'll see, not backwards >> derived from the number of "bar-objects" that "foo-objects" contain, >> right? >> >> Because we are hashing the sockets, right? not clusters. >> >> Maybe I'm wrong? I'm open to ideas. > > Hey Andre, the following patch is what I was thinking > (uncompiled/untested), it basically rounds up the maxsockets to a > power of 2 and replaces the default 512 tcb hashsize. > > It might make sense to make the auto-tuning default to a minimum of 512. > > There are a number of other hashes with static sizes that could make > use of this logic provided it's not upside-down. > > Any thoughts on this? > > Tune the tcp pcb hash based on maxsockets. > Be more forgiving of poorly chosen tunables by finding a closer power > of two rather than clamping down to 512. > Index: tcp_subr.c > =================================================================== Sorry, GUI mangled the patch... attaching a plain text version. --------------090100000306090603090504 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0"; name="tcp_auto_tune_hash.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="tcp_auto_tune_hash.diff" Index: tcp_subr.c =================================================================== --- tcp_subr.c (revision 242936) +++ tcp_subr.c (working copy) @@ -235,7 +235,7 @@ * variable net.inet.tcp.tcbhashsize */ #ifndef TCBHASHSIZE -#define TCBHASHSIZE 512 +#define TCBHASHSIZE 0 #endif /* @@ -282,6 +282,27 @@ return (0); } +/* + * Take a value and get the next power of 2 that doesn't overflow. + * Used to size the tcp_inpcb hash buckets. + */ +static int +maketcp_hashsize(int size) +{ + int hashsize; + + /* + * auto tune. + * get the next power of 2 higher than maxsockets. + */ + hashsize = 1 << fls(maxsockets); + /* catch overflow, and just go one power of 2 smaller */ + if (hashsize < maxsockets) { + hashsize = 1 << (fls(maxsockets) - 1); + } + return hashsize; +} + void tcp_init(void) { @@ -296,9 +317,20 @@ hashsize = TCBHASHSIZE; TUNABLE_INT_FETCH("net.inet.tcp.tcbhashsize", &hashsize); + if (hashsize == 0) { + /* auto tune based on maxsockets */ + hashsize = maketcp_hashsize(maxsockets); + } + /* + * Be forgiving of admins that don't know to make the tunable + * a power of two. + */ if (!powerof2(hashsize)) { - printf("WARNING: TCB hash size not a power of 2\n"); - hashsize = 512; /* safe default */ + int oldhashsize = hashsize; + + hashsize = maketcp_hashsize(hashsize); + printf("%s: WARNING: TCB hash size not a power of 2, " + "fixed %d -> %d\n", __func__, oldhashsize, hashsize); } in_pcbinfo_init(&V_tcbinfo, "tcp", &V_tcb, hashsize, hashsize, "tcp_inpcb", tcp_inpcb_init, NULL, UMA_ZONE_NOFREE, --------------090100000306090603090504-- From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 06:18:03 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 563BECDF for ; Tue, 13 Nov 2012 06:18:03 +0000 (UTC) (envelope-from s.khanchi@gmail.com) Received: from mail-ia0-f182.google.com (mail-ia0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1982A8FC0C for ; Tue, 13 Nov 2012 06:18:02 +0000 (UTC) Received: by mail-ia0-f182.google.com with SMTP id x2so531906iad.13 for ; Mon, 12 Nov 2012 22:18:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=gvovs2lpmGrMFaa7zxsAZQTd5BSjcLR3qGEqx0nrB1U=; b=kCkuOwJ70j7Nnv6P2Fwcb15U6GMexeqz5I76YbtBdgZkRUtvqlTyW4/uBTo6KymPCX VOjvDeKwn+eZWJx2LvgPgLVHadLncWmO5oMmlMhOeYKtccRE57oNQ0ICqKt0yxvRsW3l POcmVq2Y6pQc/pvswdIRGttb3SjJ4Ssex53pDJ5UJ6L8xwXc2Q+fA+vwWuGifdqCfrjz kXJYubQNanrZv63iEYFp1GmpN2r7xdejIxs5HQFo8PXtpD3TPvmCLZ8AeXsBNsSrciuG 0d8DLuesqjtjC41MwJyhjbNNwn6k/GoTsFyBvyxef3C9Jmt+/b1orNfp1A3tLzq0ZYDW Pl3Q== Received: by 10.50.183.167 with SMTP id en7mr10004350igc.49.1352787482342; Mon, 12 Nov 2012 22:18:02 -0800 (PST) MIME-Version: 1.0 Sender: s.khanchi@gmail.com Received: by 10.64.101.40 with HTTP; Mon, 12 Nov 2012 22:17:42 -0800 (PST) From: h bagade Date: Tue, 13 Nov 2012 09:47:42 +0330 X-Google-Sender-Auth: De_X42azwPtP4JhPdqkCY4HOwws Message-ID: Subject: setting ToS byte using ng_patch doesn't works well! To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 06:18:03 -0000 Hi all, I've encountered a problem using ng_patch which I don't know how to find out the cause to solve it. I hope you could help me to find in what aspects I should focus! The problem is that in some cases the ng_patch works well for setting ToS byte but sometime it doesn't. for example: netgraph settings: kldload ng_ipfw ngctl mkpeer ipfw: patch 300 in ngctl name ipfw:300 tos ngctl msg tos: setconfig {count=1 csum_flags=1 ops=[ {mode=1 value=0x05 length=1 offset=1}]} -------------------------------- ipfw rule: ipfw add 20 netgraph 300 icmp from any to any by the above settings I've got several results. I've checked the settings on different hardwares and different FreeBSD versions! In some configurations it works well but on others, I have one-way ping connection or no ping connection at all! when changing the rule to set ToS to zero, all configurations works well! I've done these different tests to find out which factors impacts on ng_patch functionality but couldn't find out a fixed reason! Can you suggest me some factors to be taken in to consideration when I'm testing ToS setting? From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 06:23:50 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1CADAFA1 for ; Tue, 13 Nov 2012 06:23:50 +0000 (UTC) (envelope-from peter@wemm.org) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8208D8FC0C for ; Tue, 13 Nov 2012 06:23:49 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id gg13so1948033lbb.13 for ; Mon, 12 Nov 2012 22:23:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wemm.org; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Apa+dvk62p8M+/CzypQV1QAl0oNIPUm2cKtXT0F1h48=; b=dBMF38gf0DJPwaWLfBraqHLClUS/c26DRhEJRGLJz2HOQzmZuyA+8uZQz/Ytuc9lnh O2sSGfQ9xNIrH7qAx+c/ItCJG5hE19wQ4uyEFg8tJoRxd39fqXe3c22Ol/MHGuQmfHLf p5xoXdn3spdFenGwIlp1JMw1j1FU0l3xFCxc0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=Apa+dvk62p8M+/CzypQV1QAl0oNIPUm2cKtXT0F1h48=; b=LbRvJRTkk3HVtkPaUQUPWB9ungg00ghrotEUn1dSIvNpp/+ZhtGJf+be9kGMucW7kd 9daEib1uQN1paRC/BtnUNpCcACtMbDn7tR5W9xocvJAwQ/D5RNfnxXZVhZ5MFktmNM8k lG2+AZ81RWbIo818zgDRmXd3bUy5tPRMs6DBX4uryDWbg9EajR168RqQkmezZfKLSgHl jlAdNZWUSpo/BJWN9TTENj/3hD6c7Qz+d2V4tSVefjMxLqqyTfAcJ+raWu6UfbmO6C4P VoGU0V6NDcs0oqfHk5CCUiCXZ7SW2uwq7g7uIdOX17ELQiFNmFHXTTaTXzMF2+MEgciu Neqw== MIME-Version: 1.0 Received: by 10.152.106.212 with SMTP id gw20mr20492825lab.8.1352787828250; Mon, 12 Nov 2012 22:23:48 -0800 (PST) Received: by 10.112.7.41 with HTTP; Mon, 12 Nov 2012 22:23:48 -0800 (PST) In-Reply-To: <50A1E47C.1030208@mu.org> References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> Date: Mon, 12 Nov 2012 22:23:48 -0800 Message-ID: Subject: Re: auto tuning tcp From: Peter Wemm To: Alfred Perlstein Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQkoAWfOaZ+Zex+5nk4b4wDsO4qRyb5US5O3iok5Le5UXCL/O5BxNKB3gnRDlAoZTEsYQ4Y0 Cc: "freebsd-net@freebsd.org" , Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 06:23:50 -0000 On Mon, Nov 12, 2012 at 10:11 PM, Alfred Perlstein wrote: > On 11/12/12 10:04 PM, Alfred Perlstein wrote: >> >> On 11/12/12 10:48 AM, Alfred Perlstein wrote: >>> >>> On 11/12/12 10:01 AM, Andre Oppermann wrote: >>>> >>>> >>>> I've already added the tunable "kern.maxmbufmem" which is in pages. >>>> That's probably not very convenient to work with. I can change it >>>> to a percentage of phymem/kva. Would that make you happy? >>>> >>> >>> It really makes sense to have the hash table be some relation to sockets >>> rather than buffers. >>> >>> If you are hashing "foo-objects" you want the hash to be some relation to >>> the max amount of "foo-objects" you'll see, not backwards derived from the >>> number of "bar-objects" that "foo-objects" contain, right? >>> >>> Because we are hashing the sockets, right? not clusters. >>> >>> Maybe I'm wrong? I'm open to ideas. >> >> >> Hey Andre, the following patch is what I was thinking >> (uncompiled/untested), it basically rounds up the maxsockets to a power of 2 >> and replaces the default 512 tcb hashsize. >> >> It might make sense to make the auto-tuning default to a minimum of 512. >> >> There are a number of other hashes with static sizes that could make use >> of this logic provided it's not upside-down. >> >> Any thoughts on this? >> >> Tune the tcp pcb hash based on maxsockets. >> Be more forgiving of poorly chosen tunables by finding a closer power >> of two rather than clamping down to 512. >> Index: tcp_subr.c >> =================================================================== > > > Sorry, GUI mangled the patch... attaching a plain text version. > > Wait, you want to replace a hash with a flat array? Why even bother to call it a hash at that point? -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 06:45:08 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 837FB4CD; Tue, 13 Nov 2012 06:45:08 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 58A4C8FC12; Tue, 13 Nov 2012 06:45:08 +0000 (UTC) Received: from kruse-124.4.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) by elvis.mu.org (Postfix) with ESMTPSA id EFEA51A3C6A; Mon, 12 Nov 2012 22:45:07 -0800 (PST) Message-ID: <50A1EC92.9000507@mu.org> Date: Mon, 12 Nov 2012 22:45:38 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Peter Wemm Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 06:45:08 -0000 On 11/12/12 10:23 PM, Peter Wemm wrote: > On Mon, Nov 12, 2012 at 10:11 PM, Alfred Perlstein wrote: >> On 11/12/12 10:04 PM, Alfred Perlstein wrote: >>> On 11/12/12 10:48 AM, Alfred Perlstein wrote: >>>> On 11/12/12 10:01 AM, Andre Oppermann wrote: >>>>> >>>>> I've already added the tunable "kern.maxmbufmem" which is in pages. >>>>> That's probably not very convenient to work with. I can change it >>>>> to a percentage of phymem/kva. Would that make you happy? >>>>> >>>> It really makes sense to have the hash table be some relation to sockets >>>> rather than buffers. >>>> >>>> If you are hashing "foo-objects" you want the hash to be some relation to >>>> the max amount of "foo-objects" you'll see, not backwards derived from the >>>> number of "bar-objects" that "foo-objects" contain, right? >>>> >>>> Because we are hashing the sockets, right? not clusters. >>>> >>>> Maybe I'm wrong? I'm open to ideas. >>> >>> Hey Andre, the following patch is what I was thinking >>> (uncompiled/untested), it basically rounds up the maxsockets to a power of 2 >>> and replaces the default 512 tcb hashsize. >>> >>> It might make sense to make the auto-tuning default to a minimum of 512. >>> >>> There are a number of other hashes with static sizes that could make use >>> of this logic provided it's not upside-down. >>> >>> Any thoughts on this? >>> >>> Tune the tcp pcb hash based on maxsockets. >>> Be more forgiving of poorly chosen tunables by finding a closer power >>> of two rather than clamping down to 512. >>> Index: tcp_subr.c >>> =================================================================== >> >> Sorry, GUI mangled the patch... attaching a plain text version. >> >> > Wait, you want to replace a hash with a flat array? Why even bother > to call it a hash at that point? > > If you are concerned about the space/time tradeoff I'm pretty happy with making it 1/2, 1/4th, 1/8th the size of maxsockets. (smaller?) Would that work better? The reason I chose to make it equal to max sockets was a space/time tradeoff, ideally a hash should have zero collisions and if a user has enough memory for 250,000 sockets, then surely they have enough memory for 256,000 pointers. If you strongly disagree then I am fine with a more conservative setting, just note that effectively the hash table will require 1/2 the factor that we go smaller in additional traversals when we max out the number of sockets. Meaning if the table is 1/4 the size of max sockets, when we hit that many tcp connections I think we'll see an order of average 2 linked list traversals to find a node. At 1/8, then that number becomes 4. I recall back in 2001 on a PII400 with a custom webserver I wrote having a huge benefit by upping this to 2^14 or maybe even 2^16, I forget, but suddenly my CPU went down a huge amount and I didn't have to worry about a load balancer or other tricks. -Alfred From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 07:11:37 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5360FB0A for ; Tue, 13 Nov 2012 07:11:37 +0000 (UTC) (envelope-from ozkan.kirik@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id EF6E68FC15 for ; Tue, 13 Nov 2012 07:11:36 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fw7so9429055vcb.13 for ; Mon, 12 Nov 2012 23:11:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=m3OtSpKU7X5MbFPdAkDgu/bu3v0Ad3QnVM0BIYsKY6g=; b=hfCekoddDYwFsEfR4Gl4H6nO3r86AhZvLYvYOGcjLmO2Bbc70cZQbgPeY7itzdeFfC 1oZ1OrPDiJKQFGfuKRv9iCD3tc/JgLxgBf5/CxDhg1ITHRvF+dhVSHrTkvH4GfmPaGOm rZU1SzY4ihTBBE/ucjGYKytVjder3Sh2hMtyeiyILCCaMd7+1FbJl5tcN2vhAvMQdGPv uIYMkENPMipjhDzLX9drs3RpcrFY8zL+Qzz4Ajjz4WQtTw0jPglzH41bOsqSRkpqh3/g p3PaCM0Zkb/BQxYUOB2wEUa4VSvjctOFYK5DCA+ezuo+jBzVRfoz+A7ZY9JxAYsxFamd pFbA== MIME-Version: 1.0 Received: by 10.220.238.148 with SMTP id ks20mr4274014vcb.5.1352790696375; Mon, 12 Nov 2012 23:11:36 -0800 (PST) Received: by 10.58.213.134 with HTTP; Mon, 12 Nov 2012 23:11:36 -0800 (PST) In-Reply-To: References: Date: Tue, 13 Nov 2012 09:11:36 +0200 Message-ID: Subject: Re: IPv6 NDP Proxy From: =?ISO-8859-1?Q?=D6zkan_KIRIK?= To: Thesaurarius Romae Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 07:11:37 -0000 you can bridge all interfaces with if_bridge. Later, assign the IPv6 addresses to the bridge0 interface. On Sun, Nov 11, 2012 at 7:28 PM, Thesaurarius Romae < thesaurarius.romae@yandex.ru> wrote: > Hello all! > > I have a small problem - I need to give IPv6 addresses to several machines > on a network interfaces, but provider who gave me IPv6 /64 network wants to > see all the IPv6 hosts in the same L2 network. > > To be more exact, I have a physical server at hetzner.de, with interface > re0. I successfully configured IPv6 address on this interface and > everything works fine, but I also have VMs on interfaces tap0, vboxnet0 and > OpenVPN clients on tun0. I google about that subject and solution I found > is to use NDP proxy. But all the examples I found are for linux, while I > use FreeBSD and can't figure how to the same thing on it. > > Here's the linux-solution: > > http://www.stocksy.co.uk/articles/Networks/ipv6_for_xen_hosts_on_a_hetzner_leased_server_with_a_routed_ipv4_allocation > > I'm pretty sure that solution is easy to implement and understand, but I > guess my lack of IPv6-knowledge prevents me from figuring it. > > Thanks for your reply! > > P.S.: Please, CC me personally, since I'm not subscribed to the list. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 07:49:41 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2DA887C3 for ; Tue, 13 Nov 2012 07:49:41 +0000 (UTC) (envelope-from sean@chittenden.org) Received: from mail01.lax1.stackjet.com (mon01.lax1.stackjet.com [174.136.104.178]) by mx1.freebsd.org (Postfix) with ESMTP id EC9468FC14 for ; Tue, 13 Nov 2012 07:49:40 +0000 (UTC) Received: from laptop-sean-wifi.local (173-228-12-182.dsl.dynamic.sonic.net [173.228.12.182]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sean@chittenden.org) by mail01.lax1.stackjet.com (Postfix) with ESMTPSA id A661D3E8D40 for ; Mon, 12 Nov 2012 23:42:15 -0800 (PST) From: Sean Chittenden Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Subject: 0.0.0.0/8 oddities... Message-Id: Date: Mon, 12 Nov 2012 23:42:14 -0800 To: freebsd-net@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 07:49:41 -0000 Hello. I ran in to an interesting situation in what appears to be an = exotic situation. Specifically, after reviewing RFC5735 again and = searching for a datacenter-local or rack-local IP range (i.e trying to = provide services that are guaranteed to be provided in the same rack as = the server), I settled on the 0.0.0.0/8 network. Per =A73 of RFC5735, it = would appear that this network is valid: https://tools.ietf.org/html/rfc5735#section-3 > 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" > network. Address 0.0.0.0/32 may be used as a source address for = this > host on this network; other addresses within 0.0.0.0/8 may be used = to > refer to specified hosts on this network ([RFC1122], Section = 3.2.1.3). And this works as expected, with regards to TCP services. But ICMP? Not = so much. Is there a reason that ICMP would fail, but TCP (e.g. ssh) = works? For example, I pulled 0.42.123.10 and 0.42.123.20 as IP addresses = to use for NTP servers, but much to my surprise, I could ssh between the = hosts, but I couldn't ping. Is this intentional? I understand that = 0.0.0.0/32 =3D=3D INADDR_ANY for source addresses, but it doesn't appear = that there should be a restriction of inbound echoreq packets. According = to tcpdump(1), the host is receiving echoreq packets, however no echorep = packets are generated. As a work around, I threw things in to a more = traditional RFC1918 network and things immediately worked for both SSH = and ICMP.=20 ?? Any thoughts as to why? It doesn't appear that the current behavior = abides by RFC5735. -sc -- Sean Chittenden sean@chittenden.org From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 08:06:30 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4F54ECA5 for ; Tue, 13 Nov 2012 08:06:30 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id A1C618FC14 for ; Tue, 13 Nov 2012 08:06:29 +0000 (UTC) Received: (qmail 23275 invoked from network); 13 Nov 2012 09:40:42 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 13 Nov 2012 09:40:42 -0000 Message-ID: <50A1FF80.3040900@networx.ch> Date: Tue, 13 Nov 2012 09:06:24 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <50A1EC92.9000507@mu.org> In-Reply-To: <50A1EC92.9000507@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 08:06:30 -0000 On 13.11.2012 07:45, Alfred Perlstein wrote: > On 11/12/12 10:23 PM, Peter Wemm wrote: >> On Mon, Nov 12, 2012 at 10:11 PM, Alfred Perlstein wrote: >>> On 11/12/12 10:04 PM, Alfred Perlstein wrote: >>>> On 11/12/12 10:48 AM, Alfred Perlstein wrote: >>>>> On 11/12/12 10:01 AM, Andre Oppermann wrote: >>>>>> >>>>>> I've already added the tunable "kern.maxmbufmem" which is in pages. >>>>>> That's probably not very convenient to work with. I can change it >>>>>> to a percentage of phymem/kva. Would that make you happy? >>>>>> >>>>> It really makes sense to have the hash table be some relation to sockets >>>>> rather than buffers. >>>>> >>>>> If you are hashing "foo-objects" you want the hash to be some relation to >>>>> the max amount of "foo-objects" you'll see, not backwards derived from the >>>>> number of "bar-objects" that "foo-objects" contain, right? >>>>> >>>>> Because we are hashing the sockets, right? not clusters. >>>>> >>>>> Maybe I'm wrong? I'm open to ideas. >>>> >>>> Hey Andre, the following patch is what I was thinking >>>> (uncompiled/untested), it basically rounds up the maxsockets to a power of 2 >>>> and replaces the default 512 tcb hashsize. >>>> >>>> It might make sense to make the auto-tuning default to a minimum of 512. >>>> >>>> There are a number of other hashes with static sizes that could make use >>>> of this logic provided it's not upside-down. >>>> >>>> Any thoughts on this? >>>> >>>> Tune the tcp pcb hash based on maxsockets. >>>> Be more forgiving of poorly chosen tunables by finding a closer power >>>> of two rather than clamping down to 512. >>>> Index: tcp_subr.c >>>> =================================================================== >>> >>> Sorry, GUI mangled the patch... attaching a plain text version. >>> >>> >> Wait, you want to replace a hash with a flat array? Why even bother >> to call it a hash at that point? >> >> > > If you are concerned about the space/time tradeoff I'm pretty happy with making it 1/2, 1/4th, 1/8th > the size of maxsockets. (smaller?) > > Would that work better? I'd go for 1/8 or even 1/16 with a lower bound of 512. More than that is excessive. > The reason I chose to make it equal to max sockets was a space/time tradeoff, ideally a hash should > have zero collisions and if a user has enough memory for 250,000 sockets, then surely they have > enough memory for 256,000 pointers. I agree in general. Though not all large memory servers do serve a large amount of connections. We have find a tradeoff here. Having a perfect hash would certainly be laudable. As long as the average hash chain doesn't go beyond few entries it's not a problem. > If you strongly disagree then I am fine with a more conservative setting, just note that effectively > the hash table will require 1/2 the factor that we go smaller in additional traversals when we max > out the number of sockets. Meaning if the table is 1/4 the size of max sockets, when we hit that > many tcp connections I think we'll see an order of average 2 linked list traversals to find a node. > At 1/8, then that number becomes 4. I'm fine with that and claim that if you expect N sockets that you would also increase maxfiles/sockets to N*2 to have some headroom. > I recall back in 2001 on a PII400 with a custom webserver I wrote having a huge benefit by upping > this to 2^14 or maybe even 2^16, I forget, but suddenly my CPU went down a huge amount and I didn't > have to worry about a load balancer or other tricks. I can certainly believe that. A hash size of 512 is no good if you have more than 4K connections. PS: Please note that my patch for mbuf and maxfiles tuning is not yet in HEAD, it's still sitting in my tcp_workqueue branch. I still have to search for derived values that may get totally out of whack with the new scaling scheme. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 08:17:56 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2360F364; Tue, 13 Nov 2012 08:17:56 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id F04FD8FC12; Tue, 13 Nov 2012 08:17:55 +0000 (UTC) Received: from kruse-124.4.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) by elvis.mu.org (Postfix) with ESMTPSA id 46CBA1A3C1A; Tue, 13 Nov 2012 00:17:55 -0800 (PST) Message-ID: <50A20251.7010302@mu.org> Date: Tue, 13 Nov 2012 00:18:25 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <50A1EC92.9000507@mu.org> <50A1FF80.3040900@networx.ch> In-Reply-To: <50A1FF80.3040900@networx.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Adrian Chadd , Peter Wemm X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 08:17:56 -0000 On 11/13/12 12:06 AM, Andre Oppermann wrote: > On 13.11.2012 07:45, Alfred Perlstein wrote: >> On 11/12/12 10:23 PM, Peter Wemm wrote: >>> On Mon, Nov 12, 2012 at 10:11 PM, Alfred Perlstein >>> wrote: >>>> On 11/12/12 10:04 PM, Alfred Perlstein wrote: >>>>> On 11/12/12 10:48 AM, Alfred Perlstein wrote: >>>>>> On 11/12/12 10:01 AM, Andre Oppermann wrote: >>>>>>> >>>>>>> I've already added the tunable "kern.maxmbufmem" which is in pages. >>>>>>> That's probably not very convenient to work with. I can change it >>>>>>> to a percentage of phymem/kva. Would that make you happy? >>>>>>> >>>>>> It really makes sense to have the hash table be some relation to >>>>>> sockets >>>>>> rather than buffers. >>>>>> >>>>>> If you are hashing "foo-objects" you want the hash to be some >>>>>> relation to >>>>>> the max amount of "foo-objects" you'll see, not backwards derived >>>>>> from the >>>>>> number of "bar-objects" that "foo-objects" contain, right? >>>>>> >>>>>> Because we are hashing the sockets, right? not clusters. >>>>>> >>>>>> Maybe I'm wrong? I'm open to ideas. >>>>> >>>>> Hey Andre, the following patch is what I was thinking >>>>> (uncompiled/untested), it basically rounds up the maxsockets to a >>>>> power of 2 >>>>> and replaces the default 512 tcb hashsize. >>>>> >>>>> It might make sense to make the auto-tuning default to a minimum >>>>> of 512. >>>>> >>>>> There are a number of other hashes with static sizes that could >>>>> make use >>>>> of this logic provided it's not upside-down. >>>>> >>>>> Any thoughts on this? >>>>> >>>>> Tune the tcp pcb hash based on maxsockets. >>>>> Be more forgiving of poorly chosen tunables by finding a closer power >>>>> of two rather than clamping down to 512. >>>>> Index: tcp_subr.c >>>>> =================================================================== >>>> >>>> Sorry, GUI mangled the patch... attaching a plain text version. >>>> >>>> >>> Wait, you want to replace a hash with a flat array? Why even bother >>> to call it a hash at that point? >>> >>> >> >> If you are concerned about the space/time tradeoff I'm pretty happy >> with making it 1/2, 1/4th, 1/8th >> the size of maxsockets. (smaller?) >> >> Would that work better? > > I'd go for 1/8 or even 1/16 with a lower bound of 512. More than > that is excessive. I'm OK with 1/8. All I'm really going for is trying to make it somewhat better than 512 when un-tuned. > >> The reason I chose to make it equal to max sockets was a space/time >> tradeoff, ideally a hash should >> have zero collisions and if a user has enough memory for 250,000 >> sockets, then surely they have >> enough memory for 256,000 pointers. > > I agree in general. Though not all large memory servers do serve a > large amount of connections. We have find a tradeoff here. > > Having a perfect hash would certainly be laudable. As long as the > average hash chain doesn't go beyond few entries it's not a problem. > >> If you strongly disagree then I am fine with a more conservative >> setting, just note that effectively >> the hash table will require 1/2 the factor that we go smaller in >> additional traversals when we max >> out the number of sockets. Meaning if the table is 1/4 the size of >> max sockets, when we hit that >> many tcp connections I think we'll see an order of average 2 linked >> list traversals to find a node. >> At 1/8, then that number becomes 4. > > I'm fine with that and claim that if you expect N sockets that you > would also increase maxfiles/sockets to N*2 to have some headroom. That is a good point. > >> I recall back in 2001 on a PII400 with a custom webserver I wrote >> having a huge benefit by upping >> this to 2^14 or maybe even 2^16, I forget, but suddenly my CPU went >> down a huge amount and I didn't >> have to worry about a load balancer or other tricks. > > I can certainly believe that. A hash size of 512 is no good if > you have more than 4K connections. > > PS: Please note that my patch for mbuf and maxfiles tuning is not yet > in HEAD, it's still sitting in my tcp_workqueue branch. I still have > to search for derived values that may get totally out of whack with > the new scaling scheme. > This is cool! Thank you for the feedback. Would you like me to put this on a user branch somewhere for you to merge into your perf branch? -Alfred From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 08:22:55 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 024CA5F8 for ; Tue, 13 Nov 2012 08:22:54 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 5D1158FC08 for ; Tue, 13 Nov 2012 08:22:53 +0000 (UTC) Received: (qmail 23372 invoked from network); 13 Nov 2012 09:57:07 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 13 Nov 2012 09:57:07 -0000 Message-ID: <50A20359.9080906@networx.ch> Date: Tue, 13 Nov 2012 09:22:49 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-net@freebsd.org, gnn@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 08:22:55 -0000 On 13.11.2012 08:42, Sean Chittenden wrote: > Hello. I ran in to an interesting situation in what appears to be an exotic situation. Specifically, after reviewing RFC5735 again and searching for a datacenter-local or rack-local IP range (i.e trying to provide services that are guaranteed to be provided in the same rack as the server), I settled on the 0.0.0.0/8 network. Per 3 of RFC5735, it would appear that this network is valid: > > https://tools.ietf.org/html/rfc5735#section-3 > >> 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" >> network. Address 0.0.0.0/32 may be used as a source address for this >> host on this network; other addresses within 0.0.0.0/8 may be used to >> refer to specified hosts on this network ([RFC1122], Section 3.2.1.3). > > And this works as expected, with regards to TCP services. But ICMP? Not so much. Is there a reason that ICMP would fail, but TCP (e.g. ssh) works? For example, I pulled 0.42.123.10 and 0.42.123.20 as IP addresses to use for NTP servers, but much to my surprise, I could ssh between the hosts, but I couldn't ping. Is this intentional? I understand that 0.0.0.0/32 == INADDR_ANY for source addresses, but it doesn't appear that there should be a restriction of inbound echoreq packets. According to tcpdump(1), the host is receiving echoreq packets, however no echorep packets are generated. As a work around, I threw things in to a more traditional RFC1918 network and things immediately worked for both SSH and ICMP. The check to drop ICMP replies to a source of 0.0.0.0/8 was added in r120958 as part of a fix for link local addresses. It was only applied to ICMP which is inconsistent as you've found out. > ?? Any thoughts as to why? It doesn't appear that the current behavior abides by RFC5735. Reading this section and RFC1122 it is not entirely clear to me what the allowed scope of 0.0.0.0/8 is. I do agree though that blocking it only in ICMP is not useful if it is allowed in the normal IP input path. Can you please check how other OS's (Linux, Windows) deal with it? You may also want to search for this question on NANOG, and if not found raise it there. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 08:25:26 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8EAB86C3 for ; Tue, 13 Nov 2012 08:25:26 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id E5EE48FC08 for ; Tue, 13 Nov 2012 08:25:25 +0000 (UTC) Received: (qmail 23388 invoked from network); 13 Nov 2012 09:59:39 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 13 Nov 2012 09:59:39 -0000 Message-ID: <50A203F0.3020803@networx.ch> Date: Tue, 13 Nov 2012 09:25:20 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <50A1EC92.9000507@mu.org> <50A1FF80.3040900@networx.ch> <50A20251.7010302@mu.org> In-Reply-To: <50A20251.7010302@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 08:25:26 -0000 On 13.11.2012 09:18, Alfred Perlstein wrote: > On 11/13/12 12:06 AM, Andre Oppermann wrote: >> On 13.11.2012 07:45, Alfred Perlstein wrote: >>> If you are concerned about the space/time tradeoff I'm pretty happy with making it 1/2, 1/4th, 1/8th >>> the size of maxsockets. (smaller?) >>> >>> Would that work better? >> >> I'd go for 1/8 or even 1/16 with a lower bound of 512. More than >> that is excessive. > > I'm OK with 1/8. All I'm really going for is trying to make it somewhat better than 512 when un-tuned. > >> PS: Please note that my patch for mbuf and maxfiles tuning is not yet >> in HEAD, it's still sitting in my tcp_workqueue branch. I still have >> to search for derived values that may get totally out of whack with >> the new scaling scheme. >> > This is cool! Thank you for the feedback. > > Would you like me to put this on a user branch somewhere for you to merge into your perf branch? I can put it into my branch and also merge it to HEAD with a "Submitted by: alfred" line. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 08:41:18 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AF24A9A2 for ; Tue, 13 Nov 2012 08:41:18 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 8A1EA8FC0C for ; Tue, 13 Nov 2012 08:41:18 +0000 (UTC) Received: from kruse-124.4.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) by elvis.mu.org (Postfix) with ESMTPSA id 605911A3C1A; Tue, 13 Nov 2012 00:41:18 -0800 (PST) Message-ID: <50A207CC.3060104@mu.org> Date: Tue, 13 Nov 2012 00:41:48 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <50A1EC92.9000507@mu.org> <50A1FF80.3040900@networx.ch> <50A20251.7010302@mu.org> <50A203F0.3020803@networx.ch> In-Reply-To: <50A203F0.3020803@networx.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 08:41:18 -0000 On 11/13/12 12:25 AM, Andre Oppermann wrote: > On 13.11.2012 09:18, Alfred Perlstein wrote: >> On 11/13/12 12:06 AM, Andre Oppermann wrote: >>> On 13.11.2012 07:45, Alfred Perlstein wrote: >>>> If you are concerned about the space/time tradeoff I'm pretty happy >>>> with making it 1/2, 1/4th, 1/8th >>>> the size of maxsockets. (smaller?) >>>> >>>> Would that work better? >>> >>> I'd go for 1/8 or even 1/16 with a lower bound of 512. More than >>> that is excessive. >> >> I'm OK with 1/8. All I'm really going for is trying to make it >> somewhat better than 512 when un-tuned. > > >>> PS: Please note that my patch for mbuf and maxfiles tuning is not yet >>> in HEAD, it's still sitting in my tcp_workqueue branch. I still have >>> to search for derived values that may get totally out of whack with >>> the new scaling scheme. >>> >> This is cool! Thank you for the feedback. >> >> Would you like me to put this on a user branch somewhere for you to >> merge into your perf branch? > > I can put it into my branch and also merge it to HEAD with > a "Submitted by: alfred" line. > Thank you, that works. Note: it's not even compile tested at this point. I should be able to do so tomorrow. Are there other hashes to look at? I noticed a few more: UDBHASHSIZE netinet/tcp_hostcache.c:#define TCP_HOSTCACHE_HASHSIZE 512 netinet/sctp_constants.h:#define SCTP_TCBHASHSIZE 1024 netinet/sctp_constants.h:#define SCTP_PCBHASHSIZE 256 netinet/tcp_syncache.c:#define TCP_SYNCACHE_HASHSIZE 512 Any of these look like good targets? I think most could be looked at. I've only glanced. I can provide deltas. -Alfred From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 10:21:22 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CCC5297B for ; Tue, 13 Nov 2012 10:21:22 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 2BFE08FC16 for ; Tue, 13 Nov 2012 10:21:21 +0000 (UTC) Received: (qmail 23795 invoked from network); 13 Nov 2012 11:55:32 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 13 Nov 2012 11:55:32 -0000 Message-ID: <50A21F1B.5090607@freebsd.org> Date: Tue, 13 Nov 2012 11:21:15 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Gleb Smirnoff Subject: Re: svn commit: r240494 - in head: contrib/pf/man contrib/pf/pfctl include sbin/pfctl sbin/pfctl/missing share/man/man4 share/man/man5 sys/conf sys/contrib/pf sys/modules/dummynet sys/modules/ipfw sys/... References: <201209141151.q8EBppm1014858@svn.freebsd.org> <20121113021140.GB260@dragon.NUXI.org> <20121113091713.GF27927@FreeBSD.org> In-Reply-To: <20121113091713.GF27927@FreeBSD.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@FreeBSD.org, freebsd-net@freebsd.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, David O'Brien X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 10:21:22 -0000 On 13.11.2012 10:17, Gleb Smirnoff wrote: > On Mon, Nov 12, 2012 at 06:11:40PM -0800, David O'Brien wrote: > D> On Fri, Sep 14, 2012 at 11:51:51AM +0000, Gleb Smirnoff wrote: > D> > Log: > D> > o Create directory sys/netpfil, where all packet filters should > D> > reside, and move there ipfw(4) and pf(4). > D> > o Move most modified parts of pf out of contrib. > D> > D> Why didn't contrib/ipfilter/ move to sys/netpfil/ as well? > D> > D> Having 1/3 of our packet filters not there (sys/netpfil) might suggest we > D> shouldn't create sys/netpfil/ > > ipfilter is really selfcontained and is a contrib code. Though it can't decide whether to really live in contrib or as part of FreeBSD. Also it hasn't been updated in a long time and the official version has progressed quite a bit. IMHO the version we have should either go away and be replaced with a fresh up to date import through the vendor channel, or move to netpfil. Would be a great task for a junior kernel hacker. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 10:24:35 2012 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 18DD5D03; Tue, 13 Nov 2012 10:24:35 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id 7FF6D8FC12; Tue, 13 Nov 2012 10:24:34 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id qADAOQXJ020351; Tue, 13 Nov 2012 14:24:26 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id qADAOQlN020350; Tue, 13 Nov 2012 14:24:26 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Tue, 13 Nov 2012 14:24:26 +0400 From: Gleb Smirnoff To: Andre Oppermann Subject: Re: svn commit: r240494 - in head: contrib/pf/man contrib/pf/pfctl include sbin/pfctl sbin/pfctl/missing share/man/man4 share/man/man5 sys/conf sys/contrib/pf sys/modules/dummynet sys/modules/ipfw sys/... Message-ID: <20121113102426.GA20289@FreeBSD.org> References: <201209141151.q8EBppm1014858@svn.freebsd.org> <20121113021140.GB260@dragon.NUXI.org> <20121113091713.GF27927@FreeBSD.org> <50A21F1B.5090607@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <50A21F1B.5090607@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: svn-src-head@FreeBSD.org, freebsd-net@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, David O'Brien X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 10:24:35 -0000 On Tue, Nov 13, 2012 at 11:21:15AM +0100, Andre Oppermann wrote: A> On 13.11.2012 10:17, Gleb Smirnoff wrote: A> > On Mon, Nov 12, 2012 at 06:11:40PM -0800, David O'Brien wrote: A> > D> On Fri, Sep 14, 2012 at 11:51:51AM +0000, Gleb Smirnoff wrote: A> > D> > Log: A> > D> > o Create directory sys/netpfil, where all packet filters should A> > D> > reside, and move there ipfw(4) and pf(4). A> > D> > o Move most modified parts of pf out of contrib. A> > D> A> > D> Why didn't contrib/ipfilter/ move to sys/netpfil/ as well? A> > D> A> > D> Having 1/3 of our packet filters not there (sys/netpfil) might suggest we A> > D> shouldn't create sys/netpfil/ A> > A> > ipfilter is really selfcontained and is a contrib code. A> A> Though it can't decide whether to really live in contrib or A> as part of FreeBSD. Also it hasn't been updated in a long A> time and the official version has progressed quite a bit. A> A> IMHO the version we have should either go away and be replaced A> with a fresh up to date import through the vendor channel, or A> move to netpfil. A> A> Would be a great task for a junior kernel hacker. I don't see any reason to remove it since it builds fine and isn't any obstacle in further development of FreeBSD. Let it be there while it works. -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 11:52:29 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2A36A23D for ; Tue, 13 Nov 2012 11:52:29 +0000 (UTC) (envelope-from thesaurarius.romae@yandex.ru) Received: from forward8.mail.yandex.net (forward8.mail.yandex.net [IPv6:2a02:6b8:0:202::3]) by mx1.freebsd.org (Postfix) with ESMTP id 489058FC12 for ; Tue, 13 Nov 2012 11:52:28 +0000 (UTC) Received: from smtp8.mail.yandex.net (smtp8.mail.yandex.net [77.88.61.54]) by forward8.mail.yandex.net (Yandex) with ESMTP id AE8FDF6139A for ; Tue, 13 Nov 2012 15:52:25 +0400 (MSK) Received: from smtp8.mail.yandex.net (localhost [127.0.0.1]) by smtp8.mail.yandex.net (Yandex) with ESMTP id 8ABCC1B60345 for ; Tue, 13 Nov 2012 15:52:23 +0400 (MSK) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by smtp8.mail.yandex.net (nwsmtp/Yandex) with ESMTP id qLl0cDIo-qMl0kpdo; Tue, 13 Nov 2012 15:52:23 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1352807543; bh=xPkqjh4s2rEH6TlGDPJgOF/3aK42IIvT/rZYE+Uz4To=; h=Received:X-Google-DKIM-Signature:MIME-Version:Received:Received: X-Originating-IP:Received:In-Reply-To:References:Date:Message-ID: Subject:From:To:Cc:Content-Type:X-Gm-Message-State; b=CIwaqZaKcHXkhpjRYAUo78LBGQSKM6V4+jGVWHem10gUdYquxdjUdV4MZBAr7rvIE RBU6f8ZIqf6bF/8Ott+GtVdQ/wcXFDPIywQA4GCUvqr0w4x4VVFcG9RYsbjK6Z+tNy cwOardeBpFaKdUg0h4g36fElieC7ucySEodL/uzw= Received: by mail-vb0-f54.google.com with SMTP id l1so9513088vba.13 for ; Tue, 13 Nov 2012 03:52:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=xPkqjh4s2rEH6TlGDPJgOF/3aK42IIvT/rZYE+Uz4To=; b=jR6z03dyK8flSXphvlZLjm9DcevRZ/qTBy5qLE4J8V9nM7vmLh2nbRLsuiOeKpXw8y 9V4XoVjuNPrTez+YZuYJLaM7ReSaffx18Vf61wJgSAMJ/z1XlviNPrH58HxNaybGCKS7 D+kn/d+djjY7L7bvoK83ezdg6g+DlrMZYe0ZNxs8CwNVanK5oJ6RSKtGEPubopzsOKZO 8p5yD03MUkNHO0bQZrGhpj9CjK6f7H2cCktW8MB1sn43GIW8AA4j/xlPsdfTQErjnGTr WjGnrDZwbBY6LSGxHNFbnvkGnU8+w4bmVSC0eZkcyqTdfn2z+BG7eQsbL1mLesFs6fhH o+wA== MIME-Version: 1.0 Received: by 10.58.207.196 with SMTP id ly4mr25750474vec.6.1352807541583; Tue, 13 Nov 2012 03:52:21 -0800 (PST) Received: by 10.58.229.200 with HTTP; Tue, 13 Nov 2012 03:52:21 -0800 (PST) X-Originating-IP: [78.36.196.85] Received: by 10.58.229.200 with HTTP; Tue, 13 Nov 2012 03:52:21 -0800 (PST) In-Reply-To: References: Date: Tue, 13 Nov 2012 15:52:21 +0400 Message-ID: Subject: Re: IPv6 NDP Proxy From: Thesaurarius Romae To: =?ISO-8859-1?Q?=D6zkan_KIRIK?= X-Gm-Message-State: ALoCoQnFfg9M4XWqSKk7Jexyoqc11Udr4b8xvd8um/+Fr26VHlicfQEkha6BggGNoi9efQkJF5Vh Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 11:52:29 -0000 That's... So simple that it sounds like a genius idea :) It works great, thanks! :) On Nov 13, 2012 10:11 AM, "=D6zkan KIRIK" wrote: > you can bridge all interfaces with if_bridge. > Later, assign the IPv6 addresses to the bridge0 interface. > > > > On Sun, Nov 11, 2012 at 7:28 PM, Thesaurarius Romae < > thesaurarius.romae@yandex.ru> wrote: > >> Hello all! >> >> I have a small problem - I need to give IPv6 addresses to several machin= es >> on a network interfaces, but provider who gave me IPv6 /64 network wants >> to >> see all the IPv6 hosts in the same L2 network. >> >> To be more exact, I have a physical server at hetzner.de, with interface >> re0. I successfully configured IPv6 address on this interface and >> everything works fine, but I also have VMs on interfaces tap0, vboxnet0 >> and >> OpenVPN clients on tun0. I google about that subject and solution I foun= d >> is to use NDP proxy. But all the examples I found are for linux, while I >> use FreeBSD and can't figure how to the same thing on it. >> >> Here's the linux-solution: >> >> http://www.stocksy.co.uk/articles/Networks/ipv6_for_xen_hosts_on_a_hetzn= er_leased_server_with_a_routed_ipv4_allocation >> >> I'm pretty sure that solution is easy to implement and understand, but I >> guess my lack of IPv6-knowledge prevents me from figuring it. >> >> Thanks for your reply! >> >> P.S.: Please, CC me personally, since I'm not subscribed to the list. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > > From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 12:43:44 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B44672CA for ; Tue, 13 Nov 2012 12:43:44 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 106488FC18 for ; Tue, 13 Nov 2012 12:43:43 +0000 (UTC) Received: (qmail 25012 invoked from network); 13 Nov 2012 14:17:55 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 13 Nov 2012 14:17:55 -0000 Message-ID: <50A2407A.5080909@networx.ch> Date: Tue, 13 Nov 2012 13:43:38 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: auto tuning tcp References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <50A1EC92.9000507@mu.org> <50A1FF80.3040900@networx.ch> <50A20251.7010302@mu.org> <50A203F0.3020803@networx.ch> <50A207CC.3060104@mu.org> In-Reply-To: <50A207CC.3060104@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , tuexen@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 12:43:44 -0000 On 13.11.2012 09:41, Alfred Perlstein wrote: > On 11/13/12 12:25 AM, Andre Oppermann wrote: >> On 13.11.2012 09:18, Alfred Perlstein wrote: >>> On 11/13/12 12:06 AM, Andre Oppermann wrote: >>>> On 13.11.2012 07:45, Alfred Perlstein wrote: >>>>> If you are concerned about the space/time tradeoff I'm pretty happy with making it 1/2, 1/4th, >>>>> 1/8th >>>>> the size of maxsockets. (smaller?) >>>>> >>>>> Would that work better? >>>> >>>> I'd go for 1/8 or even 1/16 with a lower bound of 512. More than >>>> that is excessive. >>> >>> I'm OK with 1/8. All I'm really going for is trying to make it somewhat better than 512 when >>> un-tuned. >> > >>>> PS: Please note that my patch for mbuf and maxfiles tuning is not yet >>>> in HEAD, it's still sitting in my tcp_workqueue branch. I still have >>>> to search for derived values that may get totally out of whack with >>>> the new scaling scheme. >>>> >>> This is cool! Thank you for the feedback. >>> >>> Would you like me to put this on a user branch somewhere for you to merge into your perf branch? >> >> I can put it into my branch and also merge it to HEAD with >> a "Submitted by: alfred" line. >> > Thank you, that works. Note: it's not even compile tested at this point. > > I should be able to do so tomorrow. > > Are there other hashes to look at? I noticed a few more: > > UDBHASHSIZE Even busy UDP servers have only a small number of sockets open. > netinet/tcp_hostcache.c:#define TCP_HOSTCACHE_HASHSIZE 512 This is per host, not per connection or socket. So it should by fine and scales independently. > netinet/sctp_constants.h:#define SCTP_TCBHASHSIZE 1024 > netinet/sctp_constants.h:#define SCTP_PCBHASHSIZE 256 Michael has look at that. > netinet/tcp_syncache.c:#define TCP_SYNCACHE_HASHSIZE 512 Again this is not per connection or socket. It depends on the number of concurrent SYN's waiting on SYN/ACK-ACK for a listen socket. This should be fine and it has overflow protection. If a SYN entry is lost it reverts to syncookies. > Any of these look like good targets? I think most could be looked at. I've only glanced. I can > provide deltas. -- Andre From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 12:58:11 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 64920DE1 for ; Tue, 13 Nov 2012 12:58:11 +0000 (UTC) (envelope-from tuexen@FreeBSD.org) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) by mx1.freebsd.org (Postfix) with ESMTP id 1F9938FC1A for ; Tue, 13 Nov 2012 12:58:10 +0000 (UTC) Received: from [10.0.1.109] (unknown [212.201.121.94]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id B294A1C0C069F; Tue, 13 Nov 2012 13:58:08 +0100 (CET) Subject: Re: auto tuning tcp Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: Michael Tuexen In-Reply-To: <50A2407A.5080909@networx.ch> Date: Tue, 13 Nov 2012 13:58:07 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <50A1EC92.9000507@mu.org> <50A1FF80.3040900@networx.ch> <50A20251.7010302@mu.org> <50A203F0.3020803@networx.ch> <50A207CC.3060104@mu.org> <50A2407A.5080909@networx.ch> To: Andre Oppermann X-Mailer: Apple Mail (2.1283) Cc: "freebsd-net@freebsd.org" , Alfred Perlstein X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 12:58:11 -0000 On Nov 13, 2012, at 1:43 PM, Andre Oppermann wrote: > On 13.11.2012 09:41, Alfred Perlstein wrote: >> On 11/13/12 12:25 AM, Andre Oppermann wrote: >>> On 13.11.2012 09:18, Alfred Perlstein wrote: >>>> On 11/13/12 12:06 AM, Andre Oppermann wrote: >>>>> On 13.11.2012 07:45, Alfred Perlstein wrote: >>>>>> If you are concerned about the space/time tradeoff I'm pretty = happy with making it 1/2, 1/4th, >>>>>> 1/8th >>>>>> the size of maxsockets. (smaller?) >>>>>>=20 >>>>>> Would that work better? >>>>>=20 >>>>> I'd go for 1/8 or even 1/16 with a lower bound of 512. More than >>>>> that is excessive. >>>>=20 >>>> I'm OK with 1/8. All I'm really going for is trying to make it = somewhat better than 512 when >>>> un-tuned. >>> > >>>>> PS: Please note that my patch for mbuf and maxfiles tuning is not = yet >>>>> in HEAD, it's still sitting in my tcp_workqueue branch. I still = have >>>>> to search for derived values that may get totally out of whack = with >>>>> the new scaling scheme. >>>>>=20 >>>> This is cool! Thank you for the feedback. >>>>=20 >>>> Would you like me to put this on a user branch somewhere for you to = merge into your perf branch? >>>=20 >>> I can put it into my branch and also merge it to HEAD with >>> a "Submitted by: alfred" line. >>>=20 >> Thank you, that works. Note: it's not even compile tested at this = point. >>=20 >> I should be able to do so tomorrow. >>=20 >> Are there other hashes to look at? I noticed a few more: >>=20 >> UDBHASHSIZE >=20 > Even busy UDP servers have only a small number of sockets open. >=20 >> netinet/tcp_hostcache.c:#define TCP_HOSTCACHE_HASHSIZE 512 >=20 > This is per host, not per connection or socket. So it should by fine > and scales independently. >=20 >> netinet/sctp_constants.h:#define SCTP_TCBHASHSIZE 1024 >> netinet/sctp_constants.h:#define SCTP_PCBHASHSIZE 256 >=20 > Michael has look at that. I can take a look... I also wanted to make it configurable... Best regards Michael >=20 >> netinet/tcp_syncache.c:#define TCP_SYNCACHE_HASHSIZE 512 >=20 > Again this is not per connection or socket. It depends on the number > of concurrent SYN's waiting on SYN/ACK-ACK for a listen socket. This > should be fine and it has overflow protection. If a SYN entry is lost > it reverts to syncookies. >=20 >> Any of these look like good targets? I think most could be looked = at. I've only glanced. I can >> provide deltas. >=20 > --=20 > Andre >=20 >=20 From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 18:54:55 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8BE13116; Tue, 13 Nov 2012 18:54:55 +0000 (UTC) (envelope-from sean@chittenden.org) Received: from mail01.lax1.stackjet.com (mon01.lax1.stackjet.com [174.136.104.178]) by mx1.freebsd.org (Postfix) with ESMTP id 65B448FC14; Tue, 13 Nov 2012 18:54:55 +0000 (UTC) Received: from [192.168.11.242] (c-71-202-44-241.hsd1.ca.comcast.net [71.202.44.241]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sean@chittenden.org) by mail01.lax1.stackjet.com (Postfix) with ESMTPSA id 4C3433E8D56; Tue, 13 Nov 2012 10:54:54 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 0.0.0.0/8 oddities... From: Sean Chittenden In-Reply-To: <50A20359.9080906@networx.ch> Date: Tue, 13 Nov 2012 10:54:53 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> References: <50A20359.9080906@networx.ch> To: Andre Oppermann X-Mailer: Apple Mail (2.1499) Cc: "freebsd-net@freebsd.org" , gnn@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 18:54:55 -0000 >> Hello. I ran in to an interesting situation in what appears to be an = exotic situation. Specifically, after reviewing RFC5735 again and = searching for a datacenter-local or rack-local IP range (i.e trying to = provide services that are guaranteed to be provided in the same rack as = the server), I settled on the 0.0.0.0/8 network. Per =A73 of RFC5735, it = would appear that this network is valid: >>=20 >> https://tools.ietf.org/html/rfc5735#section-3 >>=20 >>> 0.0.0.0/8 - Addresses in this block refer to source hosts on = "this" >>> network. Address 0.0.0.0/32 may be used as a source address for = this >>> host on this network; other addresses within 0.0.0.0/8 may be = used to >>> refer to specified hosts on this network ([RFC1122], Section = 3.2.1.3). >>=20 >> And this works as expected, with regards to TCP services. But ICMP? = Not so much. Is there a reason that ICMP would fail, but TCP (e.g. ssh) = works? For example, I pulled 0.42.123.10 and 0.42.123.20 as IP addresses = to use for NTP servers, but much to my surprise, I could ssh between the = hosts, but I couldn't ping. Is this intentional? I understand that = 0.0.0.0/32 =3D=3D INADDR_ANY for source addresses, but it doesn't appear = that there should be a restriction of inbound echoreq packets. According = to tcpdump(1), the host is receiving echoreq packets, however no echorep = packets are generated. As a work around, I threw things in to a more = traditional RFC1918 network and things immediately worked for both SSH = and ICMP. >=20 > The check to drop ICMP replies to a source of 0.0.0.0/8 was added > in r120958 as part of a fix for link local addresses. It was only > applied to ICMP which is inconsistent as you've found out. >=20 >> ?? Any thoughts as to why? It doesn't appear that the current = behavior abides by RFC5735. >=20 > Reading this section and RFC1122 it is not entirely clear to me > what the allowed scope of 0.0.0.0/8 is. I do agree though that > blocking it only in ICMP is not useful if it is allowed in the > normal IP input path. >=20 > Can you please check how other OS's (Linux, Windows) deal with it? I... err.. I don't have any Linux or windows boxes that I can play with = for a few weeks, but I can stand one up later this month sometime. = ELINUXFREE ? > You may also want to search for this question on NANOG, and if not > found raise it there. I looked around to see if this was changed recently, but I couldn't find = any reference as such. The thought was to pluck an easy mnemonic for = remembering and correlating network services that are guaranteed and = required to be site-local and not available through an MPLS or VPN = network (10/8, 192.168/16 and 172.16/12 are managed by BGP or some other = IGP, I wanted something explicitly that would not match). 0.42.123.10 & 0.42.123.20 =3D site-local NTP server. ^ | ^^ | | ^^^ | | | ^ | | | | | | | +------ primary, .20 =3D secondary | | +--------- "port 123 services" | +------------ "the answer to" +--------------- site/data center local There were a host of convenience things that came for free with this, = including easy to identify what traffic should be on the segment, etc. = DNS would be at 0.42.53.{10,20}, etc. Answering questions like "this = data center's DNS server is at 172.29.167.4" is a PITA, and it doesn't = need to be. I saw you just made a commit to enable this a few minutes ago, thank = you. Any plans to MFC r242956? Thanks Andre. -sc -- Sean Chittenden seanc@FreeBSD.org From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 19:30:44 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9E148E34; Tue, 13 Nov 2012 19:30:44 +0000 (UTC) (envelope-from melifaro@yandex-team.ru) Received: from forward2.mail.yandex.net (forward2.mail.yandex.net [IPv6:2a02:6b8:0:602::2]) by mx1.freebsd.org (Postfix) with ESMTP id 268358FC17; Tue, 13 Nov 2012 19:30:42 +0000 (UTC) Received: from smtpcorp1.mail.yandex.net (smtpcorp1.mail.yandex.net [77.88.47.195]) by forward2.mail.yandex.net (Yandex) with ESMTP id 4499D12A0F61; Tue, 13 Nov 2012 23:30:40 +0400 (MSK) Received: from smtpcorp1.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp1.mail.yandex.net (Yandex) with ESMTP id 24D83A013B; Tue, 13 Nov 2012 23:30:40 +0400 (MSK) Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net [95.108.170.36]) by smtpcorp1.mail.yandex.net (nwsmtp/Yandex) with ESMTP id UdnO2wEU-UenmEQ6J; Tue, 13 Nov 2012 23:30:40 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1352835040; bh=r8ywokDCACklQEe7w7/JgqMt6yepQMyxPhzvwuDEhKQ=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type; b=ah1oS7mHSrB9NpDR9FhwJpOpBv1up2bAUeX8DfoJGAJweIKUV9eBeKiLh107GtrIC PaqBDB9nUGZD2kqJWXQPrUaAn4csjuzCmnhSvzkjoEho0GiN8bQ4+hcePulw2ZflD+ SZbSEveCFv4dU6J+jMyMDoNi6XR/lfCFmrnwHrSQ= Message-ID: <50A29F57.6090701@yandex-team.ru> Date: Tue, 13 Nov 2012 23:28:23 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: freebsd-ipfw@freebsd.org Subject: [CFT] ipfw SMP-ready dynamic states Content-Type: multipart/mixed; boundary="------------000406010204050104020709" Cc: "freebsd-net@freebsd.org" , Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 19:30:44 -0000 This is a multi-part message in MIME format. --------------000406010204050104020709 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello list! Currently most ipfw operations with dynamic states (keep-state, check-state, limit) are serialized via IPFW_DYN_LOCK() which is per-vnet mutex lock. As a result, performance is limited to the same ~650kpps as in routing (in several cases). Patch changes the following: * global lock is changed to per-bucket mutex * state expiration is done in ipfw_tick every 1s. No expiration is done on forwarding path * hash table resize is done automatically and does not cause all states to be lost The only (architectural) problem I see is unlocked V_dyn_count increments. So, we can do the following: 1) lock increments/decrements via some separate mutex 2) do nothing 3) take some combined approach: Generally, we don't need value to be _exact_. As a result, we count total number of states in every ipfw_tick run and set V_dyn_count to new value. New states still increment V_dyn_count unlocked. Performance: Synthetic traffic, ipfw with single allow ip from any to any rule: 2.4M. single keep-state ip from any to any: 2.2M. Some more tests should be taken (with large number of states, different types of traffic, etc), maybe I can do some next week. You need to run recent -current or merge r242631 and r242834 before applying this patch. --------------000406010204050104020709 Content-Type: text/plain; charset=UTF-8; name="ipfw_keepstate.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ipfw_keepstate.diff" Index: sys/netpfil/ipfw/ip_fw_sockopt.c =================================================================== --- sys/netpfil/ipfw/ip_fw_sockopt.c (revision 242524) +++ sys/netpfil/ipfw/ip_fw_sockopt.c (working copy) @@ -382,7 +382,7 @@ del_entry(struct ip_fw_chain *chain, uint32_t arg) continue; l = RULESIZE(rule); chain->static_len -= l; - ipfw_remove_dyn_children(rule); + ipfw_expire_dyn_rules(chain, rule, RESVD_SET); rule->x_next = chain->reap; chain->reap = rule; } @@ -925,7 +925,7 @@ ipfw_getrules(struct ip_fw_chain *chain, void *buf dst->timestamp += boot_seconds; bp += l; } - ipfw_get_dynamic(&bp, ep); /* protected by the dynamic lock */ + ipfw_get_dynamic(chain, &bp, ep); /* protected by the dynamic lock */ return (bp - (char *)buf); } Index: sys/netpfil/ipfw/ip_fw_private.h =================================================================== --- sys/netpfil/ipfw/ip_fw_private.h (revision 242632) +++ sys/netpfil/ipfw/ip_fw_private.h (working copy) @@ -175,7 +175,9 @@ enum { /* result for matching dynamic rules */ * and only to release the result of lookup_dyn_rule(). * Eventually we may implement it with a callback on the function. */ -void ipfw_dyn_unlock(void); +struct ip_fw_chain; +void ipfw_expire_dyn_rules(struct ip_fw_chain *, struct ip_fw *, int); +void ipfw_dyn_unlock(ipfw_dyn_rule *q); struct tcphdr; struct mbuf *ipfw_send_pkt(struct mbuf *, struct ipfw_flow_id *, @@ -185,11 +187,11 @@ int ipfw_install_state(struct ip_fw *rule, ipfw_in ipfw_dyn_rule *ipfw_lookup_dyn_rule(struct ipfw_flow_id *pkt, int *match_direction, struct tcphdr *tcp); void ipfw_remove_dyn_children(struct ip_fw *rule); -void ipfw_get_dynamic(char **bp, const char *ep); +void ipfw_get_dynamic(struct ip_fw_chain *chain, char **bp, const char *ep); void ipfw_dyn_attach(void); /* uma_zcreate .... */ void ipfw_dyn_detach(void); /* uma_zdestroy ... */ -void ipfw_dyn_init(void); /* per-vnet initialization */ +void ipfw_dyn_init(struct ip_fw_chain *); /* per-vnet initialization */ void ipfw_dyn_uninit(int); /* per-vnet deinitialization */ int ipfw_dyn_len(void); @@ -259,6 +261,10 @@ struct sockopt; /* used by tcp_var.h */ #define IPFW_WLOCK(p) rw_wlock(&(p)->rwmtx) #define IPFW_WUNLOCK(p) rw_wunlock(&(p)->rwmtx) +#define IPFW_UH_LOCK_ASSERT(_chain) rw_assert(&(_chain)->uh_lock, RA_LOCKED) +#define IPFW_UH_RLOCK_ASSERT(_chain) rw_assert(&(_chain)->uh_lock, RA_RLOCKED) +#define IPFW_UH_WLOCK_ASSERT(_chain) rw_assert(&(_chain)->uh_lock, RA_WLOCKED) + #define IPFW_UH_RLOCK(p) rw_rlock(&(p)->uh_lock) #define IPFW_UH_RUNLOCK(p) rw_runlock(&(p)->uh_lock) #define IPFW_UH_WLOCK(p) rw_wlock(&(p)->uh_lock) Index: sys/netpfil/ipfw/ip_fw_dynamic.c =================================================================== --- sys/netpfil/ipfw/ip_fw_dynamic.c (revision 242834) +++ sys/netpfil/ipfw/ip_fw_dynamic.c (working copy) @@ -111,38 +111,33 @@ __FBSDID("$FreeBSD$"); * passes through the firewall. XXX check the latter!!! */ +struct ipfw_dyn_bucket { + struct mtx mtx; /* Bucket protecting lock */ + ipfw_dyn_rule *head; /* Pointer to first rule */ +}; + /* * Static variables followed by global ones */ -static VNET_DEFINE(ipfw_dyn_rule **, ipfw_dyn_v); -static VNET_DEFINE(u_int32_t, dyn_buckets); +static VNET_DEFINE(struct ipfw_dyn_bucket *, ipfw_dyn_v); +static VNET_DEFINE(u_int32_t, dyn_buckets_max); static VNET_DEFINE(u_int32_t, curr_dyn_buckets); static VNET_DEFINE(struct callout, ipfw_timeout); #define V_ipfw_dyn_v VNET(ipfw_dyn_v) -#define V_dyn_buckets VNET(dyn_buckets) +#define V_dyn_buckets_max VNET(dyn_buckets_max) #define V_curr_dyn_buckets VNET(curr_dyn_buckets) #define V_ipfw_timeout VNET(ipfw_timeout) static uma_zone_t ipfw_dyn_rule_zone; -#ifndef __FreeBSD__ -DEFINE_SPINLOCK(ipfw_dyn_mtx); -#else -static struct mtx ipfw_dyn_mtx; /* mutex guarding dynamic rules */ -#endif -#define IPFW_DYN_LOCK_INIT() \ - mtx_init(&ipfw_dyn_mtx, "IPFW dynamic rules", NULL, MTX_DEF) -#define IPFW_DYN_LOCK_DESTROY() mtx_destroy(&ipfw_dyn_mtx) -#define IPFW_DYN_LOCK() mtx_lock(&ipfw_dyn_mtx) -#define IPFW_DYN_UNLOCK() mtx_unlock(&ipfw_dyn_mtx) -#define IPFW_DYN_LOCK_ASSERT() mtx_assert(&ipfw_dyn_mtx, MA_OWNED) +#define IPFW_BUCK_LOCK_INIT(b) \ + mtx_init(&(b)->mtx, "IPFW dynamic bucket", NULL, MTX_DEF) +#define IPFW_BUCK_LOCK_DESTROY(b) \ + mtx_destroy(&(b)->mtx) +#define IPFW_BUCK_LOCK(i) mtx_lock(&V_ipfw_dyn_v[(i)].mtx) +#define IPFW_BUCK_UNLOCK(i) mtx_unlock(&V_ipfw_dyn_v[(i)].mtx) +#define IPFW_BUCK_ASSERT(i) mtx_assert(&V_ipfw_dyn_v[(i)].mtx, MA_OWNED) -void -ipfw_dyn_unlock(void) -{ - IPFW_DYN_UNLOCK(); -} - /* * Timeouts for various events in handing dynamic rules. */ @@ -171,10 +166,12 @@ static VNET_DEFINE(u_int32_t, dyn_short_lifetime); static VNET_DEFINE(u_int32_t, dyn_keepalive_interval); static VNET_DEFINE(u_int32_t, dyn_keepalive_period); static VNET_DEFINE(u_int32_t, dyn_keepalive); +static VNET_DEFINE(time_t, dyn_keepalive_last); #define V_dyn_keepalive_interval VNET(dyn_keepalive_interval) #define V_dyn_keepalive_period VNET(dyn_keepalive_period) #define V_dyn_keepalive VNET(dyn_keepalive) +#define V_dyn_keepalive_last VNET(dyn_keepalive_last) static VNET_DEFINE(u_int32_t, dyn_count); /* # of dynamic rules */ static VNET_DEFINE(u_int32_t, dyn_max); /* max # of dynamic rules */ @@ -182,14 +179,17 @@ static VNET_DEFINE(u_int32_t, dyn_max); /* max # #define V_dyn_count VNET(dyn_count) #define V_dyn_max VNET(dyn_max) +static void ipfw_dyn_tick(void *vnetx); +static void check_dyn_rules(struct ip_fw_chain *, struct ip_fw *, + int, int, int); #ifdef SYSCTL_NODE SYSBEGIN(f2) SYSCTL_DECL(_net_inet_ip_fw); SYSCTL_VNET_UINT(_net_inet_ip_fw, OID_AUTO, dyn_buckets, - CTLFLAG_RW, &VNET_NAME(dyn_buckets), 0, - "Number of dyn. buckets"); + CTLFLAG_RW, &VNET_NAME(dyn_buckets_max), 0, + "Max number of dyn. buckets"); SYSCTL_VNET_UINT(_net_inet_ip_fw, OID_AUTO, curr_dyn_buckets, CTLFLAG_RD, &VNET_NAME(curr_dyn_buckets), 0, "Current Number of dyn. buckets"); @@ -244,7 +244,7 @@ hash_packet6(struct ipfw_flow_id *id) * and we want to find both in the same bucket. */ static __inline int -hash_packet(struct ipfw_flow_id *id) +hash_packet(struct ipfw_flow_id *id, int buckets) { u_int32_t i; @@ -254,7 +254,7 @@ static __inline int else #endif /* INET6 */ i = (id->dst_ip) ^ (id->src_ip) ^ (id->dst_port) ^ (id->src_port); - i &= (V_curr_dyn_buckets - 1); + i &= (buckets - 1); return i; } @@ -292,118 +292,13 @@ print_dyn_rule_flags(struct ipfw_flow_id *id, int #define print_dyn_rule(id, dtype, prefix, postfix) \ print_dyn_rule_flags(id, dtype, LOG_DEBUG, prefix, postfix) -/** - * unlink a dynamic rule from a chain. prev is a pointer to - * the previous one, q is a pointer to the rule to delete, - * head is a pointer to the head of the queue. - * Modifies q and potentially also head. - */ -#define UNLINK_DYN_RULE(prev, head, q) { \ - ipfw_dyn_rule *old_q = q; \ - \ - /* remove a refcount to the parent */ \ - if (q->dyn_type == O_LIMIT) \ - q->parent->count--; \ - V_dyn_count--; \ - DEB(print_dyn_rule(&q->id, q->dyn_type, "unlink entry", "left");) \ - if (prev != NULL) \ - prev->next = q = q->next; \ - else \ - head = q = q->next; \ - uma_zfree(ipfw_dyn_rule_zone, old_q); } - #define TIME_LEQ(a,b) ((int)((a)-(b)) <= 0) -/** - * Remove dynamic rules pointing to "rule", or all of them if rule == NULL. - * - * If keep_me == NULL, rules are deleted even if not expired, - * otherwise only expired rules are removed. - * - * The value of the second parameter is also used to point to identify - * a rule we absolutely do not want to remove (e.g. because we are - * holding a reference to it -- this is the case with O_LIMIT_PARENT - * rules). The pointer is only used for comparison, so any non-null - * value will do. - */ -static void -remove_dyn_rule(struct ip_fw *rule, ipfw_dyn_rule *keep_me) -{ - static u_int32_t last_remove = 0; - -#define FORCE (keep_me == NULL) - - ipfw_dyn_rule *prev, *q; - int i, pass = 0, max_pass = 0; - - IPFW_DYN_LOCK_ASSERT(); - - if (V_ipfw_dyn_v == NULL || V_dyn_count == 0) - return; - /* do not expire more than once per second, it is useless */ - if (!FORCE && last_remove == time_uptime) - return; - last_remove = time_uptime; - - /* - * because O_LIMIT refer to parent rules, during the first pass only - * remove child and mark any pending LIMIT_PARENT, and remove - * them in a second pass. - */ -next_pass: - for (i = 0 ; i < V_curr_dyn_buckets ; i++) { - for (prev=NULL, q = V_ipfw_dyn_v[i] ; q ; ) { - /* - * Logic can become complex here, so we split tests. - */ - if (q == keep_me) - goto next; - if (rule != NULL && rule != q->rule) - goto next; /* not the one we are looking for */ - if (q->dyn_type == O_LIMIT_PARENT) { - /* - * handle parent in the second pass, - * record we need one. - */ - max_pass = 1; - if (pass == 0) - goto next; - if (FORCE && q->count != 0 ) { - /* XXX should not happen! */ - printf("ipfw: OUCH! cannot remove rule," - " count %d\n", q->count); - } - } else { - if (!FORCE && - !TIME_LEQ( q->expire, time_uptime )) - goto next; - } - if (q->dyn_type != O_LIMIT_PARENT || !q->count) { - UNLINK_DYN_RULE(prev, V_ipfw_dyn_v[i], q); - continue; - } -next: - prev=q; - q=q->next; - } - } - if (pass++ < max_pass) - goto next_pass; -} - -void -ipfw_remove_dyn_children(struct ip_fw *rule) -{ - IPFW_DYN_LOCK(); - remove_dyn_rule(rule, NULL /* force removal */); - IPFW_DYN_UNLOCK(); -} - /* - * Lookup a dynamic rule, locked version. + * Lookup a dynamic rule */ static ipfw_dyn_rule * -lookup_dyn_rule_locked(struct ipfw_flow_id *pkt, int *match_direction, +lookup_dyn_rule_locked(struct ipfw_flow_id *pkt, int i, int *match_direction, struct tcphdr *tcp) { /* @@ -414,23 +309,17 @@ static ipfw_dyn_rule * #define MATCH_FORWARD 1 #define MATCH_NONE 2 #define MATCH_UNKNOWN 3 - int i, dir = MATCH_NONE; + int dir = MATCH_NONE; ipfw_dyn_rule *prev, *q = NULL; - IPFW_DYN_LOCK_ASSERT(); + IPFW_BUCK_ASSERT(i); - if (V_ipfw_dyn_v == NULL) - goto done; /* not found */ - i = hash_packet(pkt); - for (prev = NULL, q = V_ipfw_dyn_v[i]; q != NULL;) { + for (prev = NULL, q = V_ipfw_dyn_v[i].head; q; prev = q, q = q->next) { if (q->dyn_type == O_LIMIT_PARENT && q->count) - goto next; - if (TIME_LEQ(q->expire, time_uptime)) { /* expire entry */ - UNLINK_DYN_RULE(prev, V_ipfw_dyn_v[i], q); continue; - } + if (pkt->proto != q->id.proto || q->dyn_type == O_LIMIT_PARENT) - goto next; + continue; if (IS_IP6_FLOW_ID(pkt)) { if (IN6_ARE_ADDR_EQUAL(&pkt->src_ip6, &q->id.src_ip6) && @@ -463,17 +352,14 @@ static ipfw_dyn_rule * break; } } -next: - prev = q; - q = q->next; } if (q == NULL) goto done; /* q = NULL, not found */ if (prev != NULL) { /* found and not in front */ prev->next = q->next; - q->next = V_ipfw_dyn_v[i]; - V_ipfw_dyn_v[i] = q; + q->next = V_ipfw_dyn_v[i].head; + V_ipfw_dyn_v[i].head = q; } if (pkt->proto == IPPROTO_TCP) { /* update state according to flags */ uint32_t ack; @@ -556,44 +442,123 @@ ipfw_lookup_dyn_rule(struct ipfw_flow_id *pkt, int struct tcphdr *tcp) { ipfw_dyn_rule *q; + int i; - IPFW_DYN_LOCK(); - q = lookup_dyn_rule_locked(pkt, match_direction, tcp); + i = hash_packet(pkt, V_curr_dyn_buckets); + + IPFW_BUCK_LOCK(i); + q = lookup_dyn_rule_locked(pkt, i, match_direction, tcp); if (q == NULL) - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(i); /* NB: return table locked when q is not NULL */ return q; } -static void -realloc_dynamic_table(void) +/* + * Unlock bucket mtx + * @p - pointer to dynamic rule + */ +void +ipfw_dyn_unlock(ipfw_dyn_rule *q) { - IPFW_DYN_LOCK_ASSERT(); + IPFW_BUCK_UNLOCK(q->bucket); +} +static int +resize_dynamic_table(struct ip_fw_chain *chain, int nbuckets) +{ + int i, k, nbuckets_old; + ipfw_dyn_rule *q; + struct ipfw_dyn_bucket *dyn_v, *dyn_v_old; + + /* Check if given number is power of 2 and less than 64k */ + if (nbuckets > 65536) + return 1; + + if ((nbuckets & (nbuckets - 1)) != 0) + return -1; + + CTR3(KTR_NET, "%s: resize dynamic hash: %d -> %d", __func__, + V_curr_dyn_buckets, nbuckets); + + /* Allocate and initialize new hash */ + dyn_v = malloc(nbuckets * sizeof(ipfw_dyn_rule), M_IPFW, + M_WAITOK | M_ZERO); + + for (i = 0 ; i < nbuckets; i++) + IPFW_BUCK_LOCK_INIT(&dyn_v[i]); + /* - * Try reallocation, make sure we have a power of 2 and do - * not allow more than 64k entries. In case of overflow, - * default to 1024. + * Call upper half lock, as get_map() do to ease + * read-only access to dynamic rules hash from sysctl */ + IPFW_UH_WLOCK(chain); - if (V_dyn_buckets > 65536) - V_dyn_buckets = 1024; - if ((V_dyn_buckets & (V_dyn_buckets-1)) != 0) { /* not a power of 2 */ - V_dyn_buckets = V_curr_dyn_buckets; /* reset */ - return; + /* Acquire chain write lock to permit hash access + * for main traffic path without additional locks + */ + IPFW_WLOCK(chain); + + /* Save old values */ + nbuckets_old = V_curr_dyn_buckets; + dyn_v_old = V_ipfw_dyn_v; + + /* Skip relinking if array is not set up */ + if (V_ipfw_dyn_v == NULL) + V_curr_dyn_buckets = 0; + + /* Re-link all dynamic states */ + for (i = 0 ; i < V_curr_dyn_buckets ; i++) { + while (V_ipfw_dyn_v[i].head != NULL) { + /* Remove from current chain */ + q = V_ipfw_dyn_v[i].head; + V_ipfw_dyn_v[i].head = q->next; + + /* Get new hash value */ + k = hash_packet(&q->id, nbuckets); + q->bucket = k; + /* Add to the new head */ + q->next = dyn_v[k].head; + dyn_v[k].head = q; + } } - V_curr_dyn_buckets = V_dyn_buckets; - if (V_ipfw_dyn_v != NULL) - free(V_ipfw_dyn_v, M_IPFW); - for (;;) { - V_ipfw_dyn_v = malloc(V_curr_dyn_buckets * sizeof(ipfw_dyn_rule *), - M_IPFW, M_NOWAIT | M_ZERO); - if (V_ipfw_dyn_v != NULL || V_curr_dyn_buckets <= 2) - break; - V_curr_dyn_buckets /= 2; + + /* Update current pointers/buckets values */ + V_curr_dyn_buckets = nbuckets; + V_ipfw_dyn_v = dyn_v; + + IPFW_WUNLOCK(chain); + + IPFW_UH_WUNLOCK(chain); + + /* Start periodic callout on initial creation */ + if (dyn_v_old == NULL) { + callout_reset_on(&V_ipfw_timeout, hz, ipfw_dyn_tick, curvnet, 0); + return (0); } + + /* Destroy all mutexes */ + for (i = 0 ; i < nbuckets_old ; i++) + IPFW_BUCK_LOCK_DESTROY(&dyn_v_old[i]); + + /* Free old hash */ + free(dyn_v_old, M_IPFW); + + return 0; } +#if 0 +void +ipfw_prepare_dynamic(struct ip_fw_chain *chain) +{ + + if (V_ipfw_dyn_v != NULL) + return; + + resize_dynamic_table(chain, V_curr_dyn_buckets); +} +#endif + /** * Install state of type 'type' for a dynamic session. * The hash table contains two type of rules: @@ -605,33 +570,26 @@ ipfw_lookup_dyn_rule(struct ipfw_flow_id *pkt, int * - "parent" rules for the above (O_LIMIT_PARENT). */ static ipfw_dyn_rule * -add_dyn_rule(struct ipfw_flow_id *id, u_int8_t dyn_type, struct ip_fw *rule) +add_dyn_rule(struct ipfw_flow_id *id, int i, u_int8_t dyn_type, struct ip_fw *rule) { ipfw_dyn_rule *r; - int i; - IPFW_DYN_LOCK_ASSERT(); + IPFW_BUCK_ASSERT(i); - if (V_ipfw_dyn_v == NULL || - (V_dyn_count == 0 && V_dyn_buckets != V_curr_dyn_buckets)) { - realloc_dynamic_table(); - if (V_ipfw_dyn_v == NULL) - return NULL; /* failed ! */ - } - i = hash_packet(id); - r = uma_zalloc(ipfw_dyn_rule_zone, M_NOWAIT | M_ZERO); if (r == NULL) { printf ("ipfw: sorry cannot allocate state\n"); return NULL; } - /* increase refcount on parent, and set pointer */ + /* + * refcount on parent is already incremented, so + * it is safe to use parent unlocked. + */ if (dyn_type == O_LIMIT) { ipfw_dyn_rule *parent = (ipfw_dyn_rule *)rule; if ( parent->dyn_type != O_LIMIT_PARENT) panic("invalid parent"); - parent->count++; r->parent = parent; rule = parent->rule; } @@ -644,8 +602,8 @@ static ipfw_dyn_rule * r->count = 0; r->bucket = i; - r->next = V_ipfw_dyn_v[i]; - V_ipfw_dyn_v[i] = r; + r->next = V_ipfw_dyn_v[i].head; + V_ipfw_dyn_v[i].head = r; V_dyn_count++; DEB(print_dyn_rule(id, dyn_type, "add dyn entry", "total");) return r; @@ -656,40 +614,40 @@ static ipfw_dyn_rule * * If the lookup fails, then install one. */ static ipfw_dyn_rule * -lookup_dyn_parent(struct ipfw_flow_id *pkt, struct ip_fw *rule) +lookup_dyn_parent(struct ipfw_flow_id *pkt, int *pindex, struct ip_fw *rule) { ipfw_dyn_rule *q; - int i; + int i, is_v6; - IPFW_DYN_LOCK_ASSERT(); + is_v6 = IS_IP6_FLOW_ID(pkt); + i = hash_packet( pkt, V_curr_dyn_buckets ); + *pindex = i; + IPFW_BUCK_LOCK(i); + for (q = V_ipfw_dyn_v[i].head ; q != NULL ; q=q->next) + if (q->dyn_type == O_LIMIT_PARENT && + rule== q->rule && + pkt->proto == q->id.proto && + pkt->src_port == q->id.src_port && + pkt->dst_port == q->id.dst_port && + ( + (is_v6 && + IN6_ARE_ADDR_EQUAL(&(pkt->src_ip6), + &(q->id.src_ip6)) && + IN6_ARE_ADDR_EQUAL(&(pkt->dst_ip6), + &(q->id.dst_ip6))) || + (!is_v6 && + pkt->src_ip == q->id.src_ip && + pkt->dst_ip == q->id.dst_ip) + ) + ) { + q->expire = time_uptime + V_dyn_short_lifetime; + DEB(print_dyn_rule(pkt, q->dyn_type, + "lookup_dyn_parent found", "");) + return q; + } - if (V_ipfw_dyn_v) { - int is_v6 = IS_IP6_FLOW_ID(pkt); - i = hash_packet( pkt ); - for (q = V_ipfw_dyn_v[i] ; q != NULL ; q=q->next) - if (q->dyn_type == O_LIMIT_PARENT && - rule== q->rule && - pkt->proto == q->id.proto && - pkt->src_port == q->id.src_port && - pkt->dst_port == q->id.dst_port && - ( - (is_v6 && - IN6_ARE_ADDR_EQUAL(&(pkt->src_ip6), - &(q->id.src_ip6)) && - IN6_ARE_ADDR_EQUAL(&(pkt->dst_ip6), - &(q->id.dst_ip6))) || - (!is_v6 && - pkt->src_ip == q->id.src_ip && - pkt->dst_ip == q->id.dst_ip) - ) - ) { - q->expire = time_uptime + V_dyn_short_lifetime; - DEB(print_dyn_rule(pkt, q->dyn_type, - "lookup_dyn_parent found", "");) - return q; - } - } - return add_dyn_rule(pkt, O_LIMIT_PARENT, rule); + /* Add virtual limiting rule */ + return add_dyn_rule(pkt, i, O_LIMIT_PARENT, rule); } /** @@ -704,12 +662,15 @@ ipfw_install_state(struct ip_fw *rule, ipfw_insn_l { static int last_log; ipfw_dyn_rule *q; + int i; DEB(print_dyn_rule(&args->f_id, cmd->o.opcode, "install_state", "");) + + i = hash_packet(&args->f_id, V_curr_dyn_buckets); - IPFW_DYN_LOCK(); + IPFW_BUCK_LOCK(i); - q = lookup_dyn_rule_locked(&args->f_id, NULL, NULL); + q = lookup_dyn_rule_locked(&args->f_id, i, NULL, NULL); if (q != NULL) { /* should never occur */ DEB( @@ -718,26 +679,22 @@ ipfw_install_state(struct ip_fw *rule, ipfw_insn_l printf("ipfw: %s: entry already present, done\n", __func__); }) - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(i); return (0); } - if (V_dyn_count >= V_dyn_max) - /* Run out of slots, try to remove any expired rule. */ - remove_dyn_rule(NULL, (ipfw_dyn_rule *)1); - if (V_dyn_count >= V_dyn_max) { if (last_log != time_uptime) { last_log = time_uptime; printf("ipfw: %s: Too many dynamic rules\n", __func__); } - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(i); return (1); /* cannot install, notify caller */ } switch (cmd->o.opcode) { case O_KEEP_STATE: /* bidir rule */ - add_dyn_rule(&args->f_id, O_KEEP_STATE, rule); + add_dyn_rule(&args->f_id, i, O_KEEP_STATE, rule); break; case O_LIMIT: { /* limit number of sessions */ @@ -745,6 +702,7 @@ ipfw_install_state(struct ip_fw *rule, ipfw_insn_l ipfw_dyn_rule *parent; uint32_t conn_limit; uint16_t limit_mask = cmd->limit_mask; + int pindex; conn_limit = (cmd->conn_limit == IP_FW_TABLEARG) ? tablearg : cmd->conn_limit; @@ -778,46 +736,54 @@ ipfw_install_state(struct ip_fw *rule, ipfw_insn_l id.src_port = args->f_id.src_port; if (limit_mask & DYN_DST_PORT) id.dst_port = args->f_id.dst_port; - if ((parent = lookup_dyn_parent(&id, rule)) == NULL) { + + /* + * We have to release lock for previous bucket to + * avoid possible deadlock + */ + IPFW_BUCK_UNLOCK(i); + + if ((parent = lookup_dyn_parent(&id, &pindex, rule)) == NULL) { printf("ipfw: %s: add parent failed\n", __func__); - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(pindex); return (1); } if (parent->count >= conn_limit) { - /* See if we can remove some expired rule. */ - remove_dyn_rule(rule, parent); - if (parent->count >= conn_limit) { - if (V_fw_verbose && last_log != time_uptime) { - last_log = time_uptime; - char sbuf[24]; - last_log = time_uptime; - snprintf(sbuf, sizeof(sbuf), - "%d drop session", - parent->rule->rulenum); - print_dyn_rule_flags(&args->f_id, - cmd->o.opcode, - LOG_SECURITY | LOG_DEBUG, - sbuf, "too many entries"); - } - IPFW_DYN_UNLOCK(); - return (1); + if (V_fw_verbose && last_log != time_uptime) { + last_log = time_uptime; + char sbuf[24]; + last_log = time_uptime; + snprintf(sbuf, sizeof(sbuf), + "%d drop session", + parent->rule->rulenum); + print_dyn_rule_flags(&args->f_id, + cmd->o.opcode, + LOG_SECURITY | LOG_DEBUG, + sbuf, "too many entries"); } + IPFW_BUCK_UNLOCK(pindex); + return (1); } - add_dyn_rule(&args->f_id, O_LIMIT, (struct ip_fw *)parent); + /* Increment counter on parent */ + parent->count++; + IPFW_BUCK_UNLOCK(pindex); + + IPFW_BUCK_LOCK(i); + add_dyn_rule(&args->f_id, i, O_LIMIT, (struct ip_fw *)parent); break; } default: printf("ipfw: %s: unknown dynamic rule type %u\n", __func__, cmd->o.opcode); - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(i); return (1); } /* XXX just set lifetime */ - lookup_dyn_rule_locked(&args->f_id, NULL, NULL); + lookup_dyn_rule_locked(&args->f_id, i, NULL, NULL); - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(i); return (0); } @@ -996,24 +962,87 @@ ipfw_dyn_send_ka(struct mbuf **mtailp, ipfw_dyn_ru } /* - * This procedure is only used to handle keepalives. It is invoked - * every dyn_keepalive_period + * This procedure is used to perform various maintance + * on dynamic hash list. Currently it is called every second. */ static void -ipfw_tick(void * vnetx) +ipfw_dyn_tick(void * vnetx) { - struct mbuf *m0, *m, *mnext, **mtailp; - struct ip *h; - int i; - ipfw_dyn_rule *q; + struct ip_fw_chain *chain; + int check_ka = 0; #ifdef VIMAGE struct vnet *vp = vnetx; #endif CURVNET_SET(vp); - if (V_dyn_keepalive == 0 || V_ipfw_dyn_v == NULL || V_dyn_count == 0) - goto done; + chain = &V_layer3_chain; + + /* Run keepalive checks every keepalive_interval iff ka is enabled */ + if ((V_dyn_keepalive_last + V_dyn_keepalive_interval >= time_uptime) && + (V_dyn_keepalive != 0)) { + V_dyn_keepalive_last = time_uptime; + check_ka = 1; + } + + check_dyn_rules(chain, NULL, RESVD_SET, check_ka, 1); + + callout_reset_on(&V_ipfw_timeout, hz, ipfw_dyn_tick, vnetx, 0); + + CURVNET_RESTORE(); +} + + +/* + * Walk thru all dynamic states doing generic maintance: + * 1) free expired states + * 2) free all states based on deleted rule / set + * 3) send keepalives for states if needed + * + * @chain - pointer to current ipfw rules chain + * @rule - delete all states originated by given rule if != NULL + * @set - delete all states originated by any rule in set @set if != RESVD_SET + * @check_ka - perform checking/sending keepalives + * @timer - indicate call from timer routine. + * + * Timer routine must call this function unlocked to permit + * sending keepalives/resizing table. + * + * Others has to call function with IPFW_UH_WLOCK held. + * + * Write lock is needed to ensure that unused parent rules + * are not freed by other instance (see stage 2, 3) + */ +static void +check_dyn_rules(struct ip_fw_chain *chain, struct ip_fw *rule, + int set, int check_ka, int timer) +{ + struct mbuf *m0, *m, *mnext, **mtailp; + struct ip *h; + int i, new_buckets = 0, max_buckets; + int expired = 0, expired_limits = 0, parents = 0, total = 0; + ipfw_dyn_rule *q, *q_prev, *q_next; + ipfw_dyn_rule *exp_head, **exptailp; + ipfw_dyn_rule *exp_lhead, **expltailp; + + KASSERT(V_ipfw_dyn_v != NULL, ("%s: dynamic table not allocated", + __func__)); + + /* Avoid possible LOR */ + KASSERT(!check_ka || timer, ("%s: keepalive check with lock held", + __func__)); + + if (V_dyn_count == 0) + return; + + /* Expired states */ + exp_head = NULL; + exptailp = &exp_head; + + /* Expired limit states */ + exp_lhead = NULL; + expltailp = &exp_lhead; + /* * We make a chain of packets to go out here -- not deferring * until after we drop the IPFW dynamic rule lock would result @@ -1022,27 +1051,202 @@ static void */ m0 = NULL; mtailp = &m0; - IPFW_DYN_LOCK(); + + /* Protect from hash resizing */ + if (timer != 0) + IPFW_UH_WLOCK(chain); + else { + IPFW_UH_WLOCK_ASSERT(chain); + } + +#define NEXT_RULE() { q_prev = q; q = q->next ; continue; } + + /* Stage 1: perform requested deletion */ for (i = 0 ; i < V_curr_dyn_buckets ; i++) { - for (q = V_ipfw_dyn_v[i] ; q ; q = q->next ) { - if (q->dyn_type == O_LIMIT_PARENT) - continue; - if (TIME_LEQ(q->expire, time_uptime)) - continue; /* too late, rule expired */ + IPFW_BUCK_LOCK(i); + for (q = V_ipfw_dyn_v[i].head, q_prev = q ; q ; ) { + /* account every rule */ + total++; - if (q->id.proto != IPPROTO_TCP) + /* Skip parent rules at all */ + if (q->dyn_type == O_LIMIT_PARENT) { + parents++; + NEXT_RULE(); + } + + /* + * Remove rules which are: + * 1) expired + * 2) created by given rule + * 3) created by any rule in given set + */ + if ((TIME_LEQ(q->expire, time_uptime)) || + ((rule != NULL) && (q->rule == rule)) || + ((set != RESVD_SET) && (q->rule->set == set))) { + /* Unlink q from current list */ + if (q == V_ipfw_dyn_v[i].head) + V_ipfw_dyn_v[i].head = q->next; + else + q_prev->next = q->next; + q->next = NULL; + + /* queue q to expire list */ + if (q->dyn_type != O_LIMIT) { + *exptailp = q; + exptailp = &(*exptailp)->next; + DEB(print_dyn_rule(&q->id, q->dyn_type, + "unlink entry", "left"); + ) + } else { + /* Separate list for limit rules */ + *expltailp = q; + expltailp = &(*expltailp)->next; + expired_limits++; + DEB(print_dyn_rule(&q->id, q->dyn_type, + "unlink limit entry", "left"); + ) + } + + q = q_prev->next; + expired++; continue; - if ( (q->state & BOTH_SYN) != BOTH_SYN) - continue; - if (TIME_LEQ(time_uptime + V_dyn_keepalive_interval, - q->expire)) - continue; /* too early */ + } - mtailp = ipfw_dyn_send_ka(mtailp, q); + /* + * Check if we need to send keepalive: + * we need to ensure if is time to do KA, + * this is established TCP session, and + * expire time is within keepalive interval + */ + if ((check_ka != 0) && (q->id.proto == IPPROTO_TCP) && + ((q->state & BOTH_SYN) == BOTH_SYN) && + (TIME_LEQ(q->expire, time_uptime + + V_dyn_keepalive_interval))) + mtailp = ipfw_dyn_send_ka(mtailp, q); + + NEXT_RULE(); } + IPFW_BUCK_UNLOCK(i); } - IPFW_DYN_UNLOCK(); + /* Stage 2: decrement counters from O_LIMIT parents */ + if (expired_limits != 0) { + /* + * XXX: Note that deleting set with more than one + * heavily-used LIMIT rules can result in overwhelming + * locking due to lack of per-hash value sorting + * + * We should probably think about: + * 1) pre-allocating hash of size, say, + * MAX(16, V_curr_dyn_buckets / 1024) + * 2) checking if expired_limits is large enough + * 3) If yes, init hash (or its part), re-link + * current list and start decrementing procedure in + * each bucket separately + */ + + /* + * Small optimization: do not unlock bucket until + * we see the next item resides in different bucket + */ + if (exp_lhead != NULL) { + i = exp_lhead->parent->bucket; + IPFW_BUCK_LOCK(i); + } + for (q = exp_lhead; q != NULL; q = q->next) { + if (i != q->parent->bucket) { + IPFW_BUCK_UNLOCK(i); + i = q->parent->bucket; + IPFW_BUCK_LOCK(i); + } + + /* Decrease parent refcount */ + q->parent->count--; + } + if (exp_lhead != NULL) + IPFW_BUCK_UNLOCK(i); + } + + /* + * We protectet ourselves from unused parent deletion by + * holding UH write lock. + */ + + /* Stage 3: remove unused parent rules */ + if ((parents != 0) && (expired != 0)) { + for (i = 0 ; i < V_curr_dyn_buckets ; i++) { + IPFW_BUCK_LOCK(i); + for (q = V_ipfw_dyn_v[i].head, q_prev = q ; q ; ) { + if (q->dyn_type != O_LIMIT_PARENT) + NEXT_RULE(); + + if (q->count != 0) + NEXT_RULE(); + + /* Parent rule without consumers */ + *exptailp = q; + exptailp = &(*exptailp)->next; + + DEB(print_dyn_rule(&q->id, q->dyn_type, + "unlink parent entry", "left"); + ) + + expired++; + + q = q->next; + } + IPFW_BUCK_UNLOCK(i); + } + } + +#undef NEXT_RULE + + /* + * Update total rules count. + * This can be slightly incorrect since we lock/unlock + * every bucket lock sequentally. + * + * However, this is good and regularly updated estimation + * for the total rules count. + */ + V_dyn_count = total - expired; + + /* + * Check if we need to resize hash: + * if current number of states exceeds number of buckes in hash, + * grow hash size to the minimum power of 2 which is bigger than + * current states count. Limit hash size by 64k. + */ + max_buckets = (V_dyn_buckets_max > 65536) ? 65536 : V_dyn_buckets_max; + + if (V_dyn_count > V_curr_dyn_buckets * 2) { + new_buckets = V_curr_dyn_buckets; + while (new_buckets < V_dyn_count) { + new_buckets *= 2; + + if (new_buckets >= max_buckets) + break; + } + } + + if (timer != 0) + IPFW_UH_WUNLOCK(chain); + + /* Finally delete old states ad limits if any */ + for (q = exp_head; q != NULL; q = q_next) { + q_next = q->next; + uma_zfree(ipfw_dyn_rule_zone, q); + } + + for (q = exp_lhead; q != NULL; q = q_next) { + q_next = q->next; + uma_zfree(ipfw_dyn_rule_zone, q); + } + + /* The rest code should be called from timer routine only */ + if (timer == 0) + return; + /* Send keepalive packets if any */ for (m = m0; m != NULL; m = mnext) { mnext = m->m_nextpkt; @@ -1055,34 +1259,48 @@ static void ip6_output(m, NULL, NULL, 0, NULL, NULL, NULL); #endif } -done: - callout_reset_on(&V_ipfw_timeout, V_dyn_keepalive_period * hz, - ipfw_tick, vnetx, 0); - CURVNET_RESTORE(); + + /* Run table resize without holding any locks */ + if (new_buckets != 0) + resize_dynamic_table(chain, new_buckets); } +/* + * Deletes all dynamic rules originated by given rule or all rules in + * given set. Specify RESVD_SET to indicate set should not be used. + * @chain - pointer to current ipfw rules chain + * @rule - delete all states originated by given rule if != NULL + * @set - delete all states originated by any rule in set @set if != RESVD_SET + * + * Function has to be called with IPFW_UH_WLOCK held. + */ void +ipfw_expire_dyn_rules(struct ip_fw_chain *chain, struct ip_fw *rule, int set) +{ + + check_dyn_rules(chain, rule, set, 0, 0); +} + +void ipfw_dyn_attach(void) { ipfw_dyn_rule_zone = uma_zcreate("IPFW dynamic rule", sizeof(ipfw_dyn_rule), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); - - IPFW_DYN_LOCK_INIT(); } void ipfw_dyn_detach(void) { + uma_zdestroy(ipfw_dyn_rule_zone); - IPFW_DYN_LOCK_DESTROY(); } void -ipfw_dyn_init(void) +ipfw_dyn_init(struct ip_fw_chain *chain) { V_ipfw_dyn_v = NULL; - V_dyn_buckets = 256; /* must be power of 2 */ + V_dyn_buckets_max = 256; /* must be power of 2 */ V_curr_dyn_buckets = 256; /* must be power of 2 */ V_dyn_ack_lifetime = 300; @@ -1095,32 +1313,55 @@ void V_dyn_keepalive_interval = 20; V_dyn_keepalive_period = 5; V_dyn_keepalive = 1; /* do send keepalives */ + V_dyn_keepalive = time_uptime; V_dyn_max = 4096; /* max # of dynamic rules */ callout_init(&V_ipfw_timeout, CALLOUT_MPSAFE); - callout_reset_on(&V_ipfw_timeout, hz, ipfw_tick, curvnet, 0); + + resize_dynamic_table(chain, V_curr_dyn_buckets); } void ipfw_dyn_uninit(int pass) { - if (pass == 0) + int i; + + if (pass == 0) { callout_drain(&V_ipfw_timeout); - else { - if (V_ipfw_dyn_v != NULL) - free(V_ipfw_dyn_v, M_IPFW); + return; } + + if (V_ipfw_dyn_v != NULL) { + /* + * Skip deleting all dynamic states - + * uma_zdestroy() does this more efficiently; + */ + + /* Destroy all mutexes */ + for (i = 0 ; i < V_curr_dyn_buckets ; i++) + IPFW_BUCK_LOCK_DESTROY(&V_ipfw_dyn_v[i]); + free(V_ipfw_dyn_v, M_IPFW); + V_ipfw_dyn_v = NULL; + } } +/* + * Returns number of dynamic rules. + */ int ipfw_dyn_len(void) { + return (V_ipfw_dyn_v == NULL) ? 0 : (V_dyn_count * sizeof(ipfw_dyn_rule)); } +/* + * Fill given buffer with dynamic states. + * IPFW_UH_RLOCK has to be held while calling. + */ void -ipfw_get_dynamic(char **pbp, const char *ep) +ipfw_get_dynamic(struct ip_fw_chain *chain, char **pbp, const char *ep) { ipfw_dyn_rule *p, *last = NULL; char *bp; @@ -1130,9 +1371,11 @@ void return; bp = *pbp; - IPFW_DYN_LOCK(); - for (i = 0 ; i < V_curr_dyn_buckets; i++) - for (p = V_ipfw_dyn_v[i] ; p != NULL; p = p->next) { + IPFW_UH_RLOCK_ASSERT(chain); + + for (i = 0 ; i < V_curr_dyn_buckets; i++) { + IPFW_BUCK_LOCK(i); + for (p = V_ipfw_dyn_v[i].head ; p != NULL; p = p->next) { if (bp + sizeof *p <= ep) { ipfw_dyn_rule *dst = (ipfw_dyn_rule *)bp; @@ -1161,7 +1404,9 @@ void bp += sizeof(ipfw_dyn_rule); } } - IPFW_DYN_UNLOCK(); + IPFW_BUCK_UNLOCK(i); + } + if (last != NULL) /* mark last dynamic rule */ bzero(&last->next, sizeof(last)); *pbp = bp; Index: sys/netpfil/ipfw/ip_fw2.c =================================================================== --- sys/netpfil/ipfw/ip_fw2.c (revision 242524) +++ sys/netpfil/ipfw/ip_fw2.c (working copy) @@ -2046,7 +2046,7 @@ do { \ f->rulenum, f->id); cmd = ACTION_PTR(f); l = f->cmd_len - f->act_ofs; - ipfw_dyn_unlock(); + ipfw_dyn_unlock(q); cmdlen = 0; match = 1; break; @@ -2637,7 +2637,7 @@ vnet_ipfw_init(const void *unused) chain->id = rule->id = 1; IPFW_LOCK_INIT(chain); - ipfw_dyn_init(); + ipfw_dyn_init(chain); /* First set up some values that are compile time options */ V_ipfw_vnet_ready = 1; /* Open for business */ --------------000406010204050104020709-- From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 20:15:43 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7817DF56; Tue, 13 Nov 2012 20:15:43 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 571AB8FC0C; Tue, 13 Nov 2012 20:15:43 +0000 (UTC) Received: from kruse-124.4.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) by elvis.mu.org (Postfix) with ESMTPSA id 8056C1A3C1A; Tue, 13 Nov 2012 12:15:37 -0800 (PST) Message-ID: <50A2AA89.9060309@mu.org> Date: Tue, 13 Nov 2012 12:16:09 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: [CFT] ipfw SMP-ready dynamic states References: <50A29F57.6090701@yandex-team.ru> In-Reply-To: <50A29F57.6090701@yandex-team.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-ipfw@freebsd.org, Luigi Rizzo , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 20:15:43 -0000 Alexander, this is awesome. On 11/13/12 11:28 AM, Alexander V. Chernikov wrote: > Hello list! > > Currently most ipfw operations with dynamic states (keep-state, > check-state, limit) are serialized via IPFW_DYN_LOCK() which is > per-vnet mutex lock. > > As a result, performance is limited to the same ~650kpps as in routing > (in several cases). > > Patch changes the following: > * global lock is changed to per-bucket mutex > * state expiration is done in ipfw_tick every 1s. No expiration is > done on forwarding path > * hash table resize is done automatically and does not cause all > states to be lost > > The only (architectural) problem I see is unlocked V_dyn_count > increments. > So, we can do the following: > 1) lock increments/decrements via some separate mutex > 2) do nothing > 3) take some combined approach: > > Generally, we don't need value to be _exact_. > As a result, we count total number of states in every ipfw_tick run > and set V_dyn_count to new value. New states still increment > V_dyn_count unlocked. > What about using per-cpu PCPU counters, and then collecting them for display/reporting? -Alfred > > Performance: > > Synthetic traffic, ipfw with single allow ip from any to any rule: 2.4M. > single keep-state ip from any to any: 2.2M. > > Some more tests should be taken (with large number of states, > different types of traffic, etc), maybe I can do some next week. > > > You need to run recent -current or merge r242631 and r242834 before > applying this patch. > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 20:33:36 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1002A48B; Tue, 13 Nov 2012 20:33:36 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id 96D368FC13; Tue, 13 Nov 2012 20:33:35 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1TYNEB-0005aJ-5t; Wed, 14 Nov 2012 00:36:59 +0400 Message-ID: <50A2AE84.5040304@FreeBSD.org> Date: Wed, 14 Nov 2012 00:33:08 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: Alfred Perlstein Subject: Re: [CFT] ipfw SMP-ready dynamic states References: <50A29F57.6090701@yandex-team.ru> <50A2AA89.9060309@mu.org> In-Reply-To: <50A2AA89.9060309@mu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Alexander V. Chernikov" , freebsd-ipfw@freebsd.org, Luigi Rizzo , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 20:33:36 -0000 On 14.11.2012 00:16, Alfred Perlstein wrote: > Alexander, this is awesome. > > On 11/13/12 11:28 AM, Alexander V. Chernikov wrote: >> Hello list! >> >> Currently most ipfw operations with dynamic states (keep-state, >> check-state, limit) are serialized via IPFW_DYN_LOCK() which is >> per-vnet mutex lock. >> >> As a result, performance is limited to the same ~650kpps as in routing >> (in several cases). >> >> Patch changes the following: >> * global lock is changed to per-bucket mutex >> * state expiration is done in ipfw_tick every 1s. No expiration is >> done on forwarding path >> * hash table resize is done automatically and does not cause all >> states to be lost >> >> The only (architectural) problem I see is unlocked V_dyn_count >> increments. >> So, we can do the following: >> 1) lock increments/decrements via some separate mutex >> 2) do nothing >> 3) take some combined approach: >> >> Generally, we don't need value to be _exact_. >> As a result, we count total number of states in every ipfw_tick run >> and set V_dyn_count to new value. New states still increment >> V_dyn_count unlocked. >> > What about using per-cpu PCPU counters, and then collecting them for > display/reporting? We currently don't have working dynamic PCPU counters in our base system. However, there is a patch implementing such counters based on UMA. (And we're testing it on ipfw :) ). I hope it will be announced till the end of this month. > > -Alfred > > >> >> Performance: >> >> Synthetic traffic, ipfw with single allow ip from any to any rule: 2.4M. >> single keep-state ip from any to any: 2.2M. >> >> Some more tests should be taken (with large number of states, >> different types of traffic, etc), maybe I can do some next week. >> >> >> You need to run recent -current or merge r242631 and r242834 before >> applying this patch. >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-ipfw@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw > To unsubscribe, send any mail to "freebsd-ipfw-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 21:41:07 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2E9A43E3 for ; Tue, 13 Nov 2012 21:41:07 +0000 (UTC) (envelope-from guy.helmer@gmail.com) Received: from mail-ye0-f182.google.com (mail-ye0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id D8E468FC12 for ; Tue, 13 Nov 2012 21:41:06 +0000 (UTC) Received: by mail-ye0-f182.google.com with SMTP id q9so292434yen.13 for ; Tue, 13 Nov 2012 13:41:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-type:content-transfer-encoding:subject:message-id:date :to:mime-version:x-mailer; bh=XQuJdMukwlqNaS+LgrxwRdbsDAM3TUPoccMxxxND89o=; b=PkzcEBja5NGB/K7Kk0svMtRulyV9+dIBQPoKXUhQ7ZMR1remcYvoonnPbUlEj2x59/ PyGAzYfCpbNA/n8/tMq91JiEVdFetVN9/FRYgGjPu10Lz2RDWiNWNc5Exc3mxtMyavNe Wct6Iqh+lfyJGPpMWcfzPSr4t51RW2kQSl8Kbi0hnDvDYLsAI0TXs3LLt74oopGESXVV tXE3+bCbcle3FDNcShNsNxy/v6EkntV3DkZVADKml/JOqRYXJYZlZdk030Ths3n4zZZy T6wU+Wxgc7SIE2H83cshw5G3Z7LIToAo8fGi/Iuveje+GU+MEoDIwyxyAxpTBAynoebQ yVNA== Received: by 10.100.250.2 with SMTP id x2mr6729629anh.14.1352842860747; Tue, 13 Nov 2012 13:41:00 -0800 (PST) Received: from guysmbp.dyn.palisadesys.com ([216.81.189.10]) by mx.google.com with ESMTPS id u22sm10999516yhl.2.2012.11.13.13.40.59 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 13 Nov 2012 13:41:00 -0800 (PST) From: Guy Helmer Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: bpf hold buffer in-use flag Message-Id: <9C928117-2230-4F01-9B95-B6D945AF4416@gmail.com> Date: Tue, 13 Nov 2012 15:40:57 -0600 To: "freebsd-net@freebsd.org" Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 21:41:07 -0000 To try to completely resolve the race in bpfread(), I have put together = these changes to add a flag to indicate when the hold buffer cannot be = modified because it is in use. Since it's my first time using = mtx_sleep() and wakeup(), I wanted to run these past the list to see if = I can get any feedback on the approach. Index: bpf.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- bpf.c (revision 242997) +++ bpf.c (working copy) @@ -819,6 +819,7 @@ bpfopen(struct cdev *dev, int flags, int fmt, stru * particular buffer method. */ bpf_buffer_init(d); + d->bd_hbuf_in_use =3D 0; d->bd_bufmode =3D BPF_BUFMODE_BUFFER; d->bd_sig =3D SIGIO; d->bd_direction =3D BPF_D_INOUT; @@ -872,6 +873,9 @@ bpfread(struct cdev *dev, struct uio *uio, int iof callout_stop(&d->bd_callout); timed_out =3D (d->bd_state =3D=3D BPF_TIMED_OUT); d->bd_state =3D BPF_IDLE; + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, + PRINET|PCATCH, "bd_hbuf", 0); /* * If the hold buffer is empty, then do a timed sleep, which * ends when the timeout expires or when enough packets @@ -940,27 +944,27 @@ bpfread(struct cdev *dev, struct uio *uio, int iof /* * At this point, we know we have something in the hold slot. */ + d->bd_hbuf_in_use =3D 1; BPFD_UNLOCK(d); /* * Move data from hold buffer into user space. * We know the entire buffer is transferred since * we checked above that the read buffer is bpf_bufsize bytes. - * - * XXXRW: More synchronization needed here: what if a second = thread - * issues a read on the same fd at the same time? Don't want = this - * getting invalidated. + * + * We do not have to worry about simultaneous reads because + * we waited for sole access to the hold buffer above. */ error =3D bpf_uiomove(d, d->bd_hbuf, d->bd_hlen, uio); BPFD_LOCK(d); - if (d->bd_hbuf !=3D NULL) { - /* Free the hold buffer only if it is still valid. */ - d->bd_fbuf =3D d->bd_hbuf; - d->bd_hbuf =3D NULL; - d->bd_hlen =3D 0; - bpf_buf_reclaimed(d); - } + KASSERT(d->bd_hbuf !=3D NULL, ("bpfread: lost bd_hbuf")); + d->bd_fbuf =3D d->bd_hbuf; + d->bd_hbuf =3D NULL; + d->bd_hlen =3D 0; + bpf_buf_reclaimed(d); + d->bd_hbuf_in_use =3D 0; + wakeup(&d->bd_hbuf_in_use); BPFD_UNLOCK(d); return (error); @@ -1114,6 +1118,9 @@ reset_d(struct bpf_d *d) BPFD_LOCK_ASSERT(d); + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, PRINET, + "bd_hbuf", 0); if ((d->bd_hbuf !=3D NULL) && (d->bd_bufmode !=3D BPF_BUFMODE_ZBUF || bpf_canfreebuf(d))) = { /* Free the hold buffer. */ @@ -1254,6 +1261,9 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t add BPFD_LOCK(d); n =3D d->bd_slen; + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, = &d->bd_lock, + PRINET, "bd_hbuf", 0); if (d->bd_hbuf) n +=3D d->bd_hlen; BPFD_UNLOCK(d); @@ -1967,6 +1977,9 @@ filt_bpfread(struct knote *kn, long hint) ready =3D bpf_ready(d); if (ready) { kn->kn_data =3D d->bd_slen; + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, + PRINET, "bd_hbuf", 0); if (d->bd_hbuf) kn->kn_data +=3D d->bd_hlen; } else if (d->bd_rtout > 0 && d->bd_state =3D=3D BPF_IDLE) { @@ -2299,6 +2312,9 @@ catchpacket(struct bpf_d *d, u_char *pkt, u_int pk * spot to do it. */ if (d->bd_fbuf =3D=3D NULL && bpf_canfreebuf(d)) { + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, + PRINET, "bd_hbuf", 0); d->bd_fbuf =3D d->bd_hbuf; d->bd_hbuf =3D NULL; d->bd_hlen =3D 0; @@ -2341,6 +2357,9 @@ catchpacket(struct bpf_d *d, u_char *pkt, u_int pk ++d->bd_dcount; return; } + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, + PRINET, "bd_hbuf", 0); ROTATE_BUFFERS(d); do_wakeup =3D 1; curlen =3D 0; Index: bpf.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- bpf.h (revision 242997) +++ bpf.h (working copy) @@ -1235,7 +1235,8 @@ SYSCTL_DECL(_net_bpf); /* * Rotate the packet buffers in descriptor d. Move the store buffer = into the * hold slot, and the free buffer ino the store slot. Zero the length = of the - * new store buffer. Descriptor lock should be held. + * new store buffer. Descriptor lock should be held. Hold buffer must + * not be marked "in use". */ #define ROTATE_BUFFERS(d) do { = \ (d)->bd_hbuf =3D (d)->bd_sbuf; = \ Index: bpf_buffer.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- bpf_buffer.c (revision 242997) +++ bpf_buffer.c (working copy) @@ -189,6 +189,9 @@ bpf_buffer_ioctl_sblen(struct bpf_d *d, u_int *i) return (EINVAL); } + while (d->bd_hbuf_in_use) + mtx_sleep(&d->bd_hbuf_in_use, &d->bd_lock, + PRINET, "bd_hbuf", 0); /* Free old buffers if set */ if (d->bd_fbuf !=3D NULL) free(d->bd_fbuf, M_BPF); Index: bpfdesc.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- bpfdesc.h (revision 242997) +++ bpfdesc.h (working copy) @@ -63,6 +63,7 @@ struct bpf_d { caddr_t bd_sbuf; /* store slot */ caddr_t bd_hbuf; /* hold slot */ caddr_t bd_fbuf; /* free slot */ + int bd_hbuf_in_use; /* don't rotate buffers */ int bd_slen; /* current length of store = buffer */ int bd_hlen; /* current length of hold buffer = */ From owner-freebsd-net@FreeBSD.ORG Tue Nov 13 21:48:20 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EDD349C0 for ; Tue, 13 Nov 2012 21:48:20 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet02.ebureau.com (internet02.tru-signal.biz [65.127.24.21]) by mx1.freebsd.org (Postfix) with ESMTP id ACBAD8FC14 for ; Tue, 13 Nov 2012 21:48:20 +0000 (UTC) Received: from service02.office.ebureau.com (internet06.ebureau.com [65.127.24.25]) by internet02.ebureau.com (Postfix) with ESMTP id 746C8E0D307; Tue, 13 Nov 2012 15:48:14 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by service02.office.ebureau.com (Postfix) with ESMTP id 4D8C0DFE769; Tue, 13 Nov 2012 15:48:14 -0600 (CST) X-Virus-Scanned: amavisd-new at ebureau.com Received: from service02.office.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pugrGhKL8RKp; Tue, 13 Nov 2012 15:48:13 -0600 (CST) Received: from square.office.ebureau.com (square.office.iscompanies.com [10.10.20.22]) by service02.office.ebureau.com (Postfix) with ESMTPSA id A11D1DFE74B; Tue, 13 Nov 2012 15:48:13 -0600 (CST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\)) Subject: Re: Default ephemeral port range From: Dustin Wenz In-Reply-To: <95686CBD-5A11-48BD-A556-5133F537C82E@gmail.com> Date: Tue, 13 Nov 2012 15:48:13 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <2EEDF65D-C235-48A7-9464-82475C26E9DD@ebureau.com> References: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> <95686CBD-5A11-48BD-A556-5133F537C82E@gmail.com> To: Colin O'Keeffe X-Mailer: Apple Mail (2.1498) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2012 21:48:21 -0000 Thanks for the information; It would seem that when I invoke the connect() system call, it picks a = client port in the portrange.first-last range and not necessarily in = portrange.hifirst-hilast. Is this expected behavior, or a bug in = connect()? - .Dustin On Nov 12, 2012, at 12:49 PM, Colin O'Keeffe wrote: > 8.1 through 9.1RC will use net.inet.ip.portrange.hifirst (49152) to = .hilast (65535) for ephemeral ports as far as I'm aware. = net.inet.ip.portrange.first to .last are just a reference to available = port numbers as per RFC6056 >=20 > Correct me if I'm wrong but netinet/in_pcb.c:490 indicates this is the = case. >=20 > -Colin >=20 > On 12 Nov 2012, at 17:57, Dustin Wenz wrote: >=20 >> I'm trying to determine why the default ephemeral port range appears = to be 10000 through 65535 in at least 8.1 through 9.1RC. Documentation = regarding the lower bound on the range seems inconsistent. The FreeBSD = website (http://wiki.freebsd.org/SystemTuning) suggests that = net.inet.ip.portrange.first defaults to 49152, which I don't believe is = accurate. >>=20 >> The IANA recommends the range be 49152 through 65535 = (http://tools.ietf.org/html/rfc6056). Is there any particular reason why = net.inet.ip.portrange.first defaults to 10000? >>=20 >> - .Dustin >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 05:32:27 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8EC39D1C for ; Wed, 14 Nov 2012 05:32:27 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13::5]) by mx1.freebsd.org (Postfix) with ESMTP id E173F8FC08 for ; Wed, 14 Nov 2012 05:32:26 +0000 (UTC) Received: from eg.sd.rdtc.ru (localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.14.5/8.14.5) with ESMTP id qAE5WIva041919; Wed, 14 Nov 2012 12:32:19 +0700 (NOVT) (envelope-from eugen@grosbein.net) Message-ID: <50A32CDD.5000109@grosbein.net> Date: Wed, 14 Nov 2012 12:32:13 +0700 From: Eugene Grosbein User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7 MIME-Version: 1.0 To: Dustin Wenz Subject: Re: Default ephemeral port range References: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> <95686CBD-5A11-48BD-A556-5133F537C82E@gmail.com> <2EEDF65D-C235-48A7-9464-82475C26E9DD@ebureau.com> In-Reply-To: <2EEDF65D-C235-48A7-9464-82475C26E9DD@ebureau.com> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 8bit Cc: freebsd-net@freebsd.org, Colin O'Keeffe X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 05:32:27 -0000 14.11.2012 04:48, Dustin Wenz : > Thanks for the information; > > It would seem that when I invoke the connect() system call, it picks a client port in the portrange.first-last range and not necessarily in portrange.hifirst-hilast. Is this expected behavior, or a bug in connect()? Please read ip(4) manual page on IP_PORTRANGE. You can choose one of three ranges for your socket (default range, low and high). Eugene Grosbein From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 05:45:21 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 94803FCE; Wed, 14 Nov 2012 05:45:21 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (abby.lhr1.as41113.net [91.208.177.20]) by mx1.freebsd.org (Postfix) with ESMTP id 4EE0C8FC0C; Wed, 14 Nov 2012 05:45:20 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1ZRY1jQDz13Mp; Wed, 14 Nov 2012 05:45:13 +0000 (GMT) Message-ID: <50A32FE7.2010206@rewt.org.uk> Date: Wed, 14 Nov 2012 05:45:11 +0000 From: Joe Holden User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> In-Reply-To: <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , gnn@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 05:45:21 -0000 Sean Chittenden wrote: >>> Hello. I ran in to an interesting situation in what appears to be an exotic situation. Specifically, after reviewing RFC5735 again and searching for a datacenter-local or rack-local IP range (i.e trying to provide services that are guaranteed to be provided in the same rack as the server), I settled on the 0.0.0.0/8 network. Per 3 of RFC5735, it would appear that this network is valid: >>> >>> https://tools.ietf.org/html/rfc5735#section-3 >>> >>>> 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" >>>> network. Address 0.0.0.0/32 may be used as a source address for this >>>> host on this network; other addresses within 0.0.0.0/8 may be used to >>>> refer to specified hosts on this network ([RFC1122], Section 3.2.1.3). >>> And this works as expected, with regards to TCP services. But ICMP? Not so much. Is there a reason that ICMP would fail, but TCP (e.g. ssh) works? For example, I pulled 0.42.123.10 and 0.42.123.20 as IP addresses to use for NTP servers, but much to my surprise, I could ssh between the hosts, but I couldn't ping. Is this intentional? I understand that 0.0.0.0/32 == INADDR_ANY for source addresses, but it doesn't appear that there should be a restriction of inbound echoreq packets. According to tcpdump(1), the host is receiving echoreq packets, however no echorep packets are generated. As a work around, I threw things in to a more traditional RFC1918 network and things immediately worked for both SSH and ICMP. >> The check to drop ICMP replies to a source of 0.0.0.0/8 was added >> in r120958 as part of a fix for link local addresses. It was only >> applied to ICMP which is inconsistent as you've found out. >> >>> ?? Any thoughts as to why? It doesn't appear that the current behavior abides by RFC5735. >> Reading this section and RFC1122 it is not entirely clear to me >> what the allowed scope of 0.0.0.0/8 is. I do agree though that >> blocking it only in ICMP is not useful if it is allowed in the >> normal IP input path. >> >> Can you please check how other OS's (Linux, Windows) deal with it? > > -- > Sean Chittenden > seanc@FreeBSD.org 0/8 is not supposed to be used, as per the rfc. As such it doesn't work on most systems (Linux, network appliance vendors included) so this working *should* be a bug, IMO. If you want address space there is plenty in RFC1918, or if you can't use that (shoot whoever did the addressing), there are others you could use (eg the new CGN space) From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:06:07 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 00B51136; Wed, 14 Nov 2012 07:06:06 +0000 (UTC) (envelope-from sean@chittenden.org) Received: from mail01.lax1.stackjet.com (mon01.lax1.stackjet.com [174.136.104.178]) by mx1.freebsd.org (Postfix) with ESMTP id D0CFB8FC19; Wed, 14 Nov 2012 07:06:06 +0000 (UTC) Received: from laptop-sean-wifi.local (173-228-12-182.dsl.dynamic.sonic.net [173.228.12.182]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sean@chittenden.org) by mail01.lax1.stackjet.com (Postfix) with ESMTPSA id 7F1D63E8D5B; Tue, 13 Nov 2012 23:06:05 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 0.0.0.0/8 oddities... From: Sean Chittenden In-Reply-To: <50A32FE7.2010206@rewt.org.uk> Date: Tue, 13 Nov 2012 23:06:04 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> To: Joe Holden X-Mailer: Apple Mail (2.1499) Cc: "freebsd-net@freebsd.org" , gnn@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:06:07 -0000 >>>> Hello. I ran in to an interesting situation in what appears to be = an exotic situation. Specifically, after reviewing RFC5735 again and = searching for a datacenter-local or rack-local IP range (i.e trying to = provide services that are guaranteed to be provided in the same rack as = the server), I settled on the 0.0.0.0/8 network. Per =A73 of RFC5735, it = would appear that this network is valid: >>>>=20 >>>> https://tools.ietf.org/html/rfc5735#section-3 >>>>=20 >>>>> 0.0.0.0/8 - Addresses in this block refer to source hosts on = "this" >>>>> network. Address 0.0.0.0/32 may be used as a source address for = this >>>>> host on this network; other addresses within 0.0.0.0/8 may be = used to >>>>> refer to specified hosts on this network ([RFC1122], Section = 3.2.1.3). >>>> And this works as expected, with regards to TCP services. But ICMP? = Not so much. Is there a reason that ICMP would fail, but TCP (e.g. ssh) = works? For example, I pulled 0.42.123.10 and 0.42.123.20 as IP addresses = to use for NTP servers, but much to my surprise, I could ssh between the = hosts, but I couldn't ping. Is this intentional? I understand that = 0.0.0.0/32 =3D=3D INADDR_ANY for source addresses, but it doesn't appear = that there should be a restriction of inbound echoreq packets. According = to tcpdump(1), the host is receiving echoreq packets, however no echorep = packets are generated. As a work around, I threw things in to a more = traditional RFC1918 network and things immediately worked for both SSH = and ICMP. >>> The check to drop ICMP replies to a source of 0.0.0.0/8 was added >>> in r120958 as part of a fix for link local addresses. It was only >>> applied to ICMP which is inconsistent as you've found out. >>>=20 >>>> ?? Any thoughts as to why? It doesn't appear that the current = behavior abides by RFC5735. >>> Reading this section and RFC1122 it is not entirely clear to me >>> what the allowed scope of 0.0.0.0/8 is. I do agree though that >>> blocking it only in ICMP is not useful if it is allowed in the >>> normal IP input path. >>>=20 >>> Can you please check how other OS's (Linux, Windows) deal with it? >=20 > 0/8 is not supposed to be used, as per the rfc. As such it doesn't = work on most systems (Linux, network appliance vendors included) so this = working *should* be a bug, IMO. Where does it say that it shouldn't be used? Which RFC & =A7? There are = plenty of RFCs and I haven't exhaustively read things, so I reserve the = right to be wrong & corrected, but I haven't seen anything that says, = "do not use 0.0.0.0/8." 0.0.0.0/32, yes, that's a reserved and special = IP address, but the remainder of the /8? It's a stretch to argue that it = can't be used. -sc -- Sean Chittenden sean@chittenden.org From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:06:55 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D1BC91DE for ; Wed, 14 Nov 2012 07:06:55 +0000 (UTC) (envelope-from fernando@gont.com.ar) Received: from web01.jbserver.net (web01.jbserver.net [93.186.182.34]) by mx1.freebsd.org (Postfix) with ESMTP id 8E68A8FC13 for ; Wed, 14 Nov 2012 07:06:54 +0000 (UTC) Received: from [186.134.15.187] (helo=[192.168.123.122]) by web01.jbserver.net with esmtpsa (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) (Exim 4.80.1) (envelope-from ) id 1TYWZw-00060T-LX; Wed, 14 Nov 2012 07:36:05 +0100 Message-ID: <50A338FB.9060602@gont.com.ar> Date: Wed, 14 Nov 2012 03:23:55 -0300 From: Fernando Gont User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121028 Thunderbird/16.0.2 MIME-Version: 1.0 To: Dustin Wenz Subject: Re: Default ephemeral port range References: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> In-Reply-To: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> X-Enigmail-Version: 1.4.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:06:55 -0000 On 11/12/2012 02:57 PM, Dustin Wenz wrote: > I'm trying to determine why the default ephemeral port range appears > to be 10000 through 65535 in at least 8.1 through 9.1RC. I had produced the patch that extended the ephemeral port range in FreeBSD. My original patch extended the ephemeral port range to 1024-65535. However, it was noted that X uses ports in the range 1024-10000, and hence it was better to exclude that port range from the ephemeral port range. > The IANA recommends the range be 49152 through 65535 > (http://tools.ietf.org/html/rfc6056). IANA *used* to recommend that range. In RFC 6056 we recommend implementations to use the largest possible port range -- ideally 1024-65536. > Is there any particular reason > why net.inet.ip.portrange.first defaults to 10000? Please see above. Cheers, -- Fernando Gont e-mail: fernando@gont.com.ar || fgont@si6networks.com PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1 From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:21:28 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C4DE93D9 for ; Wed, 14 Nov 2012 07:21:28 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (unknown [IPv6:2001:b70:201:2::22]) by mx1.freebsd.org (Postfix) with ESMTP id 5C0A28FC13 for ; Wed, 14 Nov 2012 07:21:27 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1cZZ2qTBz13Mp; Wed, 14 Nov 2012 07:21:26 +0000 (GMT) Message-ID: <50A34675.2020709@rewt.org.uk> Date: Wed, 14 Nov 2012 07:21:25 +0000 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> In-Reply-To: <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:21:28 -0000 On 14/11/2012 07:06, Sean Chittenden wrote: >>>>> Hello. I ran in to an interesting situation in what appears to be an exotic situation. Specifically, after reviewing RFC5735 again and searching for a datacenter-local or rack-local IP range (i.e trying to provide services that are guaranteed to be provided in the same rack as the server), I settled on the 0.0.0.0/8 network. Per 3 of RFC5735, it would appear that this network is valid: >>>>> >>>>> https://tools.ietf.org/html/rfc5735#section-3 >>>>> >>>>>> 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" >>>>>> network. Address 0.0.0.0/32 may be used as a source address for this >>>>>> host on this network; other addresses within 0.0.0.0/8 may be used to >>>>>> refer to specified hosts on this network ([RFC1122], Section 3.2.1.3). >>>>> And this works as expected, with regards to TCP services. But ICMP? Not so much. Is there a reason that ICMP would fail, but TCP (e.g. ssh) works? For example, I pulled 0.42.123.10 and 0.42.123.20 as IP addresses to use for NTP servers, but much to my surprise, I could ssh between the hosts, but I couldn't ping. Is this intentional? I understand that 0.0.0.0/32 == INADDR_ANY for source addresses, but it doesn't appear that there should be a restriction of inbound echoreq packets. According to tcpdump(1), the host is receiving echoreq packets, however no echorep packets are generated. As a work around, I threw things in to a more traditional RFC1918 network and things immediately worked for both SSH and ICMP. >>>> The check to drop ICMP replies to a source of 0.0.0.0/8 was added >>>> in r120958 as part of a fix for link local addresses. It was only >>>> applied to ICMP which is inconsistent as you've found out. >>>> >>>>> ?? Any thoughts as to why? It doesn't appear that the current behavior abides by RFC5735. >>>> Reading this section and RFC1122 it is not entirely clear to me >>>> what the allowed scope of 0.0.0.0/8 is. I do agree though that >>>> blocking it only in ICMP is not useful if it is allowed in the >>>> normal IP input path. >>>> >>>> Can you please check how other OS's (Linux, Windows) deal with it? >> >> 0/8 is not supposed to be used, as per the rfc. As such it doesn't work on most systems (Linux, network appliance vendors included) so this working *should* be a bug, IMO. > > Where does it say that it shouldn't be used? Which RFC & ? There are plenty of RFCs and I haven't exhaustively read things, so I reserve the right to be wrong & corrected, but I haven't seen anything that says, "do not use 0.0.0.0/8." 0.0.0.0/32, yes, that's a reserved and special IP address, but the remainder of the /8? It's a stretch to argue that it can't be used. > > -sc > > -- > Sean Chittenden > sean@chittenden.org There are several, including the one you referenced where it references the other addresses can only be used as a source address. It is vague but accepted that 0/8 isn't usable as anything other than that. Regardless, why are you trying to do something that is unsupported by pretty much every vendor/operator/os? From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:25:21 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AD2F95BC for ; Wed, 14 Nov 2012 07:25:21 +0000 (UTC) (envelope-from sean@chittenden.org) Received: from mail01.lax1.stackjet.com (mon01.lax1.stackjet.com [174.136.104.178]) by mx1.freebsd.org (Postfix) with ESMTP id 889EC8FC16 for ; Wed, 14 Nov 2012 07:25:21 +0000 (UTC) Received: from laptop-sean-wifi.local (173-228-12-182.dsl.dynamic.sonic.net [173.228.12.182]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sean@chittenden.org) by mail01.lax1.stackjet.com (Postfix) with ESMTPSA id 495B83E8D42; Tue, 13 Nov 2012 23:25:21 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 0.0.0.0/8 oddities... From: Sean Chittenden In-Reply-To: <50A34675.2020709@rewt.org.uk> Date: Tue, 13 Nov 2012 23:25:20 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> To: Joe Holden X-Mailer: Apple Mail (2.1499) Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:25:21 -0000 >>>>> The check to drop ICMP replies to a source of 0.0.0.0/8 was added >>>>> in r120958 as part of a fix for link local addresses. It was only >>>>> applied to ICMP which is inconsistent as you've found out. >>>>>=20 >>>>>> ?? Any thoughts as to why? It doesn't appear that the current = behavior abides by RFC5735. >>>>> Reading this section and RFC1122 it is not entirely clear to me >>>>> what the allowed scope of 0.0.0.0/8 is. I do agree though that >>>>> blocking it only in ICMP is not useful if it is allowed in the >>>>> normal IP input path. >>>>>=20 >>>>> Can you please check how other OS's (Linux, Windows) deal with it? >>>=20 >>> 0/8 is not supposed to be used, as per the rfc. As such it doesn't = work on most systems (Linux, network appliance vendors included) so this = working *should* be a bug, IMO. >>=20 >> Where does it say that it shouldn't be used? Which RFC & =A7? There = are plenty of RFCs and I haven't exhaustively read things, so I reserve = the right to be wrong & corrected, but I haven't seen anything that = says, "do not use 0.0.0.0/8." 0.0.0.0/32, yes, that's a reserved and = special IP address, but the remainder of the /8? It's a stretch to argue = that it can't be used. >=20 > There are several, including the one you referenced where it = references the other addresses can only be used as a source address. It = is vague but accepted that 0/8 isn't usable as anything other than that. Can you be more specific? I read "other addresses within 0.0.0.0/8 may = be used to refer to specified hosts on this network" as an indication = that use of 0/8 is intended to be supported. > Regardless, why are you trying to do something that is unsupported by = pretty much every vendor/operator/os? Status quo is fine and dandy if it's rational, backed up with a = justification and can be understood, but I'm not seeing anything that = suggests there's a good reason which indicates 0/8 shouldn't be used or = supported. -sc -- Sean Chittenden sean@chittenden.org From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:32:10 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6E4996D1 for ; Wed, 14 Nov 2012 07:32:10 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (unknown [IPv6:2001:b70:201:2::22]) by mx1.freebsd.org (Postfix) with ESMTP id 2C1FF8FC14 for ; Wed, 14 Nov 2012 07:32:09 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1cpw70Mdz13Mp; Wed, 14 Nov 2012 07:32:08 +0000 (GMT) Message-ID: <50A348F8.1050805@rewt.org.uk> Date: Wed, 14 Nov 2012 07:32:08 +0000 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> In-Reply-To: <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:32:10 -0000 On 14/11/2012 07:25, Sean Chittenden wrote: >>>>>> The check to drop ICMP replies to a source of 0.0.0.0/8 was added >>>>>> in r120958 as part of a fix for link local addresses. It was only >>>>>> applied to ICMP which is inconsistent as you've found out. >>>>>> >>>>>>> ?? Any thoughts as to why? It doesn't appear that the current behavior abides by RFC5735. >>>>>> Reading this section and RFC1122 it is not entirely clear to me >>>>>> what the allowed scope of 0.0.0.0/8 is. I do agree though that >>>>>> blocking it only in ICMP is not useful if it is allowed in the >>>>>> normal IP input path. >>>>>> >>>>>> Can you please check how other OS's (Linux, Windows) deal with it? >>>> >>>> 0/8 is not supposed to be used, as per the rfc. As such it doesn't work on most systems (Linux, network appliance vendors included) so this working *should* be a bug, IMO. >>> >>> Where does it say that it shouldn't be used? Which RFC & ? There are plenty of RFCs and I haven't exhaustively read things, so I reserve the right to be wrong & corrected, but I haven't seen anything that says, "do not use 0.0.0.0/8." 0.0.0.0/32, yes, that's a reserved and special IP address, but the remainder of the /8? It's a stretch to argue that it can't be used. >> >> There are several, including the one you referenced where it references the other addresses can only be used as a source address. It is vague but accepted that 0/8 isn't usable as anything other than that. > > Can you be more specific? I read "other addresses within 0.0.0.0/8 may be used to refer to specified hosts on this network" as an indication that use of 0/8 is intended to be supported. > >> Regardless, why are you trying to do something that is unsupported by pretty much every vendor/operator/os? > > Status quo is fine and dandy if it's rational, backed up with a justification and can be understood, but I'm not seeing anything that suggests there's a good reason which indicates 0/8 shouldn't be used or supported. -sc > It's official registration is for "self identification", "this" network doesn't mean the connected network. All in all, even allowing an address in 0/8 to be configured is a bug based on both a) the various RFCs and intended use and b) that's how everyone else accepts that it should work anyway, so RFC is irrelevant in that case. From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:37:00 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 577F17A8 for ; Wed, 14 Nov 2012 07:37:00 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (abby.lhr1.as41113.net [91.208.177.20]) by mx1.freebsd.org (Postfix) with ESMTP id 0ED078FC14 for ; Wed, 14 Nov 2012 07:36:59 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1cwV1GMhz13Mp; Wed, 14 Nov 2012 07:36:58 +0000 (GMT) Message-ID: <50A34A19.5030005@rewt.org.uk> Date: Wed, 14 Nov 2012 07:36:57 +0000 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> <50A348F8.1050805@rewt.org.uk> In-Reply-To: <50A348F8.1050805@rewt.org.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:37:00 -0000 On 14/11/2012 07:32, Joe Holden wrote: > On 14/11/2012 07:25, Sean Chittenden wrote: >>>>>>> The check to drop ICMP replies to a source of 0.0.0.0/8 was added >>>>>>> in r120958 as part of a fix for link local addresses. It was only >>>>>>> applied to ICMP which is inconsistent as you've found out. >>>>>>> >>>>>>>> ?? Any thoughts as to why? It doesn't appear that the current >>>>>>>> behavior abides by RFC5735. >>>>>>> Reading this section and RFC1122 it is not entirely clear to me >>>>>>> what the allowed scope of 0.0.0.0/8 is. I do agree though that >>>>>>> blocking it only in ICMP is not useful if it is allowed in the >>>>>>> normal IP input path. >>>>>>> >>>>>>> Can you please check how other OS's (Linux, Windows) deal with it? >>>>> >>>>> 0/8 is not supposed to be used, as per the rfc. As such it doesn't >>>>> work on most systems (Linux, network appliance vendors included) so >>>>> this working *should* be a bug, IMO. >>>> >>>> Where does it say that it shouldn't be used? Which RFC & ? There >>>> are plenty of RFCs and I haven't exhaustively read things, so I >>>> reserve the right to be wrong & corrected, but I haven't seen >>>> anything that says, "do not use 0.0.0.0/8." 0.0.0.0/32, yes, that's >>>> a reserved and special IP address, but the remainder of the /8? It's >>>> a stretch to argue that it can't be used. >>> >>> There are several, including the one you referenced where it >>> references the other addresses can only be used as a source address. >>> It is vague but accepted that 0/8 isn't usable as anything other than >>> that. >> >> Can you be more specific? I read "other addresses within 0.0.0.0/8 may >> be used to refer to specified hosts on this network" as an indication >> that use of 0/8 is intended to be supported. >> >>> Regardless, why are you trying to do something that is unsupported by >>> pretty much every vendor/operator/os? >> >> Status quo is fine and dandy if it's rational, backed up with a >> justification and can be understood, but I'm not seeing anything that >> suggests there's a good reason which indicates 0/8 shouldn't be used >> or supported. -sc >> > It's official registration is for "self identification", "this" network > doesn't mean the connected network. > > All in all, even allowing an address in 0/8 to be configured is a bug > based on both a) the various RFCs and intended use and b) that's how > everyone else accepts that it should work anyway, so RFC is irrelevant > in that case. > Actually, after testing it doesn't look like there is any special handling for other ranges either, going to need a seasoned net developer to weigh in on this one From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 07:48:59 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6A26C94F for ; Wed, 14 Nov 2012 07:48:59 +0000 (UTC) (envelope-from sean@chittenden.org) Received: from mail01.lax1.stackjet.com (mon01.lax1.stackjet.com [174.136.104.178]) by mx1.freebsd.org (Postfix) with ESMTP id 476C08FC13 for ; Wed, 14 Nov 2012 07:48:59 +0000 (UTC) Received: from laptop-sean-wifi.local (173-228-12-182.dsl.dynamic.sonic.net [173.228.12.182]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sean@chittenden.org) by mail01.lax1.stackjet.com (Postfix) with ESMTPSA id B05F33E8D59; Tue, 13 Nov 2012 23:48:52 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 0.0.0.0/8 oddities... From: Sean Chittenden In-Reply-To: <50A348F8.1050805@rewt.org.uk> Date: Tue, 13 Nov 2012 23:48:51 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> <50A348F8.1050805@rewt.org.uk> To: Joe Holden X-Mailer: Apple Mail (2.1499) Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 07:48:59 -0000 >>> Regardless, why are you trying to do something that is unsupported = by pretty much every vendor/operator/os? >>=20 >> Status quo is fine and dandy if it's rational, backed up with a = justification and can be understood, but I'm not seeing anything that = suggests there's a good reason which indicates 0/8 shouldn't be used or = supported. -sc >=20 > It's official registration is for "self identification", "this" = network doesn't mean the connected network. >=20 > All in all, even allowing an address in 0/8 to be configured is a bug = based on both a) the various RFCs and intended use and b) that's how = everyone else accepts that it should work anyway, so RFC is irrelevant = in that case. I think that's incorrect. 127/8 is used for hosts local to a physical = server and 0/8 was intended for hosts "local to a network." In my = definition, "this network" is data center-local, however there's nothing = preventing that IP address range from being rack-local either, etc. = 0.0.0.0/32 is a shortcut for saying "me on this network," which makes = sense in the context of the wording in RFC 5735. Again, section 3 = paragraph 1: 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" network. Address 0.0.0.0/32 may be used as a source address for this host on this network; other addresses within 0.0.0.0/8 may be used to refer to specified hosts on this network ([RFC1122], Section = 3.2.1.3). In environments where DNS is an extra service that requires = justification and would be an additional service that has to be secured, = exclusive use of well known IP addresses is both convenient and useful, = and the 0/8 network seems to have been defined for exactly this purpose. = I admit the address range isn't in wide use atm, but I don't see a = reason for it to not be. The fix Andre made appears to be correct, and IMO, should be merged in = to -head and MFC'ed. http://www.secnetix.de/~olli/FreeBSD/svnews/index.py?r=3D242956 Cheers (& thank you Andre for making the commit). -sc -- Sean Chittenden sean@chittenden.org From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 08:05:18 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 02DF8B30 for ; Wed, 14 Nov 2012 08:05:18 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (unknown [IPv6:2001:b70:201:2::22]) by mx1.freebsd.org (Postfix) with ESMTP id 9DE668FC08 for ; Wed, 14 Nov 2012 08:05:17 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1dY53sRNz13Mp; Wed, 14 Nov 2012 08:05:12 +0000 (GMT) Message-ID: <50A350B6.7060600@rewt.org.uk> Date: Wed, 14 Nov 2012 08:05:10 +0000 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> <50A348F8.1050805@rewt.org.uk> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 08:05:18 -0000 On 14/11/2012 07:48, Sean Chittenden wrote: >>>> Regardless, why are you trying to do something that is unsupported by pretty much every vendor/operator/os? >>> >>> Status quo is fine and dandy if it's rational, backed up with a justification and can be understood, but I'm not seeing anything that suggests there's a good reason which indicates 0/8 shouldn't be used or supported. -sc >> >> It's official registration is for "self identification", "this" network doesn't mean the connected network. >> >> All in all, even allowing an address in 0/8 to be configured is a bug based on both a) the various RFCs and intended use and b) that's how everyone else accepts that it should work anyway, so RFC is irrelevant in that case. > > I think that's incorrect. 127/8 is used for hosts local to a physical server and 0/8 was intended for hosts "local to a network." In my definition, "this network" is data center-local, however there's nothing preventing that IP address range from being rack-local either, etc. 0.0.0.0/32 is a shortcut for saying "me on this network," which makes sense in the context of the wording in RFC 5735. Again, section 3 paragraph 1: > > 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" > network. Address 0.0.0.0/32 may be used as a source address for this > host on this network; other addresses within 0.0.0.0/8 may be used to > refer to specified hosts on this network ([RFC1122], Section 3.2.1.3). > > In environments where DNS is an extra service that requires justification and would be an additional service that has to be secured, exclusive use of well known IP addresses is both convenient and useful, and the 0/8 network seems to have been defined for exactly this purpose. I admit the address range isn't in wide use atm, but I don't see a reason for it to not be. > > The fix Andre made appears to be correct, and IMO, should be merged in to -head and MFC'ed. > > http://www.secnetix.de/~olli/FreeBSD/svnews/index.py?r=242956 > > Cheers (& thank you Andre for making the commit). -sc > > -- > Sean Chittenden > sean@chittenden.org > It is quite clearly for self identification, major vendors and other reference OSes, eg Linux/Windows don't allow it, thus this is *wrong*. What we now have is a stack that's even more non-rfc/best practice compliant because *you* couldn't pick a sensible address like everyone else. There are 10 distinct ranges to choose from that whilst not being advisable, are not special cases and thus are valid. Andre, There are plenty of reserved ranges that *are* valid and *are* usable by other vendors/systems. Based on the rfc wording and vendor documents saying it should be a source address only and everyone else treating it as a special range (as it should be, like 224/4) and others such as link local, this should be reverted. From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 09:35:13 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 04538381 for ; Wed, 14 Nov 2012 09:35:13 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 5324D8FC13 for ; Wed, 14 Nov 2012 09:35:11 +0000 (UTC) Received: (qmail 30213 invoked from network); 14 Nov 2012 11:09:13 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 14 Nov 2012 11:09:13 -0000 Message-ID: <50A365C9.4010902@freebsd.org> Date: Wed, 14 Nov 2012 10:35:05 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> <50A348F8.1050805@rewt.org.uk> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Joe Holden X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 09:35:13 -0000 On 14.11.2012 08:48, Sean Chittenden wrote: >>>> Regardless, why are you trying to do something that is unsupported by pretty much every vendor/operator/os? >>> >>> Status quo is fine and dandy if it's rational, backed up with a justification and can be understood, but I'm not seeing anything that suggests there's a good reason which indicates 0/8 shouldn't be used or supported. -sc >> >> It's official registration is for "self identification", "this" network doesn't mean the connected network. >> >> All in all, even allowing an address in 0/8 to be configured is a bug based on both a) the various RFCs and intended use and b) that's how everyone else accepts that it should work anyway, so RFC is irrelevant in that case. > > I think that's incorrect. 127/8 is used for hosts local to a physical server and 0/8 was intended for hosts "local to a network." In my definition, "this network" is data center-local, however there's nothing preventing that IP address range from being rack-local either, etc. 0.0.0.0/32 is a shortcut for saying "me on this network," which makes sense in the context of the wording in RFC 5735. Again, section 3 paragraph 1: > > 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" > network. Address 0.0.0.0/32 may be used as a source address for this > host on this network; other addresses within 0.0.0.0/8 may be used to > refer to specified hosts on this network ([RFC1122], Section 3.2.1.3). > > In environments where DNS is an extra service that requires justification and would be an additional service that has to be secured, exclusive use of well known IP addresses is both convenient and useful, and the 0/8 network seems to have been defined for exactly this purpose. I admit the address range isn't in wide use atm, but I don't see a reason for it to not be. > > The fix Andre made appears to be correct, and IMO, should be merged in to -head and MFC'ed. > > http://www.secnetix.de/~olli/FreeBSD/svnews/index.py?r=242956 I agree, but I want to check how Linux and Windows behave first. -- Andre From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 09:59:26 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D1EE78AA; Wed, 14 Nov 2012 09:59:26 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (unknown [IPv6:2001:b70:201:2::22]) by mx1.freebsd.org (Postfix) with ESMTP id 6270A8FC16; Wed, 14 Nov 2012 09:59:25 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1h4r33bFz13N1; Wed, 14 Nov 2012 09:59:24 +0000 (GMT) Message-ID: <50A36B7B.50106@rewt.org.uk> Date: Wed, 14 Nov 2012 09:59:23 +0000 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> <50A348F8.1050805@rewt.org.uk> <50A365C9.4010902@freebsd.org> In-Reply-To: <50A365C9.4010902@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Sean Chittenden X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 09:59:27 -0000 On 14/11/2012 09:35, Andre Oppermann wrote: > On 14.11.2012 08:48, Sean Chittenden wrote: >>>>> Regardless, why are you trying to do something that is unsupported >>>>> by pretty much every vendor/operator/os? >>>> >>>> Status quo is fine and dandy if it's rational, backed up with a >>>> justification and can be understood, but I'm not seeing anything >>>> that suggests there's a good reason which indicates 0/8 shouldn't be >>>> used or supported. -sc >>> >>> It's official registration is for "self identification", "this" >>> network doesn't mean the connected network. >>> >>> All in all, even allowing an address in 0/8 to be configured is a bug >>> based on both a) the various RFCs and intended use and b) that's how >>> everyone else accepts that it should work anyway, so RFC is >>> irrelevant in that case. >> >> I think that's incorrect. 127/8 is used for hosts local to a physical >> server and 0/8 was intended for hosts "local to a network." In my >> definition, "this network" is data center-local, however there's >> nothing preventing that IP address range from being rack-local either, >> etc. 0.0.0.0/32 is a shortcut for saying "me on this network," which >> makes sense in the context of the wording in RFC 5735. Again, section >> 3 paragraph 1: >> >> 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" >> network. Address 0.0.0.0/32 may be used as a source address for this >> host on this network; other addresses within 0.0.0.0/8 may be used to >> refer to specified hosts on this network ([RFC1122], Section >> 3.2.1.3). >> >> In environments where DNS is an extra service that requires >> justification and would be an additional service that has to be >> secured, exclusive use of well known IP addresses is both convenient >> and useful, and the 0/8 network seems to have been defined for exactly >> this purpose. I admit the address range isn't in wide use atm, but I >> don't see a reason for it to not be. >> >> The fix Andre made appears to be correct, and IMO, should be merged in >> to -head and MFC'ed. >> >> http://www.secnetix.de/~olli/FreeBSD/svnews/index.py?r=242956 > > I agree, but I want to check how Linux and Windows behave first. > Andre, On Linux it correctly returns invalid argument, on Winsock its explicitly invalid[1], on every network vendor I have tested it on, it is invalid. Enabling this not only breaks compatibility with *everything* else, but also hasn't been tested and the ramifications on applications hasn't been checked, either. Suggest user use an appropriate range from one of the 10 listed as reserved/special and retain the same behaviour as all the other platforms. [1] http://msdn.microsoft.com/en-gb/library/windows/desktop/ms738586(v=vs.85).aspx (MS has the closest to "proper" behaviour IMO, Linux also behaves in a similar fashion however doesn't prevent the user from adding a 0/8 address to an interface, it just doesn't work) From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 10:14:02 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 83C46E27 for ; Wed, 14 Nov 2012 10:14:02 +0000 (UTC) (envelope-from dhartmei@insomnia.benzedrine.cx) Received: from insomnia.benzedrine.cx (106-30.3-213.fix.bluewin.ch [213.3.30.106]) by mx1.freebsd.org (Postfix) with ESMTP id B91EA8FC12 for ; Wed, 14 Nov 2012 10:13:59 +0000 (UTC) Received: from insomnia.benzedrine.cx (localhost.benzedrine.cx [127.0.0.1]) by insomnia.benzedrine.cx (8.14.1/8.13.4) with ESMTP id qAE9mu39010429 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 14 Nov 2012 10:48:57 +0100 (MET) Received: (from dhartmei@localhost) by insomnia.benzedrine.cx (8.14.1/8.12.10/Submit) id qAE9muZl020126; Wed, 14 Nov 2012 10:48:56 +0100 (MET) Date: Wed, 14 Nov 2012 10:48:56 +0100 From: Daniel Hartmeier To: Sean Chittenden Subject: Re: 0.0.0.0/8 oddities... Message-ID: <20121114094856.GA19022@insomnia.benzedrine.cx> References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> User-Agent: Mutt/1.5.12-2006-07-14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 10:14:02 -0000 On Tue, Nov 13, 2012 at 11:06:04PM -0800, Sean Chittenden wrote: > Where does it say that it shouldn't be used? Which RFC & ?? There are plenty of RFCs and I haven't exhaustively read things, so I reserve the right to be wrong & corrected, but I haven't seen anything that says, "do not use 0.0.0.0/8." 0.0.0.0/32, yes, that's a reserved and special IP address, but the remainder of the /8? It's a stretch to argue that it can't be used. RFC1122 Section 3.2.1.3 (which RFC5735 references directly) (a) { 0, 0 } This host on this network. MUST NOT be sent, except as a source address as part of an initialization procedure by which the host learns its own IP address. See also Section 3.3.6 for a non-standard use of {0,0}. (b) { 0, } Specified host on this network. It MUST NOT be sent, except as a source address as part of an initialization procedure by which the host learns its full IP address. So a sender MUST NOT use 0.0/16 or 0/8 as destination, ever... Daniel From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 10:15:17 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6BEC5F47; Wed, 14 Nov 2012 10:15:17 +0000 (UTC) (envelope-from lists@rewt.org.uk) Received: from abby.lhr1.as41113.net (unknown [IPv6:2001:b70:201:2::22]) by mx1.freebsd.org (Postfix) with ESMTP id 146A28FC14; Wed, 14 Nov 2012 10:15:16 +0000 (UTC) Received: from [172.16.11.21] (bella.stf.rewt.org.uk [91.208.177.62]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by abby.lhr1.as41113.net (Postfix) with ESMTPS id 3Y1hR74cs5z13Mp; Wed, 14 Nov 2012 10:15:15 +0000 (GMT) Message-ID: <50A36F31.3010706@rewt.org.uk> Date: Wed, 14 Nov 2012 10:15:13 +0000 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: 0.0.0.0/8 oddities... References: <50A20359.9080906@networx.ch> <7C614093-6408-49C6-8515-F6C09183453B@chittenden.org> <50A32FE7.2010206@rewt.org.uk> <7BE7E643-FB13-45DE-BA40-257B8ADFAA98@chittenden.org> <50A34675.2020709@rewt.org.uk> <082A52DA-3C04-46B7-A0C6-2F1CD814C01C@chittenden.org> <50A348F8.1050805@rewt.org.uk> <50A365C9.4010902@freebsd.org> <50A36B7B.50106@rewt.org.uk> In-Reply-To: <50A36B7B.50106@rewt.org.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Sean Chittenden X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 10:15:17 -0000 On 14/11/2012 09:59, Joe Holden wrote: > On 14/11/2012 09:35, Andre Oppermann wrote: >> On 14.11.2012 08:48, Sean Chittenden wrote: >>>>>> Regardless, why are you trying to do something that is unsupported >>>>>> by pretty much every vendor/operator/os? >>>>> >>>>> Status quo is fine and dandy if it's rational, backed up with a >>>>> justification and can be understood, but I'm not seeing anything >>>>> that suggests there's a good reason which indicates 0/8 shouldn't be >>>>> used or supported. -sc >>>> >>>> It's official registration is for "self identification", "this" >>>> network doesn't mean the connected network. >>>> >>>> All in all, even allowing an address in 0/8 to be configured is a bug >>>> based on both a) the various RFCs and intended use and b) that's how >>>> everyone else accepts that it should work anyway, so RFC is >>>> irrelevant in that case. >>> >>> I think that's incorrect. 127/8 is used for hosts local to a physical >>> server and 0/8 was intended for hosts "local to a network." In my >>> definition, "this network" is data center-local, however there's >>> nothing preventing that IP address range from being rack-local either, >>> etc. 0.0.0.0/32 is a shortcut for saying "me on this network," which >>> makes sense in the context of the wording in RFC 5735. Again, section >>> 3 paragraph 1: >>> >>> 0.0.0.0/8 - Addresses in this block refer to source hosts on "this" >>> network. Address 0.0.0.0/32 may be used as a source address for >>> this >>> host on this network; other addresses within 0.0.0.0/8 may be >>> used to >>> refer to specified hosts on this network ([RFC1122], Section >>> 3.2.1.3). >>> >>> In environments where DNS is an extra service that requires >>> justification and would be an additional service that has to be >>> secured, exclusive use of well known IP addresses is both convenient >>> and useful, and the 0/8 network seems to have been defined for exactly >>> this purpose. I admit the address range isn't in wide use atm, but I >>> don't see a reason for it to not be. >>> >>> The fix Andre made appears to be correct, and IMO, should be merged in >>> to -head and MFC'ed. >>> >>> http://www.secnetix.de/~olli/FreeBSD/svnews/index.py?r=242956 >> >> I agree, but I want to check how Linux and Windows behave first. >> > Andre, > > On Linux it correctly returns invalid argument, on Winsock its > explicitly invalid[1], on every network vendor I have tested it on, it > is invalid. > > Enabling this not only breaks compatibility with *everything* else, but > also hasn't been tested and the ramifications on applications hasn't > been checked, either. > > Suggest user use an appropriate range from one of the 10 listed as > reserved/special and retain the same behaviour as all the other platforms. > > [1] > http://msdn.microsoft.com/en-gb/library/windows/desktop/ms738586(v=vs.85).aspx > (MS has the closest to "proper" behaviour IMO, Linux also behaves in a > similar fashion however doesn't prevent the user from adding a 0/8 > address to an interface, it just doesn't work) The other thing to note (which is the whole reason for this thread in the first place) is the incorrect handling of 0.0.0.0 by the stack, trying to end traffic to 0.0.0.0 should *not* end up with traffic going to the default route. At best it should return an error or at least be an alias to 127.0.0.1, like Linux. From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 10:31:16 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B38CD3C4 for ; Wed, 14 Nov 2012 10:31:16 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 194AC8FC16 for ; Wed, 14 Nov 2012 10:31:15 +0000 (UTC) Received: (qmail 30477 invoked from network); 14 Nov 2012 12:05:17 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 14 Nov 2012 12:05:17 -0000 Message-ID: <50A372EC.5080002@freebsd.org> Date: Wed, 14 Nov 2012 11:31:08 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Looking for bge(4) , bce(4) and igb(4) cards Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: andre@freebsd.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 10:31:16 -0000 Hello I currently working on a number of drivers for popular network cards and extend them with automatic hybrid interrupt/polling ithread processing with life-lock prevention (so that the driver can't consume all CPU when under heavy load or attack). To properly test this I need the proper hardware as PCIe network cards: bge(4) Broadcom BCM57xx/BCM590x bce(4) Broadcom NetXtreme II (BCM5706/5708/5709/5716) igb(4) Intel PRO/1000 i82575, i82576, i82580, i210, i350 If you have one of these and can spare it I'd be very glad if you could send it to me. I'm located in Switzerland/Europe. I can reply to you privately to give you my shipping address. Of course if you have any other PCIe Gigabit Ethernet cards with a driver in FreeBSD I'm interested in receiving one as well. Of particular interest are: em(4) Intel i82571 to i82573 lem(4) Intel i82540 to i82546 age(4) Atheros L1 GigE ??? anything else 1GigE with PCIe The same goes for 10 Gigabit Ethernet but the setup is a bit more involved and I haven't done that yet, but will do soon (the issue being expensive SPF+ optics): bxe(4) Broadcom BCM5771x 10GigE cxbge(4) Chelsio T4 10GigE ixgbe(4) Intel i82598 and i82599 10GigE mxge(4) Myricom Myri10G qlxgb(4) QLogic 3200 and 8200 10GigE sfxge(4) Solarflare Many thanks for your support! -- Andre From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 15:47:44 2012 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AF48F99B; Wed, 14 Nov 2012 15:47:44 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id 08C7F8FC13; Wed, 14 Nov 2012 15:47:43 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id qAEFlgnX031731; Wed, 14 Nov 2012 19:47:42 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id qAEFlfYO031730; Wed, 14 Nov 2012 19:47:41 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Wed, 14 Nov 2012 19:47:41 +0400 From: Gleb Smirnoff To: "Alexander V. Chernikov" Subject: Re: [CFT] ipfw SMP-ready dynamic states Message-ID: <20121114154741.GE29772@nginx.com> References: <50A29F57.6090701@yandex-team.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <50A29F57.6090701@yandex-team.ru> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-ipfw@FreeBSD.org, Luigi Rizzo , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 15:47:44 -0000 On Tue, Nov 13, 2012 at 11:28:23PM +0400, Alexander V. Chernikov wrote: A> So, we can do the following: A> 1) lock increments/decrements via some separate mutex A> 2) do nothing A> 3) take some combined approach: 4) Take it via uma_zone_getcur(ipfw_dyn_rule_zone); -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 17:53:28 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AC86FB5D for ; Wed, 14 Nov 2012 17:53:28 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet02.ebureau.com (internet02.tru-signal.biz [65.127.24.21]) by mx1.freebsd.org (Postfix) with ESMTP id 6AAC98FC15 for ; Wed, 14 Nov 2012 17:53:27 +0000 (UTC) Received: from service02.office.ebureau.com (internet06.ebureau.com [65.127.24.25]) by internet02.ebureau.com (Postfix) with ESMTP id 3B55AE0EFCA; Wed, 14 Nov 2012 11:53:27 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by service02.office.ebureau.com (Postfix) with ESMTP id 3659EE1B54C; Wed, 14 Nov 2012 11:53:27 -0600 (CST) X-Virus-Scanned: amavisd-new at ebureau.com Received: from service02.office.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AejPERn3s0s1; Wed, 14 Nov 2012 11:53:25 -0600 (CST) Received: from square.office.ebureau.com (square.office.iscompanies.com [10.10.20.22]) by service02.office.ebureau.com (Postfix) with ESMTPSA id A43C8E1B538; Wed, 14 Nov 2012 11:53:25 -0600 (CST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\)) Subject: Re: Default ephemeral port range From: Dustin Wenz In-Reply-To: <50A338FB.9060602@gont.com.ar> Date: Wed, 14 Nov 2012 11:53:25 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <87A2D317-77BA-4641-979D-0AE43247D99E@ebureau.com> <50A338FB.9060602@gont.com.ar> To: Fernando Gont X-Mailer: Apple Mail (2.1498) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 17:53:28 -0000 On Nov 14, 2012, at 12:23 AM, Fernando Gont = wrote: > On 11/12/2012 02:57 PM, Dustin Wenz wrote: >> I'm trying to determine why the default ephemeral port range appears >> to be 10000 through 65535 in at least 8.1 through 9.1RC. >=20 > I had produced the patch that extended the ephemeral port range in > FreeBSD. My original patch extended the ephemeral port range to > 1024-65535. However, it was noted that X uses ports in the range > 1024-10000, and hence it was better to exclude that port range from = the > ephemeral port range. >=20 >=20 >> The IANA recommends the range be 49152 through 65535 >> (http://tools.ietf.org/html/rfc6056). >=20 > IANA *used* to recommend that range. In RFC 6056 we recommend > implementations to use the largest possible port range -- ideally > 1024-65536. >=20 Ah; that clarifies things quite a bit. There seems to be a lot of = incorrect/outdated information online about this. The suggestion from Eugene is also useful. I should be able to use = setsockopt() with IP_PORTRANGE_HIGH if I cared to use the high range = only. I probably don't want to do that in most cases, but it's good to = understand what the differences are. Thanks for the help! - .Dustin From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 20:03:36 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EA416213 for ; Wed, 14 Nov 2012 20:03:36 +0000 (UTC) (envelope-from rejithomas.d@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 9930A8FC12 for ; Wed, 14 Nov 2012 20:03:36 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id l1so1096472vba.13 for ; Wed, 14 Nov 2012 12:03:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=wGF8ogYp0Azt1hMXq8RzrkrO08Avbwwn2AcJNbTz3Y8=; b=KqtQr0LWdUbPsil7xtQ4E2OB1rQZAK69nFlt6D5S7veMj/pb5WqcNQlvzN8tIDRTgX fIywWhLKu4nnMq+5GRRPI/apMnU1R+tyqn3vU051meKu5xKJsKq88G3EyE+UqB0h3cup iHwqq2myiUUnZoN8DPIui/rAnqqi2Qnch5MJjsgLUgQ4L2t2xt7gv/EI54KMi9Sc6bio Nz/lPT9r+XGll8GebvLD62Jeej9Iw2u3OcFhTSFEM+QtfWSIMNcxjamMqWkpOb2u9mjL UgNmgCKfRtCV/LeYcyLu/yhuPmdz5zpqva2Bmj1MVdb4fdD3ZLgF6H2/AH3cB3RLG+Xt LAxQ== MIME-Version: 1.0 Received: by 10.220.227.70 with SMTP id iz6mr12713300vcb.45.1352923415604; Wed, 14 Nov 2012 12:03:35 -0800 (PST) Received: by 10.58.144.196 with HTTP; Wed, 14 Nov 2012 12:03:35 -0800 (PST) Date: Thu, 15 Nov 2012 01:33:35 +0530 Message-ID: Subject: Help wrt LOR in icmp6_rip6_input From: Reji Thomas To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 20:03:37 -0000 Hi, This is regarding a lock order reversal which is already reported in http://ipv4.sources.zabbadoz.net/freebsd/lor/134.html. Pasting the witness backtrace here: lock order reversal 1st 0xc1787144 inp (raw6inp) @ sys/netinet6/icmp6.c:1895 2nd 0xc1788090 inp (rawinp) @ sys/netinet6/icmp6.c:1895 KDB: stack backtrace: kdb_backtrace(c07dcab1,c1788090,c07f043f,c07e928d,c07ec854) at kdb_backtrace+0x2e witness_checkorder(c1788090,9,c07ec854,767,12b) at witness_checkorder+0x6c3 _mtx_lock_flags(c1788090,0,c07ec854,767,c25d5658) at _mtx_lock_flags+0x8a icmp6_rip6_input(cc9fcbec,28,38,1,0) at icmp6_rip6_input+0xb6 icmp6_input(cc9fcc94,cc9fcc34,3a,0,0) at icmp6_input+0xdd4 ip6_input(c25d5600,0,c07e3c86,e8,c08cf944) at ip6_input+0xee7 Is this a valid issue or a spurious one. This seems to be flagged when the icmp6_rip6_input code traverses the pcb list which contains v4 and v6 pcbs. How is the witness lock order established other than the static one ? Is this established on the first locking order encountered (where LOP_NEWORDER flag is passed)?. Regards Reji From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 21:13:53 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1C945EC3 for ; Wed, 14 Nov 2012 21:13:53 +0000 (UTC) (envelope-from postnet@dragas.dyndns.org) Received: from mail.dragas.org (unknown [IPv6:2001:41d0:2:ca45::3]) by mx1.freebsd.org (Postfix) with ESMTP id A8F108FC0C for ; Wed, 14 Nov 2012 21:13:52 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.dragas.org (Postfix) with ESMTP id C7CB6E4408 for ; Wed, 14 Nov 2012 22:13:51 +0100 (CET) Received: from mail.dragas.org ([127.0.0.1]) by localhost (dragas.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7v-1J6C0WpUo for ; Wed, 14 Nov 2012 22:13:46 +0100 (CET) Received: from dragasmbp.lan (unknown [87.238.26.165]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.dragas.org (Postfix) with ESMTPSA id 77CA2E449D for ; Wed, 14 Nov 2012 22:13:46 +0100 (CET) From: Stefano Marinelli Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: "Weighted round robin" for LAGG - anyhow? Message-Id: <646662D0-E66C-4168-B351-482B06ACF26A@dragas.dyndns.org> Date: Wed, 14 Nov 2012 22:13:45 +0100 To: freebsd-net@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 21:13:53 -0000 Hello everybody, I've been trying to do some experiments to improve my ADSL speed. The = idea is to bond two ADSLs, create two OpenVPN TAP channels connected to = a remote (fast connected) server and doing a round-robind LAGG = aggregation on both nodes. The remote end will NAT. The operation is = successful but, as by design of the R-R configuration, I just get the = the slowest link speed * 2. "loadbalance" mode is way slower. The first ADSL is more or less 2.2 Mbit/sec, the second one about 1.4 = MBit/sec so the gain is minimal. It's useful just for fault-tolerance. I thought about doing some sort of "weighted round-robin", giving a = weight to the two tap interfaces. Something like "send 2 packets to the = first one, than one to the second one". I found some patches [1] for Linux (I'm compiling a kernel right now), = but no information about FreeBSD. My tests show that FreeBSD has a = better bonding throughput (in Linux, I just get a bit more than 2.3 = Mbit/sec, on FreeBSD I can almost get the theorical 2.8Mbit/sec). Did anyone ever tried something like that? I tried to have a look at the = if_lagg source code, but I think it'd be quite difficult for me to do = something like that, even trying to adapt those Linux patches.=20 [1] = http://sourceforge.net/projects/bonding/forums/forum/77912/topic/2048022 Thank you, Stefano= From owner-freebsd-net@FreeBSD.ORG Wed Nov 14 22:45:02 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BA2F4A26 for ; Wed, 14 Nov 2012 22:45:02 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 6D50F8FC12 for ; Wed, 14 Nov 2012 22:45:02 +0000 (UTC) Received: from JRE-MBP-2.local (c-50-143-149-146.hsd1.ca.comcast.net [50.143.149.146]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id qAEMivWD006971 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 14 Nov 2012 14:44:58 -0800 (PST) (envelope-from julian@freebsd.org) Message-ID: <50A41EE4.5060006@freebsd.org> Date: Wed, 14 Nov 2012 14:44:52 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Stefano Marinelli Subject: Re: "Weighted round robin" for LAGG - anyhow? References: <646662D0-E66C-4168-B351-482B06ACF26A@dragas.dyndns.org> In-Reply-To: <646662D0-E66C-4168-B351-482B06ACF26A@dragas.dyndns.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2012 22:45:02 -0000 On 11/14/12 1:13 PM, Stefano Marinelli wrote: > Hello everybody, > I've been trying to do some experiments to improve my ADSL speed. The idea is to bond two ADSLs, create two OpenVPN TAP channels connected to a remote (fast connected) server and doing a round-robind LAGG aggregation on both nodes. The remote end will NAT. The operation is successful but, as by design of the R-R configuration, I just get the the slowest link speed * 2. "loadbalance" mode is way slower. > The first ADSL is more or less 2.2 Mbit/sec, the second one about 1.4 MBit/sec so the gain is minimal. It's useful just for fault-tolerance. > I thought about doing some sort of "weighted round-robin", giving a weight to the two tap interfaces. Something like "send 2 packets to the first one, than one to the second one". > I found some patches [1] for Linux (I'm compiling a kernel right now), but no information about FreeBSD. My tests show that FreeBSD has a better bonding throughput (in Linux, I just get a bit more than 2.3 Mbit/sec, on FreeBSD I can almost get the theorical 2.8Mbit/sec). > Did anyone ever tried something like that? I tried to have a look at the if_lagg source code, but I think it'd be quite difficult for me to do something like that, even trying to adapt those Linux patches. > > [1] http://sourceforge.net/projects/bonding/forums/forum/77912/topic/2048022 > > Thank you, > Stefano > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > you can use mpd (or ppp) in multilink mode to encapsulate your outgoing links via two different tcp paths to you other server where you can undo it.. mpd (and ppp I think) will allow you to select a couple of different multiplexing schemes, including one where each packet is cut up and sent, so that the slower link would be sending 1/3 of teh packet and the faster link would be sending 2/3 of the packet. From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 06:18:33 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F1239B82; Thu, 15 Nov 2012 06:18:32 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 888E38FC12; Thu, 15 Nov 2012 06:18:32 +0000 (UTC) Received: by mail-ob0-f182.google.com with SMTP id 16so1656611obc.13 for ; Wed, 14 Nov 2012 22:18:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=4CcpiinliKUIH0pZHxGMUjpYQKH32I9qfbto6cmhCeA=; b=JtJ+RMLddkv4ck1uGfr99lxlTIFaGFa+5tzXMyPTQvQlmjCbnqsYl0u1cyMVpkCios EkoJIYqt/50HSwzhDbhoDrQKnzcY+G1VWlNiPAZP0z3GQ8Gi46HKnzXHtWu6rqEnnA1z /qRCThNafKAIIXr+4P4uDWrI4K9mE5cpr3TrucLLaodpyRkExRm9SubgySP2tn7Er9+/ Ul+ncDroaxwZb+gCcS1bmcUoQJE6+FKsJ6uIWgVfI95BMzEPGmLZpC2hkOEBXcIOcjdh ngnuYT146O7ukr5kHmmjS50SWulG8QdimXqCazHAJWF6PRjxIiTciV9I+6DWWltJlPL8 hLFA== MIME-Version: 1.0 Received: by 10.60.11.197 with SMTP id s5mr104246oeb.29.1352960312126; Wed, 14 Nov 2012 22:18:32 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.76.27.65 with HTTP; Wed, 14 Nov 2012 22:18:31 -0800 (PST) In-Reply-To: <201210291115.23845.zec@fer.hr> References: <201210291115.23845.zec@fer.hr> Date: Wed, 14 Nov 2012 22:18:31 -0800 X-Google-Sender-Auth: sJMFxyn6ZdMsU2s-vEvBpK8LauI Message-ID: Subject: Re: VIMAGE crashes on 9.x with hotplug net80211 devices From: Adrian Chadd To: Marko Zec Content-Type: multipart/mixed; boundary=e89a8fb202eaccab6a04ce829ea2 Cc: freebsd-net@freebsd.org, Hans Petter Selasky , freebsd-hackers@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 06:18:33 -0000 --e89a8fb202eaccab6a04ce829ea2 Content-Type: text/plain; charset=ISO-8859-1 Hi, Here's what I have thus far. Please ignore the device_printf() change. This works for me, both for hotplug cardbus wireless devices as well as (inadvertently!) a USB bluetooth device. What do you think? Adrian --e89a8fb202eaccab6a04ce829ea2 Content-Type: application/octet-stream; name="20121114-vimage-1.diff" Content-Disposition: attachment; filename="20121114-vimage-1.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_h9jhpwt20 SW5kZXg6IHN5cy9rZXJuL3N1YnJfYnVzLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gc3lzL2tlcm4vc3Vicl9i dXMuYwkocmV2aXNpb24gMjQxNDkxKQorKysgc3lzL2tlcm4vc3Vicl9idXMuYwkod29ya2luZyBj b3B5KQpAQCAtMzksNiArMzksNyBAQAogI2luY2x1ZGUgPHN5cy9tYWxsb2MuaD4KICNpbmNsdWRl IDxzeXMvbW9kdWxlLmg+CiAjaW5jbHVkZSA8c3lzL211dGV4Lmg+CisjaW5jbHVkZSA8c3lzL3Bj cHUuaD4KICNpbmNsdWRlIDxzeXMvcG9sbC5oPgogI2luY2x1ZGUgPHN5cy9wcm9jLmg+CiAjaW5j bHVkZSA8c3lzL2NvbmR2YXIuaD4KQEAgLTUzLDYgKzU0LDggQEAKICNpbmNsdWRlIDxzeXMvYnVz Lmg+CiAjaW5jbHVkZSA8c3lzL2ludGVycnVwdC5oPgogCisjaW5jbHVkZSA8bmV0L3ZuZXQuaD4K KwogI2luY2x1ZGUgPG1hY2hpbmUvc3RkYXJnLmg+CiAKICNpbmNsdWRlIDx2bS91bWEuaD4KQEAg LTIzMTAsNiArMjMxMyw3IEBACiAJdmFfbGlzdCBhcDsKIAlpbnQgcmV0dmFsOwogCisJcHJpbnRm KCJbJWxsZF0gIiwgKGxvbmcgbG9uZyBpbnQpIGN1cnRocmVhZC0+dGRfdGlkKTsKIAlyZXR2YWwg PSBkZXZpY2VfcHJpbnRfcHJldHR5bmFtZShkZXYpOwogCXZhX3N0YXJ0KGFwLCBmbXQpOwogCXJl dHZhbCArPSB2cHJpbnRmKGZtdCwgYXApOwpAQCAtMjcxNiw3ICsyNzIwLDcgQEAKIGludAogZGV2 aWNlX3Byb2JlX2FuZF9hdHRhY2goZGV2aWNlX3QgZGV2KQogewotCWludCBlcnJvcjsKKwlpbnQg ZXJyb3IsIGlzX2RlZmF1bHRfdm5ldDsKIAogCUdJQU5UX1JFUVVJUkVEOwogCkBAIC0yNzI1LDcg KzI3MjksMTggQEAKIAkJcmV0dXJuICgwKTsKIAllbHNlIGlmIChlcnJvciAhPSAwKQogCQlyZXR1 cm4gKGVycm9yKTsKLQlyZXR1cm4gKGRldmljZV9hdHRhY2goZGV2KSk7CisKKwkvKgorCSAqIE9u bHkgc2V0IHRoZSBkZWZhdWx0IHZuZXQgdG8gdm5ldDAgaWYgdGhlIGN1cnJlbnQKKwkgKiB2bmV0 IGlzbid0IHZuZXQwLgorCSAqLworCWlzX2RlZmF1bHRfdm5ldCA9ICEhIElTX0RFRkFVTFRfVk5F VChjdXJ2bmV0KTsKKwlpZiAoISBpc19kZWZhdWx0X3ZuZXQpCisJCUNVUlZORVRfU0VUX1FVSUVU KHZuZXQwKTsKKwllcnJvciA9IGRldmljZV9hdHRhY2goZGV2KTsKKwlpZiAoISBpc19kZWZhdWx0 X3ZuZXQpCisJCUNVUlZORVRfUkVTVE9SRSgpOworCXJldHVybiBlcnJvcjsKIH0KIAogLyoqCklu ZGV4OiBzeXMva2Vybi9zdWJyX3dpdG5lc3MuYwo9PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBzeXMva2Vybi9zdWJy X3dpdG5lc3MuYwkocmV2aXNpb24gMjQxNDkxKQorKysgc3lzL2tlcm4vc3Vicl93aXRuZXNzLmMJ KHdvcmtpbmcgY29weSkKQEAgLTMxOSw2ICszMTksMjggQEAKIAlyZXR1cm4gKGEtPmZyb20gPT0g Yi0+ZnJvbSAmJiBhLT50byA9PSBiLT50byk7CiB9CiAKKy8qCisgKiBXaGV0aGVyIHRvIHBhbmlj IG9yIG5vdCB3aGVuIGEgd2l0bmVzcyBjb25kaXRpb24gb2NjdXJzLgorICovCitzdGF0aWMgaW50 IHdpdG5lc3NfZG9wYW5pYyA9IDE7CisKKy8qCisgKiBIYW5kbGUgd2hldGhlciB0byBwYW5pYyBv ciBtZXJlbHkgcHJpbnQgYW4gaW5mb3JtYXRpdmUgbWVzc2FnZS4KKyAqLworc3RhdGljIHZvaWQK K3dpdG5lc3NfcGFuaWMoY29uc3QgY2hhciAqZm10LCAuLi4pCit7CisJdmFfbGlzdCBhcDsKKwor CXZhX3N0YXJ0KGFwLCBmbXQpOworCXZwcmludGYoZm10LCBhcCk7CisJdmFfZW5kKGFwKTsKKwor CS8qIFhYWCBpdCdkIGJlIG5pY2UgdG8gbWFpbnRhaW4gdGhlIGNvcnJlY3QgcGFuaWNzdHIgKi8K KwlpZiAod2l0bmVzc19kb3BhbmljKQorCQlwYW5pYygid2l0bmVzc1xuIik7Cit9CisKIHN0YXRp YyBpbnQJX2lzaXRteXgoc3RydWN0IHdpdG5lc3MgKncxLCBzdHJ1Y3Qgd2l0bmVzcyAqdzIsIGlu dCBybWFzaywKIAkJICAgIGNvbnN0IGNoYXIgKmZuYW1lKTsKICNpZmRlZiBLREIKQEAgLTQwNSw2 ICs0MjcsOSBAQAogVFVOQUJMRV9JTlQoImRlYnVnLndpdG5lc3Mua2RiIiwgJndpdG5lc3Nfa2Ri KTsKIFNZU0NUTF9JTlQoX2RlYnVnX3dpdG5lc3MsIE9JRF9BVVRPLCBrZGIsIENUTEZMQUdfUlcs ICZ3aXRuZXNzX2tkYiwgMCwgIiIpOwogCitUVU5BQkxFX0lOVCgiZGVidWcud2l0bmVzcy5wYW5p YyIsICZ3aXRuZXNzX2RvcGFuaWMpOworU1lTQ1RMX0lOVChfZGVidWdfd2l0bmVzcywgT0lEX0FV VE8sIHBhbmljLCBDVExGTEFHX1JXLCAmd2l0bmVzc19kb3BhbmljLCAwLCAiIik7CisKIC8qCiAg KiBXaGVuIEtEQiBpcyBlbmFibGVkIGFuZCB3aXRuZXNzX3RyYWNlIGlzIDEsIGl0IHdpbGwgY2F1 c2UgdGhlIHN5c3RlbQogICogdG8gcHJpbnQgYSBzdGFjayB0cmFjZToKQEAgLTcyMiwxOCArNzQ3 LDYgQEAKICAqLwogc3RhdGljIGludCB3aXRuZXNzX3NwaW5fd2FybiA9IDA7CiAKLS8qIFRyaW0g dXNlbGVzcyBnYXJiYWdlIGZyb20gZmlsZW5hbWVzLiAqLwotc3RhdGljIGNvbnN0IGNoYXIgKgot Zml4dXBfZmlsZW5hbWUoY29uc3QgY2hhciAqZmlsZSkKLXsKLQotCWlmIChmaWxlID09IE5VTEwp Ci0JCXJldHVybiAoTlVMTCk7Ci0Jd2hpbGUgKHN0cm5jbXAoZmlsZSwgIi4uLyIsIDMpID09IDAp Ci0JCWZpbGUgKz0gMzsKLQlyZXR1cm4gKGZpbGUpOwotfQotCiAvKgogICogVGhlIFdJVE5FU1Mt ZW5hYmxlZCBkaWFnbm9zdGljIGNvZGUuICBOb3RlIHRoYXQgdGhlIHdpdG5lc3MgY29kZSBkb2Vz CiAgKiBhc3N1bWUgdGhhdCB0aGUgZWFybHkgYm9vdCBpcyBzaW5nbGUtdGhyZWFkZWQgYXQgbGVh c3QgdW50aWwgYWZ0ZXIgdGhpcwpAQCAtODI0LDE1ICs4MzcsMTUgQEAKIAljbGFzcyA9IExPQ0tf Q0xBU1MobG9jayk7CiAJaWYgKChsb2NrLT5sb19mbGFncyAmIExPX1JFQ1VSU0FCTEUpICE9IDAg JiYKIAkgICAgKGNsYXNzLT5sY19mbGFncyAmIExDX1JFQ1VSU0FCTEUpID09IDApCi0JCXBhbmlj KCIlczogbG9jayAoJXMpICVzIGNhbiBub3QgYmUgcmVjdXJzYWJsZSIsIF9fZnVuY19fLAorCQl3 aXRuZXNzX3BhbmljKCIlczogbG9jayAoJXMpICVzIGNhbiBub3QgYmUgcmVjdXJzYWJsZSIsIF9f ZnVuY19fLAogCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUpOwogCWlmICgobG9j ay0+bG9fZmxhZ3MgJiBMT19TTEVFUEFCTEUpICE9IDAgJiYKIAkgICAgKGNsYXNzLT5sY19mbGFn cyAmIExDX1NMRUVQQUJMRSkgPT0gMCkKLQkJcGFuaWMoIiVzOiBsb2NrICglcykgJXMgY2FuIG5v dCBiZSBzbGVlcGFibGUiLCBfX2Z1bmNfXywKKwkJd2l0bmVzc19wYW5pYygiJXM6IGxvY2sgKCVz KSAlcyBjYW4gbm90IGJlIHNsZWVwYWJsZSIsIF9fZnVuY19fLAogCQkgICAgY2xhc3MtPmxjX25h bWUsIGxvY2stPmxvX25hbWUpOwogCWlmICgobG9jay0+bG9fZmxhZ3MgJiBMT19VUEdSQURBQkxF KSAhPSAwICYmCiAJICAgIChjbGFzcy0+bGNfZmxhZ3MgJiBMQ19VUEdSQURBQkxFKSA9PSAwKQot CQlwYW5pYygiJXM6IGxvY2sgKCVzKSAlcyBjYW4gbm90IGJlIHVwZ3JhZGFibGUiLCBfX2Z1bmNf XywKKwkJd2l0bmVzc19wYW5pYygiJXM6IGxvY2sgKCVzKSAlcyBjYW4gbm90IGJlIHVwZ3JhZGFi bGUiLCBfX2Z1bmNfXywKIAkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lKTsKIAog CS8qCkBAIC04NDksNyArODYyLDcgQEAKIAkJcGVuZGluZ19sb2Nrc1twZW5kaW5nX2NudF0ud2hf bG9jayA9IGxvY2s7CiAJCXBlbmRpbmdfbG9ja3NbcGVuZGluZ19jbnQrK10ud2hfdHlwZSA9IHR5 cGU7CiAJCWlmIChwZW5kaW5nX2NudCA+IFdJVE5FU1NfUEVORExJU1QpCi0JCQlwYW5pYygiJXM6 IHBlbmRpbmcgbG9ja3MgbGlzdCBpcyB0b28gc21hbGwsIGJ1bXAgaXRcbiIsCisJCQl3aXRuZXNz X3BhbmljKCIlczogcGVuZGluZyBsb2NrcyBsaXN0IGlzIHRvbyBzbWFsbCwgYnVtcCBpdFxuIiwK IAkJCSAgICBfX2Z1bmNfXyk7CiAJfSBlbHNlCiAJCWxvY2stPmxvX3dpdG5lc3MgPSBlbnJvbGwo dHlwZSwgY2xhc3MpOwpAQCAtODY0LDcgKzg3Nyw3IEBACiAJY2xhc3MgPSBMT0NLX0NMQVNTKGxv Y2spOwogCiAJaWYgKHdpdG5lc3NfY29sZCkKLQkJcGFuaWMoImxvY2sgKCVzKSAlcyBkZXN0cm95 ZWQgd2hpbGUgd2l0bmVzc19jb2xkIiwKKwkJd2l0bmVzc19wYW5pYygibG9jayAoJXMpICVzIGRl c3Ryb3llZCB3aGlsZSB3aXRuZXNzX2NvbGQiLAogCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2st PmxvX25hbWUpOwogCiAJLyogWFhYOiBuZWVkIHRvIHZlcmlmeSB0aGF0IG5vIG9uZSBob2xkcyB0 aGUgbG9jayAqLwpAQCAtOTM5LDcgKzk1Miw3IEBACiAgCX0KICAJdy0+d19kaXNwbGF5ZWQgPSAx OwogCWlmICh3LT53X2ZpbGUgIT0gTlVMTCAmJiB3LT53X2xpbmUgIT0gMCkKLQkJcHJudCgiIC0t IGxhc3QgYWNxdWlyZWQgQCAlczolZFxuIiwgZml4dXBfZmlsZW5hbWUody0+d19maWxlKSwKKwkJ cHJudCgiIC0tIGxhc3QgYWNxdWlyZWQgQCAlczolZFxuIiwgdy0+d19maWxlLAogCQkgICAgdy0+ d19saW5lKTsKIAllbHNlCiAJCXBybnQoIiAtLSBuZXZlciBhY3F1aXJlZFxuIik7CkBAIC0xMDE1 LDYgKzEwMjgsMTggQEAKIH0KICNlbmRpZiAvKiBEREIgKi8KIAorLyogVHJpbSB1c2VsZXNzIGdh cmJhZ2UgZnJvbSBmaWxlbmFtZXMuICovCitzdGF0aWMgY29uc3QgY2hhciAqCitmaXh1cF9maWxl bmFtZShjb25zdCBjaGFyICpmaWxlKQoreworCisJaWYgKGZpbGUgPT0gTlVMTCkKKwkJcmV0dXJu IChOVUxMKTsKKwl3aGlsZSAoc3RybmNtcChmaWxlLCAiLi4vIiwgMykgPT0gMCkKKwkJZmlsZSAr PSAzOworCXJldHVybiAoZmlsZSk7Cit9CisKIGludAogd2l0bmVzc19kZWZpbmVvcmRlcihzdHJ1 Y3QgbG9ja19vYmplY3QgKmxvY2sxLCBzdHJ1Y3QgbG9ja19vYmplY3QgKmxvY2syKQogewpAQCAt MTA3NSw5ICsxMTAwLDggQEAKIAkJICogYWxsIHNwaW4gbG9ja3MuCiAJCSAqLwogCQlpZiAodGQt PnRkX2NyaXRuZXN0ICE9IDAgJiYgIWtkYl9hY3RpdmUpCi0JCQlwYW5pYygiYmxvY2thYmxlIHNs ZWVwIGxvY2sgKCVzKSAlcyBAICVzOiVkIiwKLQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+ bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CisJCQl3aXRuZXNz X3BhbmljKCJibG9ja2FibGUgc2xlZXAgbG9jayAoJXMpICVzIEAgJXM6JWQiLAorCQkJICAgIGNs YXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lLCBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7 CiAKIAkJLyoKIAkJICogSWYgdGhpcyBpcyB0aGUgZmlyc3QgbG9jayBhY3F1aXJlZCB0aGVuIGp1 c3QgcmV0dXJuIGFzCkBAIC0xMTE1LDIwICsxMTM5LDE4IEBACiAJCWlmICgobG9jazEtPmxpX2Zs YWdzICYgTElfRVhDTFVTSVZFKSAhPSAwICYmCiAJCSAgICAoZmxhZ3MgJiBMT1BfRVhDTFVTSVZF KSA9PSAwKSB7CiAJCQlwcmludGYoInNoYXJlZCBsb2NrIG9mICglcykgJXMgQCAlczolZFxuIiwK LQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxl bmFtZShmaWxlKSwgbGluZSk7CisJCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUs IGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAkJCXByaW50Zigid2hpbGUgZXhjbHVzaXZl bHkgbG9ja2VkIGZyb20gJXM6JWRcbiIsCiAJCQkgICAgZml4dXBfZmlsZW5hbWUobG9jazEtPmxp X2ZpbGUpLCBsb2NrMS0+bGlfbGluZSk7Ci0JCQlwYW5pYygic2hhcmUtPmV4Y2wiKTsKKwkJCXdp dG5lc3NfcGFuaWMoInNoYXJlLT5leGNsIik7CiAJCX0KIAkJaWYgKChsb2NrMS0+bGlfZmxhZ3Mg JiBMSV9FWENMVVNJVkUpID09IDAgJiYKIAkJICAgIChmbGFncyAmIExPUF9FWENMVVNJVkUpICE9 IDApIHsKIAkJCXByaW50ZigiZXhjbHVzaXZlIGxvY2sgb2YgKCVzKSAlcyBAICVzOiVkXG4iLAot CQkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lLAotCQkJICAgIGZpeHVwX2ZpbGVu YW1lKGZpbGUpLCBsaW5lKTsKKwkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwg Zml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOwogCQkJcHJpbnRmKCJ3aGlsZSBzaGFyZSBsb2Nr ZWQgZnJvbSAlczolZFxuIiwKIAkJCSAgICBmaXh1cF9maWxlbmFtZShsb2NrMS0+bGlfZmlsZSks IGxvY2sxLT5saV9saW5lKTsKLQkJCXBhbmljKCJleGNsLT5zaGFyZSIpOworCQkJd2l0bmVzc19w YW5pYygiZXhjbC0+c2hhcmUiKTsKIAkJfQogCQlyZXR1cm47CiAJfQpAQCAtMTE4MCwxMiArMTIw MiwxMSBAQAogCQkJICAgICJhY3F1aXJpbmcgZHVwbGljYXRlIGxvY2sgb2Ygc2FtZSB0eXBlOiBc IiVzXCJcbiIsIAogCQkJICAgIHctPndfbmFtZSk7CiAJCQlwcmludGYoIiAxc3QgJXMgQCAlczol ZFxuIiwgcGxvY2stPmxpX2xvY2stPmxvX25hbWUsCi0JCQkgICAgZml4dXBfZmlsZW5hbWUocGxv Y2stPmxpX2ZpbGUpLCBwbG9jay0+bGlfbGluZSk7Ci0JCQlwcmludGYoIiAybmQgJXMgQCAlczol ZFxuIiwgbG9jay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7 CisJCQkgICAgICAgZml4dXBfZmlsZW5hbWUocGxvY2stPmxpX2ZpbGUpLCBwbG9jay0+bGlfbGlu ZSk7CisJCQlwcmludGYoIiAybmQgJXMgQCAlczolZFxuIiwgbG9jay0+bG9fbmFtZSwgZml4dXBf ZmlsZW5hbWUoZmlsZSksIGxpbmUpOwogCQkJd2l0bmVzc19kZWJ1Z2dlcigxKTsKLQkJfSBlbHNl Ci0JCQltdHhfdW5sb2NrX3NwaW4oJndfbXR4KTsKKwkJICAgIH0gZWxzZQorCQkJICAgIG10eF91 bmxvY2tfc3Bpbigmd19tdHgpOwogCQlyZXR1cm47CiAJfQogCW10eF9hc3NlcnQoJndfbXR4LCBN QV9PV05FRCk7CkBAIC0xMzIzLDI0ICsxMzQ0LDE5IEBACiAJCQlpZiAoaSA8IDApIHsKIAkJCQlw cmludGYoIiAxc3QgJXAgJXMgKCVzKSBAICVzOiVkXG4iLAogCQkJCSAgICBsb2NrMS0+bGlfbG9j aywgbG9jazEtPmxpX2xvY2stPmxvX25hbWUsCi0JCQkJICAgIHcxLT53X25hbWUsIGZpeHVwX2Zp bGVuYW1lKGxvY2sxLT5saV9maWxlKSwKLQkJCQkgICAgbG9jazEtPmxpX2xpbmUpOworCQkJCSAg ICB3MS0+d19uYW1lLCBmaXh1cF9maWxlbmFtZShsb2NrMS0+bGlfZmlsZSksIGxvY2sxLT5saV9s aW5lKTsKIAkJCQlwcmludGYoIiAybmQgJXAgJXMgKCVzKSBAICVzOiVkXG4iLCBsb2NrLAotCQkJ CSAgICBsb2NrLT5sb19uYW1lLCB3LT53X25hbWUsCi0JCQkJICAgIGZpeHVwX2ZpbGVuYW1lKGZp bGUpLCBsaW5lKTsKKwkJCQkgICAgbG9jay0+bG9fbmFtZSwgdy0+d19uYW1lLCBmaXh1cF9maWxl bmFtZShmaWxlKSwgbGluZSk7CiAJCQl9IGVsc2UgewogCQkJCXByaW50ZigiIDFzdCAlcCAlcyAo JXMpIEAgJXM6JWRcbiIsCiAJCQkJICAgIGxvY2syLT5saV9sb2NrLCBsb2NrMi0+bGlfbG9jay0+ bG9fbmFtZSwKIAkJCQkgICAgbG9jazItPmxpX2xvY2stPmxvX3dpdG5lc3MtPndfbmFtZSwKLQkJ CQkgICAgZml4dXBfZmlsZW5hbWUobG9jazItPmxpX2ZpbGUpLAotCQkJCSAgICBsb2NrMi0+bGlf bGluZSk7CisJCQkJICAgIGZpeHVwX2ZpbGVuYW1lKGxvY2syLT5saV9maWxlKSwgbG9jazItPmxp X2xpbmUpOwogCQkJCXByaW50ZigiIDJuZCAlcCAlcyAoJXMpIEAgJXM6JWRcbiIsCiAJCQkJICAg IGxvY2sxLT5saV9sb2NrLCBsb2NrMS0+bGlfbG9jay0+bG9fbmFtZSwKLQkJCQkgICAgdzEtPndf bmFtZSwgZml4dXBfZmlsZW5hbWUobG9jazEtPmxpX2ZpbGUpLAotCQkJCSAgICBsb2NrMS0+bGlf bGluZSk7CisJCQkJICAgIHcxLT53X25hbWUsIGZpeHVwX2ZpbGVuYW1lKGxvY2sxLT5saV9maWxl KSwgbG9jazEtPmxpX2xpbmUpOwogCQkJCXByaW50ZigiIDNyZCAlcCAlcyAoJXMpIEAgJXM6JWRc biIsIGxvY2ssCi0JCQkJICAgIGxvY2stPmxvX25hbWUsIHctPndfbmFtZSwKLQkJCQkgICAgZml4 dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOworCQkJCSAgICBsb2NrLT5sb19uYW1lLCB3LT53X25h bWUsIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAkJCX0KIAkJCXdpdG5lc3NfZGVidWdn ZXIoMSk7CiAJCQlyZXR1cm47CkBAIC0xNDM1LDI5ICsxNDUxLDI0IEBACiAJY2xhc3MgPSBMT0NL X0NMQVNTKGxvY2spOwogCWlmICh3aXRuZXNzX3dhdGNoKSB7CiAJCWlmICgobG9jay0+bG9fZmxh Z3MgJiBMT19VUEdSQURBQkxFKSA9PSAwKQotCQkJcGFuaWMoInVwZ3JhZGUgb2Ygbm9uLXVwZ3Jh ZGFibGUgbG9jayAoJXMpICVzIEAgJXM6JWQiLAotCQkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2Nr LT5sb19uYW1lLAotCQkJICAgIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKKwkJCXdpdG5l c3NfcGFuaWMoInVwZ3JhZGUgb2Ygbm9uLXVwZ3JhZGFibGUgbG9jayAoJXMpICVzIEAgJXM6JWQi LAorCQkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lLCBmaXh1cF9maWxlbmFtZShm aWxlKSwgbGluZSk7CiAJCWlmICgoY2xhc3MtPmxjX2ZsYWdzICYgTENfU0xFRVBMT0NLKSA9PSAw KQotCQkJcGFuaWMoInVwZ3JhZGUgb2Ygbm9uLXNsZWVwIGxvY2sgKCVzKSAlcyBAICVzOiVkIiwK LQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxl bmFtZShmaWxlKSwgbGluZSk7CisJCQl3aXRuZXNzX3BhbmljKCJ1cGdyYWRlIG9mIG5vbi1zbGVl cCBsb2NrICglcykgJXMgQCAlczolZCIsCisJCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxv X25hbWUsIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAl9CiAJaW5zdGFuY2UgPSBmaW5k X2luc3RhbmNlKGN1cnRocmVhZC0+dGRfc2xlZXBsb2NrcywgbG9jayk7CiAJaWYgKGluc3RhbmNl ID09IE5VTEwpCi0JCXBhbmljKCJ1cGdyYWRlIG9mIHVubG9ja2VkIGxvY2sgKCVzKSAlcyBAICVz OiVkIiwKLQkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lLAotCQkgICAgZml4dXBf ZmlsZW5hbWUoZmlsZSksIGxpbmUpOworCQl3aXRuZXNzX3BhbmljKCJ1cGdyYWRlIG9mIHVubG9j a2VkIGxvY2sgKCVzKSAlcyBAICVzOiVkIiwKKwkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5s b19uYW1lLCBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CiAJaWYgKHdpdG5lc3Nfd2F0Y2gp IHsKIAkJaWYgKChpbnN0YW5jZS0+bGlfZmxhZ3MgJiBMSV9FWENMVVNJVkUpICE9IDApCi0JCQlw YW5pYygidXBncmFkZSBvZiBleGNsdXNpdmUgbG9jayAoJXMpICVzIEAgJXM6JWQiLAotCQkJICAg IGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lLAotCQkJICAgIGZpeHVwX2ZpbGVuYW1lKGZp bGUpLCBsaW5lKTsKKwkJCXdpdG5lc3NfcGFuaWMoInVwZ3JhZGUgb2YgZXhjbHVzaXZlIGxvY2sg KCVzKSAlcyBAICVzOiVkIiwKKwkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwg Zml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOwogCQlpZiAoKGluc3RhbmNlLT5saV9mbGFncyAm IExJX1JFQ1VSU0VNQVNLKSAhPSAwKQotCQkJcGFuaWMoInVwZ3JhZGUgb2YgcmVjdXJzZWQgbG9j ayAoJXMpICVzIHI9JWQgQCAlczolZCIsCisJCQl3aXRuZXNzX3BhbmljKCJ1cGdyYWRlIG9mIHJl Y3Vyc2VkIGxvY2sgKCVzKSAlcyByPSVkIEAgJXM6JWQiLAogCQkJICAgIGNsYXNzLT5sY19uYW1l LCBsb2NrLT5sb19uYW1lLAotCQkJICAgIGluc3RhbmNlLT5saV9mbGFncyAmIExJX1JFQ1VSU0VN QVNLLAotCQkJICAgIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKKwkJCSAgICBpbnN0YW5j ZS0+bGlfZmxhZ3MgJiBMSV9SRUNVUlNFTUFTSywgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUp OwogCX0KIAlpbnN0YW5jZS0+bGlfZmxhZ3MgfD0gTElfRVhDTFVTSVZFOwogfQpAQCAtMTQ3NSwy OSArMTQ4NiwyNCBAQAogCWNsYXNzID0gTE9DS19DTEFTUyhsb2NrKTsKIAlpZiAod2l0bmVzc193 YXRjaCkgewogCQlpZiAoKGxvY2stPmxvX2ZsYWdzICYgTE9fVVBHUkFEQUJMRSkgPT0gMCkKLQkJ cGFuaWMoImRvd25ncmFkZSBvZiBub24tdXBncmFkYWJsZSBsb2NrICglcykgJXMgQCAlczolZCIs Ci0JCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUsCi0JCQkgICAgZml4dXBfZmls ZW5hbWUoZmlsZSksIGxpbmUpOworCQl3aXRuZXNzX3BhbmljKCJkb3duZ3JhZGUgb2Ygbm9uLXVw Z3JhZGFibGUgbG9jayAoJXMpICVzIEAgJXM6JWQiLAorCQkJICAgIGNsYXNzLT5sY19uYW1lLCBs b2NrLT5sb19uYW1lLCBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CiAJCWlmICgoY2xhc3Mt PmxjX2ZsYWdzICYgTENfU0xFRVBMT0NLKSA9PSAwKQotCQkJcGFuaWMoImRvd25ncmFkZSBvZiBu b24tc2xlZXAgbG9jayAoJXMpICVzIEAgJXM6JWQiLAotCQkJICAgIGNsYXNzLT5sY19uYW1lLCBs b2NrLT5sb19uYW1lLAotCQkJICAgIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKKwkJCXdp dG5lc3NfcGFuaWMoImRvd25ncmFkZSBvZiBub24tc2xlZXAgbG9jayAoJXMpICVzIEAgJXM6JWQi LAorCQkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lLCBmaXh1cF9maWxlbmFtZShm aWxlKSwgbGluZSk7CiAJfQogCWluc3RhbmNlID0gZmluZF9pbnN0YW5jZShjdXJ0aHJlYWQtPnRk X3NsZWVwbG9ja3MsIGxvY2spOwogCWlmIChpbnN0YW5jZSA9PSBOVUxMKQotCQlwYW5pYygiZG93 bmdyYWRlIG9mIHVubG9ja2VkIGxvY2sgKCVzKSAlcyBAICVzOiVkIiwKLQkJICAgIGNsYXNzLT5s Y19uYW1lLCBsb2NrLT5sb19uYW1lLAotCQkgICAgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUp OworCQl3aXRuZXNzX3BhbmljKCJkb3duZ3JhZGUgb2YgdW5sb2NrZWQgbG9jayAoJXMpICVzIEAg JXM6JWQiLAorCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUsIGZpeHVwX2ZpbGVu YW1lKGZpbGUpLCBsaW5lKTsKIAlpZiAod2l0bmVzc193YXRjaCkgewogCQlpZiAoKGluc3RhbmNl LT5saV9mbGFncyAmIExJX0VYQ0xVU0lWRSkgPT0gMCkKLQkJCXBhbmljKCJkb3duZ3JhZGUgb2Yg c2hhcmVkIGxvY2sgKCVzKSAlcyBAICVzOiVkIiwKLQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9j ay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CisJCQl3aXRu ZXNzX3BhbmljKCJkb3duZ3JhZGUgb2Ygc2hhcmVkIGxvY2sgKCVzKSAlcyBAICVzOiVkIiwKKwkJ CSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwgZml4dXBfZmlsZW5hbWUoZmlsZSks IGxpbmUpOwogCQlpZiAoKGluc3RhbmNlLT5saV9mbGFncyAmIExJX1JFQ1VSU0VNQVNLKSAhPSAw KQotCQkJcGFuaWMoImRvd25ncmFkZSBvZiByZWN1cnNlZCBsb2NrICglcykgJXMgcj0lZCBAICVz OiVkIiwKKwkJCXdpdG5lc3NfcGFuaWMoImRvd25ncmFkZSBvZiByZWN1cnNlZCBsb2NrICglcykg JXMgcj0lZCBAICVzOiVkIiwKIAkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwK LQkJCSAgICBpbnN0YW5jZS0+bGlfZmxhZ3MgJiBMSV9SRUNVUlNFTUFTSywKLQkJCSAgICBmaXh1 cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CisJCQkgICAgaW5zdGFuY2UtPmxpX2ZsYWdzICYgTElf UkVDVVJTRU1BU0ssIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAl9CiAJaW5zdGFuY2Ut PmxpX2ZsYWdzICY9IH5MSV9FWENMVVNJVkU7CiB9CkBAIC0xNTA2LDExICsxNTEyLDExIEBACiB3 aXRuZXNzX3VubG9jayhzdHJ1Y3QgbG9ja19vYmplY3QgKmxvY2ssIGludCBmbGFncywgY29uc3Qg Y2hhciAqZmlsZSwgaW50IGxpbmUpCiB7CiAJc3RydWN0IGxvY2tfbGlzdF9lbnRyeSAqKmxvY2tf bGlzdCwgKmxsZTsKLQlzdHJ1Y3QgbG9ja19pbnN0YW5jZSAqaW5zdGFuY2U7CisJc3RydWN0IGxv Y2tfaW5zdGFuY2UgKmluc3RhbmNlID0gTlVMTDsKIAlzdHJ1Y3QgbG9ja19jbGFzcyAqY2xhc3M7 CiAJc3RydWN0IHRocmVhZCAqdGQ7CiAJcmVnaXN0ZXJfdCBzOwotCWludCBpLCBqOworCWludCBp ID0gMCwgajsKIAogCWlmICh3aXRuZXNzX2NvbGQgfHwgbG9jay0+bG9fd2l0bmVzcyA9PSBOVUxM IHx8IHBhbmljc3RyICE9IE5VTEwpCiAJCXJldHVybjsKQEAgLTE1MzcsNyArMTU0Myw3IEBACiAJ ICogZXZlbnR1YWwgcmVnaXN0ZXIgbG9ja3MgYW5kIHJlbW92ZSB0aGVtLgogCSAqLwogCWlmICh3 aXRuZXNzX3dhdGNoID4gMCkKLQkJcGFuaWMoImxvY2sgKCVzKSAlcyBub3QgbG9ja2VkIEAgJXM6 JWQiLCBjbGFzcy0+bGNfbmFtZSwKKwkJd2l0bmVzc19wYW5pYygibG9jayAoJXMpICVzIG5vdCBs b2NrZWQgQCAlczolZCIsIGNsYXNzLT5sY19uYW1lLAogCQkgICAgbG9jay0+bG9fbmFtZSwgZml4 dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOwogCWVsc2UKIAkJcmV0dXJuOwpAQCAtMTU1MCwxNiAr MTU1NiwxNSBAQAogCQkgICAgbG9jay0+bG9fbmFtZSwgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxp bmUpOwogCQlwcmludGYoIndoaWxlIGV4Y2x1c2l2ZWx5IGxvY2tlZCBmcm9tICVzOiVkXG4iLAog CQkgICAgZml4dXBfZmlsZW5hbWUoaW5zdGFuY2UtPmxpX2ZpbGUpLCBpbnN0YW5jZS0+bGlfbGlu ZSk7Ci0JCXBhbmljKCJleGNsLT51c2hhcmUiKTsKKwkJd2l0bmVzc19wYW5pYygiZXhjbC0+dXNo YXJlIik7CiAJfQogCWlmICgoaW5zdGFuY2UtPmxpX2ZsYWdzICYgTElfRVhDTFVTSVZFKSA9PSAw ICYmIHdpdG5lc3Nfd2F0Y2ggPiAwICYmCiAJICAgIChmbGFncyAmIExPUF9FWENMVVNJVkUpICE9 IDApIHsKIAkJcHJpbnRmKCJleGNsdXNpdmUgdW5sb2NrIG9mICglcykgJXMgQCAlczolZFxuIiwg Y2xhc3MtPmxjX25hbWUsCiAJCSAgICBsb2NrLT5sb19uYW1lLCBmaXh1cF9maWxlbmFtZShmaWxl KSwgbGluZSk7Ci0JCXByaW50Zigid2hpbGUgc2hhcmUgbG9ja2VkIGZyb20gJXM6JWRcbiIsCi0J CSAgICBmaXh1cF9maWxlbmFtZShpbnN0YW5jZS0+bGlfZmlsZSksCisJCXByaW50Zigid2hpbGUg c2hhcmUgbG9ja2VkIGZyb20gJXM6JWRcbiIsIGZpeHVwX2ZpbGVuYW1lKGluc3RhbmNlLT5saV9m aWxlKSwKIAkJICAgIGluc3RhbmNlLT5saV9saW5lKTsKLQkJcGFuaWMoInNoYXJlLT51ZXhjbCIp OworCQl3aXRuZXNzX3BhbmljKCJzaGFyZS0+dWV4Y2wiKTsKIAl9CiAJLyogSWYgd2UgYXJlIHJl Y3Vyc2VkLCB1bnJlY3Vyc2UuICovCiAJaWYgKChpbnN0YW5jZS0+bGlfZmxhZ3MgJiBMSV9SRUNV UlNFTUFTSykgPiAwKSB7CkBAIC0xNTczLDcgKzE1NzgsNyBAQAogCWlmICgoaW5zdGFuY2UtPmxp X2ZsYWdzICYgTElfTk9SRUxFQVNFKSAhPSAwICYmIHdpdG5lc3Nfd2F0Y2ggPiAwKSB7CiAJCXBy aW50ZigiZm9yYmlkZGVuIHVubG9jayBvZiAoJXMpICVzIEAgJXM6JWRcbiIsIGNsYXNzLT5sY19u YW1lLAogCQkgICAgbG9jay0+bG9fbmFtZSwgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOwot CQlwYW5pYygibG9jayBtYXJrZWQgbm9yZWxlYXNlIik7CisJCXdpdG5lc3NfcGFuaWMoImxvY2sg bWFya2VkIG5vcmVsZWFzZSIpOwogCX0KIAogCS8qIE90aGVyd2lzZSwgcmVtb3ZlIHRoaXMgaXRl bSBmcm9tIHRoZSBsaXN0LiAqLwpAQCAtMTYyOCw3ICsxNjMzLDcgQEAKIAkJCQl3aXRuZXNzX2xp c3RfbG9jaygmbGxlLT5sbF9jaGlsZHJlbltpXSwgcHJpbnRmKTsKIAkJCQkKIAkJCX0KLQkJcGFu aWMoIlRocmVhZCAlcCBjYW5ub3QgZXhpdCB3aGlsZSBob2xkaW5nIHNsZWVwbG9ja3NcbiIsIHRk KTsKKwkJd2l0bmVzc19wYW5pYygiVGhyZWFkICVwIGNhbm5vdCBleGl0IHdoaWxlIGhvbGRpbmcg c2xlZXBsb2Nrc1xuIiwgdGQpOwogCX0KIAl3aXRuZXNzX2xvY2tfbGlzdF9mcmVlKGxsZSk7CiB9 CkBAIC0xNzA5LDcgKzE3MTQsNyBAQAogCX0gZWxzZQogCQlzY2hlZF91bnBpbigpOwogCWlmIChm bGFncyAmIFdBUk5fUEFOSUMgJiYgbikKLQkJcGFuaWMoIiVzIiwgX19mdW5jX18pOworCQl3aXRu ZXNzX3BhbmljKCIlcyIsIF9fZnVuY19fKTsKIAllbHNlCiAJCXdpdG5lc3NfZGVidWdnZXIobik7 CiAJcmV0dXJuIChuKTsKQEAgLTE3NTUsNyArMTc2MCw3IEBACiAJfSBlbHNlIGlmICgobG9ja19j bGFzcy0+bGNfZmxhZ3MgJiBMQ19TTEVFUExPQ0spKQogCQl0eXBlbGlzdCA9ICZ3X3NsZWVwOwog CWVsc2UKLQkJcGFuaWMoImxvY2sgY2xhc3MgJXMgaXMgbm90IHNsZWVwIG9yIHNwaW4iLAorCQl3 aXRuZXNzX3BhbmljKCJsb2NrIGNsYXNzICVzIGlzIG5vdCBzbGVlcCBvciBzcGluIiwKIAkJICAg IGxvY2tfY2xhc3MtPmxjX25hbWUpOwogCiAJbXR4X2xvY2tfc3Bpbigmd19tdHgpOwpAQCAtMTc4 Niw3ICsxNzkxLDcgQEAKIAl3LT53X3JlZmNvdW50Kys7CiAJbXR4X3VubG9ja19zcGluKCZ3X210 eCk7CiAJaWYgKGxvY2tfY2xhc3MgIT0gdy0+d19jbGFzcykKLQkJcGFuaWMoCisJCXdpdG5lc3Nf cGFuaWMoCiAJCQkibG9jayAoJXMpICVzIGRvZXMgbm90IG1hdGNoIGVhcmxpZXIgKCVzKSBsb2Nr IiwKIAkJCWRlc2NyaXB0aW9uLCBsb2NrX2NsYXNzLT5sY19uYW1lLAogCQkJdy0+d19jbGFzcy0+ bGNfbmFtZSk7CkBAIC0xOTIwLDcgKzE5MjUsNyBAQAogCWlmICghd2l0bmVzc19sb2NrX3R5cGVf ZXF1YWwocGFyZW50LCBjaGlsZCkpIHsKIAkJaWYgKHdpdG5lc3NfY29sZCA9PSAwKQogCQkJbXR4 X3VubG9ja19zcGluKCZ3X210eCk7Ci0JCXBhbmljKCIlczogcGFyZW50IFwiJXNcIiAoJXMpIGFu ZCBjaGlsZCBcIiVzXCIgKCVzKSBhcmUgbm90ICIKKwkJd2l0bmVzc19wYW5pYygiJXM6IHBhcmVu dCBcIiVzXCIgKCVzKSBhbmQgY2hpbGQgXCIlc1wiICglcykgYXJlIG5vdCAiCiAJCSAgICAidGhl IHNhbWUgbG9jayB0eXBlIiwgX19mdW5jX18sIHBhcmVudC0+d19uYW1lLAogCQkgICAgcGFyZW50 LT53X2NsYXNzLT5sY19uYW1lLCBjaGlsZC0+d19uYW1lLAogCQkgICAgY2hpbGQtPndfY2xhc3Mt PmxjX25hbWUpOwpAQCAtMjEwMiw4ICsyMTA3LDggQEAKIAlpZiAobG9jay0+bG9fd2l0bmVzcy0+ d19uYW1lICE9IGxvY2stPmxvX25hbWUpCiAJCXBybnQoIiAoJXMpIiwgbG9jay0+bG9fd2l0bmVz cy0+d19uYW1lKTsKIAlwcm50KCIgciA9ICVkICglcCkgbG9ja2VkIEAgJXM6JWRcbiIsCi0JICAg IGluc3RhbmNlLT5saV9mbGFncyAmIExJX1JFQ1VSU0VNQVNLLCBsb2NrLAotCSAgICBmaXh1cF9m aWxlbmFtZShpbnN0YW5jZS0+bGlfZmlsZSksIGluc3RhbmNlLT5saV9saW5lKTsKKwkgICAgaW5z dGFuY2UtPmxpX2ZsYWdzICYgTElfUkVDVVJTRU1BU0ssIGxvY2ssIGZpeHVwX2ZpbGVuYW1lKGlu c3RhbmNlLT5saV9maWxlKSwKKwkgICAgaW5zdGFuY2UtPmxpX2xpbmUpOwogfQogCiAjaWZkZWYg RERCCkBAIC0yMTc0LDEzICsyMTc5LDYgQEAKIAlzdHJ1Y3QgbG9ja19pbnN0YW5jZSAqaW5zdGFu Y2U7CiAJc3RydWN0IGxvY2tfY2xhc3MgKmNsYXNzOwogCi0JLyoKLQkgKiBUaGlzIGZ1bmN0aW9u IGlzIHVzZWQgaW5kZXBlbmRlbnRseSBpbiBsb2NraW5nIGNvZGUgdG8gZGVhbCB3aXRoCi0JICog R2lhbnQsIFNDSEVEVUxFUl9TVE9QUEVEKCkgY2hlY2sgY2FuIGJlIHJlbW92ZWQgaGVyZSBhZnRl ciBHaWFudAotCSAqIGlzIGdvbmUuCi0JICovCi0JaWYgKFNDSEVEVUxFUl9TVE9QUEVEKCkpCi0J CXJldHVybjsKIAlLQVNTRVJUKHdpdG5lc3NfY29sZCA9PSAwLCAoIiVzOiB3aXRuZXNzX2NvbGQi LCBfX2Z1bmNfXykpOwogCWlmIChsb2NrLT5sb193aXRuZXNzID09IE5VTEwgfHwgd2l0bmVzc193 YXRjaCA9PSAtMSB8fCBwYW5pY3N0ciAhPSBOVUxMKQogCQlyZXR1cm47CkBAIC0yMTk0LDcgKzIx OTIsNyBAQAogCX0KIAlpbnN0YW5jZSA9IGZpbmRfaW5zdGFuY2UobG9ja19saXN0LCBsb2NrKTsK IAlpZiAoaW5zdGFuY2UgPT0gTlVMTCkKLQkJcGFuaWMoIiVzOiBsb2NrICglcykgJXMgbm90IGxv Y2tlZCIsIF9fZnVuY19fLAorCQl3aXRuZXNzX3BhbmljKCIlczogbG9jayAoJXMpICVzIG5vdCBs b2NrZWQiLCBfX2Z1bmNfXywKIAkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lKTsK IAkqZmlsZXAgPSBpbnN0YW5jZS0+bGlfZmlsZTsKIAkqbGluZXAgPSBpbnN0YW5jZS0+bGlfbGlu ZTsKQEAgLTIyMDcsMTMgKzIyMDUsNiBAQAogCXN0cnVjdCBsb2NrX2luc3RhbmNlICppbnN0YW5j ZTsKIAlzdHJ1Y3QgbG9ja19jbGFzcyAqY2xhc3M7CiAKLQkvKgotCSAqIFRoaXMgZnVuY3Rpb24g aXMgdXNlZCBpbmRlcGVuZGVudGx5IGluIGxvY2tpbmcgY29kZSB0byBkZWFsIHdpdGgKLQkgKiBH aWFudCwgU0NIRURVTEVSX1NUT1BQRUQoKSBjaGVjayBjYW4gYmUgcmVtb3ZlZCBoZXJlIGFmdGVy IEdpYW50Ci0JICogaXMgZ29uZS4KLQkgKi8KLQlpZiAoU0NIRURVTEVSX1NUT1BQRUQoKSkKLQkJ cmV0dXJuOwogCUtBU1NFUlQod2l0bmVzc19jb2xkID09IDAsICgiJXM6IHdpdG5lc3NfY29sZCIs IF9fZnVuY19fKSk7CiAJaWYgKGxvY2stPmxvX3dpdG5lc3MgPT0gTlVMTCB8fCB3aXRuZXNzX3dh dGNoID09IC0xIHx8IHBhbmljc3RyICE9IE5VTEwpCiAJCXJldHVybjsKQEAgLTIyMjcsNyArMjIx OCw3IEBACiAJfQogCWluc3RhbmNlID0gZmluZF9pbnN0YW5jZShsb2NrX2xpc3QsIGxvY2spOwog CWlmIChpbnN0YW5jZSA9PSBOVUxMKQotCQlwYW5pYygiJXM6IGxvY2sgKCVzKSAlcyBub3QgbG9j a2VkIiwgX19mdW5jX18sCisJCXdpdG5lc3NfcGFuaWMoIiVzOiBsb2NrICglcykgJXMgbm90IGxv Y2tlZCIsIF9fZnVuY19fLAogCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUpOwog CWxvY2stPmxvX3dpdG5lc3MtPndfZmlsZSA9IGZpbGU7CiAJbG9jay0+bG9fd2l0bmVzcy0+d19s aW5lID0gbGluZTsKQEAgLTIyMzksNyArMjIzMCw3IEBACiB3aXRuZXNzX2Fzc2VydChzdHJ1Y3Qg bG9ja19vYmplY3QgKmxvY2ssIGludCBmbGFncywgY29uc3QgY2hhciAqZmlsZSwgaW50IGxpbmUp CiB7CiAjaWZkZWYgSU5WQVJJQU5UX1NVUFBPUlQKLQlzdHJ1Y3QgbG9ja19pbnN0YW5jZSAqaW5z dGFuY2U7CisJc3RydWN0IGxvY2tfaW5zdGFuY2UgKmluc3RhbmNlID0gTlVMTDsKIAlzdHJ1Y3Qg bG9ja19jbGFzcyAqY2xhc3M7CiAKIAlpZiAobG9jay0+bG9fd2l0bmVzcyA9PSBOVUxMIHx8IHdp dG5lc3Nfd2F0Y2ggPCAxIHx8IHBhbmljc3RyICE9IE5VTEwpCkBAIC0yMjUwLDE1ICsyMjQxLDE0 IEBACiAJZWxzZSBpZiAoKGNsYXNzLT5sY19mbGFncyAmIExDX1NQSU5MT0NLKSAhPSAwKQogCQlp bnN0YW5jZSA9IGZpbmRfaW5zdGFuY2UoUENQVV9HRVQoc3BpbmxvY2tzKSwgbG9jayk7CiAJZWxz ZSB7Ci0JCXBhbmljKCJMb2NrICglcykgJXMgaXMgbm90IHNsZWVwIG9yIHNwaW4hIiwKKwkJd2l0 bmVzc19wYW5pYygiTG9jayAoJXMpICVzIGlzIG5vdCBzbGVlcCBvciBzcGluISIsCiAJCSAgICBj bGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSk7CiAJfQogCXN3aXRjaCAoZmxhZ3MpIHsKIAlj YXNlIExBX1VOTE9DS0VEOgogCQlpZiAoaW5zdGFuY2UgIT0gTlVMTCkKLQkJCXBhbmljKCJMb2Nr ICglcykgJXMgbG9ja2VkIEAgJXM6JWQuIiwKLQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+ bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CisJCQl3aXRuZXNz X3BhbmljKCJMb2NrICglcykgJXMgbG9ja2VkIEAgJXM6JWQuIiwKKwkJCSAgICBjbGFzcy0+bGNf bmFtZSwgbG9jay0+bG9fbmFtZSwgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOwogCQlicmVh azsKIAljYXNlIExBX0xPQ0tFRDoKIAljYXNlIExBX0xPQ0tFRCB8IExBX1JFQ1VSU0VEOgpAQCAt MjI3MCwzNSArMjI2MCwyOSBAQAogCWNhc2UgTEFfWExPQ0tFRCB8IExBX1JFQ1VSU0VEOgogCWNh c2UgTEFfWExPQ0tFRCB8IExBX05PVFJFQ1VSU0VEOgogCQlpZiAoaW5zdGFuY2UgPT0gTlVMTCkg ewotCQkJcGFuaWMoIkxvY2sgKCVzKSAlcyBub3QgbG9ja2VkIEAgJXM6JWQuIiwKLQkJCSAgICBj bGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1cF9maWxlbmFtZShmaWxl KSwgbGluZSk7CisJCQl3aXRuZXNzX3BhbmljKCJMb2NrICglcykgJXMgbm90IGxvY2tlZCBAICVz OiVkLiIsCisJCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUsIGZpeHVwX2ZpbGVu YW1lKGZpbGUpLCBsaW5lKTsKIAkJCWJyZWFrOwogCQl9CiAJCWlmICgoZmxhZ3MgJiBMQV9YTE9D S0VEKSAhPSAwICYmCiAJCSAgICAoaW5zdGFuY2UtPmxpX2ZsYWdzICYgTElfRVhDTFVTSVZFKSA9 PSAwKQotCQkJcGFuaWMoIkxvY2sgKCVzKSAlcyBub3QgZXhjbHVzaXZlbHkgbG9ja2VkIEAgJXM6 JWQuIiwKLQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1 cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CisJCQl3aXRuZXNzX3BhbmljKCJMb2NrICglcykgJXMg bm90IGV4Y2x1c2l2ZWx5IGxvY2tlZCBAICVzOiVkLiIsCisJCQkgICAgY2xhc3MtPmxjX25hbWUs IGxvY2stPmxvX25hbWUsIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAkJaWYgKChmbGFn cyAmIExBX1NMT0NLRUQpICE9IDAgJiYKIAkJICAgIChpbnN0YW5jZS0+bGlfZmxhZ3MgJiBMSV9F WENMVVNJVkUpICE9IDApCi0JCQlwYW5pYygiTG9jayAoJXMpICVzIGV4Y2x1c2l2ZWx5IGxvY2tl ZCBAICVzOiVkLiIsCi0JCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUsCi0JCQkg ICAgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOworCQkJd2l0bmVzc19wYW5pYygiTG9jayAo JXMpICVzIGV4Y2x1c2l2ZWx5IGxvY2tlZCBAICVzOiVkLiIsCisJCQkgICAgY2xhc3MtPmxjX25h bWUsIGxvY2stPmxvX25hbWUsIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAkJaWYgKChm bGFncyAmIExBX1JFQ1VSU0VEKSAhPSAwICYmCiAJCSAgICAoaW5zdGFuY2UtPmxpX2ZsYWdzICYg TElfUkVDVVJTRU1BU0spID09IDApCi0JCQlwYW5pYygiTG9jayAoJXMpICVzIG5vdCByZWN1cnNl ZCBAICVzOiVkLiIsCi0JCQkgICAgY2xhc3MtPmxjX25hbWUsIGxvY2stPmxvX25hbWUsCi0JCQkg ICAgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOworCQkJd2l0bmVzc19wYW5pYygiTG9jayAo JXMpICVzIG5vdCByZWN1cnNlZCBAICVzOiVkLiIsCisJCQkgICAgY2xhc3MtPmxjX25hbWUsIGxv Y2stPmxvX25hbWUsIGZpeHVwX2ZpbGVuYW1lKGZpbGUpLCBsaW5lKTsKIAkJaWYgKChmbGFncyAm IExBX05PVFJFQ1VSU0VEKSAhPSAwICYmCiAJCSAgICAoaW5zdGFuY2UtPmxpX2ZsYWdzICYgTElf UkVDVVJTRU1BU0spICE9IDApCi0JCQlwYW5pYygiTG9jayAoJXMpICVzIHJlY3Vyc2VkIEAgJXM6 JWQuIiwKLQkJCSAgICBjbGFzcy0+bGNfbmFtZSwgbG9jay0+bG9fbmFtZSwKLQkJCSAgICBmaXh1 cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CisJCQl3aXRuZXNzX3BhbmljKCJMb2NrICglcykgJXMg cmVjdXJzZWQgQCAlczolZC4iLAorCQkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1l LCBmaXh1cF9maWxlbmFtZShmaWxlKSwgbGluZSk7CiAJCWJyZWFrOwogCWRlZmF1bHQ6Ci0JCXBh bmljKCJJbnZhbGlkIGxvY2sgYXNzZXJ0aW9uIGF0ICVzOiVkLiIsCi0JCSAgICBmaXh1cF9maWxl bmFtZShmaWxlKSwgbGluZSk7CisJCXdpdG5lc3NfcGFuaWMoIkludmFsaWQgbG9jayBhc3NlcnRp b24gYXQgJXM6JWQuIiwgZml4dXBfZmlsZW5hbWUoZmlsZSksIGxpbmUpOwogCiAJfQogI2VuZGlm CS8qIElOVkFSSUFOVF9TVVBQT1JUICovCkBAIC0yMzIzLDcgKzIzMDcsNyBAQAogCX0KIAlpbnN0 YW5jZSA9IGZpbmRfaW5zdGFuY2UobG9ja19saXN0LCBsb2NrKTsKIAlpZiAoaW5zdGFuY2UgPT0g TlVMTCkKLQkJcGFuaWMoIiVzOiBsb2NrICglcykgJXMgbm90IGxvY2tlZCIsIF9fZnVuY19fLAor CQl3aXRuZXNzX3BhbmljKCIlczogbG9jayAoJXMpICVzIG5vdCBsb2NrZWQiLCBfX2Z1bmNfXywK IAkJICAgIGNsYXNzLT5sY19uYW1lLCBsb2NrLT5sb19uYW1lKTsKIAogCWlmIChzZXQpCkluZGV4 OiBzeXMvbmV0L2lmLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gc3lzL25ldC9pZi5jCShyZXZpc2lvbiAyNDE0 OTEpCisrKyBzeXMvbmV0L2lmLmMJKHdvcmtpbmcgY29weSkKQEAgLTgyMSw3ICs4MjEsOSBAQAog aWZfZGV0YWNoKHN0cnVjdCBpZm5ldCAqaWZwKQogewogCisJQ1VSVk5FVF9TRVRfUVVJRVQoaWZw LT5pZl92bmV0KTsKIAlpZl9kZXRhY2hfaW50ZXJuYWwoaWZwLCAwKTsKKwlDVVJWTkVUX1JFU1RP UkUoKTsKIH0KIAogc3RhdGljIHZvaWQK --e89a8fb202eaccab6a04ce829ea2-- From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 17:37:28 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1B4C4C68; Thu, 15 Nov 2012 17:37:28 +0000 (UTC) (envelope-from zec@fer.hr) Received: from mail.zvne.fer.hr (mail.zvne.fer.hr [161.53.66.5]) by mx1.freebsd.org (Postfix) with ESMTP id 5987F8FC08; Thu, 15 Nov 2012 17:37:26 +0000 (UTC) Received: from munja.zvne.fer.hr (161.53.66.248) by mail.zvne.fer.hr (161.53.66.5) with Microsoft SMTP Server id 14.2.318.4; Thu, 15 Nov 2012 18:36:14 +0100 Received: from sluga.fer.hr ([161.53.66.244]) by munja.zvne.fer.hr with Microsoft SMTPSVC(6.0.3790.4675); Thu, 15 Nov 2012 18:36:14 +0100 Received: from localhost ([161.53.19.8]) by sluga.fer.hr over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 15 Nov 2012 18:36:14 +0100 From: Marko Zec To: Adrian Chadd Subject: Re: VIMAGE crashes on 9.x with hotplug net80211 devices Date: Thu, 15 Nov 2012 18:36:04 +0100 User-Agent: KMail/1.9.10 References: <201210291115.23845.zec@fer.hr> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Boundary-00=_EgSpQL+nIDcE64N" Message-ID: <201211151836.04709.zec@fer.hr> X-OriginalArrivalTime: 15 Nov 2012 17:36:14.0549 (UTC) FILETIME=[B5EE8050:01CDC357] Cc: freebsd-net@freebsd.org, Hans Petter Selasky , freebsd-hackers@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 17:37:28 -0000 --Boundary-00=_EgSpQL+nIDcE64N Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Thursday 15 November 2012 07:18:31 Adrian Chadd wrote: > Hi, > > Here's what I have thus far. Please ignore the device_printf() change. > > This works for me, both for hotplug cardbus wireless devices as well > as (inadvertently!) a USB bluetooth device. > > What do you think? It looks that you've hit the right spot to set curvnet context in device_probe_and_attach(). Could you try out a slightly revised verstion (attached) - this one also removes now redundant curvnet setting from linker routines (kldload / kldunload), and adds a few extra bits which might be necessary for a broader range of drivers to work. Note that I haven't tested this myself as I don't have a -CURRENT machine ATM, but a similar patch for 8.3 apparently works fine, though I don't have hotplugabble network cards to play with (neither cardbus nor USB)... Cheers, Marko --Boundary-00=_EgSpQL+nIDcE64N Content-Type: text/x-diff; charset="iso 8859-15"; name="hotplug_vnet_20121115.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="hotplug_vnet_20121115.diff" Index: sys/kern/subr_bus.c =================================================================== --- sys/kern/subr_bus.c (revision 243091) +++ sys/kern/subr_bus.c (working copy) @@ -53,6 +53,8 @@ #include #include +#include + #include #include @@ -2735,7 +2737,11 @@ return (0); else if (error != 0) return (error); - return (device_attach(dev)); + + CURVNET_SET_QUIET(vnet0); + error = device_attach(dev); + CURVNET_RESTORE(); + return (error); } /** Index: sys/kern/kern_linker.c =================================================================== --- sys/kern/kern_linker.c (revision 243091) +++ sys/kern/kern_linker.c (working copy) @@ -53,8 +53,6 @@ #include #include -#include - #include #include "linker_if.h" @@ -1019,12 +1017,6 @@ return (error); /* - * It is possible that kldloaded module will attach a new ifnet, - * so vnet context must be set when this ocurs. - */ - CURVNET_SET(TD_TO_VNET(td)); - - /* * If file does not contain a qualified name or any dot in it * (kldname.ko, or kldname.ver.ko) treat it as an interface * name. @@ -1041,7 +1033,7 @@ error = linker_load_module(kldname, modname, NULL, NULL, &lf); if (error) { KLD_UNLOCK(); - goto done; + return (error); } lf->userrefs++; if (fileid != NULL) @@ -1055,9 +1047,6 @@ #else KLD_UNLOCK(); #endif - -done: - CURVNET_RESTORE(); return (error); } @@ -1095,7 +1084,6 @@ if ((error = priv_check(td, PRIV_KLD_UNLOAD)) != 0) return (error); - CURVNET_SET(TD_TO_VNET(td)); KLD_LOCK(); lf = linker_find_file_by_id(fileid); if (lf) { @@ -1137,7 +1125,6 @@ #else KLD_UNLOCK(); #endif - CURVNET_RESTORE(); return (error); } Index: sys/netgraph/bluetooth/socket/ng_btsocket.c =================================================================== --- sys/netgraph/bluetooth/socket/ng_btsocket.c (revision 243091) +++ sys/netgraph/bluetooth/socket/ng_btsocket.c (working copy) @@ -46,6 +46,8 @@ #include #include +#include + #include #include #include @@ -285,4 +287,4 @@ return (error); } /* ng_btsocket_modevent */ -DOMAIN_SET(ng_btsocket_); +VNET_DOMAIN_SET(ng_btsocket_); Index: sys/net/if.c =================================================================== --- sys/net/if.c (revision 243091) +++ sys/net/if.c (working copy) @@ -504,6 +504,7 @@ ifp->if_flags |= IFF_DYING; /* XXX: Locking */ + CURVNET_SET_QUIET(ifp->if_vnet); IFNET_WLOCK(); KASSERT(ifp == ifnet_byindex_locked(ifp->if_index), ("%s: freeing unallocated ifnet", ifp->if_xname)); @@ -511,9 +512,9 @@ ifindex_free_locked(ifp->if_index); IFNET_WUNLOCK(); - if (!refcount_release(&ifp->if_refcount)) - return; - if_free_internal(ifp); + if (refcount_release(&ifp->if_refcount)) + if_free_internal(ifp); + CURVNET_RESTORE(); } /* @@ -793,7 +794,9 @@ if_detach(struct ifnet *ifp) { + CURVNET_SET_QUIET(ifp->if_vnet); if_detach_internal(ifp, 0); + CURVNET_RESTORE(); } static void --Boundary-00=_EgSpQL+nIDcE64N-- From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 19:16:12 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D1ABCF7F; Thu, 15 Nov 2012 19:16:12 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id 91D4F8FC0C; Thu, 15 Nov 2012 19:16:12 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id kp6so1394560pab.13 for ; Thu, 15 Nov 2012 11:16:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=LzMa1B6/y8A8sezs+XidoLqSDprcGpzwIZPFFeIXyRg=; b=l16sRC/OV4FsCFGcaU0H0Li5U6jUZIxA7OSxi7Fd49/PYRL2Pb9hCx87z1kMwg4/Zg qv4FEtnTMpq0tS7wzy5AiN1H26Yu3CBjdXqLQ6ZHy99LT4onqmgMr+XXAexW0jNX4kwY Fm+x3CssdII0RONHk7FhXn2h+VrivL3C5Ra3wmCzKaxirI/s9K3vMqoRUx7eGncWe7pg jSzJX1rUMc4+HZEAmq8rO5C7q2Mpq+ytUQaaiWFhVPUiyL9/qc1yne9FzffiVRSRJmnt e9BuYbb5nfEorjSJGvl/zvPZ/KbR0zXKM8lzDq2G7mLtmUOuZkDs/zdWXPYRNPrRJWJ8 89CQ== MIME-Version: 1.0 Received: by 10.68.209.166 with SMTP id mn6mr1878624pbc.95.1353006972359; Thu, 15 Nov 2012 11:16:12 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.68.124.130 with HTTP; Thu, 15 Nov 2012 11:16:12 -0800 (PST) In-Reply-To: <201211151836.04709.zec@fer.hr> References: <201210291115.23845.zec@fer.hr> <201211151836.04709.zec@fer.hr> Date: Thu, 15 Nov 2012 11:16:12 -0800 X-Google-Sender-Auth: BzHBpD3CrgErucYcfhLLRpVLBUI Message-ID: Subject: Re: VIMAGE crashes on 9.x with hotplug net80211 devices From: Adrian Chadd To: Marko Zec Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Hans Petter Selasky , freebsd-hackers@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 19:16:12 -0000 Hans brings up a very good point for USB - they split if_alloc and if_attach across two different threads. So this works for non-USB devices, but not for USB devices. Hans, does each device implement its own workqueue for this kind of delayed action, or is there some generic work queue that is doing this work? Adrian On 15 November 2012 09:36, Marko Zec wrote: > On Thursday 15 November 2012 07:18:31 Adrian Chadd wrote: >> Hi, >> >> Here's what I have thus far. Please ignore the device_printf() change. >> >> This works for me, both for hotplug cardbus wireless devices as well >> as (inadvertently!) a USB bluetooth device. >> >> What do you think? > > It looks that you've hit the right spot to set curvnet context in > device_probe_and_attach(). > > Could you try out a slightly revised verstion (attached) - this one also > removes now redundant curvnet setting from linker routines (kldload / > kldunload), and adds a few extra bits which might be necessary for a > broader range of drivers to work. > > Note that I haven't tested this myself as I don't have a -CURRENT machine > ATM, but a similar patch for 8.3 apparently works fine, though I don't have > hotplugabble network cards to play with (neither cardbus nor USB)... > > Cheers, > > Marko From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 19:35:36 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9876A990; Thu, 15 Nov 2012 19:35:36 +0000 (UTC) (envelope-from hselasky@c2i.net) Received: from swip.net (mailfe07.c2i.net [212.247.154.194]) by mx1.freebsd.org (Postfix) with ESMTP id AA4548FC16; Thu, 15 Nov 2012 19:35:35 +0000 (UTC) X-T2-Spam-Status: No, hits=-1.0 required=5.0 tests=ALL_TRUSTED Received: from [176.74.213.204] (account mc467741@c2i.net HELO laptop015.hselasky.homeunix.org) by mailfe07.swip.net (CommuniGate Pro SMTP 5.4.4) with ESMTPA id 344145324; Thu, 15 Nov 2012 20:30:26 +0100 From: Hans Petter Selasky To: freebsd-hackers@freebsd.org Subject: Re: VIMAGE crashes on 9.x with hotplug net80211 devices Date: Thu, 15 Nov 2012 20:32:06 +0100 User-Agent: KMail/1.13.7 (FreeBSD/9.1-PRERELEASE; KDE/4.8.4; amd64; ; ) References: <201211151836.04709.zec@fer.hr> In-Reply-To: X-Face: 'mmZ:T{)),Oru^0c+/}w'`gU1$ubmG?lp!=R4Wy\ELYo2)@'UZ24N@ =?iso-8859-1?q?d2+AyewRX=7DmAm=3BYp=0A=09=7CU=5B?=@, _z/([?1bCfM{_"B<.J>mICJCHAzzGHI{y7{%JVz%R~yJHIji`y> =?iso-8859-1?q?Y=7Dk1C4TfysrsUI=0A=09-=25GU9V5=5DiUZF=26nRn9mJ=27=3F=26?=>O MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201211152032.06181.hselasky@c2i.net> Cc: freebsd-net@freebsd.org, Adrian Chadd , Marko Zec X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 19:35:36 -0000 On Thursday 15 November 2012 20:16:12 Adrian Chadd wrote: > Hans brings up a very good point for USB - they split if_alloc and > if_attach across two different threads. > > So this works for non-USB devices, but not for USB devices. > > Hans, does each device implement its own workqueue for this kind of > delayed action, or is there some generic work queue that is doing this > work? > > Hi, I think a new thread is created for this stuff. It is inside the USB subsystem, but would consider this a big *hack* to add VNET specific stuff in there. Isn't it possible to have curvnet return "vnet0" when nothing else is set? --HPS From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 21:49:49 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 07013725; Thu, 15 Nov 2012 21:49:49 +0000 (UTC) (envelope-from zec@fer.hr) Received: from mail.zvne.fer.hr (mail.zvne.fer.hr [161.53.66.5]) by mx1.freebsd.org (Postfix) with ESMTP id 773698FC16; Thu, 15 Nov 2012 21:49:48 +0000 (UTC) Received: from munja.zvne.fer.hr (161.53.66.248) by mail.zvne.fer.hr (161.53.66.5) with Microsoft SMTP Server id 14.2.318.4; Thu, 15 Nov 2012 22:49:45 +0100 Received: from sluga.fer.hr ([161.53.66.244]) by munja.zvne.fer.hr with Microsoft SMTPSVC(6.0.3790.4675); Thu, 15 Nov 2012 22:49:41 +0100 Received: from localhost ([161.53.19.8]) by sluga.fer.hr over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 15 Nov 2012 22:49:41 +0100 From: Marko Zec To: Hans Petter Selasky , Adrian Chadd Subject: Re: VIMAGE crashes on 9.x with hotplug net80211 devices Date: Thu, 15 Nov 2012 22:49:35 +0100 User-Agent: KMail/1.9.10 References: <201211152032.06181.hselasky@c2i.net> In-Reply-To: <201211152032.06181.hselasky@c2i.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-ID: <201211152249.36161.zec@fer.hr> X-OriginalArrivalTime: 15 Nov 2012 21:49:41.0438 (UTC) FILETIME=[1DF19DE0:01CDC37B] Cc: freebsd-hackers@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 21:49:49 -0000 On Thursday 15 November 2012 20:32:06 Hans Petter Selasky wrote: > On Thursday 15 November 2012 20:16:12 Adrian Chadd wrote: > > Hans brings up a very good point for USB - they split if_alloc and > > if_attach across two different threads. Fine, so maybe one of the following options could work: 1) pass the vnet context embedded in some other already available struct when forwarding request from 1st to 2nd thread; or 2) if we can safely assume that device attach events can only occur in context of vnet0 (and I think we can), place a few CURVNET_SET(vnet0) macros wherever necessary in 2nd USB "attach" thread. > > So this works for non-USB devices, but not for USB devices. Could you post a sample backtrace for me to look at? > > Hans, does each device implement its own workqueue for this kind of > > delayed action, or is there some generic work queue that is doing this > > work? > > Hi, > > I think a new thread is created for this stuff. It is inside the USB > subsystem, but would consider this a big *hack* to add VNET specific > stuff in there. > > Isn't it possible to have curvnet return "vnet0" when nothing else is > set? No! This was discussed already at several ocassions, including earlier in this thread: with curvnet pointing by default to vnet0, it would be essentially impossible to detect, trace and debug leakages between vnets. Marko From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 22:03:16 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4F4C2CB3; Thu, 15 Nov 2012 22:03:16 +0000 (UTC) (envelope-from postnet@dragas.dyndns.org) Received: from mail.dragas.org (unknown [IPv6:2001:41d0:2:ca45::3]) by mx1.freebsd.org (Postfix) with ESMTP id DA3068FC12; Thu, 15 Nov 2012 22:03:15 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.dragas.org (Postfix) with ESMTP id 3C54BE4365; Thu, 15 Nov 2012 23:03:14 +0100 (CET) Received: from mail.dragas.org ([127.0.0.1]) by localhost (dragas.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YbLQPx2TBzpH; Thu, 15 Nov 2012 23:03:09 +0100 (CET) Received: from dragasmbp.lan (unknown [87.238.26.165]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.dragas.org (Postfix) with ESMTPSA id 18C1EE448E; Thu, 15 Nov 2012 23:03:09 +0100 (CET) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: "Weighted round robin" for LAGG - anyhow? From: Stefano Marinelli In-Reply-To: <50A41EE4.5060006@freebsd.org> Date: Thu, 15 Nov 2012 23:03:08 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <646662D0-E66C-4168-B351-482B06ACF26A@dragas.dyndns.org> <50A41EE4.5060006@freebsd.org> To: Julian Elischer X-Mailer: Apple Mail (2.1499) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 22:03:16 -0000 Hi, > you can use mpd (or ppp) in multilink mode to encapsulate your = outgoing links via two different tcp paths to you other server where you = can undo it.. I didn't know about mpd (or ppp multilink). Actually, it seems to be = exactly what I need. Better than lagg, for this situation. The problem is: I never dealt with PPP links (not more than "connect" on = my modem), so I'm studying that for now. I've been able to set up a dual = link pptp connection (still can't figure out how to create just a TCP or = UDP flow, but I hope I'll sort it out). I'm going to do some tests. Thank you for your suggestion! Stefano= From owner-freebsd-net@FreeBSD.ORG Thu Nov 15 23:10:49 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 299D77B6; Thu, 15 Nov 2012 23:10:49 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id C63C48FC08; Thu, 15 Nov 2012 23:10:48 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id qAFNAlle013379; Thu, 15 Nov 2012 18:10:47 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id qAFNAlhY013376; Thu, 15 Nov 2012 18:10:47 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20645.30327.506353.158003@hergotha.csail.mit.edu> Date: Thu, 15 Nov 2012 18:10:47 -0500 From: Garrett Wollman To: freebsd-net@freebsd.org, freebsd-fs@freebsd.org Subject: NFS over SCTP -- is anyone likely to implement this? X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (hergotha.csail.mit.edu [127.0.0.1]); Thu, 15 Nov 2012 18:10:47 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 23:10:49 -0000 I'm working on (of all things) a Puppet module to configure NFS servers, and I'm wondering if anyone expects to implement NFS over SCTP on FreeBSD. -GAWollman From owner-freebsd-net@FreeBSD.ORG Fri Nov 16 04:53:31 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B4C4219D for ; Fri, 16 Nov 2012 04:53:31 +0000 (UTC) (envelope-from postmaster@mailpod.hostingplatform.com) Received: from atl4mhob13.myregisteredsite.com (atl4mhob13.myregisteredsite.com [209.17.115.51]) by mx1.freebsd.org (Postfix) with ESMTP id 605068FC12 for ; Fri, 16 Nov 2012 04:53:31 +0000 (UTC) Received: from mailpod1.hostingplatform.com (mailpod1.networksolutionsemail.com [206.188.198.65]) by atl4mhob13.myregisteredsite.com (8.14.4/8.14.4) with ESMTP id qAG4rPwX000772 for ; Thu, 15 Nov 2012 23:53:25 -0500 Received: (qmail 8537 invoked by uid 0); 16 Nov 2012 04:53:25 -0000 Received: (qmail 24219 invoked by uid 0); 15 Nov 2012 21:50:18 -0000 Received: from unknown (HELO atl4mhib33.myregisteredsite.com) (209) by 0 with SMTP; 15 Nov 2012 21:50:18 -0000 Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by atl4mhib33.myregisteredsite.com (8.14.4/8.14.4) with ESMTP id qAFLoHrY019905 for ; Thu, 15 Nov 2012 16:50:17 -0500 Received: from hub.freebsd.org (hub.FreeBSD.org [8.8.178.136]) by mx2.freebsd.org (Postfix) with ESMTP id 76FE53B6A9B; Thu, 15 Nov 2012 21:50:03 +0000 (UTC) Received: from hub.freebsd.org (hub.freebsd.org [8.8.178.136]) by hub.freebsd.org (Postfix) with ESMTP id B9D61851; Thu, 15 Nov 2012 21:50:03 +0000 (UTC) (envelope-from owner-freebsd-hackers@freebsd.org) Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 07013725; Thu, 15 Nov 2012 21:49:49 +0000 (UTC) (envelope-from zec@fer.hr) Received: from mail.zvne.fer.hr (mail.zvne.fer.hr [161.53.66.5]) by mx1.freebsd.org (Postfix) with ESMTP id 773698FC16; Thu, 15 Nov 2012 21:49:48 +0000 (UTC) Received: from munja.zvne.fer.hr (161.53.66.248) by mail.zvne.fer.hr (161.53.66.5) with Microsoft SMTP Server id 14.2.318.4; Thu, 15 Nov 2012 22:49:45 +0100 Received: from sluga.fer.hr ([161.53.66.244]) by munja.zvne.fer.hr with Microsoft SMTPSVC(6.0.3790.4675); Thu, 15 Nov 2012 22:49:41 +0100 Received: from localhost ([161.53.19.8]) by sluga.fer.hr over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 15 Nov 2012 22:49:41 +0100 From: Marko Zec To: Hans Petter Selasky , Adrian Chadd Subject: Re: VIMAGE crashes on 9.x with hotplug net80211 devices Date: Thu, 15 Nov 2012 22:49:35 +0100 User-Agent: KMail/1.9.10 References: <201211152032.06181.hselasky@c2i.net> In-Reply-To: <201211152032.06181.hselasky@c2i.net> MIME-Version: 1.0 Content-Disposition: inline Message-ID: <201211152249.36161.zec@fer.hr> X-OriginalArrivalTime: 15 Nov 2012 21:49:41.0438 (UTC) FILETIME=[1DF19DE0:01CDC37B] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: owner-freebsd-hackers@freebsd.org Sender: owner-freebsd-hackers@freebsd.org X-SpamScore: 0 X-MailHub-Apparently-To: mjm@michaelmeltzer.com X-MailHub-Forwarded: Yes Cc: freebsd-hackers@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Nov 2012 04:53:31 -0000 On Thursday 15 November 2012 20:32:06 Hans Petter Selasky wrote: > On Thursday 15 November 2012 20:16:12 Adrian Chadd wrote: > > Hans brings up a very good point for USB - they split if_alloc and > > if_attach across two different threads. Fine, so maybe one of the following options could work: 1) pass the vnet context embedded in some other already available struct when forwarding request from 1st to 2nd thread; or 2) if we can safely assume that device attach events can only occur in context of vnet0 (and I think we can), place a few CURVNET_SET(vnet0) macros wherever necessary in 2nd USB "attach" thread. > > So this works for non-USB devices, but not for USB devices. Could you post a sample backtrace for me to look at? > > Hans, does each device implement its own workqueue for this kind of > > delayed action, or is there some generic work queue that is doing this > > work? > > Hi, > > I think a new thread is created for this stuff. It is inside the USB > subsystem, but would consider this a big *hack* to add VNET specific > stuff in there. > > Isn't it possible to have curvnet return "vnet0" when nothing else is > set? No! This was discussed already at several ocassions, including earlier in this thread: with curvnet pointing by default to vnet0, it would be essentially impossible to detect, trace and debug leakages between vnets. Marko _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Sat Nov 17 18:23:56 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DE40A438; Sat, 17 Nov 2012 18:23:56 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id A44EE8FC0C; Sat, 17 Nov 2012 18:23:56 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id kp6so2684355pab.13 for ; Sat, 17 Nov 2012 10:23:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=LI1X5TTHWJBVTZ1RlyuTiIcUq/ly5+Rz0aL2SMuW23E=; b=jOFxvz7cY+UdRoko3xJxiPg4HGwc2CLUkiRVqa346PbB0/mGbJA0CHu80aFPvyp4yo BsTRAavhft+pknHWOkj3WKReWwF9+L9e99egJys9x0fGddt5sMe4TV1rv0lW/i3ZB3tU So11aoGydfVaROQuI2+/tBUpsDj4fyo0egABZndQD2x86FIg6+Pt9/OqZAcoWZY6o5XP 2MUkEA6fPMDpFQN04XjzIxBVowohs2RMQo6C9yRj1VA2v90h5sRGxPbCC63YmURoBgxv +YE+vKTAWd128lYdXXh7ieFtjr3VeVha0V9KHnQSTyw81ePvSm42CvtFfftuH07HRdcs e01g== MIME-Version: 1.0 Received: by 10.68.137.198 with SMTP id qk6mr26045807pbb.60.1353176636269; Sat, 17 Nov 2012 10:23:56 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.68.124.130 with HTTP; Sat, 17 Nov 2012 10:23:56 -0800 (PST) In-Reply-To: References: Date: Sat, 17 Nov 2012 10:23:56 -0800 X-Google-Sender-Auth: gnDE4nlhRv48QKefE4HFzrExOUY Message-ID: Subject: Re: netisr panic? From: Adrian Chadd To: Ian FREISLICH Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: FreeBSD Net , freebsd-current@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Nov 2012 18:23:57 -0000 Check what mtod() is doing. mbuf.h:#define mtod(m, t) ((t)((m)->m_data)) .. so if m->m_data is NULL, bam. The question is why is m_data NULL here. Someone mbuf cluey is going to have to answer that. I don't know whether the MH_dat stuff is being treated as valid but m_data isn't being updated, or something. Adrian On 17 November 2012 10:13, Ian FREISLICH wrote: > Adrian Chadd wrote: >> It's a NULL ponter deref. This is my line 484 in if_ethersubr.c: >> >> eh =3D mtod(m, struct ether_header *); >> >> >> .. if that's yours, see if eh is NULL? > > (kgdb) frame 7 > #7 0xffffffff8050f534 in ether_nh_input (m=3D0xfffffe012521e700) > at /usr/src/sys/net/if_ethersubr.c:484 > 484 eh =3D mtod(m, struct ether_header *); > (kgdb) print eh > No symbol "eh" in current context. > (kgdb) print *m > $2 =3D {m_hdr =3D {mh_next =3D 0x100000000000000, mh_nextpkt =3D 0x100000= 00000, > mh_data =3D 0x0, mh_len =3D 60, mh_flags =3D 4259842, mh_type =3D 0, > pad =3D "\000\000\000\000\000"}, M_dat =3D {MH =3D {MH_pkthdr =3D { > rcvif =3D 0xfffffe000a1c2000, header =3D 0xffffffff, len =3D 60, = flowid =3D 0, > csum_flags =3D 3840, csum_data =3D 65535, tso_segsz =3D 0, PH_vt = =3D { > vt_vtag =3D 4, vt_nrecs =3D 4}, tags =3D {slh_first =3D 0x3c000= 000}}, > MH_dat =3D {MH_ext =3D { > ext_buf =3D 0x69e5498600000000
, ext_free =3D 0x10602, ext_arg1 =3D 0xc000000070000, ext_arg2 = =3D 0x100, > ext_size =3D 2048, ref_cnt =3D 0xfffffe0125236d8c, ext_type =3D= 6}, > MH_databuf =3D "\000\000\000\000\206I=D0=B5i\002\006\001\000\000\= 000\000\000\000\000\a\000\000\000\f\000\000\001\000\000\000\000\000\000\000= \b\000\000\000\000\000\000\214m#%\001=D1=8E=D1=8F=D1=8F\006", '\0' }}, > M_databuf =3D "\000 \034\n\000=D1=8E=D1=8F=D1=8F=D1=8F=D1=8F=D1=8F=D1= =8F\000\000\000\000<\000\000\000\000\000\000\000\000\017\000\000=D1=8F=D1= =8F\000\000\000\000\004\000\000\000\000\000\000\000\000<\000\000\000\000\00= 0\000\000\000\206I=D0=B5i\002\006\001\000\000\000\000\000\000\000\a\000\000= \000\f\000\000\001\000\000\000\000\000\000\000\b\000\000\000\000\000\000\21= 4m#%\001=D1=8E=D1=8F=D1=8F\006", '\0' }} > > > Ian > > -- > Ian Freislich > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " From owner-freebsd-net@FreeBSD.ORG Sat Nov 17 21:31:19 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3F052A31; Sat, 17 Nov 2012 21:31:19 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id E904D8FC12; Sat, 17 Nov 2012 21:31:18 +0000 (UTC) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTPS id 481E746B0D; Sat, 17 Nov 2012 16:31:18 -0500 (EST) Date: Sat, 17 Nov 2012 21:31:18 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Adrian Chadd Subject: Re: netisr panic? In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="621616949-1601276811-1353187878=:94966" Cc: FreeBSD Net , Ian FREISLICH , freebsd-current@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Nov 2012 21:31:19 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --621616949-1601276811-1353187878=:94966 Content-Type: TEXT/PLAIN; charset=utf-8; format=flowed Content-Transfer-Encoding: 8BIT Panics along these lines often occur if there is a concurrency bug in a device driver such that it modifies an mbuf after dispatching to the network stack. E.g., by freeing it, reusing it, an errant dereference, etc. Not guaranteed, but that is where I'd start. Robert On Sat, 17 Nov 2012, Adrian Chadd wrote: > Check what mtod() is doing. > > mbuf.h:#define mtod(m, t) ((t)((m)->m_data)) > > .. so if m->m_data is NULL, bam. > > The question is why is m_data NULL here. Someone mbuf cluey is going > to have to answer that. I don't know whether the MH_dat stuff is being > treated as valid but m_data isn't being updated, or something. > > > Adrian > > On 17 November 2012 10:13, Ian FREISLICH wrote: >> Adrian Chadd wrote: >>> It's a NULL ponter deref. This is my line 484 in if_ethersubr.c: >>> >>> eh = mtod(m, struct ether_header *); >>> >>> >>> .. if that's yours, see if eh is NULL? >> >> (kgdb) frame 7 >> #7 0xffffffff8050f534 in ether_nh_input (m=0xfffffe012521e700) >> at /usr/src/sys/net/if_ethersubr.c:484 >> 484 eh = mtod(m, struct ether_header *); >> (kgdb) print eh >> No symbol "eh" in current context. >> (kgdb) print *m >> $2 = {m_hdr = {mh_next = 0x100000000000000, mh_nextpkt = 0x10000000000, >> mh_data = 0x0, mh_len = 60, mh_flags = 4259842, mh_type = 0, >> pad = "\000\000\000\000\000"}, M_dat = {MH = {MH_pkthdr = { >> rcvif = 0xfffffe000a1c2000, header = 0xffffffff, len = 60, flowid = 0, >> csum_flags = 3840, csum_data = 65535, tso_segsz = 0, PH_vt = { >> vt_vtag = 4, vt_nrecs = 4}, tags = {slh_first = 0x3c000000}}, >> MH_dat = {MH_ext = { >> ext_buf = 0x69e5498600000000
, ext_free = 0x10602, ext_arg1 = 0xc000000070000, ext_arg2 = 0x100, >> ext_size = 2048, ref_cnt = 0xfffffe0125236d8c, ext_type = 6}, >> MH_databuf = "\000\000\000\000\206Iеi\002\006\001\000\000\000\000\000\000\000\a\000\000\000\f\000\000\001\000\000\000\000\000\000\000\b\000\000\000\000\000\000\214m#%\001юяя\006", '\0' }}, >> M_databuf = "\000 \034\n\000юяяяяяя\000\000\000\000<\000\000\000\000\000\000\000\000\017\000\000яя\000\000\000\000\004\000\000\000\000\000\000\000\000<\000\000\000\000\000\000\000\000\206Iеi\002\006\001\000\000\000\000\000\000\000\a\000\000\000\f\000\000\001\000\000\000\000\000\000\000\b\000\000\000\000\000\000\214m#%\001юяя\006", '\0' }} >> >> >> Ian >> >> -- >> Ian Freislich >> >> _______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --621616949-1601276811-1353187878=:94966-- From owner-freebsd-net@FreeBSD.ORG Sat Nov 17 22:01:20 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 99D40DFB; Sat, 17 Nov 2012 22:01:20 +0000 (UTC) (envelope-from postnet@dragas.dyndns.org) Received: from mail.dragas.org (unknown [IPv6:2001:41d0:2:ca45::3]) by mx1.freebsd.org (Postfix) with ESMTP id 30D718FC08; Sat, 17 Nov 2012 22:01:20 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.dragas.org (Postfix) with ESMTP id 7191BE01C4; Sat, 17 Nov 2012 23:01:19 +0100 (CET) Received: from mail.dragas.org ([127.0.0.1]) by localhost (dragas.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2Mk4YLrG3vQX; Sat, 17 Nov 2012 23:01:13 +0100 (CET) Received: from dragasmbp.lan (unknown [87.238.26.165]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.dragas.org (Postfix) with ESMTPSA id 670F6E4251; Sat, 17 Nov 2012 23:01:13 +0100 (CET) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: "Weighted round robin" for LAGG - anyhow? From: Stefano Marinelli In-Reply-To: <50A41EE4.5060006@freebsd.org> Date: Sat, 17 Nov 2012 23:01:12 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <646662D0-E66C-4168-B351-482B06ACF26A@dragas.dyndns.org> <50A41EE4.5060006@freebsd.org> To: Julian Elischer X-Mailer: Apple Mail (2.1499) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Nov 2012 22:01:20 -0000 Hi, > you can use mpd (or ppp) in multilink mode to encapsulate your = outgoing links via two different tcp paths to you other server where you = can undo it.. > mpd (and ppp I think) will allow you to select a couple of different = multiplexing schemes, including one where each packet is cut up and = sent, so that the slower link would be sending 1/3 of teh packet and the = faster link would be sending 2/3 of the packet. just to confirm I had great success with mpd. I have been able to create = two pptp tunnels (to two different ips, same machine). Setting the right = link speed, I can almost get the summed speed of the two links. Amazing. I have now to face some problems (I'd love to remove one of the two ips, = but I don't seem to be able to create a proper different route. Maybe I = can try with playing with ports and setfib, which I am already using for = other tasks on that router). Thank you very much for this hint. MPD is surely much more efficient = than lagg, with different speeds. Stefano=20= From owner-freebsd-net@FreeBSD.ORG Sat Nov 17 23:13:49 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 89335C33; Sat, 17 Nov 2012 23:13:49 +0000 (UTC) (envelope-from vijju.singh@gmail.com) Received: from mail-ia0-f182.google.com (mail-ia0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1C67F8FC08; Sat, 17 Nov 2012 23:13:48 +0000 (UTC) Received: by mail-ia0-f182.google.com with SMTP id x2so3255307iad.13 for ; Sat, 17 Nov 2012 15:13:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:x-mailer:from:subject:date :to; bh=0vAbS1fimPPUkbq4dodbUJI/vliWdMN5stgTfMhfY9M=; b=zWmCnGdiBineyN26MaNo2bHycdlYCs/VmU4HzYEe4Gjkx2TQAKkzvvU2eD5QlOaSZu fGkVjSEWX+TFSLwLLxXe54mpFf2xocZ6en83AW96QAVyq/5eC/Psbv24d6WieRTFoVie yMJnpDokFWOX/Nk60jpRq4vAKdxBHPyZFho79Lk2avtHpAKfDH4ehZq8AYmzcqRcRtG2 V/hmRnfQ/DudfDZtp9PVSfwLYlpRNEu8Yt7lHRVn4JIg5YvSElXMcH9bAm4efa8D+lvi x+odgWo9tZXENkxX1jsaF7mztEjSGC62oY6nbsKCtEFgoo+Gq1n+vmayxIudX6AFjvxh BoVA== Received: by 10.50.37.196 with SMTP id a4mr3047999igk.16.1353194028332; Sat, 17 Nov 2012 15:13:48 -0800 (PST) Received: from [192.168.1.64] (108-64-226-69.lightspeed.sntcca.sbcglobal.net. [108.64.226.69]) by mx.google.com with ESMTPS id u4sm3914586igw.6.2012.11.17.15.13.44 (version=SSLv3 cipher=OTHER); Sat, 17 Nov 2012 15:13:46 -0800 (PST) References: Mime-Version: 1.0 (1.0) In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-Id: <55449D50-389B-4080-9B6F-A7CC0C0A2D1E@gmail.com> X-Mailer: iPhone Mail (10A523) From: Vijay Singh Subject: Re: netisr panic? Date: Sat, 17 Nov 2012 15:13:41 -0800 To: Robert Watson Cc: FreeBSD Net , Adrian Chadd , Ian FREISLICH , "freebsd-current@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Nov 2012 23:13:49 -0000 Or cluster allocation failed, and only the mbuf was used. Sent from my iPhone On Nov 17, 2012, at 1:31 PM, Robert Watson wrote: > Panics along these lines often occur if there is a concurrency bug in a de= vice driver such that it modifies an mbuf after dispatching to the network s= tack. E.g., by freeing it, reusing it, an errant dereference, etc. Not guar= anteed, but that is where I'd start. >=20 > Robert >=20 > On Sat, 17 Nov 2012, Adrian Chadd wrote: >=20 >> Check what mtod() is doing. >>=20 >> mbuf.h:#define mtod(m, t) ((t)((m)->m_data)) >>=20 >> .. so if m->m_data is NULL, bam. >>=20 >> The question is why is m_data NULL here. Someone mbuf cluey is going >> to have to answer that. I don't know whether the MH_dat stuff is being >> treated as valid but m_data isn't being updated, or something. >>=20 >>=20 >> Adrian >>=20 >> On 17 November 2012 10:13, Ian FREISLICH wrote: >>> Adrian Chadd wrote: >>>> It's a NULL ponter deref. This is my line 484 in if_ethersubr.c: >>>>=20 >>>> eh =3D mtod(m, struct ether_header *); >>>>=20 >>>>=20 >>>> .. if that's yours, see if eh is NULL? >>>=20 >>> (kgdb) frame 7 >>> #7 0xffffffff8050f534 in ether_nh_input (m=3D0xfffffe012521e700) >>> at /usr/src/sys/net/if_ethersubr.c:484 >>> 484 eh =3D mtod(m, struct ether_header *); >>> (kgdb) print eh >>> No symbol "eh" in current context. >>> (kgdb) print *m >>> $2 =3D {m_hdr =3D {mh_next =3D 0x100000000000000, mh_nextpkt =3D 0x10000= 000000, >>> mh_data =3D 0x0, mh_len =3D 60, mh_flags =3D 4259842, mh_type =3D 0, >>> pad =3D "\000\000\000\000\000"}, M_dat =3D {MH =3D {MH_pkthdr =3D { >>> rcvif =3D 0xfffffe000a1c2000, header =3D 0xffffffff, len =3D 60, f= lowid =3D 0, >>> csum_flags =3D 3840, csum_data =3D 65535, tso_segsz =3D 0, PH_vt =3D= { >>> vt_vtag =3D 4, vt_nrecs =3D 4}, tags =3D {slh_first =3D 0x3c000= 000}}, >>> MH_dat =3D {MH_ext =3D { >>> ext_buf =3D 0x69e5498600000000
, ext_free =3D 0x10602, ext_arg1 =3D 0xc000000070000, ext_arg2 =3D 0= x100, >>> ext_size =3D 2048, ref_cnt =3D 0xfffffe0125236d8c, ext_type =3D= 6}, >>> MH_databuf =3D "\000\000\000\000\206I=D0=B5i\002\006\001\000\000\= 000\000\000\000\000\a\000\000\000\f\000\000\001\000\000\000\000\000\000\000\= b\000\000\000\000\000\000\214m#%\001=D1=8E=D1=8F=D1=8F\006", '\0' }}, >>> M_databuf =3D "\000 \034\n\000=D1=8E=D1=8F=D1=8F=D1=8F=D1=8F=D1=8F=D1= =8F\000\000\000\000<\000\000\000\000\000\000\000\000\017\000\000=D1=8F=D1=8F= \000\000\000\000\004\000\000\000\000\000\000\000\000<\000\000\000\000\000\00= 0\000\000\206I=D0=B5i\002\006\001\000\000\000\000\000\000\000\a\000\000\000\= f\000\000\001\000\000\000\000\000\000\000\b\000\000\000\000\000\000\214m#%\0= 01=D1=8E=D1=8F=D1=8F\006", '\0' }} >>>=20 >>>=20 >>> Ian >>>=20 >>> -- >>> Ian Freislich >>>=20 >>> _______________________________________________ >>> freebsd-current@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.or= g" >> _______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"