From owner-freebsd-net@FreeBSD.ORG Sun Nov 27 01:00:02 2005 Return-Path: X-Original-To: net@FreeBSD.org Delivered-To: freebsd-net@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4A3E016A41F; Sun, 27 Nov 2005 01:00:02 +0000 (GMT) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (cell.sick.ru [217.72.144.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id 715A743D7F; Sun, 27 Nov 2005 00:59:47 +0000 (GMT) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (glebius@localhost [127.0.0.1]) by cell.sick.ru (8.13.3/8.13.3) with ESMTP id jAR0xid3078418 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 27 Nov 2005 03:59:45 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.sick.ru (8.13.3/8.13.1/Submit) id jAR0ximR078417; Sun, 27 Nov 2005 03:59:44 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.sick.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Sun, 27 Nov 2005 03:59:43 +0300 From: Gleb Smirnoff To: ru@FreeBSD.org, Vsevolod Lobko , rwatson@FreeBSD.org Message-ID: <20051127005943.GR25711@cell.sick.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="7LkOrbQMr4cezO2T" Content-Disposition: inline User-Agent: Mutt/1.5.6i Cc: net@FreeBSD.org Subject: parallelizing ipfw table X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2005 01:00:02 -0000 --7LkOrbQMr4cezO2T Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Colleagues, in ipfw(4) we've got a reader/writer locking semantics. The ipfw lookups performed on packet forwarding obtain reader lock on ipfw chain, while altering the chain requires writer access on chain. So, in multiprocessor multiinterface box we achieve a parallizm almost without any contention. However, ipfw tables lock the RADIX trie on every lookup. At the first glance the radix.c:rn_lookup() function is reenterable. This means that we can do two parallel RADIX lookups. So, I suggest to eliminate the RADIX trie locking in ipfw, and utilize the locking that is already present in ipfw. This will: - reduce number of mutex operations for each packer - remove contention from parallel ipfw_chk() lookups A patch displaying the idea is attached. Not tested yet, read below. The patch moves the tables array into the ip_fw_chain structure. This is not necessary now, but in future we can have multiple independent chains in ipfw, that's why I try to avoid usage of &layer3_chain in the functions that are deeper in the call graph. I try to supply chain pointer from the caller. The only problem is the caching in table lookup. This "hack" makes the lookup function modify the table structure. We need to remove caching to make the lookup_table() function fully lockless and reenterable at the same time. The attached patch doesn't removes caching, since it only displays the original idea. -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE --7LkOrbQMr4cezO2T Content-Type: text/plain; charset=koi8-r Content-Disposition: attachment; filename="ip_fw.table.unlock" RCS file: /home/ncvs/src/sys/netinet/ip_fw2.c,v Working file: ip_fw2.c head: 1.115 branch: locks: strict access list: symbolic names: RELENG_6_0_0_RELEASE: 1.106.2.3 RELENG_6_0: 1.106.2.3.0.2 RELENG_6_0_BP: 1.106.2.3 RELENG_6: 1.106.0.2 RELENG_6_BP: 1.106 RELENG_5_4_0_RELEASE: 1.70.2.10 RELENG_5_4: 1.70.2.10.0.2 RELENG_5_4_BP: 1.70.2.10 RELENG_4_11_0_RELEASE: 1.6.2.23 RELENG_4_11: 1.6.2.23.0.2 RELENG_4_11_BP: 1.6.2.23 RELENG_5_3_0_RELEASE: 1.70.2.7 RELENG_5_3: 1.70.2.7.0.2 RELENG_5_3_BP: 1.70.2.7 RELENG_5: 1.70.0.2 RELENG_5_BP: 1.70 RELENG_4_10_0_RELEASE: 1.6.2.21 RELENG_4_10: 1.6.2.21.0.2 RELENG_4_10_BP: 1.6.2.21 RELENG_5_2_1_RELEASE: 1.51.2.1 RELENG_5_2_0_RELEASE: 1.51.2.1 RELENG_5_2: 1.51.0.2 RELENG_5_2_BP: 1.51 RELENG_4_9_0_RELEASE: 1.6.2.18 RELENG_4_9: 1.6.2.18.0.2 RELENG_4_9_BP: 1.6.2.18 RELENG_5_1_0_RELEASE: 1.28.2.1 RELENG_5_1: 1.28.0.2 RELENG_5_1_BP: 1.28 RELENG_4_8_0_RELEASE: 1.6.2.11 RELENG_4_8: 1.6.2.11.0.2 RELENG_4_8_BP: 1.6.2.11 RELENG_5_0_0_RELEASE: 1.19.2.1 RELENG_5_0: 1.19.0.2 RELENG_5_0_BP: 1.19 RELENG_4_7_0_RELEASE: 1.6.2.3 RELENG_4_7: 1.6.2.3.0.2 RELENG_4_7_BP: 1.6.2.3 RELENG_4: 1.6.0.2 keyword substitution: kv total revisions: 161; selected revisions: 161 description: ---------------------------- revision 1.115 date: 2005/11/10 22:10:39; author: suz; state: Exp; lines: +8 -2 fixed a bug that uRPF does not work properly for an IPv6 packet bound for the sending machine itself (this is a bug introduced due to a change in ip6_input.c:Rev.1.83) Pointed out by: Sean McNeil and J.R.Oldroyd MFC after: 3 days ---------------------------- revision 1.114 date: 2005/11/02 13:46:31; author: andre; state: Exp; lines: +1 -1 Retire MT_HEADER mbuf type and change its users to use MT_DATA. Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag. Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA. Sponsored by: TCP/IP Optimization Fundraise 2005 ---------------------------- revision 1.113 date: 2005/09/27 18:10:42; author: mlaier; state: Exp; lines: +1 -1 Remove bridge(4) from the tree. if_bridge(4) is a full functional replacement and has additional features which make it superior. Discussed on: -arch Reviewed by: thompsa X-MFC-after: never (RELENG_6 as transition period) ---------------------------- revision 1.112 date: 2005/09/19 22:29:21; author: andre; state: Exp; lines: +24 -24 Use monotonic 'time_uptime' instead of 'time_second' as timebase for timeouts. ---------------------------- revision 1.111 date: 2005/09/14 07:53:54; author: bz; state: Exp; lines: +30 -6 Fix panic when kernel compiled without INET6 by rejecting IPv6 opcodes which are behind #if(n)def INET6 now. PR: kern/85826 MFC after: 3 days ---------------------------- revision 1.110 date: 2005/09/04 17:33:40; author: sam; state: Exp; lines: +1 -0 clear lock on error in O_LIMIT case of install_state Submitted by: Ted Unangst MFC after: 3 days ---------------------------- revision 1.109 date: 2005/08/14 18:20:33; author: bz; state: Exp; lines: +7 -1 Fix broken build of rev. 1.108 in case of no INET6 and IPFIREWALL compiled into kernel. Spotted and tested by: Michal Mertl ---------------------------- revision 1.108 date: 2005/08/13 11:02:33; author: bz; state: Exp; lines: +283 -67 * Add dynamic sysctl for net.inet6.ip6.fw. * Correct handling of IPv6 Extension Headers. * Add unreach6 code. * Add logging for IPv6. Submitted by: sysctl handling derived from patch from ume needed for ip6fw Obtained from: is_icmp6_query and send_reject6 derived from similar functions of netinet6,ip6fw Reviewed by: ume, gnn; silence on ipfw@ Test setup provided by: CK Software GmbH MFC after: 6 days ---------------------------- revision 1.107 date: 2005/07/26 00:19:58; author: ume; state: Exp; lines: +3 -0 include scope6_var.h for in6_clearscope(). ---------------------------- revision 1.106 date: 2005/07/03 15:42:22; author: mlaier; state: Exp; lines: +17 -20 branches: 1.106.2; Remove ambiguity from hlen. IPv4 is now indicated by is_ipv4 and we need a proper hlen value for IPv6 to implement O_REJECT and O_LOG. Reviewed by: glebius, brooks, gnn Approved by: re (scottl) ---------------------------- revision 1.105 date: 2005/06/29 21:36:49; author: simon; state: Exp; lines: +20 -17 Fix ipfw packet matching errors with address tables. The ipfw tables lookup code caches the result of the last query. The kernel may process multiple packets concurrently, performing several concurrent table lookups. Due to an insufficient locking, a cached result can become corrupted that could cause some addresses to be incorrectly matched against a lookup table. Submitted by: ru Reviewed by: csjp, mlaier Security: CAN-2005-2019 Security: FreeBSD-SA-05:13.ipfw Correct bzip2 permission race condition vulnerability. Obtained from: Steve Grubb via RedHat Security: CAN-2005-0953 Security: FreeBSD-SA-05:14.bzip2 Approved by: obrien Correct TCP connection stall denial of service vulnerability. A TCP packets with the SYN flag set is accepted for established connections, allowing an attacker to overwrite certain TCP options. Submitted by: Noritoshi Demizu Reviewed by: andre, Mohan Srinivasan Security: CAN-2005-2068 Security: FreeBSD-SA-05:15.tcp Approved by: re (security blanket), cperciva ---------------------------- revision 1.104 date: 2005/06/16 14:55:58; author: mlaier; state: Exp; lines: +52 -18 In verify_rev_path6(): - do not use static memory as we are under a shared lock only - properly rtfree routes allocated with rtalloc - rename to verify_path6() - implement the full functionality of the IPv4 version Also make O_ANTISPOOF work with IPv6. Reviewed by: gnn Approved by: re (blanket) ---------------------------- revision 1.103 date: 2005/06/16 13:20:36; author: mlaier; state: Exp; lines: +47 -47 Fix indentation in INET6 section in preperation of more serious work. Approved by: re (blanket ip6fw removal) ---------------------------- revision 1.102 date: 2005/06/12 16:27:10; author: mlaier; state: Exp; lines: +13 -10 When doing matching based on dst_ip/src_ip make sure we are really looking on an IPv4 packet as these variables are uninitialized if not. This used to allow arbitrary IPv6 packets depending on the value in the uninitialized variables. Some opcodes (most noteably O_REJECT) do not support IPv6 at all right now. Reviewed by: brooks, glebius Security: IPFW might pass IPv6 packets depending on stack contents. Approved by: re (blanket) ---------------------------- revision 1.101 date: 2005/06/10 12:28:17; author: green; state: Exp; lines: +32 -8 Modify send_pkt() to return the generated packet and have the caller do the subsequent ip_output() in IPFW. In ipfw_tick(), the keep-alive packets must be generated from the data that resides under the stateful lock, but they must not be sent at that time, as this would cause a lock order reversal with the normal ordering (interface's lock, then locks belonging to the pfil hooks). In practice, this caused deadlocks when using IPFW and if_bridge(4) together to do stateful transparent filtering. MFC after: 1 week ---------------------------- revision 1.100 date: 2005/06/04 19:04:31; author: green; state: Exp; lines: +3 -0 Better explain, then actually implement the IPFW ALTQ-rule first-match policy. It may be used to provide more detailed classification of traffic without actually having to decide its fate at the time of classification. MFC after: 1 week ---------------------------- revision 1.99 date: 2005/06/03 01:10:28; author: mlaier; state: Exp; lines: +7 -0 Add support for IPv4 only rules to IPFW2 now that it supports IPv6 as well. This is the last requirement before we can retire ip6fw. Reviewed by: dwhite, brooks(earlier version) Submitted by: dwhite (manpage) Silence from: -ipfw ---------------------------- revision 1.98 date: 2005/05/28 07:46:44; author: tanimura; state: Exp; lines: +5 -0 Let OSPFv3 go through ipfw. Some more additional checks would be desirable, though. ---------------------------- revision 1.97 date: 2005/05/04 13:12:52; author: glebius; state: Exp; lines: +0 -4 IPFW version 2 is the only option in HEAD and RELENG_5. Thus, cleanup unnecessary now ifdefs. ---------------------------- revision 1.96 date: 2005/04/26 18:10:21; author: brooks; state: Exp; lines: +5 -9 Introduce a struct icmphdr which contains the type, code, and cksum fields of an ICMP packet. Use this to allow ipfw to pullup only these values since it does not use the rest of the packet and it was failed on ICMP packets because they were not long enough. struct icmp should probably be modified to use these at some point, but that will break a fair bit of code so it can wait for another day. On the off chance that adding this struct breaks something in ports, bump __FreeBSD_version. Reported by: Randy Bush Tested by: Randy Bush ---------------------------- revision 1.95 date: 2005/04/19 10:04:38; author: phk; state: Exp; lines: +1 -1 typo ---------------------------- revision 1.94 date: 2005/04/19 09:56:14; author: phk; state: Exp; lines: +18 -3 Make IPFIREWALL compile without INET6 ---------------------------- revision 1.93 date: 2005/04/18 18:35:05; author: brooks; state: Exp; lines: +333 -32 Add IPv6 support to IPFW and Dummynet. Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi) ---------------------------- revision 1.92 date: 2005/04/15 00:47:44; author: brooks; state: Exp; lines: +107 -90 Centralized finding the protocol header in IP packets in preperation for IPv6 support. The header in IPv6 is more complex then in IPv4 so we want to handle skipping over it in one location. Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi) ---------------------------- revision 1.91 date: 2005/03/01 12:01:17; author: glebius; state: Exp; lines: +1 -1 Use NET_CALLOUT_MPSAFE macro. ---------------------------- revision 1.90 date: 2005/02/06 11:13:59; author: glebius; state: Exp; lines: +5 -0 Jump to common action checks after doing specific once. This fixes adding of divert rules, which I break in previous commit. Pointy hat to: glebius ---------------------------- revision 1.89 date: 2005/02/05 12:06:33; author: glebius; state: Exp; lines: +23 -0 Add a ng_ipfw node, implementing a quick and simple interface between ipfw(4) and netgraph(4) facilities. Reviewed by: andre, brooks, julian ---------------------------- revision 1.88 date: 2005/01/31 00:48:39; author: csjp; state: Exp; lines: +7 -2 Change the state allocator from using regular malloc to using a UMA zone instead. This should eliminate a bit of the locking overhead associated with with malloc and reduce the memory consumption associated with each new state. Reviewed by: rwatson, andre Silence on: ipfw@ MFC after: 1 week ---------------------------- revision 1.87 date: 2005/01/14 09:00:46; author: glebius; state: Exp; lines: +23 -26 o Clean up interface between ip_fw_chk() and its callers: - ip_fw_chk() returns action as function return value. Field retval is removed from args structure. Action is not flag any more. It is one of integer constants. - Any action-specific cookies are returned either in new "cookie" field in args structure (dummynet, future netgraph glue), or in mbuf tag attached to packet (divert, tee, some future action). o Convert parsing of return value from ip_fw_chk() in ipfw_check_{in,out}() to a switch structure, so that the functions are more readable, and a future actions can be added with less modifications. Approved by: andre MFC after: 2 months ---------------------------- revision 1.86 date: 2005/01/07 01:45:44; author: imp; state: Exp; lines: +1 -1 /* -> /*- for license, minor formatting changes ---------------------------- revision 1.85 date: 2004/12/10 02:17:18; author: csjp; state: Exp; lines: +69 -29 This commit adds a shared locking mechanism very similar to the mechanism used by pfil. This shared locking mechanism will remove a nasty lock order reversal which occurs when ucred based rules are used which results in hard locks while mpsafenet=1. So this removes the debug.mpsafenet=0 requirement when using ucred based rules with IPFW. It should be noted that this locking mechanism does not guarantee fairness between read and write locks, and that it will favor firewall chain readers over writers. This seemed acceptable since write operations to firewall chains protected by this lock tend to be less frequent than reads. Reviewed by: andre, rwatson Tested by: myself, seanc Silence on: ipfw@ MFC after: 1 month ---------------------------- revision 1.84 date: 2004/11/02 22:22:21; author: andre; state: Exp; lines: +0 -5 Remove RFC1644 T/TCP support from the TCP side of the network stack. A complete rationale and discussion is given in this message and the resulting discussion: http://docs.freebsd.org/cgi/mid.cgi?4177C8AD.6060706 Note that this commit removes only the functional part of T/TCP from the tcp_* related functions in the kernel. Other features introduced with RFC1644 are left intact (socket layer changes, sendmsg(2) on connection oriented protocols) and are meant to be reused by a simpler and less intrusive reimplemention of the previous T/TCP functionality. Discussed on: -arch ---------------------------- revision 1.83 date: 2004/10/22 19:18:06; author: andre; state: Exp; lines: +1 -1 When printing the initialization string and IPDIVERT is not compiled into the kernel refer to it as "loadable" instead of "disabled". ---------------------------- revision 1.82 date: 2004/10/19 21:14:57; author: andre; state: Exp; lines: +2 -4 Convert IPDIVERT into a loadable module. This makes use of the dynamic loadability of protocols. The call to divert_packet() is done through a function pointer. All semantics of IPDIVERT remain intact. If IPDIVERT is not loaded ipfw will refuse to install divert rules and natd will complain about 'protocol not supported'. Once it is loaded both will work and accept rules and open the divert socket. The module can only be unloaded if no divert sockets are open. It does not close any divert sockets when an unload is requested but will return EBUSY instead. ---------------------------- revision 1.81 date: 2004/10/03 00:47:15; author: green; state: Exp; lines: +23 -0 Add support to IPFW for matching by TCP data length. ---------------------------- revision 1.80 date: 2004/10/03 00:26:35; author: green; state: Exp; lines: +20 -1 Add support to IPFW for classification based on "diverted" status (that is, input via a divert socket). ---------------------------- revision 1.79 date: 2004/10/03 00:17:46; author: green; state: Exp; lines: +44 -0 Add to IPFW the ability to do ALTQ classification/tagging. ---------------------------- revision 1.78 date: 2004/09/30 17:42:00; author: green; state: Exp; lines: +5 -0 Validate the action pointer to be within the rule size, so that trying to add corrupt ipfw rules would not potentially panic the system or worse. ---------------------------- revision 1.77 date: 2004/09/29 04:54:33; author: mlaier; state: Exp; lines: +32 -11 Add an additional struct inpcb * argument to pfil(9) in order to enable passing along socket information. This is required to work around a LOR with the socket code which results in an easy reproducible hard lockup with debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do so later. The missing piece is to turn the filter locking into a leaf lock and will follow in a seperate (later) commit. This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in forseeable future. Suggested by: rwatson A lot of work by: csjp (he'd be even more helpful w/o mentor-reviews ;) Reviewed by: rwatson, csjp Tested by: -pf, -ipfw, LINT, csjp and myself MFC after: 3 days LOR IDs: 14 - 17 (not fixed yet) ---------------------------- revision 1.76 date: 2004/09/13 19:27:23; author: andre; state: Exp; lines: +4 -0 Do not allow 'ipfw fwd' command when IPFIREWALL_FORWARD is not compiled into the kernel. Return EINVAL instead. ---------------------------- revision 1.75 date: 2004/09/05 20:06:50; author: glebius; state: Exp; lines: +5 -2 Recover normal behavior: return EINVAL to attempt to add a divert rule when module is built without IPDIVERT. Silence from: andre Approved by: julian (mentor) ---------------------------- revision 1.74 date: 2004/08/26 14:18:30; author: ru; state: Exp; lines: +2 -0 Revert the last change to sys/modules/ipfw/Makefile and fix a standalone module build in a better way. Silence from: andre MFC after: 3 days ---------------------------- revision 1.73 date: 2004/08/19 23:31:40; author: andre; state: Exp; lines: +1 -1 When unloading ipfw module use callout_drain() to make absolutely sure that all callouts are stopped and finished. Move it before IPFW_LOCK() to avoid deadlocking when draining callouts. ---------------------------- revision 1.72 date: 2004/08/19 17:59:26; author: andre; state: Exp; lines: +0 -2 Do not unconditionally ignore IPDIVERT and IPFIREWALL_FORWARD when building the ipfw KLD. For IPFIREWALL_FORWARD this does not have any side effects. If the module has it but not the kernel it just doesn't do anything. For IPDIVERT the KLD will be unloadable if the kernel doesn't have IPDIVERT compiled in too. However this is the least disturbing behaviour. The user can just recompile either module or the kernel to match the other one. The access to the machine is not denied if ipfw refuses to load. ---------------------------- revision 1.71 date: 2004/08/19 17:38:47; author: andre; state: Exp; lines: +3 -0 Bring back the sysctl 'net.inet.ip.fw.enable' to unbreak the startup scripts and to be able to disable ipfw if it was compiled directly into the kernel. ---------------------------- revision 1.70 date: 2004/08/17 22:05:54; author: andre; state: Exp; lines: +14 -51 branches: 1.70.2; Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different. However there are many changes how ipfw is and its add-on's are handled: In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in|out]() in the ipfw PFIL handler. IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output(). ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it. DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection. BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS. More detailed changes to the code: conf/files Add netinet/ip_fw_pfil.c. conf/options Add IPFIREWALL_FORWARD option. modules/ipfw/Makefile Add ip_fw_pfil.c. net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well. netinet/ip_divert.c Removed divert_clone() function. It is no longer used. netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed. netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args. netinet/ip_fw2.c (Re)moved some global variables and the module handling. netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization. netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set. netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.) netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active. netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags. netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here. sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed. Approved by: re (scottl) ---------------------------- revision 1.69 date: 2004/08/12 22:05:47; author: csjp; state: Exp; lines: +9 -1 Add the ability to associate ipfw rules with a specific prison ID. Since the only thing truly unique about a prison is it's ID, I figured this would be the most granular way of handling this. This commit makes the following changes: - Adds tokenizing and parsing for the ``jail'' command line option to the ipfw(8) userspace utility. - Append the ipfw opcode list with O_JAIL. - While Iam here, add a comment informing others that if they want to add additional opcodes, they should append them to the end of the list to avoid ABI breakage. - Add ``fw_prid'' to the ipfw ucred cache structure. - When initializing ucred cache, if the process is jailed, set fw_prid to the prison ID, otherwise set it to -1. - Update man page to reflect these changes. This change was a strong motivator behind the ucred caching mechanism in ipfw. A sample usage of this new functionality could be: ipfw add count ip from any to any jail 2 It should be noted that because ucred based constraints are only implemented for TCP and UDP packets, the same applies for jail associations. Conceptual head nod by: pjd Reviewed by: rwatson Approved by: bmilekic (mentor) ---------------------------- revision 1.68 date: 2004/08/11 11:41:11; author: andre; state: Exp; lines: +4 -4 Only invoke verify_path() for verrevpath and versrcreach when we have an IP packet. ---------------------------- revision 1.67 date: 2004/08/09 16:12:10; author: andre; state: Exp; lines: +11 -0 New ipfw option "antispoof": For incoming packets, the packet's source address is checked if it belongs to a directly connected network. If the network is directly connected, then the interface the packet came on in is compared to the interface the network is connected to. When incoming interface and directly connected interface are not the same, the packet does not match. Usage example: ipfw add deny ip from any to any not antispoof in Manpage education by: ru ---------------------------- revision 1.66 date: 2004/07/21 19:55:14; author: andre; state: Exp; lines: +6 -0 Extend versrcreach by checking against the rt_flags for RTF_REJECT and RTF_BLACKHOLE as well. To quote the submitter: The uRPF loose-check implementation by the industry vendors, at least on Cisco and possibly Juniper, will fail the check if the route of the source address is pointed to Null0 (on Juniper, discard or reject route). What this means is, even if uRPF Loose-check finds the route, if the route is pointed to blackhole, uRPF loose-check must fail. This allows people to utilize uRPF loose-check mode as a pseudo-packet-firewall without using any manual filtering configuration -- one can simply inject a IGP or BGP prefix with next-hop set to a static route that directs to null/discard facility. This results in uRPF Loose-check failing on all packets with source addresses that are within the range of the nullroute. Submitted by: James Jun ---------------------------- revision 1.65 date: 2004/07/17 02:40:13; author: jmallett; state: Exp; lines: +0 -12 Make M_SKIP_FIREWALL a global (and semantic) flag, preventing anything from using M_PROTO6 and possibly shooting someone's foot, as well as allowing the firewall to be used in multiple passes, or with a packet classifier frontend, that may need to explicitly allow a certain packet. Presently this is handled in the ipfw_chk code as before, though I have run with it moved to upper layers, and possibly it should apply to ipfilter and pf as well, though this has not been investigated. Discussed with: luigi, rwatson ---------------------------- revision 1.64 date: 2004/07/15 08:26:07; author: phk; state: Exp; lines: +1 -0 Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything". ---------------------------- revision 1.63 date: 2004/06/24 02:01:48; author: rwatson; state: Exp; lines: +4 -1 When asserting non-Giant locks in the network stack, also assert Giant if debug.mpsafenet=0, as any points that require synchronization in the SMPng world also required it in the Giant-world: - inpcb locks (including IPv6) - inpcbinfo locks (including IPv6) - dummynet subsystem lock - ipfw2 subsystem lock ---------------------------- revision 1.62 date: 2004/06/11 22:17:14; author: csjp; state: Exp; lines: +77 -30 Modify ip fw so that whenever UID or GID constraints exist in a ruleset, the pcb is looked up once per ipfw_chk() activation. This is done by extracting the required information out of the PCB and caching it to the ipfw_chk() stack. This should greatly reduce PCB looking contention and speed up the processing of UID/GID based firewall rules (especially with large UID/GID rulesets). Some very basic benchmarks were taken which compares the number of in_pcblookup_hash(9) activations to the number of firewall rules containing UID/GID based contraints before and after this patch. The results can be viewed here: o http://people.freebsd.org/~csjp/ip_fw_pcb.png Reviewed by: andre, luigi, rwatson Approved by: bmilekic (mentor) ---------------------------- revision 1.61 date: 2004/06/10 20:20:37; author: ru; state: Exp; lines: +4 -1 init_tables() must be run after sys/net/route.c:route_init(). ---------------------------- revision 1.60 date: 2004/06/09 20:10:38; author: ru; state: Exp; lines: +324 -1 Introduce a new feature to IPFW2: lookup tables. These are useful for handling large sparse address sets. Initial implementation by Vsevolod Lobko , refined by me. MFC after: 1 week ---------------------------- revision 1.59 date: 2004/05/30 17:57:45; author: phk; state: Exp; lines: +1 -0 Add some missing includes which are masked by the one on death-row in ---------------------------- revision 1.58 date: 2004/05/25 15:02:12; author: csjp; state: Exp; lines: +4 -0 Add a super-user check to ipfw_ctl() to make sure that the calling process is a non-prison root. The security.jail.allow_raw_sockets sysctl variable is disabled by default, however if the user enables raw sockets in prisons, prison-root should not be able to interact with firewall rule sets. Approved by: rwatson, bmilekic (mentor) ---------------------------- revision 1.57 date: 2004/04/23 14:27:27; author: andre; state: Exp; lines: +31 -7 Add the option versrcreach to verify that a valid route to the source address of a packet exists in the routing table. The default route is ignored because it would match everything and render the check pointless. This option is very useful for routers with a complete view of the Internet (BGP) in the routing table to reject packets with spoofed or unrouteable source addresses. Example: ipfw add 1000 deny ip from any to any not versrcreach also known in Cisco-speak as: ip verify unicast source reachable-via any Reviewed by: luigi ---------------------------- revision 1.56 date: 2004/02/25 19:55:28; author: mlaier; state: Exp; lines: +25 -5 Re-remove MT_TAGs. The problems with dummynet have been fixed now. Tested by: -current, bms(mentor), me Approved by: bms(mentor), sam ---------------------------- revision 1.55 date: 2004/02/18 00:04:51; author: mlaier; state: Exp; lines: +5 -25 Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet is not working properly with the patch in place. Approved by: bms(mentor) ---------------------------- revision 1.54 date: 2004/02/13 19:14:15; author: mlaier; state: Exp; lines: +25 -5 This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacing them mostly with packet tags (one case is handled by using an mbuf flag since the linkage between "caller" and "callee" is direct and there's no need to incur the overhead of a packet tag). This is (mostly) work from: sam Silence from: -arch Approved by: bms(mentor), sam, rwatson ---------------------------- revision 1.53 date: 2003/12/24 18:22:04; author: ume; state: Exp; lines: +1 -1 NULL is not 0. Submitted by: "Bjoern A. Zeeb" ---------------------------- revision 1.52 date: 2003/12/16 18:21:47; author: maxim; state: Exp; lines: +1 -1 o IN_MULTICAST wants an address in host byte order. PR: kern/60304 Submitted by: demon MFC after: 1 week ---------------------------- revision 1.51 date: 2003/12/02 00:23:45; author: sam; state: Exp; lines: +1 -0 branches: 1.51.2; Include opt_ipsec.h so IPSEC/FAST_IPSEC is defined and the appropriate code is compiled in to support the O_IPSEC operator. Previously no support was included and ipsec rules were always matching. Note that we do not return an error when an ipsec rule is added and the kernel does not have IPsec support compiled in; this is done intentionally but we may want to revisit this (document this in the man page). PR: 58899 Submitted by: Bjoern A. Zeeb Approved by: re (rwatson) ---------------------------- revision 1.50 date: 2003/11/27 09:40:13; author: andre; state: Exp; lines: +7 -13 Fix verify_rev_path() function. The author of this function tried to cut corners which completely broke down when the routing table locking was introduced. Reviewed by: sam (mentor) Approved by: re (rwatson) ---------------------------- revision 1.49 date: 2003/11/24 03:57:03; author: sam; state: Exp; lines: +9 -5 Correct a problem where ipfw-generated packets were being returned for ipfw processing w/o an indication the packets were generated by ipfw--and so should not be processed (this manifested itself as a LOR.) The flag bit in the mbuf that was used to mark the packets was not listed in M_COPYFLAGS so if a packet had a header prepended (as done by IPsec) the flag was lost. Correct this by defining a new M_PROTO6 flag and use it to mark packets that need this processing. Reviewed by: bms Approved by: re (rwatson) MFC after: 2 weeks ---------------------------- revision 1.48 date: 2003/11/23 18:13:41; author: sam; state: Exp; lines: +1 -1 Use MPSAFE callouts only when debug.mpsafenet is 1. Both timer routines potentially transmit packets that may enter KAME IPsec w/o Giant if the callouts are marked MPSAFE. Reviewed by: ume Approved by: re (rwatson) ---------------------------- revision 1.47 date: 2003/11/20 20:07:37; author: andre; state: Exp; lines: +6 -3 Introduce tcp_hostcache and remove the tcp specific metrics from the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache. It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve. tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address. It removes significant locking requirements from the tcp stack with regard to the routing table. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl) ---------------------------- revision 1.46 date: 2003/11/20 19:47:30; author: andre; state: Exp; lines: +1 -1 Remove RTF_PRCLONING from routing table and adjust users of it accordingly. The define is left intact for ABI compatibility with userland. This is a pre-step for the introduction of tcp_hostcache. The network stack remains fully useable with this change. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl) ---------------------------- revision 1.45 date: 2003/11/20 10:28:33; author: maxim; state: Exp; lines: +2 -2 Fix an arguments order in check_uidgid() call. PR: kern/59314 Submitted by: Andrey V. Shytov Approved by: re (rwatson, jhb) ---------------------------- revision 1.44 date: 2003/11/14 21:48:56; author: andre; state: Exp; lines: +1 -6 Remove the global one-level rtcache variable and associated complex locking and rework ip_rtaddr() to do its own rtlookup. Adopt all its callers to this and make ip_output() callable with NULL rt pointer. Reviewed by: sam (mentor) ---------------------------- revision 1.43 date: 2003/11/07 23:26:57; author: sam; state: Exp; lines: +60 -40 Move uid/gid checking logic out of line and lock inpcb usage. This has a LOR between IPFW inpcb locks but I'm committing it now as the lesser of two evils (the other being unlocked use of in_pcblookup). Supported by: FreeBSD Foundation ---------------------------- revision 1.42 date: 2003/11/07 20:25:47; author: ume; state: Exp; lines: +1 -1 use ipsec_getnhist() instead of obsoleted ipsec_gethist(). Submitted by: "Bjoern A. Zeeb" Reviewed by: Ari Suutari (ipfw@) ---------------------------- revision 1.41 date: 2003/10/31 18:32:11; author: brooks; state: Exp; lines: +9 -8 Replace the if_name and if_unit members of struct ifnet with new members if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname) ---------------------------- revision 1.40 date: 2003/10/16 02:00:12; author: mckusick; state: Exp; lines: +7 -4 Malloc buckets of size 128 have been having their 64-byte offset trashed after being freed. This has caused several panics including kern/42277 related to soft updates. Jim Kuhn tracked the problem down to ipfw limit rule processing. In the expiry of dynamic rules, it is possible for an O_LIMIT_PARENT rule to be removed when it still has live children. When the children eventually do expire, a pointer to the (long gone) parent is dereferenced and a count decremented. Since this memory can, and is, allocated for other purposes (in the case of kern/42277 an inodedep structure), chaos ensues. The offset in question in inodedep is the offset of the 16 bit count field in the ipfw2 ipfw_dyn_rule. Submitted by: Jim Kuhn Reviewed by: "Evgueni V. Gavrilov" Reviewed by: Ben Pfountz MFC after: 1 week ---------------------------- revision 1.39 date: 2003/09/17 22:06:47; author: sam; state: Exp; lines: +2 -1 Bandaid locking change: mark static rule mutex recursive so re-entry when sending an ICMP packet doesn't cause a panic. A better solution is needed; possibly defering the transmit to a dedicated thread. Observed by: "Aaron Wohl" ---------------------------- revision 1.38 date: 2003/09/17 00:56:50; author: sam; state: Exp; lines: +309 -164 Add locking. o change timeout to MPSAFE callout o restructure rule deletion to deal with locking requirements o replace static buffer used for ipfw control operations with malloc'd storage Sponsored by: FreeBSD Foundation ---------------------------- revision 1.37 date: 2003/07/15 23:07:34; author: luigi; state: Exp; lines: +27 -23 Allow set 31 to be used for rules other than 65535. Set 31 is still special because rules belonging to it are not deleted by the "ipfw flush" command, but must be deleted explicitly with "ipfw delete set 31" or by individual rule numbers. This implement a flexible form of "persistent rules" which you might want to have available even after an "ipfw flush". Note that this change does not violate POLA, because you could not use set 31 in a ruleset before this change. sbin/ipfw changes to allow manipulation of set 31 will follow shortly. Suggested by: Paul Richards ---------------------------- revision 1.36 date: 2003/07/12 05:54:17; author: luigi; state: Exp; lines: +1 -1 Implement comments embedded into ipfw2 instructions. Since we already had 'O_NOP' instructions which always match, all I needed to do is allow the NOP command to have arbitrary length (i.e. move its label in a different part of the switch() which validates instructions). The kernel must know nothing about comments, everything else is done in userland (which will be described in the upcoming ipfw2.c commit). ---------------------------- revision 1.35 date: 2003/07/08 07:44:42; author: luigi; state: Exp; lines: +13 -17 Merge the handlers of O_IP_SRC_MASK and O_IP_DST_MASK opcodes, and support matching a list of addr/mask pairs so one can write more efficient rulesets which were not possible before e.g. add 100 skipto 1000 not src-ip 10.0.0.0/8,127.0.0.1/8,192.168.0.0/16 The change is fully backward compatible. ipfw2 and manpage commit to follow. MFC after: 3 days ---------------------------- revision 1.34 date: 2003/07/04 21:42:32; author: luigi; state: Exp; lines: +16 -0 Implement the 'ipsec' option to match packets coming out of an ipsec tunnel. Should work with both regular and fast ipsec (mutually exclusive). See manpage for more details. Submitted by: Ari Suutari (ari.suutari@syncrontech.com) Revised by: sam MFC after: 1 week ---------------------------- revision 1.33 date: 2003/06/28 14:16:53; author: luigi; state: Exp; lines: +1 -1 whitespace fix ---------------------------- revision 1.32 date: 2003/06/23 21:18:56; author: luigi; state: Exp; lines: +4 -4 Remove whitespace at end of line. ---------------------------- revision 1.31 date: 2003/06/22 17:33:19; author: luigi; state: Exp; lines: +29 -12 Add support for multiple values and ranges for the "iplen", "ipttl", "ipid" options. This feature has been requested by several users. On passing, fix some minor bugs in the parser. This change is fully backward compatible so if you have an old /sbin/ipfw and a new kernel you are not in trouble (but you need to update /sbin/ipfw if you want to use the new features). Document the changes in the manpage. Now you can write things like ipfw add skipto 1000 iplen 0-500 which some people were asking to give preferential treatment to short packets. The 'MFC after' is just set as a reminder, because I still need to merge the Alpha/Sparc64 fixes for ipfw2 (which unfortunately change the size of certain kernel structures; not that it matters a lot since ipfw2 is entirely optional and not the default...) PR: bin/48015 MFC after: 1 week ---------------------------- revision 1.30 date: 2003/06/04 01:17:37; author: ticso; state: Exp; lines: +15 -6 Change handling to support strong alignment architectures such as alpha and sparc64. PR: alpha/50658 Submitted by: rizzo Tested on: alpha ---------------------------- revision 1.29 date: 2003/06/02 23:54:09; author: kbyanc; state: Exp; lines: +6 -3 Account for packets processed at layer-2 (i.e. net.link.ether.ipfw=1). MFC after: 2 weeks ---------------------------- revision 1.28 date: 2003/03/15 01:13:00; author: cjc; state: Exp; lines: +50 -0 branches: 1.28.2; Add a 'verrevpath' option that verifies the interface that a packet comes in on is the same interface that we would route out of to get to the packet's source address. Essentially automates an anti-spoofing check using the information in the routing table. Experimental. The usage and rule format for the feature may still be subject to change. ---------------------------- revision 1.27 date: 2003/02/19 05:47:34; author: imp; state: Exp; lines: +2 -2 Back out M_* changes, per decision of the TRB. Approved by: trb ---------------------------- revision 1.26 date: 2003/02/17 13:39:57; author: maxim; state: Exp; lines: +2 -2 o Fix ipfw uid rules: socheckuid() returns 0 when uid matches a socket cr_uid. Note: we do not have socheckuid() in RELENG_4, ip_fw2.c uses its own macro for a similar purpose that is why ipfw2 in RELENG_4 processes uid rules correctly. I will MFC the diff for code consistency. Reported by: Oleg Baranov Reviewed by: luigi MFC after: 1 month ---------------------------- revision 1.25 date: 2003/01/21 08:56:03; author: alfred; state: Exp; lines: +2 -2 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT. ---------------------------- revision 1.24 date: 2003/01/20 11:58:34; author: maxim; state: Exp; lines: +2 -0 If the first action is O_LOG adjust a pointer to the real one, unbreaks skipto + log rules. Reported by: Wiktor Niesiobedzki MFC after: 1 week ---------------------------- revision 1.23 date: 2003/01/14 19:35:33; author: dillon; state: Exp; lines: +3 -3 Introduce the ability to flag a sysctl for operation at secure level 2 or 3 in addition to secure level 1. The mask supports up to a secure level of 8 but only add defines through CTLFLAG_SECURE3 for now. As per the missif in the log entry for 1.11 of ip_fw2.c which added the secure flag to the IPFW sysctl's in the first place, change the secure level requirement from 1 to 3 now that we have support for it. Reviewed by: imp With Design Suggestions by: imp ---------------------------- revision 1.22 date: 2002/12/27 17:43:25; author: iedowse; state: Exp; lines: +8 -2 Bridged packets are supplied to the firewall with their IP header in network byte order, but icmp_error() expects the IP header to be in host order and the code here did not perform the necessary swapping for the bridged case. This bug causes an "icmp_error: bad length" panic when certain length IP packets (e.g. ip_len == 0x100) are rejected by the firewall with an ICMP response. MFC after: 3 days ---------------------------- revision 1.21 date: 2002/12/24 13:45:23; author: maxim; state: Exp; lines: +16 -15 o De-anonymity dummynet(4) and ipfw(4) messages, prepend them by 'dummynet: ' and 'ipfw: ' prefixes. PR: kern/41609 ---------------------------- revision 1.20 date: 2002/12/15 09:44:02; author: maxim; state: Exp; lines: +1 -1 o Fix byte order logging issue: sa.sin_port is already in host byte order. PR: kern/45964 Submitted by: Sascha Blank Reviewed by: luigi MFC after: 1 week ---------------------------- revision 1.19 date: 2002/11/20 19:07:27; author: luigi; state: Exp; lines: +0 -1 branches: 1.19.2; Move fw_one_pass from ip_fw2.c to ip_input.c so that neither bridge.c nor if_ethersubr.c depend on IPFIREWALL. Restore the use of fw_one_pass in if_ethersubr.c ipfw.8 will be updated with a separate commit. Approved by: re ---------------------------- revision 1.18 date: 2002/10/29 08:53:14; author: maxim; state: Exp; lines: +1 -1 Lower a priority of "session drop" messages. Requested by: Eugene Grosbein MFC after: 3 days ---------------------------- revision 1.17 date: 2002/10/24 18:04:44; author: mux; state: Exp; lines: +2 -6 Fix ipfw2 panics on 64-bit platforms. Quoting luigi: In order to make the userland code fully 64-bit clean it may be necessary to commit other changes that may or may not cause a minor change in the ABI. Reviewed by: luigi ---------------------------- revision 1.16 date: 2002/10/24 18:01:53; author: luigi; state: Exp; lines: +2 -2 src and dst address were erroneously swapped in SRC_SET and DST_SET commands. Use the correct one. Also affects ipfw2 in -stable. ---------------------------- revision 1.15 date: 2002/10/23 10:07:55; author: maxim; state: Exp; lines: +23 -24 Kill EOL spaces. Approved by: luigi MFC after: 1 week ---------------------------- revision 1.14 date: 2002/10/23 10:05:19; author: maxim; state: Exp; lines: +1 -1 Use syslog for messages about dropped sessions, do not flood a console. Suggested by: Eugene Grosbein Approved by: luigi MFC after: 1 week ---------------------------- revision 1.13 date: 2002/10/19 11:31:50; author: mux; state: Exp; lines: +3 -3 Several malloc() calls were passing the M_DONTWAIT flag which is an mbuf allocation flag. Use the correct M_NOWAIT malloc() flag. Fortunately, both were defined to 1, so this commit is a no-op. ---------------------------- revision 1.12 date: 2002/10/16 01:54:44; author: sam; state: Exp; lines: +1 -1 Replace aux mbufs with packet tags: o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month ---------------------------- revision 1.11 date: 2002/08/25 03:50:17; author: cjc; state: Exp; lines: +6 -3 Lock the sysctl(8) knobs that turn ip{,6}fw(8) firewalling and firewall logging on and off when at elevated securelevel(8). It would be nice to be able to only lock these at securelevel >= 3, like rules are, but there is no such functionality at present. I don't see reason to be adding features to securelevel(8) with MAC being merged into 5.0. PR: kern/39396 Reviewed by: luigi MFC after: 1 week ---------------------------- revision 1.10 date: 2002/08/19 04:45:01; author: luigi; state: Exp; lines: +4 -6 Raise limit for port lists to 30 entries/ranges. Remove a duplicate "logging" message, and identify the firewall as ipfw2 in the boot message. ---------------------------- revision 1.9 date: 2002/08/16 10:31:47; author: luigi; state: Exp; lines: +116 -47 sys/netinet/ip_fw2.c: Implement the M_SKIP_FIREWALL bit in m_flags to avoid loops for firewall-generated packets (the constant has to go in sys/mbuf.h). Better comments on keepalive generation, and enforce dyn_rst_lifetime and dyn_fin_lifetime to be less than dyn_keepalive_period. Enforce limits (up to 64k) on the number of dynamic buckets, and retry allocation with smaller sizes. Raise default number of dynamic rules to 4096. Improved handling of set of rules -- now you can atomically enable/disable multiple sets, move rules from one set to another, and swap sets. sbin/ipfw/ipfw2.c: userland support for "noerror" pipe attribute. userland support for sets of rules. minor improvements on rule parsing and printing. sbin/ipfw/ipfw.8: more documentation on ipfw2 extensions, differences from ipfw1 (so we can use the same manpage for both), stateful rules, and some additional examples. Feedback and more examples needed here. ---------------------------- revision 1.8 date: 2002/08/13 19:13:23; author: phk; state: Exp; lines: +1 -1 remove spurious printf ---------------------------- revision 1.7 date: 2002/08/10 04:37:32; author: luigi; state: Exp; lines: +95 -18 One bugfix and one new feature. The bugfix (ipfw2.c) makes the handling of port numbers with a dash in the name, e.g. ftp-data, consistent with old ipfw: use \\ before the - to consider it as part of the name and not a range separator. The new feature (all this description will go in the manpage): each rule now belongs to one of 32 different sets, which can be optionally specified in the following form: ipfw add 100 set 23 allow ip from any to any If "set N" is not specified, the rule belongs to set 0. Individual sets can be disabled, enabled, and deleted with the commands: ipfw disable set N ipfw enable set N ipfw delete set N Enabling/disabling of a set is atomic. Rules belonging to a disabled set are skipped during packet matching, and they are not listed unless you use the '-S' flag in the show/list commands. Note that dynamic rules, once created, are always active until they expire or their parent rule is deleted. Set 31 is reserved for the default rule and cannot be disabled. All sets are enabled by default. The enable/disable status of the sets can be shown with the command ipfw show sets Hopefully, this feature will make life easier to those who want to have atomic ruleset addition/deletion/tests. Examples: To add a set of rules atomically: ipfw disable set 18 ipfw add ... set 18 ... # repeat as needed ipfw enable set 18 To delete a set of rules atomically ipfw disable set 18 ipfw delete set 18 ipfw enable set 18 To test a ruleset and disable it and regain control if something goes wrong: ipfw disable set 18 ipfw add ... set 18 ... # repeat as needed ipfw enable set 18 ; echo "done "; sleep 30 && ipfw disable set 18 here if everything goes well, you press control-C before the "sleep" terminates, and your ruleset will be left active. Otherwise, e.g. if you cannot access your box, the ruleset will be disabled after the sleep terminates. I think there is only one more thing that one might want, namely a command to assign all rules in set X to set Y, so one can test a ruleset using the above mechanisms, and once it is considered acceptable, make it part of an existing ruleset. ---------------------------- revision 1.6 date: 2002/07/24 02:41:19; author: luigi; state: Exp; lines: +2 -1 branches: 1.6.2; Only log things net.inet.ip.fw.verbose is set ---------------------------- revision 1.5 date: 2002/07/14 23:47:18; author: luigi; state: Exp; lines: +145 -104 Implement keepalives for dynamic rules, so they will not expire just because you leave your session idle. Also, put in a fix for 64-bit architectures (to be revised). In detail: ip_fw.h * Reorder fields in struct ip_fw to avoid alignment problems on 64-bit machines. This only masks the problem, I am still not sure whether I am doing something wrong in the code or there is a problem elsewhere (e.g. different aligmnent of structures between userland and kernel because of pragmas etc.) * added fields in dyn_rule to store ack numbers, so we can generate keepalives when the dynamic rule is about to expire ip_fw2.c * use a local function, send_pkt(), to generate TCP RST for Reset rules; * save about 250 bytes by cleaning up the various snprintf() in ipfw_log() ... * ... and use twice as many bytes to implement keepalives (this seems to be working, but i have not tested it extensively). Keepalives are generated once every 5 seconds for the last 20 seconds of the lifetime of a dynamic rule for an established TCP flow. The packets are sent to both sides, so if at least one of the endpoints is responding, the timeout is refreshed and the rule will not expire. You can disable this feature with sysctl net.inet.ip.fw.dyn_keepalive=0 (the default is 1, to have them enabled). MFC after: 1 day (just kidding... I will supply an updated version of ipfw2 for RELENG_4 tomorrow). ---------------------------- revision 1.4 date: 2002/07/08 22:46:01; author: luigi; state: Exp; lines: +266 -278 No functional changes, but: Following Darren's suggestion, make Dijkstra happy and rewrite the ipfw_chk() main loop removing a lot of goto's and using instead a variable to store match status. Add a lot of comments to explain what instructions are supposed to do and how -- this should ease auditing of the code and make people more confident with it. In terms of code size: the entire file takes about 12700 bytes of text, about 3K of which are for the main function, ipfw_chk(), and 2K (ouch!) for ipfw_log(). ---------------------------- revision 1.3 date: 2002/07/05 22:43:06; author: luigi; state: Exp; lines: +110 -36 Implement the last 2-3 missing instructions for ipfw, now it should support all the instructions of the old ipfw. Fix some bugs in the user interface, /sbin/ipfw. Please check this code against your rulesets, so i can fix the remaining bugs (if any, i think they will be mostly in /sbin/ipfw). Once we have done a bit of testing, this code is ready to be MFC'ed, together with a bunch of other changes (glue to ipfw, and also the removal of some global variables) which have been in -current for a couple of weeks now. MFC after: 7 days ---------------------------- revision 1.2 date: 2002/06/28 08:36:26; author: dfr; state: Exp; lines: +1 -1 Fix warning. Reviewed by: luigi ---------------------------- revision 1.1 date: 2002/06/27 23:02:17; author: luigi; state: Exp; The new ipfw code. This code makes use of variable-size kernel representation of rules (exactly the same concept of BPF instructions, as used in the BSDI's firewall), which makes firewall operation a lot faster, and the code more readable and easier to extend and debug. The interface with the rest of the system is unchanged, as witnessed by this commit. The only extra kernel files that I am touching are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In userland I only had to touch those programs which manipulate the internal representation of firewall rules). The code is almost entirely new (and I believe I have written the vast majority of those sections which were taken from the former ip_fw.c), so rather than modifying the old ip_fw.c I decided to create a new file, sys/netinet/ip_fw2.c . Same for the user interface, which is in sbin/ipfw/ipfw2.c (it still compiles to /sbin/ipfw). The old files are still there, and will be removed in due time. I have not renamed the header file because it would have required touching a one-line change to a number of kernel files. In terms of user interface, the new "ipfw" is supposed to accepts the old syntax for ipfw rules (and produce the same output with "ipfw show". Only a couple of the old options (out of some 30 of them) has not been implemented, but they will be soon. On the other hand, the new code has some very powerful extensions. First, you can put "or" connectives between match fields (and soon also between options), and write things like ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any This should make rulesets slightly more compact (and lines longer!), by condensing 2 or more of the old rules into single ones. Also, as an example of how easy the rules can be extended, I have implemented an 'address set' match pattern, where you can specify an IP address in a format like this: 10.20.30.0/26{18,44,33,22,9} which will match the set of hosts listed in braces belonging to the subnet 10.20.30.0/26 . The match is done using a bitmap, so it is essentially a constant time operation requiring a handful of CPU instructions (and a very small amount of memmory -- for a full /24 subnet, the instruction only consumes 40 bytes). Again, in this commit I have focused on functionality and tried to minimize changes to the other parts of the system. Some performance improvement can be achieved with minor changes to the interface of ip_fw_chk_t. This will be done later when this code is settled. The code is meant to compile unmodified on RELENG_4 (once the PACKET_TAG_* changes have been merged), for this reason you will see #ifdef __FreeBSD_version in a couple of places. This should minimize errors when (hopefully soon) it will be time to do the MFC. ---------------------------- revision 1.6.2.23 date: 2004/09/13 07:21:17; author: maxim; state: Exp; lines: +5 -2 MFC rev.1.75: recover normal behavior: return EINVAL to attempt to add a divert rule when module is built without IPDIVERT. ---------------------------- revision 1.6.2.22 date: 2004/06/16 06:57:49; author: ru; state: Exp; lines: +335 -0 MFC: IPFW2 lookup tables. ---------------------------- revision 1.6.2.21 date: 2004/04/02 17:15:44; author: andre; state: Exp; lines: +13 -16 MFC of rev 1.50, fix refcount leak in verify_rev_path() function. Reviewed by: sam ---------------------------- revision 1.6.2.20 date: 2004/01/22 08:50:26; author: maxim; state: Exp; lines: +1 -0 MFC rev. 1.51: include opt_ipsec.h so IPSEC/FAST_IPSEC is defined and the appropriate code is compiled in to support the O_IPSEC operator. ---------------------------- revision 1.6.2.19 date: 2003/12/23 12:23:57; author: maxim; state: Exp; lines: +1 -1 MFC rev. 1.52: IN_MULTICAST wants an address in host byte order. ---------------------------- revision 1.6.2.18 date: 2003/10/17 11:01:03; author: scottl; state: Exp; lines: +7 -4 MFC Rev 1.40 to fix use-after-free problems with dynamic rules. Submitted by: mckusick Approved by: re (murray> ---------------------------- revision 1.6.2.17 date: 2003/10/01 07:07:07; author: bms; state: Exp; lines: +3 -3 MFC: The knobs controlling ipfw2 and ip6fw were not protected when running at an elevated securelevel. Fix this behaviour and function as documented. PR: kern/39396 Approved by: re (rwatson) ---------------------------- revision 1.6.2.16 date: 2003/07/17 06:03:39; author: luigi; state: Exp; lines: +66 -45 MFC: sync ipfw2 with the version in -current, including: * implement a '-n' option to do a syntax-check only of ipfw2 rules; * allow spaces after commas in ipfw rules; * support for comma-separated address lists e.g. ipfw add allow ip from not 10.0.0.0/8, 192.168.0.0/16, 1.2.3.4 to me (note the possibility to put a 'not' in front of the entire list, which was not possible with "or blocks"); * allow comments in ipfw rules which are stored together with rules and appear upon an 'ipfw show': ipfw add allow udp from any to any 53 // nameserver * allow set 31 to be used for ordinary (non-default) rules, but with the special feature that rules in set 31 cannot be disabled and are not affected by a 'flush' command (so they must be deleted explicitly). This permits a flexible form of "persistent" rules which should survive across firewall reloads. * allow ranges to be specified in the "ipfw show" and "ipfw list" commands (the same ought to be done for ""ipfw delete"): ipfw show 100-1000 2000 3000-5500 I believe the kernel side of these changes is entirely backward compatible with the old /sbin/ipfw[2], though of course you need to update the userland command to use the new features. ---------------------------- revision 1.6.2.15 date: 2003/06/28 16:12:13; author: luigi; state: Exp; lines: +94 -23 MFC: sync ipfw2 (kernel, userland, manpage) with the version in -current. Among other things, this includes the following: + pass to the preprocessor all command-line options after -p (except the last one, the ruleset file) + add the "verrevpath" option + support strong alignment architectures such as alpha and sparc64; + support multiple values and ranges for "iplen", "ipttl", "ipid" options. + support range notations such as 1.2.3.4/24{5,6,7,10-20,60-90} for sets of IP addresses The changes (also those in sys/netinet/ip_dummynet.c) are all IPFW2-specific, which is entirely optional in RELENG_4 so there are no ABI issues for those using the standard ipfw[1]. Note, however, that ipfw2 users MUST REBUILD /sbin/ipfw together with the new kernel. ---------------------------- revision 1.6.2.14 date: 2003/06/23 22:00:21; author: luigi; state: Exp; lines: +16 -15 MFC: add subsystem names in front of the messages generated by ipfw[2] and dummynet. There ought to be a better way to do this, something similar to device_printf() which automatically generates the subsystem names! ---------------------------- revision 1.6.2.13 date: 2003/06/18 02:55:08; author: kbyanc; state: Exp; lines: +6 -3 MFC revision 1.29: Account for packets processed at layer-2 (i.e. net.link.ether.ipfw=1). ---------------------------- revision 1.6.2.12 date: 2003/04/08 10:42:32; author: maxim; state: Exp; lines: +2 -2 MFC rev. 1.26: make socheckuid() consistent across branches. ---------------------------- revision 1.6.2.11 date: 2003/01/27 13:45:20; author: maxim; state: Exp; lines: +2 -0 MFC rev. 1.24: if the first action is O_LOG adjust a pointer to the real one, unbreaks skipto + log rules. ---------------------------- revision 1.6.2.10 date: 2003/01/23 21:06:45; author: sam; state: Exp; lines: +1 -1 MFC: m_tag support Replace aux mbufs with packet tags: o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version ---------------------------- revision 1.6.2.9 date: 2003/01/20 02:23:08; author: iedowse; state: Exp; lines: +8 -2 MFC: Bridged packets need to have their IP header converted to host byte order before being passed to icmp_error(). ---------------------------- revision 1.6.2.8 date: 2002/12/23 10:02:40; author: maxim; state: Exp; lines: +1 -1 MFC rev. 1.20: fix byte order logging issue: sa.sin_port is already in host byte order. ---------------------------- revision 1.6.2.7 date: 2002/11/21 01:27:30; author: luigi; state: Exp; lines: +0 -1 MFC: obey to fw_one_pass in bridge and layer 2 firewalling (the latter only affects ipfw2 users). Move fw_one_pass from ip_fw[2].c to ip_input.c to avoid depending on IPFIREWALL. ---------------------------- revision 1.6.2.6 date: 2002/11/18 21:32:37; author: luigi; state: Exp; lines: +3 -3 MFC 1.12: replace incorrect usage of M_DONTWAIT with M_NOWAIT ---------------------------- revision 1.6.2.5 date: 2002/11/04 15:57:21; author: maxim; state: Exp; lines: +24 -25 MFC revs. 1.14, 1.15, 1.18: o Use syslog for messages about dropped sessions, do not flood a console. o Kill EOL spaces. ---------------------------- revision 1.6.2.4 date: 2002/10/24 18:24:13; author: luigi; state: Exp; lines: +2 -2 MFC: use correct address in DST_SET and SRC_SET rules. ---------------------------- revision 1.6.2.3 date: 2002/08/21 05:34:07; author: luigi; state: Exp; lines: +4 -6 Sync the file with the one in -current: allow up to 30 ports/ranges in port specifications, clean up formatting of the boot string. No functional or ABI changes. ---------------------------- revision 1.6.2.2 date: 2002/08/16 11:03:11; author: luigi; state: Exp; lines: +181 -35 Synchronize ipfw2 with the version in -current (adding sets of rules, prevention of loops in keepalive generation, better defaults on size of dynamic rule table). For documentation, please refer to the ipfw manpage in -current (which I am going to MFC as soon as I have completed the section listing differences between ipfw1-stable and ipfw2). In particular have a look at the sections "PACKET FLOW", "IPFW2 ENHANCEMENTS" and "EXAMPLES" to see if your ruleset can be simplified with the new commands. ---------------------------- revision 1.6.2.1 date: 2002/07/24 03:21:23; author: luigi; state: Exp; lines: +0 -1 Bring ipfw2 into the -stable tree. This will give more people a chance to test it, and hopefully accelerate the transition from the old to the new ipfw code. NOTE: THIS COMMIT WILL NOT CHANGE THE FIREWALL YOU USE, NOR A SINGLE BIT IN YOUR KERNEL AND BINARIES. YOU WILL KEEP USING YOUR OLD "ipfw" UNLESS YOU: + add "options IPFW2" (undocumented) to your kernel config file; + compile and install sbin/ipfw and lib/libalias with make -DIPFW2 in other words, you must really want it. On the other hand, i believe you do really want to use this new code. In addition to being twice as fast in processing individual rules, you can use more powerful match patterns such as ... ip from 1.2.3.0/24{50,6,27,158} to ... ... ip from { 1.2.3.4/26 or 5.6.7.8/22 } to ... ... ip from any 5-7,9-66,1020-3000,4000-5000 to ... i.e. match sparse sets of IP addresses in constant time; use "or" connectives between match patterns; have multiple port ranges; etc. which I believe will dramatically reduce your ruleset size. As an additional bonus, "keep-state" rules will now send keepalives when the rule is about to expire, so you will not have your remote login sessions die while you are idle. The syntax is backward compatible with the old ipfw. A manual page documenting the extensions has yet to be completed. ---------------------------- revision 1.19.2.1 date: 2002/12/15 13:57:43; author: maxim; state: Exp; lines: +1 -1 MFC rev. 1.20: fix byte order logging issue: sa.sin_port is already in host byte order. Approved by: re (rwatson) ---------------------------- revision 1.28.2.1 date: 2003/06/04 02:19:36; author: ticso; state: Exp; lines: +15 -6 MFC: Change handling to support strong alignment architectures such as alpha and sparc64. src/sbin/ipfw/ipfw2.c (1.24) src/sys/netinet/ip_dummynet.c (1.64) src/sys/netinet/ip_fw.h (1.77) src/sys/netinet/ip_fw2.c (1.30) Approved by: re (scottl) ---------------------------- revision 1.51.2.1 date: 2003/12/23 12:25:56; author: maxim; state: Exp; lines: +1 -1 MFC rev. 1.52: IN_MULTICAST wants an address in host byte order. Approved by: re (scottl) ---------------------------- revision 1.70.2.14 date: 2005/06/29 21:38:48; author: simon; state: Exp; lines: +20 -17 Correct ipfw packet matching errors with address tables. Security: CAN-2005-2019 Security: FreeBSD-SA-05:13.ipfw Correct bzip2 denial of service and permission race vulnerabilities. Obtained from: Redhat, Steve Grubb via RedHat Security: CAN-2005-0953, CAN-2005-1260 Security: FreeBSD-SA-05:14.bzip2 Approved by: obrien Correct TCP connection stall denial of service vulnerability. A TCP packets with the SYN flag set is accepted for established connections, allowing an attacker to overwrite certain TCP options. Security: CAN-2005-2068 Security: FreeBSD-SA-05:15.tcp Approved by: cperciva ---------------------------- revision 1.70.2.13 date: 2005/06/17 23:56:59; author: green; state: Exp; lines: +32 -8 MFC: r1.101 Invoke the transmission of IPFW's stateful keep-alives once the locks have been dropped, thus preventing a deadlock between IPFW and the ifnet. ---------------------------- revision 1.70.2.12 date: 2005/06/17 23:30:32; author: green; state: Exp; lines: +3 -0 MFC: ipfw.8 r.1.172, ip_fw2.c r.1.100 Properly document the IPFW ALTQ first-match behavior that was intended, as well as actually implementing it. ---------------------------- revision 1.70.2.11 date: 2005/05/12 15:11:30; author: green; state: Exp; lines: +92 -1 MFC: IPFW ALTQ(4) classification support, diverted traffic match rules, and the TCP packet data length match rule. ---------------------------- revision 1.70.2.10 date: 2005/02/06 16:16:20; author: csjp; state: Exp; lines: +7 -2 branches: 1.70.2.10.2; MFC v1.88 FreeBSD src repository Modified files: sys/netinet ip_fw2.c Log: Change the state allocator from using regular malloc to using a UMA zone instead. This should eliminate a bit of the locking overhead associated with with malloc and reduce the memory consumption associated with each new state. Reviewed by: rwatson, andre Silence on: ipfw@ MFC after: 1 week ---------------------------- revision 1.70.2.9 date: 2005/01/31 23:26:35; author: imp; state: Exp; lines: +1 -1 MFC: /*- and related license changes ---------------------------- revision 1.70.2.8 date: 2005/01/07 23:09:39; author: csjp; state: Exp; lines: +69 -29 MFC v1.85 Log: This commit adds a shared locking mechanism very similar to the mechanism used by pfil. This shared locking mechanism will remove a nasty lock order reversal which occurs when ucred based rules are used which results in hard locks while mpsafenet=1. So this removes the debug.mpsafenet=0 requirement when using ucred based rules with IPFW. It should be noted that this locking mechanism does not guarantee fairness between read and write locks, and that it will favor firewall chain readers over writers. This seemed acceptable since write operations to firewall chains protected by this lock tend to be less frequent than reads. Reviewed by: andre, rwatson Tested by: myself, seanc Silence on: ipfw@ MFC after: 1 month ---------------------------- revision 1.70.2.7 date: 2004/10/13 22:07:05; author: green; state: Exp; lines: +5 -0 MFC r1.78: further rule verification (against corrupt rules added by root). Approved by: re ---------------------------- revision 1.70.2.6 date: 2004/10/03 17:04:40; author: mlaier; state: Exp; lines: +32 -11 MFC pfil API change: Add an additional struct inpcb * argument to pfil(9) in order to enable passing along socket information. This is required to work around a LOR with the socket code which results in an easy reproducible hard lockup with debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do so later. The missing piece is to turn the filter locking into a leaf lock and will follow in a seperate (later) commit. Suggested by: rwatson A lot of work by: csjp LOR IDs: 14 - 17 (not fixed yet) Approved by: re (scottl) ---------------------------- revision 1.70.2.5 date: 2004/09/16 18:02:22; author: andre; state: Exp; lines: +4 -0 MFC 1.76: Return EINVAL when IPFIREWALL_FORWARD is not compiled into the kernel. Approved by: re (scottl) ---------------------------- revision 1.70.2.4 date: 2004/09/13 07:19:55; author: maxim; state: Exp; lines: +5 -2 MFC rev.1.75: recover normal behavior: return EINVAL to attempt to add a divert rule when module is built without IPDIVERT. Approved by: re (kensmith) ---------------------------- revision 1.70.2.3 date: 2004/08/26 14:41:43; author: ru; state: Exp; lines: +2 -0 Fix a standalone module build. Approved by: re (scottl) ---------------------------- revision 1.70.2.2 date: 2004/08/20 02:02:05; author: kensmith; state: Exp; lines: +1 -1 Almost an oops, local mirror hadn't caught up with the last version of this file yet... MFC of rev. 1.73 - definitely stop callout. Work done by: andre Noticed my oops: andre Approved by: re ---------------------------- revision 1.70.2.1 date: 2004/08/20 01:40:42; author: kensmith; state: Exp; lines: +3 -2 MFC balance of ipfw fixes. Revs being MFC-ed: ip_fw.h rev 1.90 ip_fw2.c rev 1.71-1.73 ip_fw_pfil.c rev 1.2-1.3 ip_input.c rev 1.285 tcp_subr.c rev 1.202 Work done by: andre Approved by: re ---------------------------- revision 1.70.2.10.2.1 date: 2005/06/29 21:41:03; author: simon; state: Exp; lines: +20 -17 Correct ipfw packet matching errors with address tables. Security: CAN-2005-2019 Security: FreeBSD-SA-05:13.ipfw Correct bzip2 denial of service and permission race vulnerabilities. Obtained from: Redhat, Steve Grubb via RedHat Security: CAN-2005-0953, CAN-2005-1260 Security: FreeBSD-SA-05:14.bzip2 Approved by: obrien Correct TCP connection stall denial-of-service vulnerabilities. MFC: rev 1.270 of tcp_input.c, rev 1.25 of tcp_seq.h by ps: When a TCP packets containing a timestamp is received, inadequate checking of sequence numbers is performed, allowing an attacker to artificially increase the internal "recent" timestamp for a connection. A TCP packets with the SYN flag set is accepted for established connections, allowing an attacker to overwrite certain TCP options. Security: CAN-2005-0356, CAN-2005-2068 Security: FreeBSD-SA-05:15.tcp Approved by: so (cperciva) ---------------------------- revision 1.106.2.5 date: 2005/11/14 22:33:35; author: suz; state: Exp; lines: +8 -2 MFC 1.115 fixed a bug that uRPF does not work properly for an IPv6 packet bound for the sending machine itself (this is a bug introduced due to a change in ip6_input.c:Rev.1.81.2.2) ---------------------------- revision 1.106.2.4 date: 2005/11/04 20:26:14; author: ume; state: Exp; lines: +3 -0 MFC: scope cleanup. with this change - most of the kernel code will not care about the actual encoding of scope zone IDs and won't touch "s6_addr16[1]" directly. - similarly, most of the kernel code will not care about link-local scoped addresses as a special case. - scope boundary check will be stricter. For example, the current *BSD code allows a packet with src=::1 and dst=(some global IPv6 address) to be sent outside of the node, if the application do: s = socket(AF_INET6); bind(s, "::1"); sendto(s, some_global_IPv6_addr); This is clearly wrong, since ::1 is only meaningful within a single node, but the current implementation of the *BSD kernel cannot reject this attempt. sys/net/if_gif.c: 1.53 sys/net/if_spppsubr.c: 1.120 sys/netinet/icmp6.h: 1.19 sys/netinet/ip_carp.c: 1.28,1.29 sys/netinet/ip_fw2.c: 1.107 sys/netinet/tcp_subr.c: 1.230,1.231,1.235 sys/netinet/tcp_usrreq.c: 1.125 sys/netinet6/ah_core.c: 1.26 sys/netinet6/icmp6.c: 1.63,1.64 sys/netinet6/in6.c: 1.52 sys/netinet6/in6.h: 1.38 sys/netinet6/in6_cksum.c: 1.11 sys/netinet6/in6_ifattach.c: 1.27 sys/netinet6/in6_pcb.c: 1.63 sys/netinet6/in6_proto.c: 1.33 sys/netinet6/in6_src.c: 1.31,1.32 sys/netinet6/in6_var.h: 1.22 sys/netinet6/ip6_forward.c: 1.29 sys/netinet6/ip6_input.c: 1.83 sys/netinet6/ip6_mroute.c: 1.30 sys/netinet6/ip6_output.c: 1.95 sys/netinet6/ip6_var.h: 1.33 sys/netinet6/ipsec.c: 1.43 sys/netinet6/mld6.c: 1.21 sys/netinet6/nd6.c: 1.50 sys/netinet6/nd6_nbr.c: 1.30 sys/netinet6/nd6_rtr.c: 1.27 sys/netinet6/raw_ip6.c: 1.54 sys/netinet6/route6.c: 1.12 sys/netinet6/scope6.c: 1.13,1.14,1.15 sys/netinet6/scope6_var.h: 1.5 sys/netinet6/udp6_output.c: 1.23 sys/netinet6/udp6_usrreq.c: 1.55 sys/netkey/key.c: 1.72,1.73 ---------------------------- revision 1.106.2.3 date: 2005/09/17 13:43:36; author: bz; state: Exp; lines: +30 -6 MFC: rev. 1.111 Fix panic when kernel compiled without INET6 by rejecting IPv6 opcodes which are behind #if(n)def INET6 now. PR: kern/85826 Approved by: re (scottl) ---------------------------- revision 1.106.2.2 date: 2005/09/08 22:49:23; author: sam; state: Exp; lines: +1 -0 MFC 1.110: clear lock on error in O_LIMIT case of install_state Approved by: re (scottl) ---------------------------- revision 1.106.2.1 date: 2005/08/20 08:36:57; author: bz; state: Exp; lines: +289 -67 MFC: rev. 1.108, 1.109 src/sys/netinet/ip_fw2.c rev. 1.101 src/sys/netinet/ip_fw.h rev. 1.77 src/sbin/ipfw/ipfw2.c rev. 1.176 src/sbin/ipfw/ipfw.8 * Add dynamic sysctl for net.inet6.ip6.fw. * Correct handling of IPv6 Extension Headers. * Add unreach6 code. * Add logging for IPv6. * Fix build without INET6 and IPFIREWALL compiled into kernel.[1] Submitted by: sysctl handling derived from patch from ume needed for ip6fw Obtained from: is_icmp6_query and send_reject6 derived from similar functions of netinet6,ip6fw Reviewed by: ume, gnn; silence on ipfw@ Spotted and tested by: Michal Mertl [1] Approved by: re (kensmith) ============================================================================= --7LkOrbQMr4cezO2T--