From owner-freebsd-net@FreeBSD.ORG Mon Nov 15 10:43:38 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5ECC316A4CE; Mon, 15 Nov 2004 10:43:38 +0000 (GMT) Received: from relay.bestcom.ru (relay.bestcom.ru [217.72.144.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8695A43D49; Mon, 15 Nov 2004 10:43:37 +0000 (GMT) (envelope-from glebius@freebsd.org) Received: from cell.sick.ru (root@cell.sick.ru [217.72.144.68]) by relay.bestcom.ru (8.13.1/8.12.9) with ESMTP id iAFAhW3V079147 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Mon, 15 Nov 2004 13:43:33 +0300 (MSK) (envelope-from glebius@freebsd.org) Received: from cell.sick.ru (glebius@localhost [127.0.0.1]) by cell.sick.ru (8.12.11/8.12.8) with ESMTP id iAFAhW6r093546 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 15 Nov 2004 13:43:32 +0300 (MSK) (envelope-from glebius@freebsd.org) Received: (from glebius@localhost) by cell.sick.ru (8.12.11/8.12.11/Submit) id iAFAhVtI093545; Mon, 15 Nov 2004 13:43:31 +0300 (MSK) (envelope-from glebius@freebsd.org) X-Authentication-Warning: cell.sick.ru: glebius set sender to glebius@freebsd.org using -f Date: Mon, 15 Nov 2004 13:43:31 +0300 From: Gleb Smirnoff To: julian@freebsd.org, archie@freebsd.org Message-ID: <20041115104331.GA93477@cell.sick.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="MGYHOYXEY6WxJCY8" Content-Disposition: inline User-Agent: Mutt/1.5.6i X-Virus-Scanned: clamd / ClamAV version devel-20041013, clamav-milter version 0.75l on 127.0.0.1 X-Virus-Status: Clean cc: maxim@freebsd.org cc: rwatson@freebsd.org cc: net@freebsd.org Subject: divert(4) socket isn't connection oriented X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Nov 2004 10:43:38 -0000 --MGYHOYXEY6WxJCY8 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Hi! I've spent several days digging in interaction between divert protocol and ng_ksocket. I've find some oddities in there. Look at div_output(), it tells incoming packet from outgoing by presence of sockaddr_in structure. Depending on it packet is passed either to ip_input() or ip_output(). This sockaddr_in was previuosly supplied by divert interface to application at the end of divert_packet() with help of sbappendaddr_locked(). Look at ng_ksocket_incoming2(), near 'pru_soreceive'. When ng_ksocket takes data from socket it saves supplied sockaddr in m_tag attached to packet. After packet travel thru netgraph(4) it may return to ng_ksocket in ng_ksocket_rcvdata() and this sockaddr will be used in call to sosend(). It is important that ng_ksocket does not save sockaddr if socket is connected (see ng_ksocket_incoming2(), near 'pru_soreceive'). And this is correct! If a generic socket is connected, all data must be sent to connection destination. The problem is that divert(4) socket is always marked as connected (see end of div_attach()), and thus ng_ksocket_rcvdata() does not supply sockaddr in call to sosend(). So div_output() _always_ sends data to ip_output()! This is strange and odd but working - packets flow in both directions. This setup is working: /usr/sbin/ngctl -f- <<-SEQ mkpeer echo dummy dummy name .:dummy echo_div mkpeer echo_div: ksocket echo inet/raw/divert name echo_div:echo div_sock rmhook dummy msg div_sock: bind inet/0.0.0.0:8888 SEQ ipfw add 1 divert 8888 all from any to any via fxp0 in ping -c 1 www.com Since it is working, it was not noticed quickly. Real problems occur when a multicast packet comes on interface: it is diverted to ng_ksocket, returned and div_output() sends it to ip_output(). In ip_output() it is ip_mloopback()ed and if_simloop()ed. A copy of packet enters divert socket, duplicated... a forever loop and total freeze. Removing 'always connected status' from divert sockets fixes the problem and incoming packets go into ip_input(), not ip_output(). Please review attached patch. It: - removes SS_ISCONNECTED from so->so_state - removes pru_disconnect method, since it won't be called - removes pru_abort method, since it must not be called on divert socket. It was used indirectly via disconnect method. With this patch codepath of packets is correct and incoming packets do not enter ip_output() anymore. An alternative may be adding a kludge in ng_ksocket_incoming2(), so that sockaddr is saved always if socket is a divert socket. I don't like it. Awaiting for your feedback! -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE --MGYHOYXEY6WxJCY8 Content-Type: text/plain; charset=koi8-r Content-Disposition: attachment; filename="ip_divert.c.diff" Index: ip_divert.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_divert.c,v retrieving revision 1.109 diff -u -r1.109 ip_divert.c --- ip_divert.c 12 Nov 2004 22:17:42 -0000 1.109 +++ ip_divert.c 15 Nov 2004 10:14:37 -0000 @@ -415,12 +415,7 @@ inp->inp_ip_p = proto; inp->inp_vflag |= INP_IPV4; inp->inp_flags |= INP_HDRINCL; - /* The socket is always "connected" because - we always know "where" to send the packet */ INP_UNLOCK(inp); - SOCK_LOCK(so); - so->so_state |= SS_ISCONNECTED; - SOCK_UNLOCK(so); return 0; } @@ -442,32 +437,6 @@ } static int -div_abort(struct socket *so) -{ - struct inpcb *inp; - - INP_INFO_WLOCK(&divcbinfo); - inp = sotoinpcb(so); - if (inp == 0) { - INP_INFO_WUNLOCK(&divcbinfo); - return EINVAL; /* ??? possible? panic instead? */ - } - INP_LOCK(inp); - soisdisconnected(so); - in_pcbdetach(inp); - INP_INFO_WUNLOCK(&divcbinfo); - return 0; -} - -static int -div_disconnect(struct socket *so) -{ - if ((so->so_state & SS_ISCONNECTED) == 0) - return ENOTCONN; - return div_abort(so); -} - -static int div_bind(struct socket *so, struct sockaddr *nam, struct thread *td) { struct inpcb *inp; @@ -662,12 +631,10 @@ #endif struct pr_usrreqs div_usrreqs = { - .pru_abort = div_abort, .pru_attach = div_attach, .pru_bind = div_bind, .pru_control = in_control, .pru_detach = div_detach, - .pru_disconnect = div_disconnect, .pru_peeraddr = div_peeraddr, .pru_send = div_send, .pru_shutdown = div_shutdown, --MGYHOYXEY6WxJCY8--