From owner-svn-src-stable-6@FreeBSD.ORG Tue Apr 14 16:45:19 2009 Return-Path: Delivered-To: svn-src-stable-6@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C1B810656CF; Tue, 14 Apr 2009 16:45:19 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 9593A8FC1C; Tue, 14 Apr 2009 16:45:17 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n3EGjHIN013782; Tue, 14 Apr 2009 16:45:17 GMT (envelope-from emaste@svn.freebsd.org) Received: (from emaste@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n3EGjH6g013779; Tue, 14 Apr 2009 16:45:17 GMT (envelope-from emaste@svn.freebsd.org) Message-Id: <200904141645.n3EGjH6g013779@svn.freebsd.org> From: Ed Maste Date: Tue, 14 Apr 2009 16:45:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-6@freebsd.org X-SVN-Group: stable-6 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r191063 - in stable/6/sys: . contrib/pf dev/cxgb netinet X-BeenThere: svn-src-stable-6@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for only the 6-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Apr 2009 16:45:21 -0000 Author: emaste Date: Tue Apr 14 16:45:17 2009 New Revision: 191063 URL: http://svn.freebsd.org/changeset/base/191063 Log: MFC r171746 by csjp Summary: - We disallow multicast operations on divert sockets. It really doesn't make semantic sense to allow this, since typically you would set multicast parameters on multicast end points. NOTE: As a part of this change, we actually dis-allow multicast options on any socket that IS a divert socket OR IS NOT a SOCK_RAW or SOCK_DGRAM family - We check to see if there are any socket options that have been specified on the socket, and if there was (which is very un-common and also probably doesnt make sense to support) we duplicate the mbuf carrying the options. - We then drop the INP/INFO locks over the call to ip_output(). It should be noted that since we no longer support multicast operations on divert sockets and we have duplicated any socket options, we no longer need the reference to the pcb to be coherent. - Finally, we replaced the call to ip_input() to use netisr queuing. This should remove the recursive entry into the IP stack from divert. (The ip_output.c changes come from in_mcast.c in head.) Modified: stable/6/sys/ (props changed) stable/6/sys/contrib/pf/ (props changed) stable/6/sys/dev/cxgb/ (props changed) stable/6/sys/netinet/ip_divert.c stable/6/sys/netinet/ip_output.c Modified: stable/6/sys/netinet/ip_divert.c ============================================================================== --- stable/6/sys/netinet/ip_divert.c Tue Apr 14 16:24:56 2009 (r191062) +++ stable/6/sys/netinet/ip_divert.c Tue Apr 14 16:45:17 2009 (r191063) @@ -61,6 +61,7 @@ #include #include +#include #include #include @@ -303,6 +304,7 @@ div_output(struct socket *so, struct mbu struct m_tag *mtag; struct divert_tag *dt; int error = 0; + struct mbuf *options; /* * An mbuf may hasn't come from userland, but we pretend @@ -359,6 +361,8 @@ div_output(struct socket *so, struct mbu if (((ip->ip_hl != (sizeof (*ip) >> 2)) && inp->inp_options) || ((u_short)ntohs(ip->ip_len) > m->m_pkthdr.len)) { error = EINVAL; + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&divcbinfo); m_freem(m); } else { /* Convert fields to host order for ip_output() */ @@ -371,15 +375,46 @@ div_output(struct socket *so, struct mbu #ifdef MAC mac_create_mbuf_from_inpcb(inp, m); #endif - error = ip_output(m, - inp->inp_options, NULL, - ((so->so_options & SO_DONTROUTE) ? - IP_ROUTETOIF : 0) | - IP_ALLOWBROADCAST | IP_RAWOUTPUT, - inp->inp_moptions, NULL); + /* + * Get ready to inject the packet into ip_output(). + * Just in case socket options were specified on the + * divert socket, we duplicate them. This is done + * to avoid having to hold the PCB locks over the call + * to ip_output(), as doing this results in a number of + * lock ordering complexities. + * + * Note that we set the multicast options argument for + * ip_output() to NULL since it should be invariant that + * they are not present. + */ + KASSERT(inp->inp_moptions == NULL, + ("multicast options set on a divert socket")); + options = NULL; + /* + * XXXCSJP: It is unclear to me whether or not it makes + * sense for divert sockets to have options. However, + * for now we will duplicate them with the INP locks + * held so we can use them in ip_output() without + * requring a reference to the pcb. + */ + if (inp->inp_options != NULL) { + options = m_dup(inp->inp_options, M_DONTWAIT); + if (options == NULL) + error = ENOBUFS; + } + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&divcbinfo); + if (error == ENOBUFS) { + m_freem(m); + return (error); + } + error = ip_output(m, options, NULL, + ((so->so_options & SO_DONTROUTE) ? + IP_ROUTETOIF : 0) | IP_ALLOWBROADCAST | + IP_RAWOUTPUT, NULL, NULL); + if (options != NULL) + m_freem(options); } - INP_UNLOCK(inp); - INP_INFO_WUNLOCK(&divcbinfo); } else { dt->info |= IP_FW_DIVERT_LOOPBACK_FLAG; if (m->m_pkthdr.rcvif == NULL) { @@ -404,8 +439,8 @@ div_output(struct socket *so, struct mbu mac_create_mbuf_from_socket(so, m); SOCK_UNLOCK(so); #endif - /* Send packet to input processing */ - ip_input(m); + /* Send packet to input processing via netisr */ + netisr_queue(NETISR_IP, m); } return error; Modified: stable/6/sys/netinet/ip_output.c ============================================================================== --- stable/6/sys/netinet/ip_output.c Tue Apr 14 16:24:56 2009 (r191062) +++ stable/6/sys/netinet/ip_output.c Tue Apr 14 16:45:17 2009 (r191063) @@ -1710,6 +1710,16 @@ ip_setmoptions(struct inpcb *inp, struct int ifindex; int s; + /* + * If socket is neither of type SOCK_RAW or SOCK_DGRAM, + * or is a divert socket, reject it. + * XXX Unlocked read of inp_socket believed OK. + */ + if (inp->inp_socket->so_proto->pr_protocol == IPPROTO_DIVERT || + (inp->inp_socket->so_proto->pr_type != SOCK_RAW && + inp->inp_socket->so_proto->pr_type != SOCK_DGRAM)) + return (EOPNOTSUPP); + switch (sopt->sopt_name) { /* store an index number for the vif you wanna use in the send */ case IP_MULTICAST_VIF: @@ -1995,6 +2005,16 @@ ip_getmoptions(struct inpcb *inp, struct INP_LOCK(inp); imo = inp->inp_moptions; + /* + * If socket is neither of type SOCK_RAW or SOCK_DGRAM, + * or is a divert socket, reject it. + */ + if (inp->inp_socket->so_proto->pr_protocol == IPPROTO_DIVERT || + (inp->inp_socket->so_proto->pr_type != SOCK_RAW && + inp->inp_socket->so_proto->pr_type != SOCK_DGRAM)) { + INP_UNLOCK(inp); + return (EOPNOTSUPP); + } error = 0; switch (sopt->sopt_name) {