From owner-cvs-src@FreeBSD.ORG Thu Feb 10 15:07:14 2005 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AE82A16A4D1 for ; Thu, 10 Feb 2005 15:07:14 +0000 (GMT) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1D0C443D49 for ; Thu, 10 Feb 2005 15:07:13 +0000 (GMT) (envelope-from oppermann@networx.ch) Received: (qmail 30970 invoked from network); 10 Feb 2005 14:45:33 -0000 Received: from unknown (HELO networx.ch) ([62.48.0.53]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 10 Feb 2005 14:45:33 -0000 Message-ID: <420B78A0.3D500650@networx.ch> Date: Thu, 10 Feb 2005 16:07:12 +0100 From: Andre Oppermann X-Mailer: Mozilla 4.8 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Gleb Smirnoff References: <200502051206.j15C6YOY015206@repoman.freebsd.org> <420B3CDA.9033810C@networx.ch> <20050210110334.GB21237@cell.sick.ru> <420B6D38.F9C6DDAF@networx.ch> <20050210143626.GA23361@cell.sick.ru> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit cc: cvs-src@freebsd.org cc: src-committers@freebsd.org cc: cvs-all@freebsd.org Subject: Re: cvs commit: src/sbin/ipfw ipfw2.c src/sys/netgraph ng_ipfw.cng_ipfw.h src/sys/netinet ip_fw.h ip_fw2.c ip_fw_pfil.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2005 15:07:14 -0000 Gleb Smirnoff wrote: > > Andre, > > On Thu, Feb 10, 2005 at 03:18:32PM +0100, Andre Oppermann wrote: > A> > Andre, all other edge netgraph nodes does not queue packet for ISR. > A> > "Edge node" stands for a node which is an interface between netgraph and > A> > other networking subsystem. It has been proved in practice, that it is not > A> > needed. And in theory, there is no way ro recurse. You say, that you can > A> > describe a normal setup which leads to recursion. I can't. So pls describe it. > A> > A> There is a major difference to the other edge nodes. When ng_ipfw is > A> being entered the pfil lock is still held we are in the middle of > A> ip_input/ouput() processing. For example I can quite readily imagine > A> someone ng-diverting all SMTP TCP sessions into some kind tunneling > A> encapsulation, perhaps through ng_ppp->ng_l2tp which causes the packet > A> so re-enter ip_input() again. Then it may be that I have another > A> ng-diversion either intentionally or accidentially causing my stack > A> to grow until it explodes. In addition the pfil read lock is held > A> all the time and even recursed on. Given a certain traffic level it > A> may load to deadlock or prevent any pfil hooks changes for a long time. > A> This is dangerous territory and netgraph is not bound in depth or path > A> a packet may take. > A> ng_ipfw is special in this regard compared to other netgraph edges > A> precisely because it gets called from the middle of ip_input/output(). > A> And because of that you have to do some special treatment. I wouldn't > A> complain if there were a guarnatee that packets going out via ng_ipfw > A> absolutely come back through it again, but there isn't. You can set > A> it up as one-way thing. > > You haven't gave an example of normal setup, bringing recursion. Neither > ng_ppp nor ng_l2tp can reinject packet to IP stack. Looking for recursion, > you should look only at edge nodes. ng_ppp and ng_l2tp are protocol implementing > nodes, not edge ones. Probably you meant that packet diverted via ng_ipfw > traverses ng_ppp|ng_l2tp and comes on ng_iface node, which implements system > interface. I will doubtfully count this setup as "normal", but anyway there is > no recursion. ng_iface, as a well designed edge node, queues packets for ISR. Ok, I was looking for such a clear explanation. I'm not an netgraph expert at all and need you to tell me what happens in there to get the full picture. Thanks for this writeup. So far I was assuming the worst case. Are you sure ng_iface does not and never will do direct dispatch? > Possible recursion, you are speaking about should be solved in other direction. > Every edge node should queue packets, when they exit netgraph. And they all do. > If you find one that does not, we will fix it not ng_ipfw. How does the return path for ng_ipfw do it before it sends the packet back to ip_input/output()? And different question: If netgraph doesn't queue the packet why can't you just pick it up again and let it continue directly from ip_ipfw_pfil? E.g. why does it have to reinject the packet? > A> > A> The other thing is the passing back of errors from netgraph. Only > A> > A> certain kinds of errors should be reported back and others converted > A> > A> to some default error. It is very confusing for an application > A> > A> developer to get a very (from his POV) non-sensical error message > A> > A> like ENOTCONN when writing on a socket. He doesn't have knowledge > A> > A> of the ipfw/netgraph stuff that happens in the kernel and it makes > A> > A> debugging extremely confusing. In the he blames FreeBSDs socket > A> > A> implementation whereas it was only some error in setting up the > A> > A> netgraph by the administrator. > A> > > A> > I have already replied this in net@ list. OK, I'll ask again: Do you want > A> > ng_ipfw_rcvdata() to end with "return (0)"? > A> > A> No. I want it to pass EACCES, ENOMEM/ENOBUFS and everything else as > A> ENXIO (including if hook not connected). In a previous email I said > A> ESRCH was ok, but it really doesn't make sense and is pretty confusing > A> to an application writer. ENXIO is much better and not to be confused > A> with possible programming errors from application side. > > I have asked you in mail: "Do you want this ugly construction at and of > ng_ipfw_input()?" > > NG_SEND_DATA_ONLY(error, hook, m); > > if (error == EACCES || error == ENOMEM || error == ENOBUFS) > return (error); > else > return (0); > > Do you want it? No, I want this: NG_SEND_DATA_ONLY(error, hook, m); if (error == EACCES || error == ENOMEM || error == ENOBUFS) return (error); else if (error != 0) return (ENXIO); else return (0); or alternatively: switch (error) { case 0: return (0); break; /* Not reached. */ case EACCES: case ENOMEM: case ENOBUFS: return (error); break; /* Not reached. */ default: return (ENXIO); break; /* Not reached. */ } > Ok, let's treat question about substituting ESRCH with ENXIO separately. I can > accept it, just to meet your wishes. Please, ask Ruslan for review, since he > showed interest to this node particularly, and uses netgraph in general. -- Andre