Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Feb 1999 09:15:29 -0800
From:      "Justin C. Walker" <justin@apple.com>
To:        Chris Csanady <ccsanady@friley-185-205.res.iastate.edu>
Cc:        freebsd-current@FreeBSD.ORG, freebsd-net@FreeBSD.ORG
Subject:   Re: Serious mbuf cluster leak..
Message-ID:  <199902111715.JAA00622@walker3.apple.com>
In-Reply-To: "Your message of Wed, 10 Feb 1999 17:49:20 CST."<19990210234920.2A11B6@friley-185-205.res.iastate.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

<bold><flushleft><color><param>7540,3813,2F35</param><fontfamily><param>Helvetica</param>Ouch.  I can say that our implementation doesn't seem to suffer from this problem.  Could be there's an issue in the use of PRUS_* v. the socket state we use.  The code in my kernel looks like:


in sosend():

                    if (dontroute)

                            so->so_options |= SO_DONTROUTE;

                    if (resid > 0)

                            so->so_state |= SS_MORETOCOME;

                    s = splnet();                               /* XXX */

                    error = (*so->so_proto->pr_usrreq)(so,

                        (flags & MSG_OOB) ? PRU_SENDOOB : PRU_SEND,

                        top, addr, control);

                    splx(s);

                    if (dontroute)

                            so->so_options &= ~SO_DONTROUTE;

                    so->so_state &= ~SS_MORETOCOME;


and in tcp_output():

	        if (len) {

	                if (len == tp->t_maxseg)

  	                      goto send;

	                if (!(so->so_state & SS_MORETOCOME)) {

  	                      if ((idle || tp->t_flags & TF_NODELAY) &&

    	                        	len + off >= so->so_snd.sb_cc)

	                                	goto send;

	                }

 	               if (tp->t_force)

        	               	 goto send;


We've subjected this to countless (well, some, I'm sure, can count them :-}) hours of thrashing for web server, file server, and other-server types of uses, and haven't seen any (reports of) leakage like this.


I'll look more closely at the results we see, to verify that we don't have a problem.


Regards,


Justin


From: </fontfamily></color></flushleft></bold><color><param>7540,3813,2F35</param><fontfamily><param>Helvetica</param>Chris Csanady <<ccsanady@friley-185-205.res.iastate.edu><color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>Date: </color></bold><color><param>7540,3813,2F35</param>1999-02-11 01:51:17 -0800<color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>To: </color></bold><color><param>7540,3813,2F35</param>freebsd-current@FreeBSD.ORG<color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>Subject: </color></bold><color><param>7540,3813,2F35</param>Re: Serious mbuf cluster leak..<color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>Cc: </color></bold><color><param>7540,3813,2F35</param>freebsd-net@FreeBSD.ORG<color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>In-reply-to: </color></bold><color><param>7540,3813,2F35</param>"Your message of Wed, 10 Feb 1999 17:49:20 CST."<<19990210234920.2A11B6@friley-185-205.res.iastate.edu><color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>X-Mailer: </color></bold><color><param>7540,3813,2F35</param>exmh version 2.0.2 2/24/98<color><param>0000,0000,0000</param>

<bold><color><param>7540,3813,2F35</param>X-Loop: </color></bold><color><param>7540,3813,2F35</param>FreeBSD.org<color><param>0000,0000,0000</param>


</color></color></color></color></color></color></color></color></color></color></color></color></color></color></color></fontfamily><fixed><fontfamily><param>Ohlfs</param>After a while, I have determined the cause of the leak to be the<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>following commit.  Although, I can't seem to find any reason why<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>it would cause this behavior--reverting these files fixes it.<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>Any thoughts?<color><param>0000,0000,0000</param>


<color><param>7540,3813,2F35</param>fenner      1999/01/20 09:32:01 PST<color><param>0000,0000,0000</param>


<color><param>7540,3813,2F35</param>  Modified files:<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>    sys/kern             uipc_socket.c <color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>    sys/netinet          tcp_output.c tcp_usrreq.c tcp_var.h <color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>    sys/sys              protosw.h <color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  Log:<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  Add a flag, passed to pru_send routines, PRUS_MORETOCOME.  This<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  flag means that there is more data to be put into the socket buffer.<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  Use it in TCP to reduce the interaction between mbuf sizes and the<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  Nagle algorithm.<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  <color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  Based on:     "Justin C. Walker" <<justin@apple.com>'s description of Apple's<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>                fix for this problem.<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  <color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  Revision  Changes    Path<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  1.50      +4 -2      src/sys/kern/uipc_socket.c<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  1.32      +3 -2      src/sys/netinet/tcp_output.c<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  1.40      +7 -2      src/sys/netinet/tcp_usrreq.c<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  1.49      +18 -17    src/sys/netinet/tcp_var.h<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>  1.26      +2 -1      src/sys/sys/protosw.h<color><param>0000,0000,0000</param>


<color><param>7540,3813,2F35</param>>I have been seeing a nasty cluster leak in both 3.0 stable and 4.0<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>current as of today.  Until now, I thougt maybe it was something in<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>my driver, athough after much careful looking over my code, it<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>simply does not look possible.  Also, I downgraded to current of<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>Dec 12, and the problem dissappears.  <color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>The odd thing is that the clusters that leak don't seem to be<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>attached to mbufs.  Or at least there is not a 1-1 ratio.  Following<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>is netstat output after a while of running netpipe in streaming<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>mode.  (NPtcp -s; see ftp://ftp.scl.ameslab.gov/pub/netpipe)  Also,<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>the leak only becomes apparent when the send write size is very<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>large--several hundred K to several megabytes.<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>Does anyone have any idea what this may be?  I really am not sure<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>where to look aside from trying prorgressively newer kernels.  Also,<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>I only have alphas to test on right now..<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>puck:~> netstat -m<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>211/416 mbufs in use:<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>        116 mbufs allocated to data<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>        95 mbufs allocated to packet headers<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>1674/1688/2048 mbuf clusters in use (current/peak/max)<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>3480 Kbytes allocated to network (97% in use)<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>0 requests for memory denied<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>0 requests for memory delayed<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>0 calls to protocol drain routines<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>Chris Csanady<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>To Unsubscribe: send mail to majordomo@FreeBSD.org<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>>with "unsubscribe freebsd-net" in the body of the message<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>><color><param>0000,0000,0000</param>





<color><param>7540,3813,2F35</param>To Unsubscribe: send mail to majordomo@FreeBSD.org<color><param>0000,0000,0000</param>

<color><param>7540,3813,2F35</param>with "unsubscribe freebsd-net" in the body of the message<color><param>0000,0000,0000</param>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902111715.JAA00622>