From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 10 13:50:47 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D73037B47D; Thu, 10 Apr 2003 13:50:46 -0700 (PDT) Received: from rms21.rommon.net (rms21.rommon.net [193.64.42.200]) by mx1.FreeBSD.org (Postfix) with ESMTP id DE5EA43FB1; Thu, 10 Apr 2003 13:50:44 -0700 (PDT) (envelope-from pete@he.iki.fi) Received: from PHE (h93.vuokselantie10.fi [193.64.42.147]) by rms21.rommon.net (8.12.6p2/8.12.6) with SMTP id h3AKodqo068247; Thu, 10 Apr 2003 23:50:40 +0300 (EEST) (envelope-from pete@he.iki.fi) Message-ID: <05b601c2ffa2$ed87a5b0$932a40c1@PHE> From: "Petri Helenius" To: "Jin Guojun [DSD]" , , References: <3E94A22D.174321F0@lbl.gov> <3E94A8C4.3A196E42@lbl.gov> Date: Thu, 10 Apr 2003 23:51:13 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 20:50:47 -0000 There was a discussion on mballoc performance on freebsd-net about a month ago but it has since died without conclusion. Pete ----- Original Message ----- From: "Jin Guojun [DSD]" To: ; Sent: Thursday, April 10, 2003 2:12 AM Subject: Re: tcp_output starving -- is due to mbuf get delay? > Some details was left behind -- > > The machine is 2 GHz Intel P4 with 1 GB memory, so the delay is not from > either CPU or lack of memory. > > -Jin > > "Jin Guojun [DSD]" wrote: > > > When testing GigE path that has 67 ms RTT, the maximum TCP throughput is > > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is > > starving > > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond > > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, > > sosend never stopped at sbwait. So only place can slow down is the mbuf > > allocation > > in sosend(). The attached trace file shows that each MGET and MCLGET takes > > significant time -- around 8 us at slow start time, and gradually increasing > > after that > > in an range 18 to 648 us. > > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then > > > > the performance will be reduced to 40%, in fact it is down to 25%, which means > > higher average delay. > > > > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. > > > > Any one can tell what factors would cause MGET / MCLGET to wait? > > Is there any way to make MGET/MCLGET not to wait? > > > > -Jin > > > > ----------- system info ------------- > > > > kern.ipc.maxsockbuf: 10485760 > > net.inet.tcp.sendspace: 8388608 > > kern.ipc.nmbclusters: 10240 > > kern.ipc.mbuf_wait: 32 > > kern.ipc.mbtypes: 2606 322 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > kern.ipc.nmbufs: 40960 > > > > -------------- code trace and explanation ---------- > > > > sosend() > > { > > ... > > if (space < resid + clen && > > (atomic || space < so->so_snd.sb_lowat || space < clen)) { > > if (so->so_state & SS_NBIO) > > snderr(EWOULDBLOCK); > > sbunlock(&so->so_snd); > > error = sbwait(&so->so_snd); /***** never come > > down to here ****/ > > splx(s); > > if (error) > > goto out; > > goto restart; > > } > > splx(s); > > mp = ⊤ > > space -= clen; > > do { > > if (uio == NULL) { > > /* > > * Data is prepackaged in "top". > > */ > > resid = 0; > > if (flags & MSG_EOR) > > top->m_flags |= M_EOR; > > } else do { > > if (top == 0) { > > microtime(&t1); > > MGETHDR(m, M_WAIT, MT_DATA); > > if (m == NULL) { > > error = ENOBUFS; > > goto release; > > } > > mlen = MHLEN; > > m->m_pkthdr.len = 0; > > m->m_pkthdr.rcvif = (struct ifnet *)0; > > } else { > > MGET(m, M_WAIT, MT_DATA); > > if (m == NULL) { > > error = ENOBUFS; > > goto release; > > } > > mlen = MLEN; > > } > > if (resid >= MINCLSIZE) { > > MCLGET(m, M_WAIT); > > if ((m->m_flags & M_EXT) == 0) > > goto nopages; > > mlen = MCLBYTES; > > len = min(min(mlen, resid), space); > > } else { > > nopages: > > len = min(min(mlen, resid), space); > > /* > > * For datagram protocols, leave room > > * for protocol headers in first mbuf. > > */ > > if (atomic && top == 0 && len < mlen) > > MH_ALIGN(m, len); > > } > > microtime(&t2); > > td = time_diff(&t2, &t1); > > if ((td > 5 && (++tcnt & 31) == 0) || td > 50) > > log( ... "td %d %d\n", td, tcnt); > > > > ... > > > > } /* end of sosend */ > > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" >