Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Dec 1998 11:06:00 -0800
From:      "Justin C. Walker" <justin@apple.com>
To:        freebsd-net@FreeBSD.ORG
Subject:   Re: MLEN < write length < MINCLSIZE "bug"
Message-ID:  <19981215110600.D652@apple.com>
In-Reply-To: <Pine.BSF.4.05.9812150824460.22888-100000@alive.znep.com>; from Marc Slemko on Tue, Dec 15, 1998 at 08:30:12AM -0800
References:  <199812151555.PAA07456@netrinsics.com> <Pine.BSF.4.05.9812150824460.22888-100000@alive.znep.com>

next in thread | previous in thread | raw e-mail | index | archive | help
If I can horn in on the discussion, we have seen the problems you're
debating, particularly when using request/response interactions over
TCP.  After beating our heads against a few walls, we came to the
following conclusion: this is due (as has been observed before in this
thread) to the separation between the job that the socket layer is
doing, and the job that the TCP (or, generally, transport protocol)
layer is doing.

One way to smooth over the bump in the road is to provide a hint to
the lower layer, so we have a socket state bit (SS_MORETOCOME) that we
turn on (in so_state) just before calling the protocol send routine
(PRU_SEND, ...), and turn off after the call returns.  The hint is
turned on only if resid is positive.

In tcp_output(), in the 'if len { ...}' (following the comment on
silly window avoidaince), we check for the bit:

        if (len) {
                if (len == tp->t_maxseg)
                        goto send;
                if (!(so->so_state & SS_MORETOCOME)) {
                        if ((idle || tp->t_flags & TF_NODELAY) &&
                            len + off >= so->so_snd.sb_cc)
                                goto send;
                }
                if (tp->t_force)
                        goto send;
                if (len >= tp->max_sndwnd / 2)
                        goto send;
                if (SEQ_LT(tp->snd_nxt, tp->snd_max))
                        goto send;
        }

Essentially, if there's more to come, we hold off sending; and we only
believe there's more to come if the user has committed to it (in the
form of a write request).  This seems to smooth out (some of) the
bumps caused by the user buffer/mbuf/cluster size differences and the
request/response effects on the TCP state machines.

Regards,

Justin

On Tue, Dec 15, 1998 at 08:30:12AM -0800, Marc Slemko wrote:
> (-stable removed from the cc list, since this isn't particular to stable
> in any way)
> 
> On Tue, 15 Dec 1998, Michael Robinson wrote:
> 
> > Bill Fenner <fenner@parc.xerox.com> writes:
> > >You misunderstand.  The fix is to accumulate mbufs in a chain until either
> > >a) The protocol gets all of the data that it wanted, or
> > >b) All of the data that the user has provided has been copied into mbufs.
> > >
> > >(b) is what sosend() used to do.  The URL referenced (the one with
> > >"vanj88" in it) describes why sosend() was changed to use only a single
> > >mbuf at a time, but this performance problem was not envisioned at
> > >the time.
> > 
> > Ok, I misunderstood.  But I still disagree it's a bug.  Or, more precisely,
> > it would be a bug if the socket API and the TCP protocol were seen as one
> > inseparable entity, which is not the case.
> 
> No, it really is a bug.
> 
> It is inherently broken to write multiple packets for one write() when the
> size of the write is far less than the MTU (well, the "effective MTU")
> unless you have extreme extenuating circumstances.
> 
> It may not be a bug covered by any spec, but for people trying to write
> useful network apps it shoots them in the head.  It is still a bug.
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-net" in the body of the message

-- 
Justin C. Walker, Curmudgeon-At-Large   *
Institute for General Semantics         |
Manager, CoreOS Networking              |   Men are from Earth.
Apple Computer, Inc.                    |   Women are from Earth.
2 Infinite Loop                         |	Deal with it.
Cupertino, CA 95014                     |
*---------------------------------------*------------------------------------*

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19981215110600.D652>