From owner-freebsd-current  Sun Feb 14 22:19:26 1999
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id WAA10052
          for freebsd-current-outgoing; Sun, 14 Feb 1999 22:19:26 -0800 (PST)
          (envelope-from owner-freebsd-current@FreeBSD.ORG)
Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.26.10.9])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id WAA10046
          for <freebsd-current@FreeBSD.ORG>; Sun, 14 Feb 1999 22:19:22 -0800 (PST)
          (envelope-from bde@godzilla.zeta.org.au)
Received: (from bde@localhost)
	by godzilla.zeta.org.au (8.8.7/8.8.7) id RAA24136;
	Mon, 15 Feb 1999 17:19:12 +1100
Date: Mon, 15 Feb 1999 17:19:12 +1100
From: Bruce Evans <bde@zeta.org.au>
Message-Id: <199902150619.RAA24136@godzilla.zeta.org.au>
To: dillon@apollo.backplane.com, freebsd-current@FreeBSD.ORG
Subject: Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

>    This isn't a 'bug', per say, but it bothers me that a small 128 byte
>    write() is being somehow broken apart into two smaller read()s.  It isn't
>    efficient, and it shouldn't be happening.

Breaking apart write() into read()s would be a BUG :-).

Breaking apart read() into read()s seems to be caused by my rescheduling
changes in kern_subr.c.

>    fcntl(fds[0], F_SETFL, O_NONBLOCK);

Here you permit non-atomic reads for block sizes <= PIPE_BUF, so you should
be prpared to get them.

>    if (fork() == 0) {
>        sleep(1);
>        write(fds[1], buf, sizeof(buf));
>        _exit(1);
>    }

The write() apparently begins by copyout()ing only 96 bytes, and if the
need_resched() condition is true, then the process doing the write will
block.

>    select(fds[0] + 1, &rfds, NULL, NULL, NULL);
>    while ((n = read(fds[0], buf, sizeof(buf))) > 0)
>        printf("read %d\n", n);
>    return(0);

When the writer blocks, the reader runs and uses a buggy loop to read
only the first chunk of input.

On an otherwise idle system, the need_resched() condition seems to be
true always.  I would have expected the synchronisation provided by the
sleep(1) to bias need_resched() in the opposite direction.  A reschedule
has been done, normally just after the previous hardclock() call, just
before the writer wakes up, so another one should not be done soon
(until after the next hardclock() call).

Bruce

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message