From owner-freebsd-current@FreeBSD.ORG Mon Jun 16 05:07:28 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 659BE37B401; Mon, 16 Jun 2003 05:07:28 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id A409343FBF; Mon, 16 Jun 2003 05:07:26 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id WAA22387; Mon, 16 Jun 2003 22:07:23 +1000 Date: Mon, 16 Jun 2003 22:07:22 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Don Lewis In-Reply-To: <200306161109.h5GB9MM7048819@gw.catspoiler.org> Message-ID: <20030616212958.O28213@gamplex.bde.org> References: <200306161109.h5GB9MM7048819@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@freebsd.org cc: tjr@freebsd.org Subject: Re: qmail uses 100% cpu after FreeBSD-5.0 to 5.1 upgrade X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Jun 2003 12:07:28 -0000 On Mon, 16 Jun 2003, Don Lewis wrote: > On 16 Jun, Bruce Evans wrote: > > In my review of 1.87, I forgot to ask you how atomic the close is with part > > of it moved out to fifo_inactive(). I think it's important that all > > traces of the old open have gone away (as far as applications can tell) > > when the last close returns. > > I hadn't taken queued data into consideration. Now that I've looked at > this more closely, there are other problems in both the old and new > code. If a process calls fcntl(fd, F_SETOWN, ...) on one end of the > fifo, that should be undone when that end of the fifo is closed. In the > old implementation, that only happens when both ends of the fifo are > closed and the sockets are deleted. F_SETOWN (and associated signal delivery) is even more broken than that :-]. This fcntl() should applied to the file (though not just the file descriptor), so its effect should be limited to fd's open in the file instance and go away when all thse are closed. However, F_SETOWN (and associated signal delivery) actually applies to the socket for fifos. It doesn't work quite right for ttys either. F_SETOWN apparently isn't used in ways complicated enough to require it to work right. > >> Now there are two questions that I can't answer: > >> > >> Why is my analysis of select() and the SS_CANTRCVMORE flag > >> incorrect in 5.1-current with version 1.87 or 1.88 of > >> fifo_vnops.c. > > > > I think it is correct, assuming that something writes to the fifo. > > Writing might be part of synchronization but actually reading the > > data should not be necessary since the last close must discard the > > data (POSIX spec). > > It sure looks to me like SS_CANTRCVMORE is always set when the write end > of the fifo is closed, no matter whether the the sockets were freshly > allocated by a fifo_open() call on the read end of the fifo, or because > the the last writer closed the write end of the fifo. It sure looks > like select() should immediately return if this flag is set, but it is > not returning ... Alfred changed the semantics for 5.x. I thought that you knew this. I finally gave up resisting this change after a lot of email :-). In 5.x, SS_CANTRCVMORE often has no effect for fifos (it still works normally for sockets). fifo_poll() normally calls soo_poll() with POLLIN converted to POLLINIGNEOF. This causes soo_poll() (sopoll()) to skip the usual SS_CANTRCVMORE check (which is inside soreadable()) and check the watermark instead, so that select() on a fifo normally waits for data even when the fifo is open in nonblocking mode and SS_CANTRCVMORE is set. Blocking in select() even in nonblocking mode is usually what is wanted, but is not what is wanted for detecting EOF. 4.8 handles EOF detection (== all writers going away in the context of fifos) better at a cost of providing no good way to wait for the first writer. We changed it since all other systems seem to do it like 5.x and few applications understand this. > Actually, something seems broken. I modified my little test program to > actually read the data, which works just fine, but select() still blocks > when the writer closes the fifo, so there doesn't seem to be a way to > detect the EOF. Hmm, we may have changed too much. EOF can be detected using poll() instead of select() and seting POLLIN and POLLINIGNEOF in the poll flags (this stops fifo_poll() clearing POLLIN -- see the comment), but the POLLINIGNEOF is not documented at the application level and is probably never used there. I suspect that other systems have more magic to handle EOF. I tried to avoid such magic since I think the state of the fifo should be the same when there are no writers (and no data) no matter how the state of having no writers was reached (otherwise I think the state depends too much on races between open() for reading and close() by the last writer). POSIX is clear enough on this for read/write but fuzzy for select/poll. Bruce