From owner-freebsd-hackers Thu Sep 23 10:32:16 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 2E79E14FEE for ; Thu, 23 Sep 1999 10:32:13 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id KAA28739; Thu, 23 Sep 1999 10:31:54 -0700 (PDT) (envelope-from dillon) Date: Thu, 23 Sep 1999 10:31:54 -0700 (PDT) From: Matthew Dillon Message-Id: <199909231731.KAA28739@apollo.backplane.com> To: Christopher Sedore Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: mbufs, external storage, and MFREE References: Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :I have the following question: Let's say that I have a block of user :memory which I've mapped into the kernel, and would like to send on a :network socket. I'd like to simply grab an mbuf, point to the memory as :external storage, and queue it up for transmission. This would work fine, :except that when MFREE gets called, I have to write an deallocator that :maintains a table of all the different cases where I've done this, and do :a reverse mapping back to the original block, and then deal with sending :more, unmapping, etc. In other words, having MFREE call a deallocator :with just the data pointer and the size is inconvenient (actually, it :would make my scenario quite inefficient given the number of mappings back :to the original block that would have to be done). : :Am I missing another mechanism to handle this? Does it not come up enough :to matter? : :-Chris This is almost precisely the mechanism that the sendfile() system call uses. In that case it maps VMIO-backed data rather then user memory, but it is a very similar problem. There has been talk of implementing this type of mechanism not only for sockets, but for file read()/write() as well. In fact, John Dyson had delved into the issue with his vfs.ioopt stuff before he ran out of time. The one problem with using direct VM page mappings is that currently there is no way for the socket to prevent the underlying data from being modified in the middle of a transmission. And, in the same respect for vfs.ioopt, no way to prevent the data the user ostensibly read() into his 'private' buffer from changing out from under the user if the underlying file is modified. For user memory, the only way such a mechanism can currently be implemented is by obtaining the underlying pages and busy'ing them for the duration of their use by the system, causing anyone trying to access them while the system operation is in progress to block. This can cause a potential problem with TCP in that the mbuf data you send to TCP sticks around until it gets pushed out the door *and* acknowledged by the other end. i.e. the data is not disposed of as when read() or write() returns but instead goes directly into TCP's outgoing queue. If the TCP connection hangs, the process may hang. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message