From owner-freebsd-arch@FreeBSD.ORG Mon Apr 24 06:33:12 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4D5DB16A401 for ; Mon, 24 Apr 2006 06:33:12 +0000 (UTC) (envelope-from mckusick@chez.mckusick.com) Received: from chez.mckusick.com (dsl081-247-049.sfo1.dsl.speakeasy.net [64.81.247.49]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0780D43D4C for ; Mon, 24 Apr 2006 06:33:11 +0000 (GMT) (envelope-from mckusick@chez.mckusick.com) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.13.6/8.13.1) with ESMTP id k3O6XUJ0042841 for ; Sun, 23 Apr 2006 23:33:30 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <200604240633.k3O6XUJ0042841@chez.mckusick.com> To: arch@freebsd.org X-URL: http://WWW.McKusick.COM/ Date: Sun, 23 Apr 2006 23:33:30 -0700 From: Kirk McKusick Cc: Subject: Linus Torvalds on FreeBSD's Use of Copy-on-write X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Kirk McKusick List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Apr 2006 06:33:12 -0000 Anyone working on zero-copy sockets care to respond to this? http://developers.slashdot.org/developers/06/04/21/1536213.shtml Linus Torvalds made reference to some possible future extensions. This included vmsplice(), a system call since implemented by Jens Axboe "to basically do a 'write to the buffer', but using the reference counting and VM traversal to actually fill the buffer." Reviewing the implications of using such a system call lead to a comparison with FreeBSD's ZERO_COPY_SOCKET which uses COW (copy on write). Linus explained that while this may look good on specific benchmarks, it actually introduces extra overhead, "the thing is, the cost of marking things COW is not just the cost of the initial page table invalidate: it's also the cost of the fault eventually when you _do_ write to the page, even if at that point you decide that the page is no longer shared, and the fault can just mark the page writable again." He went on to explain, "The COW approach does generate some really nice benchmark numbers, because the way you benchmark this thing is that you never actually write to the user page in the first place, so you end up having a nice benchmark loop that has to do the TLB invalidate just the _first_ time, and never has to do any work ever again later on." Linus didn't pull any punches when he summarized: "I claim that Mach people (and apparently FreeBSD) are incompetent idiots. Playing games with VM is bad. memory copies are _also_ bad, but quite frankly, memory copies often have _less_ downside than VM games, and bigger caches will only continue to drive that point home."