From owner-freebsd-arch@FreeBSD.ORG Sun Jun 11 13:25:30 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8614716A41B for ; Sun, 11 Jun 2006 13:25:30 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3BE5E43D45 for ; Sun, 11 Jun 2006 13:25:30 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id E599946BE7 for ; Sun, 11 Jun 2006 09:25:29 -0400 (EDT) Date: Sun, 11 Jun 2006 14:25:29 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: arch@FreeBSD.org Message-ID: <20060611141632.Y26634@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: MFC of socket/protocol reference improvements X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jun 2006 13:25:30 -0000 Dear All, I'm in the process of evaluating a possible MFC over the socket/protocol reference changes I made in April, 2006, to the RELENG_6 branch. Over the past few months, these changes have been gradually refined, and a number of bugs (of varying severity) have been fixed. These changes are important because they close a significant number of races, reduce the locking overhead and improve parallelism, and lay the groundwork for future improvement in the socket and protocol code. However, they also make significant changes in a number of important network protocols (such as TCP), and change the semantics of the socket/protocol interface (protosw.h). I think these changes are important to our short and long term goals of improving network stack performance and architecture. My original plan was to start looking in detail at the MFC after about three months of settling time, which is in about three weeks. I continue to plan to do this. A few specific points for discussion: (1) Normally, RELENG_* has significant constraints on changes to the kernel APIs used by loadable modules -- especially for device drivers. In the past, we've not made a lot of changes to the protocol switch interface, and historically it hasn't been a run-time extensible interface. Andre has recent made changes to allow IP protocols to be loaded at runtime, such as IP divert, and these will be affected, however. Do we consider modules programmed against these interfaces to be "breakable" -- i.e., the require a recompile and or changes in the RELENG_6 branch? (2) More testing would really be appreciated. I caught a subtle bug in the handling of the retransmit timer in the context of my changes by accident as a result of close analysis of TCP traces through a firewall -- I noticed some "odd" packets that were, with a bit of time, tracked to their source. However, this sort of thing is really subtle. Any help determining whether there are other regressions in TCP behavior would be greatly appreciated. While we've now hammered on the new code quite a bit, and it fixes some known panics in RELENG_6, I would categorize these changes as high risk, as they touch quite sensitive and heavily deployed code. Getting the 7.x code tested in diverse high load environments before the MFC would be very good. I'm still in the process of looking at further refinements of the socket/protocol relationship, which may be candidates for future merging, and will depend on these changes. Among other things, I've been looking at further evolving the notion of socket close vs. socket detach, which are currently conflated notions, leading to both a lack of clarity and lack of flexibility in the current API. In turn, that has presented a problem with experimenting with alternative locking strategies, such as vertical integration of locks between the socket and protocol layers. Getting these changes into RELENG_6 will depend on these earlier changes being merged. Robert N M Watson Computer Laboratory Universty of Cambridge