From owner-freebsd-stable@FreeBSD.ORG Wed Jun 28 09:17:01 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 09CB616A408 for ; Wed, 28 Jun 2006 09:17:01 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9FF9D43D94 for ; Wed, 28 Jun 2006 09:17:00 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 57D9846BF6; Wed, 28 Jun 2006 05:17:00 -0400 (EDT) Date: Wed, 28 Jun 2006 10:17:00 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Stanislaw Halik In-Reply-To: <20060627134134.GA23337@tehran.lain.pl> Message-ID: <20060628101405.I50845@fledge.watson.org> References: <20060627045310.GA6324@tehran.lain.pl> <20060627140946.J273@fledge.watson.org> <20060627134134.GA23337@tehran.lain.pl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-stable@freebsd.org Subject: Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jun 2006 09:17:01 -0000 On Tue, 27 Jun 2006, Stanislaw Halik wrote: > On Tue, Jun 27, 2006, Robert Watson wrote: >>> 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you, >>> experienced people, suggest me if it's a hardware problem or is it an >>> error inside the OS? >> This is a known bug in the TCP code; a large set of outstanding changes is >> present in 7.x that will fix the problem when merged. However, I recently >> had push-back on merging the larger batch of changes, so am looking at >> merging a workaround that will also correct the problem without the larger >> set of architectural changes. I hope to have a chance to look at that in >> detail this weekend. > > I'm glad to know that it isn't either unknown or hardware-related. Thank you > for your prompt reply! Per my earlier e-mail, I had hoped to merge a larger set of changes from HEAD that resolve the underlying problem here (that inpcb's can be detached from a socket while the socket is still in use), but right now I'm deferring merging those changes as they are somewhat risky (as they are large). Instead, I've produced a candidate work-around patch, now attached to kern/97095. This does not fix the underlying problem, but seeks to narrow the window for the race to be exercised by avoiding caching a volatile pointer across user memory copying, which under load can result in blocking I/O. I would be quite interested in knowing if this resolves the problem in practice -- if so, it's a definite short-term merge candidate to reduce the symptoms of this problem until the proper fix can be merged. http://www.watson.org/~robert/freebsd/netperf/20060628-ip_ctloutput.diff Thanks, Robert N M Watson Computer Laboratory University of Cambridge Index: ip_output.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v retrieving revision 1.242.2.9 diff -u -r1.242.2.9 ip_output.c --- ip_output.c 4 Jun 2006 10:19:34 -0000 1.242.2.9 +++ ip_output.c 28 Jun 2006 09:03:14 -0000 @@ -1154,7 +1154,7 @@ struct socket *so; struct sockopt *sopt; { - struct inpcb *inp = sotoinpcb(so); + struct inpcb *inp; int error, optval; error = optval = 0; @@ -1187,6 +1187,7 @@ m_free(m); break; } + inp = sotoinpcb(so); INP_LOCK(inp); error = ip_pcbopts(inp, sopt->sopt_name, m); INP_UNLOCK(inp); @@ -1209,6 +1210,7 @@ if (error) break; + inp = sotoinpcb(so); switch (sopt->sopt_name) { case IP_TOS: inp->inp_ip_tos = optval; @@ -1274,6 +1276,7 @@ case IP_MULTICAST_LOOP: case IP_ADD_MEMBERSHIP: case IP_DROP_MEMBERSHIP: + inp = sotoinpcb(so); error = ip_setmoptions(inp, sopt); break; @@ -1283,6 +1286,7 @@ if (error) break; + inp = sotoinpcb(so); INP_LOCK(inp); switch (optval) { case IP_PORTRANGE_DEFAULT: @@ -1325,6 +1329,7 @@ req = mtod(m, caddr_t); len = m->m_len; optname = sopt->sopt_name; + inp = sotoinpcb(so); error = ipsec4_set_policy(inp, optname, req, len, priv); m_freem(m); break; @@ -1341,6 +1346,7 @@ switch (sopt->sopt_name) { case IP_OPTIONS: case IP_RETOPTS: + inp = sotoinpcb(so); if (inp->inp_options) error = sooptcopyout(sopt, mtod(inp->inp_options, @@ -1362,6 +1368,7 @@ case IP_FAITH: case IP_ONESBCAST: case IP_DONTFRAG: + inp = sotoinpcb(so); switch (sopt->sopt_name) { case IP_TOS: @@ -1427,6 +1434,7 @@ case IP_MULTICAST_LOOP: case IP_ADD_MEMBERSHIP: case IP_DROP_MEMBERSHIP: + inp = sotoinpcb(so); error = ip_getmoptions(inp, sopt); break; @@ -1441,7 +1449,8 @@ req = mtod(m, caddr_t); len = m->m_len; } - error = ipsec4_get_policy(sotoinpcb(so), req, len, &m); + inp = sotoinpcb(so); + error = ipsec4_get_policy(inp, req, len, &m); if (error == 0) error = soopt_mcopyout(sopt, m); /* XXX */ if (error == 0)