Date: Wed, 28 Jun 2006 10:17:00 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Stanislaw Halik <sthalik@tehran.lain.pl> Cc: freebsd-stable@freebsd.org Subject: Re: trap 12: supervisor write, page not present on 6.1-STABLE Tue May 16 2006 Message-ID: <20060628101405.I50845@fledge.watson.org> In-Reply-To: <20060627134134.GA23337@tehran.lain.pl> References: <20060627045310.GA6324@tehran.lain.pl> <20060627140946.J273@fledge.watson.org> <20060627134134.GA23337@tehran.lain.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 27 Jun 2006, Stanislaw Halik wrote: > On Tue, Jun 27, 2006, Robert Watson wrote: >>> 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you, >>> experienced people, suggest me if it's a hardware problem or is it an >>> error inside the OS? >> This is a known bug in the TCP code; a large set of outstanding changes is >> present in 7.x that will fix the problem when merged. However, I recently >> had push-back on merging the larger batch of changes, so am looking at >> merging a workaround that will also correct the problem without the larger >> set of architectural changes. I hope to have a chance to look at that in >> detail this weekend. > > I'm glad to know that it isn't either unknown or hardware-related. Thank you > for your prompt reply! Per my earlier e-mail, I had hoped to merge a larger set of changes from HEAD that resolve the underlying problem here (that inpcb's can be detached from a socket while the socket is still in use), but right now I'm deferring merging those changes as they are somewhat risky (as they are large). Instead, I've produced a candidate work-around patch, now attached to kern/97095. This does not fix the underlying problem, but seeks to narrow the window for the race to be exercised by avoiding caching a volatile pointer across user memory copying, which under load can result in blocking I/O. I would be quite interested in knowing if this resolves the problem in practice -- if so, it's a definite short-term merge candidate to reduce the symptoms of this problem until the proper fix can be merged. http://www.watson.org/~robert/freebsd/netperf/20060628-ip_ctloutput.diff Thanks, Robert N M Watson Computer Laboratory University of Cambridge Index: ip_output.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v retrieving revision 1.242.2.9 diff -u -r1.242.2.9 ip_output.c --- ip_output.c 4 Jun 2006 10:19:34 -0000 1.242.2.9 +++ ip_output.c 28 Jun 2006 09:03:14 -0000 @@ -1154,7 +1154,7 @@ struct socket *so; struct sockopt *sopt; { - struct inpcb *inp = sotoinpcb(so); + struct inpcb *inp; int error, optval; error = optval = 0; @@ -1187,6 +1187,7 @@ m_free(m); break; } + inp = sotoinpcb(so); INP_LOCK(inp); error = ip_pcbopts(inp, sopt->sopt_name, m); INP_UNLOCK(inp); @@ -1209,6 +1210,7 @@ if (error) break; + inp = sotoinpcb(so); switch (sopt->sopt_name) { case IP_TOS: inp->inp_ip_tos = optval; @@ -1274,6 +1276,7 @@ case IP_MULTICAST_LOOP: case IP_ADD_MEMBERSHIP: case IP_DROP_MEMBERSHIP: + inp = sotoinpcb(so); error = ip_setmoptions(inp, sopt); break; @@ -1283,6 +1286,7 @@ if (error) break; + inp = sotoinpcb(so); INP_LOCK(inp); switch (optval) { case IP_PORTRANGE_DEFAULT: @@ -1325,6 +1329,7 @@ req = mtod(m, caddr_t); len = m->m_len; optname = sopt->sopt_name; + inp = sotoinpcb(so); error = ipsec4_set_policy(inp, optname, req, len, priv); m_freem(m); break; @@ -1341,6 +1346,7 @@ switch (sopt->sopt_name) { case IP_OPTIONS: case IP_RETOPTS: + inp = sotoinpcb(so); if (inp->inp_options) error = sooptcopyout(sopt, mtod(inp->inp_options, @@ -1362,6 +1368,7 @@ case IP_FAITH: case IP_ONESBCAST: case IP_DONTFRAG: + inp = sotoinpcb(so); switch (sopt->sopt_name) { case IP_TOS: @@ -1427,6 +1434,7 @@ case IP_MULTICAST_LOOP: case IP_ADD_MEMBERSHIP: case IP_DROP_MEMBERSHIP: + inp = sotoinpcb(so); error = ip_getmoptions(inp, sopt); break; @@ -1441,7 +1449,8 @@ req = mtod(m, caddr_t); len = m->m_len; } - error = ipsec4_get_policy(sotoinpcb(so), req, len, &m); + inp = sotoinpcb(so); + error = ipsec4_get_policy(inp, req, len, &m); if (error == 0) error = soopt_mcopyout(sopt, m); /* XXX */ if (error == 0)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060628101405.I50845>