From owner-freebsd-net@freebsd.org Thu Sep 24 12:13:42 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4D7FCA0768D for ; Thu, 24 Sep 2015 12:13:42 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-wi0-f181.google.com (mail-wi0-f181.google.com [209.85.212.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D4C491AE5; Thu, 24 Sep 2015 12:13:41 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by wiclk2 with SMTP id lk2so109917343wic.1; Thu, 24 Sep 2015 05:13:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type; bh=XrVXqTYIWB7Im2kgYHlf9E0b/9YKaPrKQ4l4483bHN8=; b=bKic0QcMGvIBJ74Y5ZJFs1+kDs4XUsry1i7C7BtYQNaitK6VtLyYfQNoNZFowK70he Z2qZeCOY0VOItKkNDaxYyc2QYSmJRJP2k/OMqDKIiDBmq4u/+YBjWukbJg21vX3t37gY YrxUOJMwKfetRv1GKo/7ENwYxLx62kAUTYPiI2fkQydJp4n/sEO50x+GE8ORj88v6gFM D7pAhe56IwmKiW50O/Y3uQQ1rJ2WaqbY9t/0Pijn0BZJkBWTc+8U7LlUEG1U2i1T04BH yaaClRTvIPsib6DbzEjvc2h/gxw4z392AjcSghsR4Sb/nC5ioJzAKAqoc9O+bwEo1D5F yEEQ== X-Received: by 10.194.58.177 with SMTP id s17mr48548382wjq.102.1443096814178; Thu, 24 Sep 2015 05:13:34 -0700 (PDT) Received: from FRI2JCHARBON-M1.local ([217.30.88.7]) by smtp.googlemail.com with ESMTPSA id x9sm9625989wjf.44.2015.09.24.05.13.32 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 24 Sep 2015 05:13:33 -0700 (PDT) Subject: Re: Can tcp_close() be called in INP_TIMEWAIT case To: Palle Girgensohn , John Baldwin , George Neville-Neil References: <26B0FF93-8AE3-4514-BDA1-B966230AAB65@FreeBSD.org> <55FC1809.3070903@freebsd.org> <20150918160605.GN67105@kib.kiev.ua> <55FFBE01.6060706@freebsd.org> <3721F099-F45D-4DCD-8AB3-84D1ABC44145@FreeBSD.org> <73856F2B-3E70-483C-9988-C84E798CEB44@FreeBSD.org> <44EBAC98-4761-4E47-8E47-5032430A1C8A@FreeBSD.org> <56019AF8.8000705@freebsd.org> <5601CF2D.9030307@freebsd.org> <5602BB7A.9010504@freebsd.org> Cc: Konstantin Belousov , freebsd-net@freebsd.org From: Julien Charbon X-Enigmail-Draft-Status: N1110 Message-ID: <5603E8E4.5030406@freebsd.org> Date: Thu, 24 Sep 2015 14:13:24 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <5602BB7A.9010504@freebsd.org> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="jf9xFk6R77gtURuCKOu9XtufFhCKG4dl3" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Sep 2015 12:13:42 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --jf9xFk6R77gtURuCKOu9XtufFhCKG4dl3 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi -net, On 23/09/15 16:47, Julien Charbon wrote: > Thanks to Palle, I got access to the kernel dump. And the results is > more interesting than expected: Thus somehow the kernel reaches a stat= e > in tcp_detach() where: >=20 > INP_TIMEWAIT && INP_DROPPED && tp !=3D NULL >=20 > In details: >=20 > - inp is in TIMEWAIT state > - inp has been dropped by in_pcbdrop() > - inp->inp_ppcb (a struct tcptw) is not NULL >=20 > All the related structures looks good from the coredump: socket, inp,= > and tcptw, thus no sign of any memory corruption (so far). >=20 > And for the kernel, this state it is _not_ ok. Hopefully, there are > only two functions that set the INP_DROPPED flags: >=20 > - tcp_twclose() and, > - tcp_close() >=20 > If tcp_twclose() is called inp->inp_ppcb is set to NULL and the struct= > tcptw is freed (all good, not assertion) >=20 > If tcp_close() is called inp->inp_ppcb is left untouched (less ok, > potential assertion) >=20 > Almost all tcp_close() calls (or tcp_close() parents calls) use a > pattern like: >=20 > if (inp->inp_flags & INP_TIMEWAIT) { > /* Don't call tcp_close() just return */ > return; > } >=20 > /* Call tcp_close() */ > tcp_close(); >=20 > But not _all_ tcp_close() calls. >=20 > Thus the most important point here is: Either this assertion is wrong= , > either tcp_close() in INP_TIMEWAIT state should not happen. >=20 > This assert and tcp_close() current behavior is here since a long time= , > thus I would like old beards^W^W^W more experimented TCP stack > developers to give an opinion/refresh theirs memories on this very > specific case. So the issue is: - tcp_close() is called for some reasons with an inp in INP_TIMEWAIT state and sets the INP_DROPPED flag, - tcp_detach() is called when the last reference on socket is dropped then now in_pcbfree() can be called twice instead of once: 1. First in tcp_detach(): static void tcp_detach(struct socket *so, struct inpcb *inp) { struct tcpcb *tp; tp =3D intotcpcb(inp); if (inp->inp_flags & INP_TIMEWAIT) { if (inp->inp_flags & INP_DROPPED) { in_pcbdetach(inp); in_pcbfree(inp); <-- } 2. Second when tcptw expires here: void tcp_twclose(struct tcptw *tw, int reuse) { struct socket *so; struct inpcb *inp; inp =3D tw->tw_inpcb; tcp_tw_2msl_stop(tw, reuse); inp->inp_ppcb =3D NULL; in_pcbdrop(inp); so =3D inp->inp_socket; if (so !=3D NULL) { ... } else { in_pcbfree(inp); <-- } This behavior is backed by Palle kernel panic backstraces and coredumps.= o Solutions: Long: Forbid to call tcp_close() when inp is in INP_TIMEWAIT state, the TCP stack rule being: - if !INP_TIMEWAIT: Call tcp_close() - if INP_TIMEWAIT: Call tcp_twclose() (or call nothing, the tcptw will expire/be recycled anyway) Short: if INP_TIMEWAIT & INP_DROPPED: Do not call in_pcbfree(inp) in tcp_detach() unless tcptw is already discarded. The long solution seems cleaner, backed by tcp_detach() old comments and most of current tcp_close() calls but I won't take that longer path without -net approval first. Thanks. -- Julien "For every complex problem there is an answer that is clear, simple, and wrong" -- H. L. Mencken --jf9xFk6R77gtURuCKOu9XtufFhCKG4dl3 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJWA+jtAAoJEKVlQ5Je6dhxh8QIAIkArsWyFAWKuLHEiQe+SYso 7YnxipNNTcGnD8T6GY1MJwhFxpc/PVf2wTOgdOcDmdFOL8FsYPzajvZyUWWoIrz4 CPDlUZ8k0ZeUDTafkYcUf/EITVMF3p6znbz30AxEI5Bi2vJ2BPCWv1KPZxlYZIoz IpNy10/ucL5xHNzqmSZDdUGLko2ODjUHTpYMTxH9nyrYkD8Y1fmH/I3C2HsoWi4O a++prZLYmQL0LgRyH4j6EMCV1epkmj8VWRHGWG72EJS1Gm0DP6JYs+aLxFfSrcKn P6GfsAO+fyLIZoOn+9AE+utvBHN30s2NzgYKf4PxtLN4Ahzt0oBD21/sbSYz2TQ= =fxeu -----END PGP SIGNATURE----- --jf9xFk6R77gtURuCKOu9XtufFhCKG4dl3--