Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Oct 2016 09:20:17 +0200
From:      Julien Charbon <jch@freebsd.org>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info>
Subject:   Re: 11.0 stuck on high network load
Message-ID:  <8143cd8f-c007-2378-b004-b2b037402d03@freebsd.org>
In-Reply-To: <20161010173531.GI6177@zxy.spb.ru>
References:  <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> <20161010142941.GV54003@zxy.spb.ru> <52d634aa-639c-bef7-1f10-c46dbadc4d85@freebsd.org> <20161010173531.GI6177@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--oq9rDkJ6isRwA1ckJwhFJjfq55qKQCU3W
Content-Type: multipart/mixed; boundary="6bRp3pkLFdj50Nvhlaf2QLjbLWh7J9Aq4";
 protected-headers="v1"
From: Julien Charbon <jch@freebsd.org>
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org,
 hiren panchasara <hiren@strugglingcoder.info>
Message-ID: <8143cd8f-c007-2378-b004-b2b037402d03@freebsd.org>
Subject: Re: 11.0 stuck on high network load
References: <20160926172159.GA54003@zxy.spb.ru>
 <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com>
 <20160928115909.GC54003@zxy.spb.ru>
 <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org>
 <20161006111043.GH54003@zxy.spb.ru>
 <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org>
 <20161010133220.GU54003@zxy.spb.ru>
 <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org>
 <20161010142941.GV54003@zxy.spb.ru>
 <52d634aa-639c-bef7-1f10-c46dbadc4d85@freebsd.org>
 <20161010173531.GI6177@zxy.spb.ru>
In-Reply-To: <20161010173531.GI6177@zxy.spb.ru>

--6bRp3pkLFdj50Nvhlaf2QLjbLWh7J9Aq4
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


 Hi Slawa,

On 10/10/16 7:35 PM, Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 05:44:21PM +0200, Julien Charbon wrote:
>>>> can check the current other usages of goto findpcb in tcp_input().  =
The
>>>> rational here being:
>>>>
>>>>  - Behavior before the patch:  If the inp we found was deleted then =
goto
>>>> findpcb.
>>>>  - Behavior after the patch:  If the inp we found was deleted or dro=
pped
>>>> then goto findpcb.
>>>>
>>>>  I just prefer having the same behavior applied everywhere:  If
>>>> tcp_input() loses the inp lock race and the inp was deleted or dropp=
ed
>>>> then retry to find a new inpcb to deliver to.
>>>>
>>>>  But you are right dropping the packet here will also fix the issue.=

>>>>
>>>>  Then the review process becomes quite helpful because people can ar=
gue:
>>>>  Dropping here is better because "blah", or goto findpcb is better
>>>> because "bluh", etc.  And at the review end you have a nice final pa=
tch.
>>>>
>>>> https://reviews.freebsd.org/D8211
>>>
>>> I am not sure, I am see to
>>>
>>> sys/netinet/in_pcb.h:#define    INP_DROPPED             0x04000000 /*=

>> protocol drop flag */
>>>
>>> and think this is a flag 'all packets must be droped'
>>
>>  Hm, I believe this flag means "this inp has been dropped by the TCP
>> stack, so don't use it anymore".  Actually this flag is better describ=
ed
>> in the function that sets it:
>>
>> "(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid
>> future packet delivery or event notification when a socket remains ope=
n
>> but TCP has closed."
>>
>> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_=
pcb.c#L1320
>>
>> /*
>>  * in_pcbdrop() removes an inpcb from hashed lists, releasing its
>> address and
>>  * port reservation, and preventing it from being returned by inpcb lo=
okups.
>>  *
>>  * It is used by TCP to mark an inpcb as unused and avoid future packe=
t
>>  * delivery or event notification when a socket remains open but TCP h=
as
>>  * closed.  This might occur as a result of a shutdown()-initiated TCP=
 close
>>  * or a RST on the wire, and allows the port binding to be reused whil=
e
>> still
>>  * maintaining the invariant that so_pcb always points to a valid inpc=
b
>> until
>>  * in_pcbdetach().
>>  *
>>  */
>> void
>> in_pcbdrop(struct inpcb *inp)
>> {
>>   inp->inp_flags |=3D INP_DROPPED;
>>   ...
>>
>>  The classical example where "goto findpcb" is useful:  You receive a
>> new connection request with a TCP SYN packet and this packet is unluck=
y
>> and reached a inp being dropped:
>>
>>  - with "goto findpcb" approach, the next lookup will most likely find=

>> the LISTEN inp and start the TCP hand-shake as usual
>>  - with "drop the packet" approach, the TCP client will need to
>> re-transmit a TCP SYN packet
>>
>>  It is not because a packet was unlucky once that it deserves to be
>> dropped. :)
>=20
> Thanks for explaining, very helpfull.
> In this situation (TCP SYN with same 4-tuple as existing socket)
> allocate new PCB is best. But for this we must destroy current PCB. I
> am think INP_WUNLOCK(inp) don't destroy it and in_pcblookup_mbuf find
> it again (I am think in_pcblookup_mbuf find this PCB on first turn).
> I am assume for classical example in_pcbrele_wlocked(inp) free and
> destroy current PCB for possibility in_pcblookup_mbuf allocate new
> one.

 Astute question:  Here, the same inp cannot be find again by
in_pcblookup_mbuf(), the explanation is a bit long though:

in_pcbdrop() does two things under INP_WLOCK lock:
https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb=
=2Ec#L1334

1. Add INP_DROPPED flag
2. Remove the inp from the TCP hash table

 And once removed from the TCP hash table, in_pcblookup_mbuf() will
return NULL when doing the same TCP hash table lookup again.

 It means that under a INP_WLOCK lock, these two changes are atomic:

 - Either an inp does not have INP_DROPPED flag and can be in TCP hash ta=
ble
 - Either an inp has INP_DROPPED flag and is not in TCP hash table

 But when you don't have the INP_WLOCK lock, then you can witness the
intermediate state where a inp is still in TCP hash table while a thread
is adding the INP_DROPPED flag.

 Nothing unusual here.

 Then threads are competing for the INP_WLOCK lock.  For the example,
let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and
racing for this INP_WLOCK:

https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb=
=2Ec#L1964

 And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and
racing for this INP_WLOCK:

https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_ti=
mer.c#L323

 That leads to two cases:

 o Thread A wins the race:

  Thread A will continue tcp_input() as usal and INP_DROPPED flags is
not set and inp is still in TCP hash table.
  Thread B is waiting on thread A to release INP_WLOCK after finishing
tcp_input() processing, and thread B will continue
tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing.

 o Thread B wins the race:

  Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp
INP_DROPPED is set and inp being removed from TCP hash table.
  In parallel, thread A has found the inp in TCP hash before is was
removed, and waiting on the found inp INP_WLOCK lock.
  Once thread B has released the INP_WLOCK lock, thread A gets this lock
and sees the INP_DROPPED flag and do "goto findpcb" but here because the
inp is not more in TCP hash table and it will not be find again by
in_pcblookup_mbuf().

 Hopefully I am clear enough here.

--
Julien


--6bRp3pkLFdj50Nvhlaf2QLjbLWh7J9Aq4--

--oq9rDkJ6isRwA1ckJwhFJjfq55qKQCU3W
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJX/JK2AAoJEKVlQ5Je6dhxNxkH/RhKPtBgDzP+NpnwwumbRhEr
PrbYsnFOKL60VpPZfbvY4vz7TNclgmaaS2VIRcuJphuJr2k6MfkETyFd7QSBqSQV
vg4tAvNVXbNqG+Ubvrc2KvY/QeK+FbJc1aYlzZ9xS5trHLbf0uRjoIg+KsC+gyBA
VPLizfIGdN+8tHFs0FKAuirfPgiqNBBgUDaTCqh0l8DOsRp3/QLPs05sok6UJAtF
ooaba6b4WyFe1DCQawMvJM4JDZIrGMy3qjzZYQppM8FxYbbl3yCAqcTDV0wgksXF
TViCJYuRb+xYbQNdvnOSPZUBGbjsZYa8UXq4fmqva9urD7UllFpiDzKJ8uPJBIw=
=5V+B
-----END PGP SIGNATURE-----

--oq9rDkJ6isRwA1ckJwhFJjfq55qKQCU3W--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8143cd8f-c007-2378-b004-b2b037402d03>