Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Jul 2022 11:03:47 +0200
From:      Kristof Provost <kp@FreeBSD.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org
Subject:   Re: git: 151abc80cde7 - main - if_vlan: avoid hash table thrashing when adding and removing entries
Message-ID:  <057F4F07-EFE1-4F67-B15D-857F42C5588D@FreeBSD.org>
In-Reply-To: <1ad901aa-ac87-3725-de1e-3c6a42c50cc2@FreeBSD.org>
References:  <202207222319.26MNJEVY032683@gitrepo.freebsd.org> <1ad901aa-ac87-3725-de1e-3c6a42c50cc2@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_=
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

On 25 Jul 2022, at 18:20, John Baldwin wrote:
> On 7/22/22 4:19 PM, Kristof Provost wrote:
>> The branch main has been updated by kp:
>>
>> URL: =

>> https://cgit.FreeBSD.org/src/commit/?id=3D151abc80cde778bc18b91c334d07=
fbd52bbb38fb
>>
>> commit 151abc80cde778bc18b91c334d07fbd52bbb38fb
>> Author:     Kristof Provost <kp@FreeBSD.org>
>> AuthorDate: 2022-07-22 17:17:04 +0000
>> Commit:     Kristof Provost <kp@FreeBSD.org>
>> CommitDate: 2022-07-22 17:18:41 +0000
>>
>>      if_vlan: avoid hash table thrashing when adding and removing =

>> entries
>>          vlan_remhash() uses incorrect value for b.
>>          When using the default value for VLAN_DEF_HWIDTH (4), the =

>> VLAN hash-list table
>>      expands from 16 chains to 32 chains as the 129th entry is added. =

>> trunk->hwidth
>>      becomes 5. Say a few more entries are added and there are now =

>> 135 entries.
>>      trunk-hwidth will still be 5. If an entry is removed, =

>> vlan_remhash() will
>>      calculate a value of 32 for b. refcnt will be decremented to =

>> 134. The if
>>      comparison at line 473 will return true and vlan_growhash() will =

>> be called. The
>>      VLAN hash-list table will be compressed from 32 chains wide to =

>> 16 chains wide.
>>      hwidth will become 4. This is an error, and it can be seen when =

>> a new VLAN is
>>      added. The table will again be expanded. If an entry is then =

>> removed, again
>>      the table is contracted.
>>          If the number of VLANS stays in the range of 128-512, each =

>> time an insert
>>      follows a remove, the table will expand. Each time a remove =

>> follows an
>>      insert, the table will be contracted.
>>          The fix is simple. The line 473 should test that the number =

>> of entries has
>>      decreased such that the table should be contracted using what =

>> would be the new
>>      value of hwidth. line 467 should be:
>>                  b =3D 1 << (trunk->hwidth - 1);
>>          PR:             265382
>>      Reviewed by:    kp
>>      MFC after:      2 weeks
>>      Sponsored by:   NetApp, Inc.
>
> This does mean that there are some odd edge cases (e.g. if you remove =

> the 513th entry
> only to turn around and re-add it) that can still thrash even with =

> this fixed.  Not
> sure if those are worth worrying about though?  If they were, perhaps =

> you could
> imagine some sort of "dampener" where you have to remove some number N =

> of entries
> to actually rehash to a smaller table.  But that is probably rather =

> complex to
> implement and probably not worth it?
>
Good point, although I think I agree it=E2=80=99s probably not worth worr=
ying =

too much about this.

There=E2=80=99s a tangentially related issue I also want to look at one o=
f =

these days, namely that the current `trunk->refcnt > (b * b) / 2` check =

results in far too many entries in each hash row. For example, until 128 =

vlan interfaces we stick with width =3D 4, so end up with an average of 8=
 =

entries in each row.
That means that for every inbound packet we scan through a linked list =

of 8 entries to find the correct vlan interface. We should tweak the =

check to reduce this, as all it=E2=80=99ll cost is a tiny bit of memory. =
(Or =

we should just default VLAN_ARRAY on, paying a bit more memory for no =

linked-list scanning).

Kristof
--=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_=
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE html>
<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/xhtml; charset=3Dutf-8"=
>
</head>
<body><div style=3D"font-family: sans-serif;"><div class=3D"markdown" sty=
le=3D"white-space: normal;">
<p dir=3D"auto">On 25 Jul 2022, at 18:20, John Baldwin wrote:</p>
</div><div class=3D"plaintext" style=3D"white-space: normal;"><blockquote=
 style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px solid #136=
BCE; color: #136BCE;"><p dir=3D"auto">On 7/22/22 4:19 PM, Kristof Provost=
 wrote:</p>
<blockquote style=3D"margin: 0 0 5px; padding-left: 5px; border-left: 2px=
 solid #136BCE; border-left-color: #4B89CF; color: #4B89CF;"><p dir=3D"au=
to">The branch main has been updated by kp:</p>
<p dir=3D"auto">URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D=
151abc80cde778bc18b91c334d07fbd52bbb38fb">https://cgit.FreeBSD.org/src/co=
mmit/?id=3D151abc80cde778bc18b91c334d07fbd52bbb38fb</a></p>
<p dir=3D"auto">commit 151abc80cde778bc18b91c334d07fbd52bbb38fb
<br>
Author:     Kristof Provost &lt;kp@FreeBSD.org&gt;
<br>
AuthorDate: 2022-07-22 17:17:04 +0000
<br>
Commit:     Kristof Provost &lt;kp@FreeBSD.org&gt;
<br>
CommitDate: 2022-07-22 17:18:41 +0000</p>
<p dir=3D"auto">     if_vlan: avoid hash table thrashing when adding and =
removing entries
<br>
         vlan_remhash() uses incorrect value for b.
<br>
         When using the default value for VLAN_DEF_HWIDTH (4), the VLAN h=
ash-list table
<br>
     expands from 16 chains to 32 chains as the 129th entry is added. tru=
nk-&gt;hwidth
<br>
     becomes 5. Say a few more entries are added and there are now 135 en=
tries.
<br>
     trunk-hwidth will still be 5. If an entry is removed, vlan_remhash()=
 will
<br>
     calculate a value of 32 for b. refcnt will be decremented to 134. Th=
e if
<br>
     comparison at line 473 will return true and vlan_growhash() will be =
called. The
<br>
     VLAN hash-list table will be compressed from 32 chains wide to 16 ch=
ains wide.
<br>
     hwidth will become 4. This is an error, and it can be seen when a ne=
w VLAN is
<br>
     added. The table will again be expanded. If an entry is then removed=
, again
<br>
     the table is contracted.
<br>
         If the number of VLANS stays in the range of 128-512, each time =
an insert
<br>
     follows a remove, the table will expand. Each time a remove follows =
an
<br>
     insert, the table will be contracted.
<br>
         The fix is simple. The line 473 should test that the number of e=
ntries has
<br>
     decreased such that the table should be contracted using what would =
be the new
<br>
     value of hwidth. line 467 should be:
<br>
                 b =3D 1 &lt;&lt; (trunk-&gt;hwidth - 1);
<br>
         PR:             265382
<br>
     Reviewed by:    kp
<br>
     MFC after:      2 weeks
<br>
     Sponsored by:   NetApp, Inc.</p>
</blockquote><p dir=3D"auto">This does mean that there are some odd edge =
cases (e.g. if you remove the 513th entry
<br>
only to turn around and re-add it) that can still thrash even with this f=
ixed.  Not
<br>
sure if those are worth worrying about though?  If they were, perhaps you=
 could
<br>
imagine some sort of "dampener" where you have to remove some number N of=
 entries
<br>
to actually rehash to a smaller table.  But that is probably rather compl=
ex to
<br>
implement and probably not worth it?</p>
<br></blockquote></div>
<div class=3D"markdown" style=3D"white-space: normal;">
<p dir=3D"auto">Good point, although I think I agree it=E2=80=99s probabl=
y not worth worrying too much about this.</p>
<p dir=3D"auto">There=E2=80=99s a tangentially related issue I also want =
to look at one of these days, namely that the current <code>trunk-&gt;ref=
cnt &gt; (b * b) / 2</code> check results in far too many entries in each=
 hash row. For example, until 128 vlan interfaces we stick with width =3D=
 4, so end up with an average of 8 entries in each row.<br>
That means that for every inbound packet we scan through a linked list of=
 8 entries to find the correct vlan interface. We should tweak the check =
to reduce this, as all it=E2=80=99ll cost is a tiny bit of memory. (Or we=
 should just default VLAN_ARRAY on, paying a bit more memory for no linke=
d-list scanning).</p>
<p dir=3D"auto">Kristof</p>

</div></div></body>

</html>

--=_MailMate_5ECFD0FD-8C38-43CD-ACD5-6EDDAE48A5DF_=--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?057F4F07-EFE1-4F67-B15D-857F42C5588D>