Date: Thu, 10 Nov 2011 13:13:59 +0330 From: Hooman Fazaeli <hoomanfazaeli@gmail.com> To: Adrian Chadd <adrian@freebsd.org> Cc: pyunyh@gmail.com, freebsd-net@freebsd.org, Jason Wolfe <nitroboost@gmail.com>, Jack Vogel <jfvogel@gmail.com>, Emil Muratov <gpm@hotplug.ru> Subject: Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled Message-ID: <4EBB9CDF.9090300@gmail.com> In-Reply-To: <CAJ-Vmok8kX9F5eXTghx_s7diNiLTWY1-eMDUdCOUHQCz6zW%2BPg@mail.gmail.com> References: <CAAAm0r0RXEJo4UiKS=Ui0e5OQTg6sg-xcYf3mYB5%2Bvk8i8557w@mail.gmail.com> <CAAAm0r1DKvoL9=Ket9up=4%2B5xiCzTTZJK99FhF9jcCA28B0M%2BA@mail.gmail.com> <CAAAm0r3XdsMHZh%2BP_NF-txZasdExzwZ8ymmGQgGhJQds0fOiBQ@mail.gmail.com> <CAAAm0r1iS3z-7CBJ=xYDf%2BJOA1Q2nU0O54Twbyb7FjvgWHjKVw@mail.gmail.com> <4EA7E203.3020306@sepehrs.com> <CAAAm0r3Nr2t8cCetPkFnLQ-3KwqHw_0SpqbtvYPRUkSP=9n8CA@mail.gmail.com> <4EA80818.3030504@sentex.net> <4EA80F88.4000400@hotplug.ru> <4EA82715.2000404@gmail.com> <4EA8FA40.7010504@hotplug.ru> <4EA91836.2040508@gmail.com> <4EA959EE.2070806@hotplug.ru> <4EAD116A.8090006@gmail.com> <CAAAm0r3qm=nQQuAmZDD4k4X8K-xW6_kM9TukRT=1GoG9dYR3zw@mail.gmail.com> <4EAE58A2.9040803@gmail.com> <CAAAm0r0uoPPEQbq5rHkFr6ZLp-WJ4YVjDVvxxV6y%2BUh4eEKDEA@mail.gmail.com> <4EB96511.50701@gmail.com> <CAJ-Vmomf-wxb8dY7YF7qT_FGK5d-YLPU3BkPOeHnOtKZ%2BUrYeQ@mail.gmail.com> <4EBA3F22.2060204@gmail.com> <CAJ-Vmok8kX9F5eXTghx_s7diNiLTWY1-eMDUdCOUHQCz6zW%2BPg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/10/2011 3:39 AM, Adrian Chadd wrote: > There's no locking around the OACTIVE flag set/clear, right? > Is it possible that multiple TX threads are fiddling with OACTIVE and > then it's not being properly cleared and tx kicked? > > > Adrian If we check for OACTIVE periodically (for instance, in local_timer) and under transient resource shortage, the driver will finally end up with OACTIVE cleared. Under frequent resource shortages, the driver may remain OACTIVE longer than it is ~OACTIVE or it may constantly toggles but there is not much the driver can do about this and a simple locking around OACTIVE set/clear does not change the situation. The problem _is_ low resources and the only fix is to increase it. The problems we should focus on here are two things: 1- The driver _must_ be able to recover from OACTIVE after transient resource shortages. 2- It is desirable to do this as fast as possible. Doing recovery in local_timer accommodates the first need but it is very far from from the second. One possible solution for 2 would be to defer setting OACTIVE until N consecutive transmissions fail (i.e., N == 75% (if_snd.ifq_maxlen - if_snd.ifq_len)). The overhead is a little wasted cpu time in longer OACTIVE states. We still need local_timer to recover from these states.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EBB9CDF.9090300>