Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Jan 2022 16:27:23 +0100
From:      Wojciech Macek <wma@semihalf.com>
To:        Mark Johnston <markj@freebsd.org>
Cc:        Wojciech Macek <wma@freebsd.org>, src-committers <src-committers@freebsd.org>,  "<dev-commits-src-all@freebsd.org>" <dev-commits-src-all@freebsd.org>, dev-commits-src-main@freebsd.org
Subject:   Re: git: 2e72208b6c62 - main - ip_mroute: do not call epoch_waitwhen lock is taken
Message-ID:  <CANsEV8c_WGG%2BCdtvsAo0PC7qrdBiAMLsWhPwVQ_vmWouhHBHBQ@mail.gmail.com>
In-Reply-To: <Yd9K0bT4kgWDHlLd@nuc>
References:  <202201111019.20BAJjCg058259@gitrepo.freebsd.org> <Yd9K0bT4kgWDHlLd@nuc>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000593d8f05d578538d
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Thanks, i'll take a look if moving mrouter_done outside the lock would work=
.

Wojtek

=C5=9Br., 12 sty 2022, 22:40 u=C5=BCytkownik Mark Johnston <markj@freebsd.o=
rg>
napisa=C5=82:

> On Tue, Jan 11, 2022 at 10:19:45AM +0000, Wojciech Macek wrote:
> > The branch main has been updated by wma:
> >
> > URL:
> https://cgit.FreeBSD.org/src/commit/?id=3D2e72208b6c622505323ed48dc58830f=
c307392b1
> >
> > commit 2e72208b6c622505323ed48dc58830fc307392b1
> > Author:     Wojciech Macek <wma@FreeBSD.org>
> > AuthorDate: 2022-01-11 10:08:35 +0000
> > Commit:     Wojciech Macek <wma@FreeBSD.org>
> > CommitDate: 2022-01-11 10:19:32 +0000
> >
> >     ip_mroute: do not call epoch_waitwhen lock is taken
> >
> >     mrouter_done is called with RAW IP lock taken. Some annoying
> >     printfs are visible on the console if INVARIANTS option is enabled.
> >
> >     Provide atomic-based mechanism which counts enters and exits from/t=
o
> >     critical section in ip_input and ip_output.
> >     Before de-initialization of function pointers ensure (with busy-wai=
t)
> >     that mrouter de-initialization is visible to all readers and that w=
e
> don't
> >     remove pointers (like ip_mforward etc.) in the middle of packet
> processing.
>
> This doesn't address the problem that the warning is complaining about,
> which is that a non-sleepable lock is held while performing an expensive
> operation.  Now we are silently spinning instead, and there's no
> guarantee of forward progress since threads may be frequently entering
> the MROUTER_RLOCK-protected section in ip_input().  And the change
> converts a per-CPU atomic operation (in epoch_enter()) to a global
> atomic operation, potentially hurting performance.  The atomics are also
> missing requisite barriers.
>
> I think a better solution is to not hold the raw inpcb lock while
> calling the ip_mrouter_done() callback.  That lock does not provide
> synchronization for anything related to ip_mroute.c.  Also note that
> X_ip_mrouter_done() can sleep in taskqueue_drain(), which is another
> reason to avoid holding the lock there.
>
> > ---
> >  sys/netinet/ip_mroute.h | 9 +++++----
> >  sys/netinet/raw_ip.c    | 1 +
> >  2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/sys/netinet/ip_mroute.h b/sys/netinet/ip_mroute.h
> > index 65c5bdd3a025..faf5b8c72a46 100644
> > --- a/sys/netinet/ip_mroute.h
> > +++ b/sys/netinet/ip_mroute.h
> > @@ -365,11 +365,12 @@ extern int      (*ip_mrouter_set)(struct socket *=
,
> struct sockopt *);
> >  extern int   (*ip_mrouter_get)(struct socket *, struct sockopt *);
> >  extern int   (*ip_mrouter_done)(void);
> >  extern int   (*mrt_ioctl)(u_long, caddr_t, int);
> > +extern int   ip_mrouter_critical_section_cnt;
> >
> > -#define      MROUTER_RLOCK_TRACKER   struct epoch_tracker mrouter_et
> > -#define      MROUTER_RLOCK() epoch_enter_preempt(net_epoch_preempt,
> &mrouter_et)
> > -#define      MROUTER_RUNLOCK()
>  epoch_exit_preempt(net_epoch_preempt, &mrouter_et)
> > -#define      MROUTER_WAIT()  epoch_wait_preempt(net_epoch_preempt)
> > +#define      MROUTER_RLOCK_TRACKER
>
> Why keep this definition at all?
>
> > +#define      MROUTER_RLOCK()
>  atomic_add_int(&ip_mrouter_critical_section_cnt, 1)
> > +#define      MROUTER_RUNLOCK()
>  atomic_subtract_int(&ip_mrouter_critical_section_cnt, 1)
> > +#define      MROUTER_WAIT()          do {} while
> (atomic_load_int(&ip_mrouter_critical_section_cnt) !=3D 0)
> >
> >  #endif /* _KERNEL */
> >
> > diff --git a/sys/netinet/raw_ip.c b/sys/netinet/raw_ip.c
> > index 7c495745806e..dc49c36f25ad 100644
> > --- a/sys/netinet/raw_ip.c
> > +++ b/sys/netinet/raw_ip.c
> > @@ -120,6 +120,7 @@ VNET_DEFINE(struct socket *, ip_mrouter);
> >  int (*ip_mrouter_set)(struct socket *, struct sockopt *);
> >  int (*ip_mrouter_get)(struct socket *, struct sockopt *);
> >  int (*ip_mrouter_done)(void);
> > +int ip_mrouter_critical_section_cnt;
> >  int (*ip_mforward)(struct ip *, struct ifnet *, struct mbuf *,
> >                  struct ip_moptions *);
> >  int (*mrt_ioctl)(u_long, caddr_t, int);
>

--000000000000593d8f05d578538d
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto">Thanks, i&#39;ll take a look if moving mrouter_done outsi=
de the lock would work.<div dir=3D"auto"><br></div><div dir=3D"auto">Wojtek=
</div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_=
attr">=C5=9Br., 12 sty 2022, 22:40 u=C5=BCytkownik Mark Johnston &lt;<a hre=
f=3D"mailto:markj@freebsd.org">markj@freebsd.org</a>&gt; napisa=C5=82:<br><=
/div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-le=
ft:1px #ccc solid;padding-left:1ex">On Tue, Jan 11, 2022 at 10:19:45AM +000=
0, Wojciech Macek wrote:<br>
&gt; The branch main has been updated by wma:<br>
&gt; <br>
&gt; URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D2e72208b6c62=
2505323ed48dc58830fc307392b1" rel=3D"noreferrer noreferrer" target=3D"_blan=
k">https://cgit.FreeBSD.org/src/commit/?id=3D2e72208b6c622505323ed48dc58830=
fc307392b1</a><br>
&gt; <br>
&gt; commit 2e72208b6c622505323ed48dc58830fc307392b1<br>
&gt; Author:=C2=A0 =C2=A0 =C2=A0Wojciech Macek &lt;wma@FreeBSD.org&gt;<br>
&gt; AuthorDate: 2022-01-11 10:08:35 +0000<br>
&gt; Commit:=C2=A0 =C2=A0 =C2=A0Wojciech Macek &lt;wma@FreeBSD.org&gt;<br>
&gt; CommitDate: 2022-01-11 10:19:32 +0000<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0ip_mroute: do not call epoch_waitwhen lock is taken=
<br>
&gt;=C2=A0 =C2=A0 =C2=A0<br>
&gt;=C2=A0 =C2=A0 =C2=A0mrouter_done is called with RAW IP lock taken. Some=
 annoying<br>
&gt;=C2=A0 =C2=A0 =C2=A0printfs are visible on the console if INVARIANTS op=
tion is enabled.<br>
&gt;=C2=A0 =C2=A0 =C2=A0<br>
&gt;=C2=A0 =C2=A0 =C2=A0Provide atomic-based mechanism which counts enters =
and exits from/to<br>
&gt;=C2=A0 =C2=A0 =C2=A0critical section in ip_input and ip_output.<br>
&gt;=C2=A0 =C2=A0 =C2=A0Before de-initialization of function pointers ensur=
e (with busy-wait)<br>
&gt;=C2=A0 =C2=A0 =C2=A0that mrouter de-initialization is visible to all re=
aders and that we don&#39;t<br>
&gt;=C2=A0 =C2=A0 =C2=A0remove pointers (like ip_mforward etc.) in the midd=
le of packet processing.<br>
<br>
This doesn&#39;t address the problem that the warning is complaining about,=
<br>
which is that a non-sleepable lock is held while performing an expensive<br=
>
operation.=C2=A0 Now we are silently spinning instead, and there&#39;s no<b=
r>
guarantee of forward progress since threads may be frequently entering<br>
the MROUTER_RLOCK-protected section in ip_input().=C2=A0 And the change<br>
converts a per-CPU atomic operation (in epoch_enter()) to a global<br>
atomic operation, potentially hurting performance.=C2=A0 The atomics are al=
so<br>
missing requisite barriers.<br>
<br>
I think a better solution is to not hold the raw inpcb lock while<br>
calling the ip_mrouter_done() callback.=C2=A0 That lock does not provide<br=
>
synchronization for anything related to ip_mroute.c.=C2=A0 Also note that<b=
r>
X_ip_mrouter_done() can sleep in taskqueue_drain(), which is another<br>
reason to avoid holding the lock there.<br>
<br>
&gt; ---<br>
&gt;=C2=A0 sys/netinet/ip_mroute.h | 9 +++++----<br>
&gt;=C2=A0 sys/netinet/raw_ip.c=C2=A0 =C2=A0 | 1 +<br>
&gt;=C2=A0 2 files changed, 6 insertions(+), 4 deletions(-)<br>
&gt; <br>
&gt; diff --git a/sys/netinet/ip_mroute.h b/sys/netinet/ip_mroute.h<br>
&gt; index 65c5bdd3a025..faf5b8c72a46 100644<br>
&gt; --- a/sys/netinet/ip_mroute.h<br>
&gt; +++ b/sys/netinet/ip_mroute.h<br>
&gt; @@ -365,11 +365,12 @@ extern int=C2=A0 =C2=A0 =C2=A0 (*ip_mrouter_set)=
(struct socket *, struct sockopt *);<br>
&gt;=C2=A0 extern int=C2=A0 =C2=A0(*ip_mrouter_get)(struct socket *, struct=
 sockopt *);<br>
&gt;=C2=A0 extern int=C2=A0 =C2=A0(*ip_mrouter_done)(void);<br>
&gt;=C2=A0 extern int=C2=A0 =C2=A0(*mrt_ioctl)(u_long, caddr_t, int);<br>
&gt; +extern int=C2=A0 =C2=A0ip_mrouter_critical_section_cnt;<br>
&gt;=C2=A0 <br>
&gt; -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK_TRACKER=C2=A0 =C2=A0struct =
epoch_tracker mrouter_et<br>
&gt; -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK() epoch_enter_preempt(net_e=
poch_preempt, &amp;mrouter_et)<br>
&gt; -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RUNLOCK()=C2=A0 =C2=A0 =C2=A0 =C2=
=A0epoch_exit_preempt(net_epoch_preempt, &amp;mrouter_et)<br>
&gt; -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_WAIT()=C2=A0 epoch_wait_preempt(n=
et_epoch_preempt)<br>
&gt; +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK_TRACKER<br>
<br>
Why keep this definition at all?<br>
<br>
&gt; +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK()=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0atomic_add_int(&amp;ip_mrouter_critical_section_cnt, 1)<br>
&gt; +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RUNLOCK()=C2=A0 =C2=A0 =C2=A0 =C2=
=A0atomic_subtract_int(&amp;ip_mrouter_critical_section_cnt, 1)<br>
&gt; +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_WAIT()=C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 do {} while (atomic_load_int(&amp;ip_mrouter_critical_section_cnt) =
!=3D 0)<br>
&gt;=C2=A0 <br>
&gt;=C2=A0 #endif /* _KERNEL */<br>
&gt;=C2=A0 <br>
&gt; diff --git a/sys/netinet/raw_ip.c b/sys/netinet/raw_ip.c<br>
&gt; index 7c495745806e..dc49c36f25ad 100644<br>
&gt; --- a/sys/netinet/raw_ip.c<br>
&gt; +++ b/sys/netinet/raw_ip.c<br>
&gt; @@ -120,6 +120,7 @@ VNET_DEFINE(struct socket *, ip_mrouter);<br>
&gt;=C2=A0 int (*ip_mrouter_set)(struct socket *, struct sockopt *);<br>
&gt;=C2=A0 int (*ip_mrouter_get)(struct socket *, struct sockopt *);<br>
&gt;=C2=A0 int (*ip_mrouter_done)(void);<br>
&gt; +int ip_mrouter_critical_section_cnt;<br>
&gt;=C2=A0 int (*ip_mforward)(struct ip *, struct ifnet *, struct mbuf *,<b=
r>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct i=
p_moptions *);<br>
&gt;=C2=A0 int (*mrt_ioctl)(u_long, caddr_t, int);<br>
</blockquote></div>

--000000000000593d8f05d578538d--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANsEV8c_WGG%2BCdtvsAo0PC7qrdBiAMLsWhPwVQ_vmWouhHBHBQ>