Date: Thu, 13 Jan 2022 16:27:23 +0100 From: Wojciech Macek <wma@semihalf.com> To: Mark Johnston <markj@freebsd.org> Cc: Wojciech Macek <wma@freebsd.org>, src-committers <src-committers@freebsd.org>, "<dev-commits-src-all@freebsd.org>" <dev-commits-src-all@freebsd.org>, dev-commits-src-main@freebsd.org Subject: Re: git: 2e72208b6c62 - main - ip_mroute: do not call epoch_waitwhen lock is taken Message-ID: <CANsEV8c_WGG%2BCdtvsAo0PC7qrdBiAMLsWhPwVQ_vmWouhHBHBQ@mail.gmail.com> In-Reply-To: <Yd9K0bT4kgWDHlLd@nuc> References: <202201111019.20BAJjCg058259@gitrepo.freebsd.org> <Yd9K0bT4kgWDHlLd@nuc>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000593d8f05d578538d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks, i'll take a look if moving mrouter_done outside the lock would work= . Wojtek =C5=9Br., 12 sty 2022, 22:40 u=C5=BCytkownik Mark Johnston <markj@freebsd.o= rg> napisa=C5=82: > On Tue, Jan 11, 2022 at 10:19:45AM +0000, Wojciech Macek wrote: > > The branch main has been updated by wma: > > > > URL: > https://cgit.FreeBSD.org/src/commit/?id=3D2e72208b6c622505323ed48dc58830f= c307392b1 > > > > commit 2e72208b6c622505323ed48dc58830fc307392b1 > > Author: Wojciech Macek <wma@FreeBSD.org> > > AuthorDate: 2022-01-11 10:08:35 +0000 > > Commit: Wojciech Macek <wma@FreeBSD.org> > > CommitDate: 2022-01-11 10:19:32 +0000 > > > > ip_mroute: do not call epoch_waitwhen lock is taken > > > > mrouter_done is called with RAW IP lock taken. Some annoying > > printfs are visible on the console if INVARIANTS option is enabled. > > > > Provide atomic-based mechanism which counts enters and exits from/t= o > > critical section in ip_input and ip_output. > > Before de-initialization of function pointers ensure (with busy-wai= t) > > that mrouter de-initialization is visible to all readers and that w= e > don't > > remove pointers (like ip_mforward etc.) in the middle of packet > processing. > > This doesn't address the problem that the warning is complaining about, > which is that a non-sleepable lock is held while performing an expensive > operation. Now we are silently spinning instead, and there's no > guarantee of forward progress since threads may be frequently entering > the MROUTER_RLOCK-protected section in ip_input(). And the change > converts a per-CPU atomic operation (in epoch_enter()) to a global > atomic operation, potentially hurting performance. The atomics are also > missing requisite barriers. > > I think a better solution is to not hold the raw inpcb lock while > calling the ip_mrouter_done() callback. That lock does not provide > synchronization for anything related to ip_mroute.c. Also note that > X_ip_mrouter_done() can sleep in taskqueue_drain(), which is another > reason to avoid holding the lock there. > > > --- > > sys/netinet/ip_mroute.h | 9 +++++---- > > sys/netinet/raw_ip.c | 1 + > > 2 files changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/sys/netinet/ip_mroute.h b/sys/netinet/ip_mroute.h > > index 65c5bdd3a025..faf5b8c72a46 100644 > > --- a/sys/netinet/ip_mroute.h > > +++ b/sys/netinet/ip_mroute.h > > @@ -365,11 +365,12 @@ extern int (*ip_mrouter_set)(struct socket *= , > struct sockopt *); > > extern int (*ip_mrouter_get)(struct socket *, struct sockopt *); > > extern int (*ip_mrouter_done)(void); > > extern int (*mrt_ioctl)(u_long, caddr_t, int); > > +extern int ip_mrouter_critical_section_cnt; > > > > -#define MROUTER_RLOCK_TRACKER struct epoch_tracker mrouter_et > > -#define MROUTER_RLOCK() epoch_enter_preempt(net_epoch_preempt, > &mrouter_et) > > -#define MROUTER_RUNLOCK() > epoch_exit_preempt(net_epoch_preempt, &mrouter_et) > > -#define MROUTER_WAIT() epoch_wait_preempt(net_epoch_preempt) > > +#define MROUTER_RLOCK_TRACKER > > Why keep this definition at all? > > > +#define MROUTER_RLOCK() > atomic_add_int(&ip_mrouter_critical_section_cnt, 1) > > +#define MROUTER_RUNLOCK() > atomic_subtract_int(&ip_mrouter_critical_section_cnt, 1) > > +#define MROUTER_WAIT() do {} while > (atomic_load_int(&ip_mrouter_critical_section_cnt) !=3D 0) > > > > #endif /* _KERNEL */ > > > > diff --git a/sys/netinet/raw_ip.c b/sys/netinet/raw_ip.c > > index 7c495745806e..dc49c36f25ad 100644 > > --- a/sys/netinet/raw_ip.c > > +++ b/sys/netinet/raw_ip.c > > @@ -120,6 +120,7 @@ VNET_DEFINE(struct socket *, ip_mrouter); > > int (*ip_mrouter_set)(struct socket *, struct sockopt *); > > int (*ip_mrouter_get)(struct socket *, struct sockopt *); > > int (*ip_mrouter_done)(void); > > +int ip_mrouter_critical_section_cnt; > > int (*ip_mforward)(struct ip *, struct ifnet *, struct mbuf *, > > struct ip_moptions *); > > int (*mrt_ioctl)(u_long, caddr_t, int); > --000000000000593d8f05d578538d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto">Thanks, i'll take a look if moving mrouter_done outsi= de the lock would work.<div dir=3D"auto"><br></div><div dir=3D"auto">Wojtek= </div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_= attr">=C5=9Br., 12 sty 2022, 22:40 u=C5=BCytkownik Mark Johnston <<a hre= f=3D"mailto:markj@freebsd.org">markj@freebsd.org</a>> napisa=C5=82:<br><= /div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-le= ft:1px #ccc solid;padding-left:1ex">On Tue, Jan 11, 2022 at 10:19:45AM +000= 0, Wojciech Macek wrote:<br> > The branch main has been updated by wma:<br> > <br> > URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D2e72208b6c62= 2505323ed48dc58830fc307392b1" rel=3D"noreferrer noreferrer" target=3D"_blan= k">https://cgit.FreeBSD.org/src/commit/?id=3D2e72208b6c622505323ed48dc58830= fc307392b1</a><br> > <br> > commit 2e72208b6c622505323ed48dc58830fc307392b1<br> > Author:=C2=A0 =C2=A0 =C2=A0Wojciech Macek <wma@FreeBSD.org><br> > AuthorDate: 2022-01-11 10:08:35 +0000<br> > Commit:=C2=A0 =C2=A0 =C2=A0Wojciech Macek <wma@FreeBSD.org><br> > CommitDate: 2022-01-11 10:19:32 +0000<br> > <br> >=C2=A0 =C2=A0 =C2=A0ip_mroute: do not call epoch_waitwhen lock is taken= <br> >=C2=A0 =C2=A0 =C2=A0<br> >=C2=A0 =C2=A0 =C2=A0mrouter_done is called with RAW IP lock taken. Some= annoying<br> >=C2=A0 =C2=A0 =C2=A0printfs are visible on the console if INVARIANTS op= tion is enabled.<br> >=C2=A0 =C2=A0 =C2=A0<br> >=C2=A0 =C2=A0 =C2=A0Provide atomic-based mechanism which counts enters = and exits from/to<br> >=C2=A0 =C2=A0 =C2=A0critical section in ip_input and ip_output.<br> >=C2=A0 =C2=A0 =C2=A0Before de-initialization of function pointers ensur= e (with busy-wait)<br> >=C2=A0 =C2=A0 =C2=A0that mrouter de-initialization is visible to all re= aders and that we don't<br> >=C2=A0 =C2=A0 =C2=A0remove pointers (like ip_mforward etc.) in the midd= le of packet processing.<br> <br> This doesn't address the problem that the warning is complaining about,= <br> which is that a non-sleepable lock is held while performing an expensive<br= > operation.=C2=A0 Now we are silently spinning instead, and there's no<b= r> guarantee of forward progress since threads may be frequently entering<br> the MROUTER_RLOCK-protected section in ip_input().=C2=A0 And the change<br> converts a per-CPU atomic operation (in epoch_enter()) to a global<br> atomic operation, potentially hurting performance.=C2=A0 The atomics are al= so<br> missing requisite barriers.<br> <br> I think a better solution is to not hold the raw inpcb lock while<br> calling the ip_mrouter_done() callback.=C2=A0 That lock does not provide<br= > synchronization for anything related to ip_mroute.c.=C2=A0 Also note that<b= r> X_ip_mrouter_done() can sleep in taskqueue_drain(), which is another<br> reason to avoid holding the lock there.<br> <br> > ---<br> >=C2=A0 sys/netinet/ip_mroute.h | 9 +++++----<br> >=C2=A0 sys/netinet/raw_ip.c=C2=A0 =C2=A0 | 1 +<br> >=C2=A0 2 files changed, 6 insertions(+), 4 deletions(-)<br> > <br> > diff --git a/sys/netinet/ip_mroute.h b/sys/netinet/ip_mroute.h<br> > index 65c5bdd3a025..faf5b8c72a46 100644<br> > --- a/sys/netinet/ip_mroute.h<br> > +++ b/sys/netinet/ip_mroute.h<br> > @@ -365,11 +365,12 @@ extern int=C2=A0 =C2=A0 =C2=A0 (*ip_mrouter_set)= (struct socket *, struct sockopt *);<br> >=C2=A0 extern int=C2=A0 =C2=A0(*ip_mrouter_get)(struct socket *, struct= sockopt *);<br> >=C2=A0 extern int=C2=A0 =C2=A0(*ip_mrouter_done)(void);<br> >=C2=A0 extern int=C2=A0 =C2=A0(*mrt_ioctl)(u_long, caddr_t, int);<br> > +extern int=C2=A0 =C2=A0ip_mrouter_critical_section_cnt;<br> >=C2=A0 <br> > -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK_TRACKER=C2=A0 =C2=A0struct = epoch_tracker mrouter_et<br> > -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK() epoch_enter_preempt(net_e= poch_preempt, &mrouter_et)<br> > -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RUNLOCK()=C2=A0 =C2=A0 =C2=A0 =C2= =A0epoch_exit_preempt(net_epoch_preempt, &mrouter_et)<br> > -#define=C2=A0 =C2=A0 =C2=A0 MROUTER_WAIT()=C2=A0 epoch_wait_preempt(n= et_epoch_preempt)<br> > +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK_TRACKER<br> <br> Why keep this definition at all?<br> <br> > +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RLOCK()=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0atomic_add_int(&ip_mrouter_critical_section_cnt, 1)<br> > +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_RUNLOCK()=C2=A0 =C2=A0 =C2=A0 =C2= =A0atomic_subtract_int(&ip_mrouter_critical_section_cnt, 1)<br> > +#define=C2=A0 =C2=A0 =C2=A0 MROUTER_WAIT()=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 do {} while (atomic_load_int(&ip_mrouter_critical_section_cnt) = !=3D 0)<br> >=C2=A0 <br> >=C2=A0 #endif /* _KERNEL */<br> >=C2=A0 <br> > diff --git a/sys/netinet/raw_ip.c b/sys/netinet/raw_ip.c<br> > index 7c495745806e..dc49c36f25ad 100644<br> > --- a/sys/netinet/raw_ip.c<br> > +++ b/sys/netinet/raw_ip.c<br> > @@ -120,6 +120,7 @@ VNET_DEFINE(struct socket *, ip_mrouter);<br> >=C2=A0 int (*ip_mrouter_set)(struct socket *, struct sockopt *);<br> >=C2=A0 int (*ip_mrouter_get)(struct socket *, struct sockopt *);<br> >=C2=A0 int (*ip_mrouter_done)(void);<br> > +int ip_mrouter_critical_section_cnt;<br> >=C2=A0 int (*ip_mforward)(struct ip *, struct ifnet *, struct mbuf *,<b= r> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct i= p_moptions *);<br> >=C2=A0 int (*mrt_ioctl)(u_long, caddr_t, int);<br> </blockquote></div> --000000000000593d8f05d578538d--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANsEV8c_WGG%2BCdtvsAo0PC7qrdBiAMLsWhPwVQ_vmWouhHBHBQ>