[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210226151040.6b9df8ac@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Fri, 26 Feb 2021 15:10:40 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Wei Wang <weiwan@...gle.com>
Cc: Martin Zaharinov <micron10@...il.com>,
Alexander Duyck <alexanderduyck@...com>,
Eric Dumazet <edumazet@...gle.com>,
"David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [PATCH net] net: fix race between napi kthread mode and busy
poll
On Fri, 26 Feb 2021 14:24:29 -0800 Wei Wang wrote:
> > I'm not sure this takes care of rapid:
> >
> > dev_set_threaded(0)
> > # NAPI gets sent to sirq
> > dev_set_threaded(1)
> >
> > since subsequent set_threaded(1) doesn't spawn the thread "afresh".
>
> I think the race between softirq and kthread could be purely dependent
> on the SCHED bit. In napi_schedule_prep(), we check if SCHED bit is
> set. And we only call ____napi_schedule() when SCHED bit is not set.
> In ____napi_schedule(), we either wake up kthread, or raise softirq,
> never both.
> So as long as we don't wake up the kthread when creating it, there
> should not be a chance of race between softirq and kthread.
But we don't destroy the thread when dev_set_threaded(0) is called, or
make sure that it gets parked, we just clear NAPI_STATE_THREADED and
that's it.
The thread could be running long after NAPI_STATE_THREADED was cleared,
and long after it gave up NAPI_STATE_SCHED. E.g. if some heavy sirq
processing kicks in at the very moment we reenable BH.
> > > while (!kthread_should_stop() && !napi_disable_pending(napi)) {
> > > - if (test_bit(NAPI_STATE_SCHED, &napi->state)) {
> > > + unsigned long val = READ_ONCE(napi->state);
> > > +
> > > + if (val & NAPIF_STATE_SCHED &&
> > > + !(val & NAPIF_STATE_SCHED_BUSY_POLL)) {
> >
> > Again, not protected from the napi_disable() case AFAICT.
>
> Hmmm..... Yes. I think you are right. I missed that napi_disable()
> also grabs the SCHED bit. In this case, I think we have to use the
> SCHED_THREADED bit. The SCHED_BUSY_POLL bit is not enough to protect
> the race between napi_disable() and napi_threaded_poll(). :(
> Sorry, I missed this point when evaluating both solutions. I will have
> to switch to use the SCHED_THREADED bit.
Alright, AFAICT SCHED_THREADED doesn't suffer either of the problems
I brought up here.
Powered by blists - more mailing lists