[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN8PR15MB27873FF52B109480173366B8BD9E9@BN8PR15MB2787.namprd15.prod.outlook.com>
Date: Thu, 25 Feb 2021 01:22:08 +0000
From: Alexander Duyck <alexanderduyck@...com>
To: Jakub Kicinski <kuba@...nel.org>
CC: Eric Dumazet <edumazet@...gle.com>, Wei Wang <weiwan@...gle.com>,
"David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Martin Zaharinov <micron10@...il.com>
Subject: RE: [PATCH net] net: fix race between napi kthread mode and busy poll
> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Wednesday, February 24, 2021 4:21 PM
> To: Alexander Duyck <alexanderduyck@...com>
> Cc: Eric Dumazet <edumazet@...gle.com>; Wei Wang
> <weiwan@...gle.com>; David S . Miller <davem@...emloft.net>; netdev
> <netdev@...r.kernel.org>; Paolo Abeni <pabeni@...hat.com>; Hannes
> Frederic Sowa <hannes@...essinduktion.org>; Martin Zaharinov
> <micron10@...il.com>
> Subject: Re: [PATCH net] net: fix race between napi kthread mode and busy
> poll
>
> On Thu, 25 Feb 2021 00:11:34 +0000 Alexander Duyck wrote:
> > > > We were trying to not pollute the list (with about 40 different
> > > > emails so far)
> > > >
> > > > (Note this was not something I initiated, I only hit Reply all
> > > > button)
> > > >
> > > > OK, I will shut up, since you seem to take over this matter, and
> > > > it is 1am here in France.
> > >
> > > Are you okay with adding a SCHED_THREADED bit for threaded NAPI to
> > > be set in addition to SCHED? At least that way the bit is associated with it's
> user.
> > > IIUC since the extra clear_bit() in busy poll was okay so should be
> > > a new set_bit()?
> >
> > The problem with adding a bit for SCHED_THREADED is that you would
> > have to heavily modify napi_schedule_prep so that it would add the
> > bit. That is the reason for going with adding the bit to the busy poll
> > logic because it added no additional overhead. Adding another atomic
> > bit setting operation or heavily modifying the existing one would add
> > considerable overhead as it is either adding a complicated conditional
> > check to all NAPI calls, or adding an atomic operation to the path for
> > the threaded NAPI.
>
> I wasn't thinking of modifying the main schedule logic, just the threaded
> parts:
>
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index
> ddf4cfc12615..6953005d06af 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -360,6 +360,7 @@ enum {
> NAPI_STATE_IN_BUSY_POLL, /* sk_busy_loop() owns this NAPI */
> NAPI_STATE_PREFER_BUSY_POLL, /* prefer busy-polling over softirq
> processing*/
> NAPI_STATE_THREADED, /* The poll is performed inside its own
> thread*/
> + NAPI_STATE_SCHED_THREAD, /* Thread owns the NAPI and will poll
> */
> };
>
> enum {
> diff --git a/net/core/dev.c b/net/core/dev.c index
> 6c5967e80132..23e53f971478 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4294,6 +4294,7 @@ static inline void ____napi_schedule(struct
> softnet_data *sd,
> */
> thread = READ_ONCE(napi->thread);
> if (thread) {
> + set_bit(NAPI_STATE_SCHED_THREAD, &napi->state);
> wake_up_process(thread);
> return;
> }
> @@ -6486,7 +6487,8 @@ bool napi_complete_done(struct napi_struct *n,
> int work_done)
> WARN_ON_ONCE(!(val & NAPIF_STATE_SCHED));
>
> new = val & ~(NAPIF_STATE_MISSED | NAPIF_STATE_SCHED |
> - NAPIF_STATE_PREFER_BUSY_POLL);
> + NAPIF_STATE_PREFER_BUSY_POLL |
> + NAPI_STATE_SCHED_THREAD);
>
> /* If STATE_MISSED was set, leave STATE_SCHED set,
> * because we will call napi->poll() one more time.
> @@ -6971,7 +6973,9 @@ static int napi_thread_wait(struct napi_struct
> *napi)
> set_current_state(TASK_INTERRUPTIBLE);
>
> while (!kthread_should_stop() && !napi_disable_pending(napi)) {
> - if (test_bit(NAPI_STATE_SCHED, &napi->state)) {
> + if (test_bit(NAPI_STATE_SCHED_THREAD, &napi->state)) {
> + WARN_ON(!test_bit(test_bit(NAPI_STATE_SCHED,
> + &napi->state)));
> WARN_ON(!list_empty(&napi->poll_list));
> __set_current_state(TASK_RUNNING);
> return 0;
Yeah, that was the patch Wei had done earlier. Eric complained about the extra set_bit atomic operation in the threaded path. That is when I came up with the idea of just adding a bit to the busy poll logic so that the only extra cost in the threaded path was having to check 2 bits instead of 1.
Powered by blists - more mailing lists