[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iK-AQeDvdV-FSjqA6QPm=tSUJTqMZ2z8D1Dw401n-xPYg@mail.gmail.com>
Date: Thu, 18 Mar 2021 10:22:23 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Lijun Pan <ljp@...ux.ibm.com>
Cc: netdev <netdev@...r.kernel.org>, Jakub Kicinski <kuba@...nel.org>,
David Miller <davem@...emloft.net>, tlfalcon@...ux.ibm.com,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andriin@...com>, Wei Wang <weiwan@...gle.com>,
Cong Wang <cong.wang@...edance.com>,
Taehee Yoo <ap420073@...il.com>,
shemminger@...ux-foundation.org
Subject: Re: [PATCH net] net: core: avoid napi_disable to cause deadlock
On Thu, Mar 18, 2021 at 9:04 AM Lijun Pan <ljp@...ux.ibm.com> wrote:
>
> There are chances that napi_disable is called twice by NIC driver.
???
Please fix the buggy driver, or explain why it can not be fixed.
> This could generate deadlock. For example,
> the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> by napi_complete_done, then set it again.
> When napi_disable is called the second time, it will loop infinitely
> because no dev->poll will be running to clear NAPI_STATE_SCHED.
>
> CPU0 CPU1
> napi_disable
> test_and_set_bit
> (napi_complete_done clears
> NAPI_STATE_SCHED, ret 0,
> and set NAPI_STATE_SCHED)
> napi_disable
> test_and_set_bit
> (ret 1 and loop infinitely because
> no napi instance is scheduled to
> clear NAPI_STATE_SCHED bit)
>
> Checking the napi state bit to make sure if napi is already disabled,
> exit the call early enough to avoid spinning infinitely.
>
> Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
> Signed-off-by: Lijun Pan <ljp@...ux.ibm.com>
> ---
> net/core/dev.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
Powered by blists - more mailing lists