netdev - Re: [PATCH net v2] net: core: make napi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOhMmr7-GOQuF7TRQ28c9QC=ccLSCRk-TztxJTGYe-ZE_Afdpg@mail.gmail.com>
Date:   Wed, 14 Apr 2021 12:31:32 -0500
From:   Lijun Pan <lijunp213@...il.com>
To:     Yunsheng Lin <linyunsheng@...wei.com>
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH net v2] net: core: make napi_disable more robust

On Wed, Apr 14, 2021 at 3:45 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2021/4/14 16:08, Lijun Pan wrote:
> > There are chances that napi_disable can be called twice by NIC driver.
> > This could generate deadlock. For example,
> > the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> > by napi_complete_done, then set it again.
> > When napi_disable is called the second time, it will loop infinitely
> > because no dev->poll will be running to clear NAPI_STATE_SCHED.
> >
> > Though it is driver writer's responsibility to make sure it being
> > called only once, making napi_disable more robust does not hurt, not
> > to say it can prevent a buggy driver from crashing a system.
> > So, we check the napi state bit to make sure that if napi is already
> > disabled, we exit the call early enough to avoid spinning infinitely.
> >
> > Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
> > Signed-off-by: Lijun Pan <lijunp213@...il.com>
> > ---
> > v2: justify that this patch makes napi_disable more robust.
> >
> >  net/core/dev.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 1f79b9aa9a3f..fa0aa212b7bb 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6830,6 +6830,24 @@ EXPORT_SYMBOL(netif_napi_add);
> >  void napi_disable(struct napi_struct *n)
> >  {
> >       might_sleep();
> > +
> > +     /* make sure napi_disable() runs only once,
> > +      * When napi is disabled, the state bits are like:
> > +      * NAPI_STATE_SCHED (set by previous napi_disable)
> > +      * NAPI_STATE_NPSVC (set by previous napi_disable)
> > +      * NAPI_STATE_DISABLE (cleared by previous napi_disable)
> > +      * NAPI_STATE_PREFER_BUSY_POLL (cleared by previous napi_complete_done)
> > +      * NAPI_STATE_MISSED (cleared by previous napi_complete_done)
> > +      */
> > +
> > +     if (napi_disable_pending(n))
> > +             return;
> > +     if (test_bit(NAPI_STATE_SCHED, &n->state) &&
> > +         test_bit(NAPI_STATE_NPSVC, &n->state) &&
> > +         !test_bit(NAPI_STATE_MISSED, &n->state) &&
> > +         !test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state))
> > +             return;
>
> The NAPI_STATE_DISABLE is cleared at the end of napi_disable(),
> and if a buggy driver/hw triggers a interrupt and driver calls
> napi_schedule_irqoff(), which may set NAPI_STATE_MISSED
> if NAPI_STATE_SCHED is set(in napi_schedule_prep()), the above
> checking does not seem to handle it?

What I described in the commit message is the napi_disable() being
called from the same instance, same cpu. e.g.,
funcA {
    napi_disable();
    ...
    funcB{
        if (blah)
            napi_disable();
            ...
    }
    funcC;
}

The scenario you mentioned above seems to have napi already enabled
and scheduled, such that napi_schedule_prep() would set NAPI_STATE_MISSED.
The two scenarios are different per my understanding. Is there a way
that your scenario will finally call into my scenario?
Let me know if I understand you correctly.

Maybe testing NAPI_STATE_MISSED bit is not needed
because this bit is not that reliable.

Lijun