lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 30 Mar 2023 13:44:16 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     Paolo Abeni <pabeni@...hat.com>
Cc:     Jason Xing <kerneljasonxing@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Jason Xing <kernelxing@...cent.com>, netdev@...r.kernel.org,
        eric.dumazet@...il.com
Subject: Re: [PATCH net-next 4/4] net: optimize ____napi_schedule() to avoid
 extra NET_RX_SOFTIRQ

On Thu, Mar 30, 2023 at 1:39 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On Thu, 2023-03-30 at 17:50 +0800, Jason Xing wrote:
> > On Wed, Mar 29, 2023 at 7:53 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > ____napi_schedule() adds a napi into current cpu softnet_data poll_list,
> > > then raises NET_RX_SOFTIRQ to make sure net_rx_action() will process it.
> > >
> > > Idea of this patch is to not raise NET_RX_SOFTIRQ when being called indirectly
> > > from net_rx_action(), because we can process poll_list from this point,
> > > without going to full softirq loop.
> > >
> > > This needs a change in net_rx_action() to make sure we restart
> > > its main loop if sd->poll_list was updated without NET_RX_SOFTIRQ
> > > being raised.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > Cc: Jason Xing <kernelxing@...cent.com>
> > > ---
> > >  net/core/dev.c | 22 ++++++++++++++++++----
> > >  1 file changed, 18 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index f34ce93f2f02e7ec71f5e84d449fa99b7a882f0c..0c4b21291348d4558f036fb05842dab023f65dc3 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -4360,7 +4360,11 @@ static inline void ____napi_schedule(struct softnet_data *sd,
> > >         }
> > >
> > >         list_add_tail(&napi->poll_list, &sd->poll_list);
> > > -       __raise_softirq_irqoff(NET_RX_SOFTIRQ);
> > > +       /* If not called from net_rx_action()
> > > +        * we have to raise NET_RX_SOFTIRQ.
> > > +        */
> > > +       if (!sd->in_net_rx_action)
> > > +               __raise_softirq_irqoff(NET_RX_SOFTIRQ);
> > >  }
> > >
> > >  #ifdef CONFIG_RPS
> > > @@ -6648,6 +6652,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > >         LIST_HEAD(list);
> > >         LIST_HEAD(repoll);
> > >
> > > +start:
> > >         sd->in_net_rx_action = true;
> > >         local_irq_disable();
> > >         list_splice_init(&sd->poll_list, &list);
> > > @@ -6659,9 +6664,18 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > >                 skb_defer_free_flush(sd);
> > >
> > >                 if (list_empty(&list)) {
> > > -                       sd->in_net_rx_action = false;
> > > -                       if (!sd_has_rps_ipi_waiting(sd) && list_empty(&repoll))
> > > -                               goto end;
> > > +                       if (list_empty(&repoll)) {
> > > +                               sd->in_net_rx_action = false;
> > > +                               barrier();
> > > +                               /* We need to check if ____napi_schedule()
> > > +                                * had refilled poll_list while
> > > +                                * sd->in_net_rx_action was true.
> > > +                                */
> > > +                               if (!list_empty(&sd->poll_list))
> > > +                                       goto start;
> >
> > I noticed that since we decide to go back and restart this loop, it
> > would be better to check the time_limit. More than that,
> > skb_defer_free_flush() can consume some time which is supposed to take
> > into account.
>
> Note that we can have a __napi_schedule() invocation with sd-
> >in_net_rx_action only after executing the napi_poll() call below and
> thus after the related time check (that is - after performing at least
> one full iteration of the main for(;;) loop).
>
> I don't think another check right here is needed.

I was about to say the same thing.

Back to READ_ONCE() and WRITE_ONCE(), I originally had them in my tree,
then simply realized they were not needed and confusing as a matter of fact.

barrier() is the right thing here, and only one is needed, because
others are implicit,
because before hitting reads, we must call out of line functions.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ