[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoA-fwcq7LRPx7z=42YC_CNsrNMoUznCPf8OkkSdWyiJAA@mail.gmail.com>
Date: Thu, 6 Feb 2025 00:17:02 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
Kuniyuki Iwashima <kuniyu@...zon.com>, Simon Horman <horms@...nel.org>, eric.dumazet@...il.com
Subject: Re: [PATCH net-next] net: flush_backlog() small changes
On Thu, Feb 6, 2025 at 12:00 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Wed, Feb 5, 2025 at 4:22 PM Jason Xing <kerneljasonxing@...il.com> wrote:
> >
> > Hi Eric,
> >
> > On Tue, Feb 4, 2025 at 10:49 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > Add READ_ONCE() around reads of skb->dev->reg_state, because
> > > this field can be changed from other threads/cpus.
> > >
> > > Instead of calling dev_kfree_skb_irq() and kfree_skb()
> > > while interrupts are masked and locks held,
> > > use a temporary list and use __skb_queue_purge_reason()
> > >
> > > Use SKB_DROP_REASON_DEV_READY drop reason to better
> > > describe why these skbs are dropped.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > ---
> > > net/core/dev.c | 12 ++++++++----
> > > 1 file changed, 8 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index c0021cbd28fc11e4c4eb6184d98a2505fa674871..cd31e78a7d8a2229e3dc17d08bb638f862148823 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -6119,16 +6119,18 @@ EXPORT_SYMBOL(netif_receive_skb_list);
> > > static void flush_backlog(struct work_struct *work)
> > > {
> > > struct sk_buff *skb, *tmp;
> > > + struct sk_buff_head list;
> > > struct softnet_data *sd;
> > >
> > > + __skb_queue_head_init(&list);
> > > local_bh_disable();
> > > sd = this_cpu_ptr(&softnet_data);
> > >
> > > backlog_lock_irq_disable(sd);
> > > skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
> > > - if (skb->dev->reg_state == NETREG_UNREGISTERING) {
> > > + if (READ_ONCE(skb->dev->reg_state) == NETREG_UNREGISTERING) {
> > > __skb_unlink(skb, &sd->input_pkt_queue);
> > > - dev_kfree_skb_irq(skb);
> > > + __skb_queue_tail(&list, skb);
> >
> > I wonder why we cannot simply replace the above function with
> > 'dev_kfree_skb_irq_reason(skb, SKB_DROP_REASON_DEV_READY);'?
>
> Because this horribly expensive thing pushes packets to another perc-cpu list,
> and raises a softirq to perform the freeing later from BH.
Agreed about this case. How about changing kfree_skb_reason(skb, ...)?
>
>
> >
> > > rps_input_queue_head_incr(sd);
> > > }
> > > }
> > > @@ -6136,14 +6138,16 @@ static void flush_backlog(struct work_struct *work)
> > >
> > > local_lock_nested_bh(&softnet_data.process_queue_bh_lock);
> > > skb_queue_walk_safe(&sd->process_queue, skb, tmp) {
> > > - if (skb->dev->reg_state == NETREG_UNREGISTERING) {
> > > + if (READ_ONCE(skb->dev->reg_state) == NETREG_UNREGISTERING) {
> > > __skb_unlink(skb, &sd->process_queue);
> > > - kfree_skb(skb);
> > > + __skb_queue_tail(&list, skb);
> >
> > Same here.
>
> Please read the changelog, I think you missed the point.
I meant why not directly use kfree_skb_reason(skb, ...) here? It's
simpler, right? Then don't bother to use a temporary list.
>
> >
> > > rps_input_queue_head_incr(sd);
> > > }
> > > }
> > > local_unlock_nested_bh(&softnet_data.process_queue_bh_lock);
> > > local_bh_enable();
> > > +
> > > + __skb_queue_purge_reason(&list, SKB_DROP_REASON_DEV_READY);
> >
> > I'm also worried that dev_kfree_skb_irq() is not the same as
> > kfree_skb_reason() because of the following commit:
> > commit 7df5cb75cfb8acf96c7f2342530eb41e0c11f4c3
> > Author: Subash Abhinov Kasiviswanathan <quic_subashab@...cinc.com>
> > Date: Thu Jul 23 11:31:48 2020 -0600
> >
> > dev: Defer free of skbs in flush_backlog
> >
> > IRQs are disabled when freeing skbs in input queue.
> > Use the IRQ safe variant to free skbs here.
> >
> > Fixes: 145dd5f9c88f ("net: flush the softnet backlog in process context")
> > Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>
> > Signed-off-by: David S. Miller <davem@...emloft.net>
>
> The point of this patch is to no longer attempt the kfree_skb() while being
> in hard or soft irq blocking sections.
>
> Therefore call efficient kfree_skb() instead of expensive fallbacks
> that were designed
> for callers in hard irq contexts.
Thanks for the explanation!
Thanks,
Jason
Powered by blists - more mailing lists