[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <willemdebruijn.kernel.39472926ed88e@gmail.com>
Date: Sat, 31 Jan 2026 12:41:42 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jakub Kicinski <kuba@...nel.org>,
fengwei_yin@...ux.alibaba.com,
Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: davem@...emloft.net,
edumazet@...gle.com,
pabeni@...hat.com,
horms@...nel.org,
netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: procfs: Fix RCU stall and soft lockup in
ptype_seq_next()
Jakub Kicinski wrote:
> On Wed, 28 Jan 2026 15:03:59 +0800 fengwei_yin@...ux.alibaba.com wrote:
> > The root cause is in ptype_seq_next(): when iterating over packet
> > types, it's possible that a packet type entry (pt) has been removed,
> > its dev set to NULL, and pt->af_packet_net is not initialized.
> > In that case, the function may return the same 'nxt' pointer indefinitely.
> > This results in an infinite loop under RCU read-side critical section,
> > causing an RCU stall and eventually a soft lockup.
> >
> > Fix the issue by properly handling the case where 'nxt' points to
> > an empty list, ensuring forward progress in the iterator.
>
> > @@ -247,7 +247,7 @@ static void *ptype_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> >
> > if (pt->af_packet_net) {
> > net_ptype_all:
> > - if (nxt != &net->ptype_all && nxt != &net->ptype_specific)
> > + if (!list_empty(nxt) && nxt != &net->ptype_all && nxt != &net->ptype_specific)
> > goto found;
> >
> > if (nxt == &net->ptype_all) {
> > @@ -267,6 +267,9 @@ static void *ptype_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> > return NULL;
> > nxt = ptype_base[hash].next;
> > }
> > +
> > + if (list_empty(nxt))
> > + return NULL;
> > found:
> > return list_entry(nxt, struct packet_type, list);
> > }
>
> I'm not sure this fix works, TBH, we're dealing with an RCU list here.
> The elements are not deleted with list_del_init(), so they won't
> look "empty".
>
> If the pt entries are under RCU protection I think the issue is that
> af_packet is clearing pt->dev before waiting for the grace period to
> expire.
>
> Willem, is there a reason for that or just convenience?
That would be wrong. Do we see it doing that somewhere?
These handlers should get removed with dev_remove_pack. Or
__dev_remove_pack and observe the RCU grace period some other way.
I can review these, but was not aware of any abuses.
> --
> pw-bot: cr
Powered by blists - more mailing lists