[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250124173133.71b4df3c@kernel.org>
Date: Fri, 24 Jan 2025 17:31:33 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: Mina Almasry <almasrymina@...gle.com>, davem@...emloft.net,
netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
andrew+netdev@...n.ch, horms@...nel.org, hawk@...nel.org,
ilias.apalodimas@...aro.org, asml.silence@...il.com, kaiyuanz@...gle.com,
willemb@...gle.com, mkarsten@...terloo.ca, jdamato@...tly.com
Subject: Re: [PATCH net] net: page_pool: don't try to stash the napi id
On Fri, 24 Jan 2025 23:18:08 +0100 Toke Høiland-Jørgensen wrote:
> > The reading paths in page_pool.c don't hold the lock, no? Only the
> > reading paths in page_pool_user.c seem to do.
> >
> > I could not immediately wrap my head around why pool->p.napi can be
> > accessed in page_pool_napi_local with no lock, but needs to be
> > protected in the code in page_pool_user.c. It seems
> > READ_ONCE/WRITE_ONCE protection is good enough to make sure
> > page_pool_napi_local doesn't race with
> > page_pool_disable_direct_recycling in a way that can crash (the
> > reading code either sees a valid pointer or NULL). Why is that not
> > good enough to also synchronize the accesses between
> > page_pool_disable_direct_recycling and page_pool_nl_fill? I.e., drop
> > the locking?
>
> It actually seems that this is *not* currently the case. See the
> discussion here:
>
> https://lore.kernel.org/all/8734h8qgmz.fsf@toke.dk/
>
> IMO (as indicated in the message linked above), we should require users
> to destroy the page pool before freeing the NAPI memory, rather than add
> additional synchronisation.
Agreed in general but this is a slightly different case.
This sequence should be legal IMHO:
page_pool_disable_direct_recycling()
napi_disable()
netif_napi_del()
# free NAPI
page_pool_destroy()
I'm not saying it's a good idea! but since
page_pool_disable_direct_recycling() detaches the NAPI,
logically someone could assume the above works.
I agree with you on datapath accesses, as discussed in the thread you
linked. But here reader is not under RCU, so the RCU sync in NAPI
destruction does not protect us from reader stalling for a long time.
Powered by blists - more mailing lists