[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241119-magnetic-striped-pig-bcffa9@leitao>
Date: Tue, 19 Nov 2024 02:22:06 -0800
From: Breno Leitao <leitao@...ian.org>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Stephen Hemminger <stephen@...workplumber.org>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
paulmck@...nel.org, Cong Wang <cong.wang@...edance.com>
Subject: Re: [PATCH net 1/2] netpoll: Use rcu_access_pointer() in
__netpoll_setup
Hello Herbet,
On Tue, Nov 19, 2024 at 11:28:33AM +0800, Herbert Xu wrote:
> On Mon, Nov 18, 2024 at 03:15:17AM -0800, Breno Leitao wrote:
> > The ndev->npinfo pointer in __netpoll_setup() is RCU-protected but is being
> > accessed directly for a NULL check. While no RCU read lock is held in this
> > context, we should still use proper RCU primitives for consistency and
> > correctness.
> >
> > Replace the direct NULL check with rcu_access_pointer(), which is the
> > appropriate primitive when only checking for NULL without dereferencing
> > the pointer. This function provides the necessary ordering guarantees
> > without requiring RCU read-side protection.
> >
> > Signed-off-by: Breno Leitao <leitao@...ian.org>
> > Fixes: 8fdd95ec162a ("netpoll: Allow netpoll_setup/cleanup recursion")
> > ---
> > net/core/netpoll.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> > index aa49b92e9194babab17b2e039daf092a524c5b88..45fb60bc4803958eb07d4038028269fc0c19622e 100644
> > --- a/net/core/netpoll.c
> > +++ b/net/core/netpoll.c
> > @@ -626,7 +626,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
> > goto out;
> > }
> >
> > - if (!ndev->npinfo) {
> > + if (!rcu_access_pointer(ndev->npinfo)) {
> > npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
> > if (!npinfo) {
> > err = -ENOMEM;
>
> This is completely bogus. Think about it, we are setting ndev->npinfo,
> meaning that we must have some form of synchronisation over this that
> guarantees us to be the only writer.
Correct. __netpoll_setup() should have the RTNL lock held. In the most
common case, it is done through:
netpoll_setup() {
rtnl_lock();
...
__netpoll_setup()
...
rtnl_unlock();
}
> So why does it need RCU protection for reading?
Good question, I understand this bring explicit calls to RCU pointers. In
fact, the same function that this patch changes (__netpoll_setup), later
does use rtnl_dereference(), and it is inside the same RTNL lock.
More over, looking at the RCU documentation, there is an explicit example
about this, at Documentation/RCU/Design/Requirements/Requirements.rst in
the "Performance and Scalability" section. I says:
spin_lock(&gp_lock);
p = rcu_access_pointer(gp);
if (!p) {
spin_unlock(&gp_lock);
return false;
}
> Assuming that this code isn't completely bonkers, then the correct
> primitive to use should be rcu_dereference_protected.
I looked about rcu_dereference_protected() as well, and I though it is
used when you are de-referencing the pointer, which is a more expensive
approach. In the code above, the code basically need to check if the
pointer is assigned or not. Looking at the code, it seems that having
rcu_access_pointer() inside the update lock seems a common pattern, than
that is what I chose.
On the other side, I understand we want to call an RCU primitive with
the _protected() context, so, I looked for a possible
`rcu_access_pointer_protected()`, but this best does not exist. Anyway,
I am happy to change it, if it is the correct API.
> Fixes header should be set to the commit that introduced the broken
> RCU marking:
>
> commit 5fbee843c32e5de2d8af68ba0bdd113bb0af9d86
> Author: Cong Wang <amwang@...hat.com>
> Date: Tue Jan 22 21:29:39 2013 +0000
>
> netpoll: add RCU annotation to npinfo field
When 8fdd95ec162a was created, npinfo was an RCU pointer, although
without the RCU annotation that came later (5fbee843c). That is
reason I chose to fix 8fdd95ec162a.
For instance, checking out 8fdd95ec162a, at the end of
__netpoll_setup(), I see, the RCU annotation, indicating that
ndev->npinfo was a RCU protected pointer.
/* last thing to do is link it to the net device structure */
rcu_assign_pointer(ndev->npinfo, npinfo);
Thanks for feedback and the good pointers
--breno
Powered by blists - more mailing lists