[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMArcTUYz=fv-Uisk-9xFDZBKwt0Es+dm97iZaRjs_q4_5d-Mw@mail.gmail.com>
Date: Mon, 12 May 2025 10:05:00 +0900
From: Taehee Yoo <ap420073@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Mina Almasry <almasrymina@...gle.com>, davem@...emloft.net, pabeni@...hat.com,
edumazet@...gle.com, horms@...nel.org, sdf@...ichev.me,
netdev@...r.kernel.org, asml.silence@...il.com, dw@...idwei.uk,
skhawaja@...gle.com, kaiyuanz@...gle.com, jdamato@...tly.com
Subject: Re: [PATCH net v3] net: devmem: fix kernel panic when netlink socket
close after module unload
On Sat, May 10, 2025 at 7:32 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
Hi Jakub,
Thanks a lot for the review!
> On Fri, 9 May 2025 12:43:42 -0700 Mina Almasry wrote:
> > > @@ -117,9 +124,6 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding)
> > > unsigned long xa_idx;
> > > unsigned int rxq_idx;
> > >
> > > - if (binding->list.next)
> > > - list_del(&binding->list);
> > > -
> >
> > Unfortunately if you're going to delete this, then you need to do
> > list_del in _all_ the callers of net_devmem_unbind_dmabuf, and I think
> > there is a callsite in netdev_nl_bind_rx_doit that is missed?
> >
> > But also, it may rough to continually have to remember to always do
> > list_del when we do unbind. AFAIR Jakub asked for uniformity in the
> > bind/unbind functions. Can we instead do the list_add inside of
> > net_devmem_bind_dmabuf? So net_devmem_bind_dmabuf can take the struct
> > list_head as an arg and do the list add, then the unbind can do the
> > list_del, so it is uniform, but we don't have to remember to do
> > list_add/del everytime we call bind/unbind.
> >
> > Also, I suspect that clean up can be a separate patch.
>
> Right. Let's leave it for a cleanup. And you can also inline
> net_devmem_unset_dev() in that case. My ask was to separate
> devmem logic from socket logic more clearly but the "new lock"
> approach doesn't really go in such direction. It's good enough
> for the fix tho.
Thanks!
I will make net_devmem_unset_dev() the inline function.
>
> > > + struct mutex lock;
> > >
> > > /* The user holds a ref (via the netlink API) for as long as they want
> > > * the binding to remain alive. Each page pool using this binding holds
> > > diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
> > > index dae9f0d432fb..bd5d58604ec0 100644
> > > --- a/net/core/netdev-genl.c
> > > +++ b/net/core/netdev-genl.c
> > > @@ -979,14 +979,27 @@ void netdev_nl_sock_priv_destroy(struct netdev_nl_sock *priv)
> > > {
> > > struct net_devmem_dmabuf_binding *binding;
> > > struct net_devmem_dmabuf_binding *temp;
> > > + netdevice_tracker dev_tracker;
> > > struct net_device *dev;
> > >
> > > mutex_lock(&priv->lock);
> > > list_for_each_entry_safe(binding, temp, &priv->bindings, list) {
> > > + list_del(&binding->list);
> > > +
> > > + mutex_lock(&binding->lock);
> > > dev = binding->dev;
> > > + if (!dev) {
> > > + mutex_unlock(&binding->lock);
> > > + net_devmem_unbind_dmabuf(binding);
> > > + continue;
> > > + }
> > > + netdev_hold(dev, &dev_tracker, GFP_KERNEL);
> > > + mutex_unlock(&binding->lock);
> > > +
> >
> > Consider writing the above lines as something like:
> >
> > mutex_lock(&binding->lock);
> > if (binding->dev) {
> > netdev_hold(binding->dev, &dev_tracker, GPF_KERNEL);
> > }
> >
> > net_devmem_unbind_dmabuf(binding);
> >
> > if (binding->dev) {
> > netdev_put(binding->dev, &dev_tracker);
> > }
> > mutex_unlock(&binding->lock);
> >
> > i.e., don't duplicate the net_devmem_unbind_dmabuf(binding); call.
>
> I think it's fine as is.
>
> > Other than that, I could not find issues. I checked lock ordering. The
> > lock hierarchy is:
> >
> > priv->lock
> > binding->lock
> > netdev_lock(dev)
>
> Did you mean:
>
> priv->lock
> netdev_lock(dev)
> binding->lock
Thanks a lot!
Taehee Yoo
Powered by blists - more mailing lists