[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250416172711.0c6a1da8@kernel.org>
Date: Wed, 16 Apr 2025 17:27:11 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Stanislav Fomichev <stfomichev@...il.com>, Taehee Yoo
<ap420073@...il.com>, davem@...emloft.net, pabeni@...hat.com,
edumazet@...gle.com, andrew+netdev@...n.ch, horms@...nel.org,
asml.silence@...il.com, dw@...idwei.uk, sdf@...ichev.me,
skhawaja@...gle.com, simona.vetter@...ll.ch, kaiyuanz@...gle.com,
netdev@...r.kernel.org
Subject: Re: [PATCH net] net: devmem: fix kernel panic when socket close
after module unload
On Wed, 16 Apr 2025 08:47:14 -0700 Mina Almasry wrote:
> > Right, tho a bit of work and tricky handling will be necessary to get
> > that right. We're not holding a ref on binding->dev.
> >
> > I think we need to invert the socket mutex vs instance lock ordering.
> > Make the priv mutex protect the binding->list and binding->dev.
> > For that to work the binding needs to also store a pointer to its
> > owning socket?
> >
> > Then in both uninstall paths (from socket and from netdev unreg) we can
> > take the socket mutex, delete from list, clear the ->dev pointer,
> > unlock, release the ref on the binding.
>
> I don't like that the ref obtained by the socket can be released by
> both the socket and the netdev unreg :( It creates a weird mental
> model where the ref owned by the socket can actually be dropped by the
> netdev unreg path and then the socket close needs to detect that
> something else dropped its ref. It also creates a weird scenario where
> the device got unregistered and reregistered (I assume that's
> possible? Or no?) and the socket is alive and the device is registered
> but actually the binding is not active.
I agree. But it's be best I could come up with (and what we ended up
with in io-uring)...
> > The socket close path would probably need to lock the socket, look at
> > the first entry, if entry has ->dev call netdev_hold(), release the
> > socket, lock the netdev, lock the socket again, look at the ->dev, if
> > NULL we raced - done. If not NULL release the socket, call unbind.
> > netdev_put(). Restart this paragraph.
> >
> > I can't think of an easier way.
> >
>
> How about, roughly:
>
> - the binding holds a ref on dev, making sure that the dev is alive
> until the last ref on the binding is dropped and the binding is freed.
> - The ref owned by the socket is only ever dropped by the socket close.
> - When we netdev_lock(binding->dev) to later do a
> net_devmem_dmabuf_unbind, we must first grab another ref on the
> binding->dev, so that it doesn't get freed if the unbind drops the
> last ref.
Right now you can't hold a reference on a netdevice forever.
You have to register a notifier and when NETDEV_UNREGISTER triggers
you must give up the reference you took. Also, fun note, it is illegal
to take an "additional reference". You must re-lookup the device or
otherwise safely ensure device is not getting torn down.
See netdev_wait_allrefs_any(), that blocks whoever called unregister
until all refs are reclaimed.
> I think that would work too?
>
> Can you remind me why we do a dev_memory_provider_uninstall on a
> device unregister? If the device gets unregistered then re-registered
> (again, I'm kinda assuming that is possible, I'm not sure)
It's not legal right now. I think there's a BUG_ON() somewhere.
> I expect it
> to still be memory provider bound, because the netlink socket is still
> alive and the userspace is still expecting a live binding. Maybe
> delete the dev_memory_provider_uninstall code I added on unregister,
> and sorry I put it there...? Or is there some reason I'm forgetting
> that we have to uninstall the memory provider on unregister?
IIRC bound_rxqs will point to freed memory once the netdev is gone.
If we had a ref on the netdev then yeah we could possibly potentially
keep the queues around. But holding refs on a netdev is.. a topic for
another time. I'm trying to limit amount of code we'd need to revert
if the instance locking turns out to be fundamentally broken :S
Powered by blists - more mailing lists