[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221123181800.1e41e8c8@kernel.org>
Date: Wed, 23 Nov 2022 18:18:00 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Ido Schimmel <idosch@...sch.org>
Cc: Yang Yingliang <yangyingliang@...wei.com>,
Leon Romanovsky <leon@...nel.org>, netdev@...r.kernel.org,
jiri@...dia.com, davem@...emloft.net, edumazet@...gle.com,
pabeni@...hat.com
Subject: Re: [PATCH net] net: devlink: fix UAF in
devlink_compat_running_version()
On Wed, 23 Nov 2022 21:18:14 +0200 Ido Schimmel wrote:
> > I used the fix code proposed by Jakub, but it didn't work correctly, so
> > I tried to correct and improve it, and need some devlink helper.
> >
> > Anyway, it is a nsim problem, if we want fix this without touch devlink,
> > I think we can add a 'registered' field in struct nsim_dev, and it can be
> > checked in nsim_get_devlink_port() like this:
>
> I read the discussion and it's not clear to me why this is a netdevsim
> specific problem. The fundamental problem seems to be that it is
> possible to hold a reference on a devlink instance before it's
> registered and that devlink_free() will free the instance regardless of
> its current reference count because it expects devlink_unregister() to
> block. In this case, the instance was never registered, so
> devlink_unregister() is not called.
>
> ethtool was able to get a reference on the devlink instance before it
> was registered because netdevsim registers its netdevs before
> registering its devlink instance. However, netdevsim is not the only one
> doing this: funeth, ice, prestera, mlx4, mlxsw, nfp and potentially
> others do the same thing.
>
> When you think about it, it's strange that it's even possible for
> ethtool to reach the driver when the netdev used in the request is long
> gone, but it's not holding a reference on the netdev (it's holding a
> reference on the devlink instance instead) and
> devlink_compat_running_version() is called without RTNL.
Indeed. We did a bit of a flip-flop with the devlink locking rules
and the fact that the instance is reachable before it is registered
is a leftover from a previous restructuring :(
Hence my preference to get rid of the ordering at the driver level
than to try to patch it up in the code. Dunno if that's convincing.
Powered by blists - more mailing lists