[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBixdlVsR5dl3J7Y@nvidia.com>
Date: Mon, 20 Mar 2023 16:18:14 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Leon Romanovsky <leon@...nel.org>
Cc: Patrisious Haddad <phaddad@...dia.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, linux-rdma@...r.kernel.org,
netdev@...r.kernel.org, Paolo Abeni <pabeni@...hat.com>,
Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [PATCH rdma-next v1 2/3] RDMA/mlx5: Handling dct common resource
destruction upon firmware failure
On Thu, Mar 16, 2023 at 03:39:27PM +0200, Leon Romanovsky wrote:
> From: Patrisious Haddad <phaddad@...dia.com>
>
> Previously when destroying a DCT, if the firmware function for the
> destruction failed, the common resource would have been destroyed
> either way, since it was destroyed before the firmware object.
> Which leads to kernel warning "refcount_t: underflow" which indicates
> possible use-after-free.
> Which is triggered when we try to destroy the common resource for the
> second time and execute refcount_dec_and_test(&common->refcount).
>
> So, currently before destroying the common resource we check its
> refcount and continue with the destruction only if it isn't zero.
This seems super sketchy
If the destruction fails why not set the refcount back to 1?
Jason
Powered by blists - more mailing lists