lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3cf0890b-4eb0-e70e-cd9c-2ecc3d496263@hpe.com>
Date:   Fri, 2 Jun 2023 14:11:36 -0700 (PDT)
From:   Paul Cassella <cassella@....com>
To:     Ira Weiny <ira.weiny@...el.com>
CC:     <dan.j.williams@...el.com>,
        Yongqiang Liu <liuyongqiang13@...wei.com>,
        <linux-kernel@...r.kernel.org>, <nvdimm@...ts.linux.dev>,
        <vishal.l.verma@...el.com>, <dave.jiang@...el.com>,
        <akpm@...ux-foundation.org>, <joao.m.martins@...cle.com>,
        <zhangxiaoxu5@...wei.com>, <linux-cxl@...r.kernel.org>
Subject: Re: [PATCH] dax/hmem: Fix refcount leak in dax_hmem_probe()

On Fri, 2 Jun 2023, Ira Weiny wrote:
> Paul Cassella wrote:
> > On Sat, 3 Dec 2022, Ira Weiny wrote:
> > > On Sat, Dec 03, 2022 at 09:58:58AM +0000, Yongqiang Liu wrote:

> > > > We should always call dax_region_put() whenever devm_create_dev_dax()
> > > > succeed or fail to avoid refcount leak of dax_region. Move the return
> > > > value check after dax_region_put().

> > > I think dax_region_put is called from dax_region_unregister() automatically on
> > > tear down.

> > Note the reference dax_region_unregister() will be putting is the one 
> > devm_create_dev_dax() takes by kref_get(&dax_region->kref).   I think 
> > dax_hmem_probe() needs to put its reference in the error case, as in the 
> > successful case.

> Looking at this again I'm inclined to agree that something is wrong.  But
> I'm not sure this patch fixes it. anything.

> When you say:
> 
> > ...   I think 
> > dax_hmem_probe() needs to put its reference in the error case, as in the 
> > successful case.
> 
> ... which kref_get() is dax_hmem_probe() letting go?

Isn't it letting go of the initial kref_init() reference from 
alloc_dax_region()?

Sorry, I had gone through the code a little carelessly yesterday.  Now I 
think that kref_init() reference is the one that dax_hmem_probe() is 
dropping in the success case, and which still needs to be dropped in the 
error case.

(If so, I think the alloc_dax_region() path that returns NULL on 
devm_add_action_or_reset() failure, releasing the kref_get reference, will 
leak the kref_init reference and the region.)



-- 
Paul Cassella


> Here is what I see with the current code.
> 
> dax_hmem_probe()
> 	alloc_dax_region()
> 		kref_get(&dax_region->kref)
> 		devm_add_action_or_reset(... dax_region_unregister, ...)
> 				... kref_put() later...
> 
> 	devm_create_dev_dax()
> 		...may return error...
> 		kref_get()
> 		[dev_dax_release() set to call kref_put() later]
> 		...may return error...
> 
> 	if not error
> 		dax_region_put() => kref_put()
> 
> I think this is an extra unneeded put???
> 
> Dan I see this pattern repeated in cxl and pmem.  Is the intent to remove
> the need for dax_region_unregister() to be called when the platform device
> unwinds because the platform device is not intended to own the dax_region
> after success?  If so it seems like the device managed call should be
> removed too.  Not just calling dax_region_put()?  :-/
> 
> But wouldn't that cause an issue with the sysfs entries created?
> 
> > 
> > Consider, devm_create_dev_dax() has error paths that return without 
> > involving dax_region_unregister(), prior to kref_get() and device_add().  
> > dax_hmem_probe() is clearly responsible for freeing the region in those 
> > cases.
> 
> No the devm_add_action_or_reset(... dax_region_unregister, ...) will cause
> dax_region_unregister() to release the reference when the platform device
> unwinds.
> 
> devm_create_dev_dax() configures a reference release through the
> dev_dax->type release.  So when the dev_dax device is put the dax_region
> will be released through dev_dax_release()->dax_region_put().
> 
> > 
> > 
> > dax_hmem_probe() drops its own reference in the successful case because 
> > (per the comment) "child dev_dax instances now own the lifetime of the 
> > dax_region".  That ownership is the reference that the error-case 
> > dax_region_unregister() is dropping.
> 
> No dax_region_unregister() is not just an error case flow.
> 
> >
> > dax_hmem_probe()'s initial reference 
> > also needs to be dropped in the error case, as in the successful case.
> > 
> 
> I don't follow this.  Doesn't this now result in an invalid reference
> release in dax_region_unregister() when the platform device is unwound?
> Furthermore, that reference is required I think for the sysfs entries.
> 
> > 
> > > > Fixes: c01044cc8191 ("ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device")
> > > 
> > > I'm also not sure how this patch is related to this fix.
> > > 
> > > Ira
> > > 
> > > > Signed-off-by: Yongqiang Liu <liuyongqiang13@...wei.com>
> > > > ---
> > > >  drivers/dax/hmem/hmem.c | 5 ++---
> > > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> > > > index 1bf040dbc834..09f5cd7b6c8e 100644
> > > > --- a/drivers/dax/hmem/hmem.c
> > > > +++ b/drivers/dax/hmem/hmem.c
> > > > @@ -36,12 +36,11 @@ static int dax_hmem_probe(struct platform_device *pdev)
> > > >  		.size = region_idle ? 0 : resource_size(res),
> > > >  	};
> > > >  	dev_dax = devm_create_dev_dax(&data);
> > > > -	if (IS_ERR(dev_dax))
> > > > -		return PTR_ERR(dev_dax);
> > > >  
> > > >  	/* child dev_dax instances now own the lifetime of the dax_region */
> > 
> > This comment should probably be updated now.  :)
> > 
> 
> I think removed...
> 
> Dan what do you think of this patch?  Am I seriously off the rails here?
> I worry about this code being around for so long.  But since it is in tear
> down perhaps it is just a race which has never been lost.
> 
> Ira
> 
> ---- 8< ---------------------
> 
> From f63969c761b04fb5e646e7ba7df77a48bc26ba1c Mon Sep 17 00:00:00 2001
> From: Ira Weiny <ira.weiny@...el.com>
> Date: Fri, 2 Jun 2023 11:17:10 -0700
> Subject: [PATCH] dax: Avoid extra kref_put() of dax_regions
> 
> In alloc_dax_region() sysfs attribute groups are created against the
> parent object the dax_region is being created under.  A reference to the
> dax_region was thus obtained for the lifetime of the parent device via
> kobj_get() and released via dax_region_unregister().
> 
> The ownership of the dax_region was intended to be transferred to the
> device dax device but this transfer is not necessary and could be
> problematic if the sysfs entries are used after the dax device is
> unwound but before the parent device is.
> 
> For the dax device the dax_region reference taken during creation in
> devm_create_dev_dax() is sufficient to ensure the dax_region lifetime
> for both the parent device and the dax device.
> 
> Remove the extraneous dax_region_put() from all call paths with this
> pattern.
> 
> Not-Yet-Signed-off-by: Ira Weiny <ira.weiny@...el.com>
> ---
>  drivers/dax/cxl.c       | 3 ---
>  drivers/dax/hmem/hmem.c | 3 ---
>  drivers/dax/pmem.c      | 7 +------
>  3 files changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
> index ccdf8de85bd5..d3c3654842ba 100644
> --- a/drivers/dax/cxl.c
> +++ b/drivers/dax/cxl.c
> @@ -31,9 +31,6 @@ static int cxl_dax_region_probe(struct device *dev)
>  	dev_dax = devm_create_dev_dax(&data);
>  	if (IS_ERR(dev_dax))
>  		return PTR_ERR(dev_dax);
> -
> -	/* child dev_dax instances now own the lifetime of the dax_region */
> -	dax_region_put(dax_region);
>  	return 0;
>  }
>  
> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> index e5fe8b39fb94..d22d56964120 100644
> --- a/drivers/dax/hmem/hmem.c
> +++ b/drivers/dax/hmem/hmem.c
> @@ -41,9 +41,6 @@ static int dax_hmem_probe(struct platform_device *pdev)
>  	dev_dax = devm_create_dev_dax(&data);
>  	if (IS_ERR(dev_dax))
>  		return PTR_ERR(dev_dax);
> -
> -	/* child dev_dax instances now own the lifetime of the dax_region */
> -	dax_region_put(dax_region);
>  	return 0;
>  }
>  
> diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
> index f050ea78bb83..efbdaac51e5f 100644
> --- a/drivers/dax/pmem.c
> +++ b/drivers/dax/pmem.c
> @@ -65,12 +65,7 @@ static struct dev_dax *__dax_pmem_probe(struct device *dev)
>  		.pgmap = &pgmap,
>  		.size = range_len(&range),
>  	};
> -	dev_dax = devm_create_dev_dax(&data);
> -
> -	/* child dev_dax instances now own the lifetime of the dax_region */
> -	dax_region_put(dax_region);
> -
> -	return dev_dax;
> +	return devm_create_dev_dax(&data);
>  }
>  
>  static int dax_pmem_probe(struct device *dev)
> -- 
> 2.40.0
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ