lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64f80177c2c21_1e8e7829487@iweiny-mobl.notmuch>
Date:   Tue, 5 Sep 2023 21:35:03 -0700
From:   Ira Weiny <ira.weiny@...el.com>
To:     Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        Ira Weiny <ira.weiny@...el.com>
CC:     Dan Williams <dan.j.williams@...el.com>,
        Navneet Singh <navneet.singh@...el.com>,
        Fan Ni <fan.ni@...sung.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Dave Jiang <dave.jiang@...el.com>,
        Alison Schofield <alison.schofield@...el.com>,
        Vishal Verma <vishal.l.verma@...el.com>,
        <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v2 14/18] dax/region: Support DAX device creation on
 dynamic DAX regions

Jonathan Cameron wrote:
> On Mon, 28 Aug 2023 22:21:05 -0700
> Ira Weiny <ira.weiny@...el.com> wrote:
> 
> > Dynamic Capacity (DC) DAX regions have a list of extents which define
> > the memory of the region which is available.
> > 
> > Now that DAX region extents are fully realized support DAX device
> > creation on dynamic regions by adjusting the allocation algorithms
> > to account for the extents.  Remember also references must be held on
> > the extents until the DAX devices are done with the memory.
> > 
> > Redefine the region available size to include only extent space.  Reuse
> > the size allocation algorithm by defining sub-resources for each extent
> > and limiting range allocation to those extents which have space.  Do not
> > support direct mapping of DAX devices on dynamic devices.
> > 
> > Enhance DAX device range objects to hold references on the extents until
> > the DAX device is destroyed.
> > 
> > NOTE: At this time all extents within a region are created equally.
> > However, labels are associated with extents which can be used with
> > future DAX device labels to group which extents are used.
> 
> This sound like a bad place to start to me as we are enabling something
> that is probably 'wrong' in the long term as opposed to just not enabling it
> until we have appropriate support.

I disagree.  I don't think the kernel should be trying to process tags at
the lower level.

> I'd argue better to just reject any extents with different labels for now.

Again I disagree.  This is less restrictive.  The idea is that labels can
be changed such that user space can ultimately decided which extents
should be used for which devices.  I have some work on that already.
(Basically it becomes quite easy to assign a label to a dax device and
have the extent search use only dax extents which match that label.)

> 
> As this is an RFC meh ;)

Sure!  ;-)

> 
> Whilst this looks fine to me, I'm rather out of my depth wrt to the DAX
> side of things so take that with a pinch of salt.

NP

> 
> Jonathan
> 
> 
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@...el.com>
> > ---
> >  drivers/dax/bus.c         | 157 +++++++++++++++++++++++++++++++++++++++-------
> >  drivers/dax/cxl.c         |  44 +++++++++++++
> >  drivers/dax/dax-private.h |   5 ++
> >  3 files changed, 182 insertions(+), 24 deletions(-)
> > 
> > diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> > index ea7ae82b4687..a9ea6a706702 100644
> > --- a/drivers/dax/bus.c
> > +++ b/drivers/dax/bus.c
> 
> ...
> 
> 
> > @@ -1183,7 +1290,7 @@ static ssize_t mapping_store(struct device *dev, struct device_attribute *attr,
> >  	to_alloc = range_len(&r);
> >  	if (alloc_is_aligned(dev_dax, to_alloc))
> >  		rc = alloc_dev_dax_range(&dax_region->res, dev_dax, r.start,
> > -					 to_alloc);
> > +					 to_alloc, NULL);
> >  	device_unlock(dev);
> >  	device_unlock(dax_region->dev);
> >  
> > @@ -1400,8 +1507,10 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
> >  	device_initialize(dev);
> >  	dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id);
> >  
> > +	dev_WARN_ONCE(parent, is_dynamic(dax_region) && data->size,
> > +		      "Dynamic DAX devices are created initially with 0 size");
> 
> dev_info() maybe more appropriate?

Unless I'm mistaken this can happen from userspace but only if something
in the code changes later.  Because the dax layer is trying to support
non-dynamic regions (which dynamic may be a bad name), I was worried that
the creation with a size might slip through...

> Is this common enough that we need the
> _ONCE?

once is because it could end up spamming a log later if something got
coded up wrong.

> 
> 
> >  	rc = alloc_dev_dax_range(&dax_region->res, dev_dax, dax_region->res.start,
> > -				 data->size);
> > +				 data->size, NULL);
> >  	if (rc)
> >  		goto err_range;
> >  
> > diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
> > index 44cbd28668f1..6394a3531e25 100644
> > --- a/drivers/dax/cxl.c
> > +++ b/drivers/dax/cxl.c
> ...
> 
> 
> >  static int cxl_dax_region_create_extent(struct dax_region *dax_region,
> >  					struct cxl_dr_extent *cxl_dr_ext)
> >  {
> > @@ -45,11 +80,20 @@ static int cxl_dax_region_create_extent(struct dax_region *dax_region,
> >  	/* device manages the dr_extent on success */
> >  	kref_init(&dr_extent->ref);
> >  
> > +	rc = dax_region_add_resource(dax_region, dr_extent,
> > +				     cxl_dr_ext->hpa_offset,
> > +				     cxl_dr_ext->hpa_length);
> > +	if (rc) {
> > +		kfree(dr_extent);
> 
> goto for these and single unwinding block?

Yea.  Done.

> 
> > +		return rc;
> > +	}
> > +
> >  	rc = dax_region_ext_create_dev(dax_region, dr_extent,
> >  				       cxl_dr_ext->hpa_offset,
> >  				       cxl_dr_ext->hpa_length,
> >  				       cxl_dr_ext->label);
> >  	if (rc) {
> > +		dax_region_rm_resource(dr_extent);
> >  		kfree(dr_extent);
> as above.

Done.

> 
> >  		return rc;
> >  	}
> > diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
> > index 250babd6e470..ad73b53aa802 100644
> > --- a/drivers/dax/dax-private.h
> > +++ b/drivers/dax/dax-private.h
> > @@ -44,12 +44,16 @@ struct dax_region {
> >  /*
> >   * struct dax_region_extent - extent data defined by the low level region
> >   * driver.
> > + * @region: cache of dax_region
> > + * @res: cache of resource tree for this extent
> >   * @private_data: lower level region driver data
> 
> Not sure 'lower level' is well defined here. Is "region driver data"
> not enough?

For me it was not.  I'll have to sleep on it.  Technically there is no
dax_region 'driver' but only a dax_region device.

> 
> >   * @ref: track number of dax devices which are using this extent
> >   * @get: get reference to low level data
> >   * @put: put reference to low level data
> >   */
> >  struct dax_region_extent {
> > +	struct dax_region *region;
> > +	struct resource *res;
> >  	void *private_data;
> >  	struct kref ref;
> >  	void (*get)(struct dax_region_extent *dr_extent);
> > @@ -131,6 +135,7 @@ struct dev_dax {
> >  		unsigned long pgoff;
> >  		struct range range;
> >  		struct dax_mapping *mapping;
> > +		struct dax_region_extent *dr_extent;
> 
> Huh. Seems that ranges is in the kernel doc but not the
> bits that make that up.  Maybe good to add the docs
> whilst here?

oh.  sure.  took me a couple of reads of this sentence.

I'm going to think on this too.

Ira

> 
> >  	} *ranges;
> >  };
> >  
> > 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ