lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67a2a4d2519_2d383a29411@iweiny-mobl.notmuch>
Date: Tue, 4 Feb 2025 17:37:54 -0600
From: Ira Weiny <ira.weiny@...el.com>
To: Dan Williams <dan.j.williams@...el.com>, Ira Weiny <ira.weiny@...el.com>,
	Dave Jiang <dave.jiang@...el.com>, Davidlohr Bueso <dave@...olabs.net>,
	Jonathan Cameron <jonathan.cameron@...wei.com>, Alison Schofield
	<alison.schofield@...el.com>, Vishal Verma <vishal.l.verma@...el.com>
CC: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Ira Weiny
	<ira.weiny@...el.com>
Subject: Re: [PATCH RFC 2/2] cxl/memdev: Remove temporary variables from
 cxl_memdev_state

Dan Williams wrote:
> Ira Weiny wrote:

Perhaps this would have been good to add to the commit message.

<quote>

The net-net of this change is to make partition set up 2 distinct steps.

	1) query the device for total, ram, and pmem partition size information
	2) create partitions using that information

While doing so it avoids storing the total, ram, pmem sizes tuple in favor of a
stack variable cxl_dev_info.

</quote>

[snip]

> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 998e1df36db673c47c4e87b957df9c29bf3f291a..44618746ad79b0459501bb3001518f6b7d2ceaba 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -1068,6 +1068,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, "CXL");
> >  /**
> >   * cxl_mem_get_partition_info - Get partition info
> >   * @mds: The driver data for the operation
> > + * @dev_info: Device info results
> >   *
> >   * Retrieve the current partition info for the device specified.  The active
> >   * values are the current capacity in bytes.  If not 0, the 'next' values are
> > @@ -1075,9 +1076,10 @@ EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, "CXL");
> >   *
> >   * Return: 0 if no error: or the result of the mailbox command.
> >   *
> > - * See CXL @8.2.9.5.2.1 Get Partition Info
> > + * See CXL 3.1 @8.2.9.9.2.1 Get Partition Info
> >   */
> > -static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds)
> > +static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds,
> > +				      struct cxl_mem_dev_info *dev_info)
> 
> I was hoping this would get further away from new in/out arguments and
> look at centralizing all partition enumeration into one routine.

I don't understand?  get partition info is only required if the partition align
bytes is not 0.  IOW if the device allows for partitions to be changed.
cxl_mem_get_partition_info() is only called IFF that extra query is required.
So this does centralize byte information queries into one routine.  It leaves
creating partitions to the device driver which moves us toward these being
mailbox only calls...

What I did fail to do is change mds to a mailbox.  So add this hunk to this
call with the corresponding change in the caller.

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 44618746ad79..873793dab68e 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1067,7 +1067,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, "CXL");
 
 /**
  * cxl_mem_get_partition_info - Get partition info
- * @mds: The driver data for the operation
+ * @mbox: Mailbox to query
  * @dev_info: Device info results
  *
  * Retrieve the current partition info for the device specified.  The active
@@ -1078,10 +1078,9 @@ EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, "CXL");
  *
  * See CXL 3.1 @8.2.9.9.2.1 Get Partition Info
  */
-static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds,
+static int cxl_mem_get_partition_info(struct cxl_mailbox *mbox,
                                      struct cxl_mem_dev_info *dev_info)
 {
-       struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox;
        struct cxl_mbox_get_partition_info pi;
        struct cxl_mbox_cmd mbox_cmd;
        int rc;
@@ -1091,7 +1090,7 @@ static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds,
                .size_out = sizeof(pi),
                .payload_out = &pi,
        };
-       rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
+       rc = cxl_internal_send_cmd(mbox, &mbox_cmd);
        if (rc)
                return rc;
 

> That
> was the intent of cxl_mem_dpa_fetch() to capture all generic Memory
> Expander DPA boundary information.

The problem with cxl_mem_dpa_fetch() is it mixes in the partition info mailbox
call after you have already supposedly set up volatile/persistent byte
settings.

IOW I don't think it is clean to have the cxl_mem_get_partition_info() call
hidden away in cxl_mem_dpa_fetch().

> 
> >  {
> >  	struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox;
> >  	struct cxl_mbox_get_partition_info pi;
> > @@ -1093,9 +1095,9 @@ static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds)
> >  	if (rc)
> >  		return rc;
> >  
> > -	mds->active_volatile_bytes =
> > +	dev_info->volatile_bytes =
> >  		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > -	mds->active_persistent_bytes =
> > +	dev_info->persistent_bytes =
> >  		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> >  
> >  	return 0;
> > @@ -1104,18 +1106,24 @@ static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds)
> >  /**
> >   * cxl_dev_state_identify() - Send the IDENTIFY command to the device.
> >   * @mds: The driver data for the operation
> > + * @dev_info: Device info results
> >   *
> >   * Return: 0 if identify was executed successfully or media not ready.
> >   *
> > - * This will dispatch the identify command to the device and on success populate
> > - * structures to be exported to sysfs.
> > + * This will dispatch the identify and partition info commands to the device
> > + * and on success populate structures required for the memory device to
> > + * operate.
> > + *
> > + * See CXL 3.1 @8.2.9.9.1.1 Identify Memory Device
> >   */
> > -int cxl_dev_state_identify(struct cxl_memdev_state *mds)
> > +int cxl_dev_state_identify(struct cxl_memdev_state *mds,
> > +			   struct cxl_mem_dev_info *dev_info)
> 
> This function is tiny,

It's 47 lines long?

> and most of its work is just transfering
> parameters from device to an intermediate temporary object (@dev_info).

Yes, by design.

> I would say just subsume this function in its only caller and skip the
> @dev_info transfer step.

There are 2 callers including cxl_test.

Further clean up here would be to add the other values to the dev_info tuple
and make this a true mailbox call.  Or split out the lsa and other values
queries into another call.  But that seemed overkill at this point.  No one is
needing those to be separated out.

> 
> >  {
> >  	struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox;
> > -	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
> > +	struct device *dev = cxl_mbox->host;
> >  	struct cxl_mbox_identify id;
> >  	struct cxl_mbox_cmd mbox_cmd;
> > +	u64 partition_align_bytes;
> >  	u32 val;
> >  	int rc;
> >  
> > @@ -1131,14 +1139,22 @@ int cxl_dev_state_identify(struct cxl_memdev_state *mds)
> >  	if (rc < 0)
> >  		return rc;
> >  
> > -	mds->total_bytes =
> > +	dev_info->total_bytes =
> >  		le64_to_cpu(id.total_capacity) * CXL_CAPACITY_MULTIPLIER;
> > -	mds->volatile_only_bytes =
> > +	dev_info->volatile_bytes =
> >  		le64_to_cpu(id.volatile_capacity) * CXL_CAPACITY_MULTIPLIER;
> > -	mds->persistent_only_bytes =
> > +	dev_info->persistent_bytes =
> >  		le64_to_cpu(id.persistent_capacity) * CXL_CAPACITY_MULTIPLIER;
> > -	mds->partition_align_bytes =
> > +
> > +	partition_align_bytes =
> >  		le64_to_cpu(id.partition_align) * CXL_CAPACITY_MULTIPLIER;
> > +	if (partition_align_bytes != 0) {
> > +		rc = cxl_mem_get_partition_info(mds, dev_info);
> > +		if (rc) {
> > +			dev_err(dev, "Failed to query partition information\n");
> > +			return rc;
> > +		}
> > +	}
> >  
> >  	mds->lsa_size = le32_to_cpu(id.lsa_size);
> >  	memcpy(mds->firmware_version, id.fw_revision,
> > @@ -1237,7 +1253,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd)
> >  	return rc;
> >  }
> >  
> > -static void add_part(struct cxl_dpa_info *info, u64 start, u64 size, enum cxl_partition_mode mode)
> > +void cxl_add_partition(struct cxl_dpa_info *info, u64 start, u64 size, enum cxl_partition_mode mode)
> >  {
> >  	int i = info->nr_partitions;
> >  
> > @@ -1251,40 +1267,7 @@ static void add_part(struct cxl_dpa_info *info, u64 start, u64 size, enum cxl_pa
> >  	info->part[i].mode = mode;
> >  	info->nr_partitions++;
> >  }
> > -
> > -int cxl_mem_dpa_fetch(struct cxl_memdev_state *mds, struct cxl_dpa_info *info)
> > -{
> > -	struct cxl_dev_state *cxlds = &mds->cxlds;
> > -	struct device *dev = cxlds->dev;
> > -	int rc;
> > -
> > -	if (!cxlds->media_ready) {
> > -		info->size = 0;
> > -		return 0;
> > -	}
> > -
> > -	info->size = mds->total_bytes;
> > -
> > -	if (mds->partition_align_bytes == 0) {
> > -		add_part(info, 0, mds->volatile_only_bytes, CXL_PARTMODE_RAM);
> > -		add_part(info, mds->volatile_only_bytes,
> > -			 mds->persistent_only_bytes, CXL_PARTMODE_PMEM);
> > -		return 0;
> > -	}
> > -
> > -	rc = cxl_mem_get_partition_info(mds);
> > -	if (rc) {
> > -		dev_err(dev, "Failed to query partition information\n");
> > -		return rc;
> > -	}
> > -
> > -	add_part(info, 0, mds->active_volatile_bytes, CXL_PARTMODE_RAM);
> > -	add_part(info, mds->active_volatile_bytes, mds->active_persistent_bytes,
> > -		 CXL_PARTMODE_PMEM);
> > -
> > -	return 0;
> > -}
> > -EXPORT_SYMBOL_NS_GPL(cxl_mem_dpa_fetch, "CXL");
> > +EXPORT_SYMBOL_NS_GPL(cxl_add_partition, "CXL");
> 
> I am not seeing the justification for both 'struct cxl_dpa_info' and
> 'struct cxl_mem_dev_info'.

cxl_mem_dev_info is a convenient way to pass the (total, ram, pmem) byte tuple
around and it gets declared on the stack so it is never stored.

I'm open to different names...

[snip]

> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index fa30e763a0ce4fd78441f486ce08c81e21207e29..b69ba756e7f722a327c42fb57f843635cf8367a2 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -903,6 +903,7 @@ __ATTRIBUTE_GROUPS(cxl_rcd);
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > +	struct cxl_mem_dev_info dev_info = { 0 };
> >  	struct cxl_dpa_info range_info = { 0 };
> >  	struct cxl_memdev_state *mds;
> >  	struct cxl_dev_state *cxlds;
> > @@ -989,13 +990,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > -	rc = cxl_dev_state_identify(mds);
> > +	rc = cxl_dev_state_identify(mds, &dev_info);
> >  	if (rc)
> >  		return rc;
> >  
> > -	rc = cxl_mem_dpa_fetch(mds, &range_info);
> > -	if (rc)
> > -		return rc;
> > +	if (cxlds->media_ready) {
> 
> Why does media_ready affect partition boundary enumeration?

We want the driver to load without any partitions so that the device can be
queried.

See:

   commit e764f12208b99ac7892c4e3f6bf88d71ca71036f
   Author: Dave Jiang <dave.jiang@...el.com>
   Date:   Thu May 18 16:38:20 2023 -0700
   
       cxl: Move cxl_await_media_ready() to before capacity info retrieval
   
       Move cxl_await_media_ready() to cxl_pci probe before driver starts issuing
       IDENTIFY and retrieving memory device information to ensure that the
       device is ready to provide the information. Allow cxl_pci_probe() to succeed
       even if media is not ready. Cache the media failure in cxlds and don't ask
       the device for any media information.
   
       The rationale for proceeding in the !media_ready case is to allow for
       mailbox operations to interrogate and/or remediate the device. After
       media is repaired then rebinding the cxl_pci driver is expected to
       restart the capacity scan.
   
       Suggested-by: Dan Williams <dan.j.williams@...el.com>
       Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
       Reviewed-by: Ira Weiny <ira.weiny@...el.com>
       Signed-off-by: Dave Jiang <dave.jiang@...el.com>
       Link: https://lore.kernel.org/r/168445310026.3251520.8124296540679268206.stgit@djiang5-mobl3      
       [djbw: fixup cxl_test]
       Signed-off-by: Dan Williams <dan.j.williams@...el.com>

I explored getting rid of media_ready but I think that is a bigger fish to fry
than I have time for ATM.  And you have ideas there which I need to explore
more.

> 
> > +		range_info.size = dev_info.total_bytes;
> > +		cxl_add_partition(&range_info, 0, dev_info.volatile_bytes,
> > +				  CXL_PARTMODE_RAM);
> > +		cxl_add_partition(&range_info, dev_info.volatile_bytes,
> > +				  dev_info.persistent_bytes, CXL_PARTMODE_PMEM);
> > +	}
> 
> Why remove the cxl_mem_dpa_fetch() helper in favor of open-coding these
> cxl_add_partition() calls?

After removing the ugly hidden get partition info call in cxl_mem_dpa_fetch()
cxl_mem_dpa_fetch boiled down to 2 cxl_add_partition() calls.  That was very
tiny.

More importantly this exported a very clean cxl_add_parition() call for any
driver (ie type 2) to call without creating special structures to pass to the
cxl core.

I can put cxl_mem_dpa_fetch() back if you want...  I had changed it to
cxl_mem_create_partitions() to make it more clear what it was actually doing
before realizing it was so small it was probably best removed.

It would be used both here and cxl_test.

[snip]

> > @@ -1534,13 +1535,17 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  		return rc;
> >  
> >  	cxlds->media_ready = true;
> > -	rc = cxl_dev_state_identify(mds);
> > +	rc = cxl_dev_state_identify(mds, &dev_info);
> >  	if (rc)
> >  		return rc;
> >  
> > -	rc = cxl_mem_dpa_fetch(mds, &range_info);
> > -	if (rc)
> > -		return rc;
> > +	if (cxlds->media_ready) {
> > +		range_info.size = dev_info.total_bytes;
> > +		cxl_add_partition(&range_info, 0, dev_info.volatile_bytes,
> > +				  CXL_PARTMODE_RAM);
> > +		cxl_add_partition(&range_info, dev_info.volatile_bytes,
> > +				  dev_info.persistent_bytes, CXL_PARTMODE_PMEM);
> > +	}
> 
> ...and here is duplicated logic that goes away if this stays centralized
> in cxl_mem_dpa_fetch().

True but it was so little code.  And exporting cxl_add_partition() seemed much
cleaner as an example for type 2 devices.

Ira

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ