lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZOi7ZOtb9j7TYib5@rric.localdomain>
Date:   Fri, 25 Aug 2023 16:32:04 +0200
From:   Robert Richter <rrichter@....com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     Terry Bowman <terry.bowman@....com>, alison.schofield@...el.com,
        vishal.l.verma@...el.com, ira.weiny@...el.com, bwidawsk@...nel.org,
        dave.jiang@...el.com, Jonathan.Cameron@...wei.com,
        linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org,
        bhelgaas@...gle.com
Subject: Re: [PATCH v8 03/14] cxl/hdm: Use stored Component Register mappings
 to map HDM decoder capability

On 07.08.23 20:00:41, Dan Williams wrote:
> Robert Richter wrote:
> > On 03.07.23 17:39:38, Dan Williams wrote:

> > [51/51] Linking target ndctl/ndctl
> > 1/6 ndctl:cxl / cxl-topology.sh             OK              18.59s
> > 2/6 ndctl:cxl / cxl-region-sysfs.sh         OK              12.26s
> > 3/6 ndctl:cxl / cxl-labels.sh               OK              28.25s
> > 4/6 ndctl:cxl / cxl-create-region.sh        OK              19.56s
> > 5/6 ndctl:cxl / cxl-xor-region.sh           OK              10.57s
> > 6/6 ndctl:cxl / cxl-security.sh             OK               9.77s
> > 
> > Ok:                 6
> > Expected Fail:      0
> > Fail:               0
> > Unexpected Pass:    0
> > Skipped:            0
> > Timeout:            0
> > 
> > Full log written to /root/ndctl/build/meson-logs/testlog.txt
> 
> So it turns out it is not cxl_test that is failing instead it is the
> tests noticing a regression of the VH base case. I am running
> cxl-topology.sh in a QEMU environment with a local device defined, and
> that local device is hitting a probe regression.
> 
> When cxl_test goes to count the expected disabled devices it sees one
> more because the QEMU device is also disabled.
> 
> ++ jq 'map(select(.state == "disabled")) | length'
> + count=5
> + (( count == 4 ))
> + err 170
> ++ basename /root/git/ndctl/test/cxl-topology.sh
> + echo test/cxl-topology.sh: failed at line 170
> 
> ...i.e. only 4 devices are expected to be disabled, not the 5th one that
> was not on the cxl_test bus. I assume you are running without any other
> CXL devices?
> 
> I will look into making that a more formalized case so that failure is
> not QEMU configuration dependent, but please make sure that QEMU CXL
> configs do not regress.

I finally could reproduce and root cause the issue.

I have isolated this to the operation on the qemu mem0 device, without
cxl_test involved:

root@...u:~# cxl list
[
  {
    "memdev":"mem0",
    "pmem_size":536870912,
    "serial":0,
    "host":"0000:35:00.0"
  }
]
root@...u:~# echo mem0 >/sys/bus/cxl/drivers/cxl_mem/unbind
[   65.957423] nvdimm nmem0: trace
[   66.001986] nd_bus ndbus0: nvdimm.remove(nmem0)
[   66.041053] cxl_mem mem0: disconnect mem0 from port1
root@...u:~# echo mem0 >/sys/bus/cxl/drivers/cxl_mem/bind
[   75.076881] cxl_mem mem0: scan: iter: mem0 dport_dev: 0000:34:00.0 parent: pci0000:34
[   75.077349] cxl_mem mem0: found already registered port port1:pci0000:34
[   75.077653] cxl_mem mem0: host-bridge: pci0000:34
[   75.078003] cxl_pci 0000:35:00.0: Failed to request region 0x00000000fe841110-0x00000000fe84119f
[   75.078415] cxl_port endpoint2: Failed to map HDM capability.
[   75.078683] cxl_port endpoint2: probe: -12
[   75.078911] cxl_port: probe of endpoint2 failed with error -12
[   75.079223] cxl_mem mem0: endpoint2 added to port1
[   75.079469] cxl_mem mem0: endpoint2 failed probe
[   75.079706] cxl_mem mem0: probe: -6
[   75.079943] cxl_mem mem0: disconnect mem0 from port1
bash: echo: write error: No such device or address
root@...u:~# cxl list -i
[
  {
    "memdev":"mem0",
    "pmem_size":536870912,
    "serial":0,
    "host":"0000:35:00.0",
    "state":"disabled"
  }
]

The issue causes ndctl cxl tests with the cxl_test module to fail too.

> > > > +static struct cxl_register_map *cxl_port_get_comp_map(struct cxl_port *port)
> > > > +{
> > > > +	/*
> > > > +	 * HDM capability applies to Endpoints, USPs and VH Host
> > > > +	 * Bridges. The Endpoint's component register mappings are
> > > > +	 * located in the cxlds.
> > > > +	 */
> > > > +	if (is_cxl_endpoint(port)) {
> > > > +		struct cxl_memdev *memdev = to_cxl_memdev(port->uport_dev);
> > > > +
> > > > +		return &memdev->cxlds->comp_map;
> > > 
> > > ...but why? The issue here is that the @dev argument in that map is the
> > > memdev parent PCI device. However, in this context the @dev for devm
> > > operations wants to be &port->dev.

Right, I fixed this by changing the function i/f to pass @dev for the
devm operation. See below.

> > 
> > The cxl_pci driver stores the comp_map of the endpoint in the cxlds
> > structure, struct cxl_port is not yet available at this point.
> 
> When you say "this point" you mean "that point" in cxl_pci, right? I
> initially took that to mean literally "this" point in the quoted code
> above.

Yes, exactly. The mapping is in cxlds->comp_map, not port->comp_map.

> 
> > See patch #2 of this series ("cxl/pci: Store the endpoint's Component
> > Register mappings in struct cxl_dev_state").
> > 
> > > 
> > > > +	}
> > > > +
> > > > +	return &port->comp_map;
> > > 
> > > ...so this is fine, and folding in the following resolves the test
> > > failure.
> > > 
> > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > > index b0f59e63e0d2..6f111f487795 100644
> > > --- a/drivers/cxl/core/hdm.c
> > > +++ b/drivers/cxl/core/hdm.c
> > > @@ -125,22 +125,6 @@ static bool should_emulate_decoders(struct cxl_endpoint_dvsec_info *info)
> > >         return true;
> > >  }
> > >  
> > > -static struct cxl_register_map *cxl_port_get_comp_map(struct cxl_port *port)
> > > -{
> > > -       /*
> > > -        * HDM capability applies to Endpoints, USPs and VH Host
> > > -        * Bridges. The Endpoint's component register mappings are
> > > -        * located in the cxlds.
> > > -        */
> > > -       if (is_cxl_endpoint(port)) {
> > > -               struct cxl_memdev *memdev = to_cxl_memdev(port->uport_dev);
> > > -
> > > -               return &memdev->cxlds->comp_map;
> > > -       }
> > > -
> > > -       return &port->comp_map;
> > > -}
> > > -
> > >  /**
> > >   * devm_cxl_setup_hdm - map HDM decoder component registers
> > >   * @port: cxl_port to map
> > > @@ -160,8 +144,7 @@ struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
> > >         cxlhdm->port = port;
> > >         dev_set_drvdata(dev, cxlhdm);
> > >  
> > > -       comp_map = cxl_port_get_comp_map(port);
> > > -
> > > +       comp_map = &port->comp_map;
> > 
> > Can you check if the following works instead, I think the
> > pre-initialization is missing in cxl_mock_mem_probe() for
> > cxl_test:
> > 
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index d6d067fbee97..4c4e33de4d74 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -1333,6 +1333,8 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
> >  	mutex_init(&mds->mbox_mutex);
> >  	mutex_init(&mds->event.log_lock);
> >  	mds->cxlds.dev = dev;
> > +	mds->cxlds.comp_map.dev = dev;
> > +	mds->cxlds.comp_map.resource = CXL_RESOURCE_NONE;
> 
> This has the same problem. @dev is specifying the lifetime of the
> mapping. The lifetime of the mapping needs to be relative to the driver
> using the registers. So if the cxl_port driver is mapping the component
> registers the only valid device in the comp_map is &port->dev.
> 
> I notice that cxl_port_get_comp_map() endpoint port as an argument. That
> endpoint port was instantiated with @cxlmd, but it seems not with
> @cxlmd->cxlds->comp_map information which is available. Lets just fix
> that. I.e. move devm_cxl_add_endpoint() into the core and make it
> initialize @endpoint->comp_map with @cxlds->comp_map while switching
> @dev in the @endpoint->comp_map to be @endpoint->dev, and not
> @cxlds->dev.

That does not work as we need the RAS cap early once cxl_pci is bound.
There are multiple devices using the comp_map. For proper devm release
we need to assign different devices depending on the users. That is,
the RAS cap is released with cxl_pci driver unbinding and HDM with the
memdev release. The fix I have chosen is to expand
cxl_map_component_regs() to pass the devm device to release it
together with the device using it.

> 
> ...but it turns out that is the second bug in this patch. As I went to
> try to reproduce this failure for a single port VH configuration I
> notice another bug, a regression of this fix:
> 
>    7bba261e0aa6 cxl/port: Scan single-target ports for decoders
> 
> ...because there is no requirement for single port switches /
> host-bridges to have HDM decoders per the specification, and the
> original patch is turning HDM decoders not present as a hard failure.

But if the HDM is not present, a -ENODEV is still returned?
cxl_switch_port_probe() does not fail then.

> 
> >  	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
> >  
> >  	return mds;
> > -- 
> > 2.30.2
> > 
> > The cxl_pci driver always has something valid or fails otherwise.
> 
> Understood, just copy that map at __devm_cxl_add_port() time. I am
> thinking devm_cxl_add_endpoint() moves into core/port.c and it calls
> __devm_cxl_add_port() directly with a new parameter to take a passed in
> @comp_map or @component_reg_phys for the switch port case.

Please take a look at the cxl_map_component_regs() update in v9. We
will send it out today. The changes compared to v8 are
straightforward.

Thanks,

-Robert

> 
> > 
> > If that works the change should be merged into patch #2.
> > 
> > Thanks,
> > 
> > -Robert
> > 
> > 
> > >         if (comp_map->resource == CXL_RESOURCE_NONE) {
> > >                 if (info && info->mem_enabled) {
> > >                         cxlhdm->decoder_count = info->ranges;
> > > 
> > > 
> > > Am I missing why the cxlds->comp_map needs to be reused?
> > > 
> > > Note that I am out and may not be able to respond further until next
> > > week.
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ