[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZurAiwt7t2WWVrJM@PC2K9PVX.TheFacebook.com>
Date: Wed, 18 Sep 2024 13:59:07 +0200
From: Gregory Price <gourry@...rry.net>
To: Dan Williams <dan.j.williams@...el.com>
Cc: linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org,
dave@...olabs.net, jonathan.cameron@...wei.com,
dave.jiang@...el.com, alison.schofield@...el.com,
vishal.l.verma@...el.com, ira.weiny@...el.com, rrichter@....com,
terry.bowman@....com
Subject: Re: [PATCH] cxl/core/port: defer probe when memdev fails to find
correct port
On Fri, Sep 13, 2024 at 09:32:48PM -0700, Dan Williams wrote:
> Gregory Price wrote:
> > Depending on device/hierarchy readiness, it can be possible for the
> > async probe process to attempt to register an endpoint before the
> > entire port hierarchy is ready. This currently fails with -ENXIO.
> >
> > Return -EPROBE_DEFER to try again later automatically (which is
> > what the local comments already say we should do anyway).
>
> I want to make sure this is not papering over some other issue. Can you
> post the final topology when this works (cxl list -BPET)? My working
> theory is that you have 2 devices that share an intermediate port.
> Otherwise, I am having a hard time understanding why the
> cxl_bus_rescan() in cxl_acpi_probe() does not obviate the explicit
> EPROBE_DEFER.
>
Sorry for the delay
[
{
"bus":"root0",
"provider":"ACPI.CXL",
"nr_dports":4,
"dports":[
{
"dport":"pci0000:e0",
"alias":"ACPI0016:00",
"id":7
},
{
"dport":"pci0000:00",
"alias":"ACPI0016:01",
"id":0
},
{
"dport":"pci0000:c0",
"alias":"ACPI0016:02",
"id":6
},
{
"dport":"pci0000:20",
"alias":"ACPI0016:03",
"id":1
}
],
"ports:root0":[
{
"port":"port1",
"host":"pci0000:e0",
"depth":1,
"decoders_committed":2,
"nr_dports":4,
"dports":[
{
"dport":"0000:e0:07.2",
"alias":"device:16",
"id":114
},
{
"dport":"0000:e0:01.1",
"alias":"device:02",
"id":0
},
{
"dport":"0000:e0:01.3",
"alias":"device:05",
"id":2
},
{
"dport":"0000:e0:07.1",
"alias":"device:0d",
"id":113
}
],
"endpoints:port1":[
{
"endpoint":"endpoint5",
"host":"mem0",
"parent_dport":"0000:e0:01.1",
"depth":2,
"decoders_committed":1
}
]
},
{
"port":"port3",
"host":"pci0000:c0",
"depth":1,
"decoders_committed":2,
"nr_dports":1,
"dports":[
{
"dport":"0000:c0:01.1",
"alias":"device:c3",
"id":0
}
],
"endpoints:port3":[
{
"endpoint":"endpoint6",
"host":"mem1",
"parent_dport":"0000:c0:01.1",
"depth":2,
"decoders_committed":1
}
]
},
{
"port":"port2",
"host":"pci0000:00",
"depth":1,
"decoders_committed":0,
"nr_dports":2,
"dports":[
{
"dport":"0000:00:01.3",
"alias":"device:55",
"id":2
},
{
"dport":"0000:00:07.1",
"alias":"device:5d",
"id":113
}
]
},
{
"port":"port4",
"host":"pci0000:20",
"depth":1,
"decoders_committed":0,
"nr_dports":1,
"dports":[
{
"dport":"0000:20:01.1",
"alias":"device:d0",
"id":0
}
]
}
]
}
]
> So, devA is dependendent on devB to create a common port, but devA loses
> that race after cxl_bus_rescan() has already run. Then EBPROBE_DEFER is
> the right answer to trigger devA to try again.
Powered by blists - more mailing lists