[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <69813c31200a6_55fa1005a@dwillia2-mobl4.notmuch>
Date: Mon, 2 Feb 2026 16:07:13 -0800
From: <dan.j.williams@...el.com>
To: Li Ming <ming.li@...omail.com>, <dave@...olabs.net>,
<jonathan.cameron@...wei.com>, <dave.jiang@...el.com>,
<alison.schofield@...el.com>, <vishal.l.verma@...el.com>,
<ira.weiny@...el.com>, <dan.j.williams@...el.com>
CC: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Li Ming
<ming.li@...omail.com>
Subject: Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport
adding
Li Ming wrote:
> When CXL subsystem adds a cxl port to a hierarchy, there is a small
> window where the new port becomes visible before it is bound to a
> driver. This happens because device_add() adds a device to bus device
> list before bus_probe_device() binds it to a driver.
> So if two cxl memdevs are trying to add a dport to a same port via
> devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> and attempt to add a dport, but fails because the port has not yet been
> attached to cxl port driver.
> the sequence is like:
>
> CPU 0 CPU 1
> devm_cxl_enumerate_ports()
> # port not found, add it
> add_port_attach_ep()
> # hold the parent port lock
> # to add the new port
> devm_cxl_create_port()
> device_add()
> # Add dev to bus devs list
> bus_add_device()
> devm_cxl_enumerate_ports()
> # found the port
> find_cxl_port_by_uport()
> # hold port lock to add a dport
> device_lock(the port)
> find_or_add_dport()
> cxl_port_add_dport()
> return -ENXIO because port->dev.driver is NULL
> device_unlock(the port)
> bus_probe_device()
> # hold the port lock
> # for attaching
> device_lock(the port)
> attaching the new port
> device_unlock(the port)
>
> To fix this race, require that dport addition holds the parent port lock
> of the target port. The CXL subsystem already requires holding the
> parent port lock while attaching a new port. Therefore, successfully
> acquiring the parent port lock ganrantees that port attaching has
> completed.
Are you seeing this case fail permanently? The expectation is that the
one that loses the race iterates up the topology and retries.
So yes, you can lose this race once, but not twice is the expectation.
Powered by blists - more mailing lists