[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <698275bca732a_55fa10097@dwillia2-mobl4.notmuch>
Date: Tue, 3 Feb 2026 14:25:00 -0800
From: <dan.j.williams@...el.com>
To: Li Ming <ming.li@...omail.com>, danjwilliams <dan.j.williams@...el.com>
CC: dave <dave@...olabs.net>, jonathan.cameron <jonathan.cameron@...wei.com>,
dave.jiang <dave.jiang@...el.com>, alison.schofield
<alison.schofield@...el.com>, vishal.l.verma <vishal.l.verma@...el.com>,
ira.weiny <ira.weiny@...el.com>, linux-cxl <linux-cxl@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport
adding
Li Ming wrote:
[..]
> > >
> > > To fix this race, require that dport addition holds the parent port lock
> > > of the target port. The CXL subsystem already requires holding the
> > > parent port lock while attaching a new port. Therefore, successfully
> > > acquiring the parent port lock ganrantees that port attaching has
> > > completed.
> >
> > Are you seeing this case fail permanently? The expectation is that the
> > one that loses the race iterates up the topology and retries.
> >
> > So yes, you can lose this race once, but not twice is the expectation.
> >
> Hi Dan,
>
> My understanding is that would not trigger enumeration retry, because
> enumerating ports flow retries the enumeration only when
> find_or_add_dport() returns a -EAGAIN. but the port's driver checking
> in cxl_port_add_dport() returns a -ENXIO, so it makes
> devm_cxl_enumerate_ports() failure directly.
Ah, true, my mental model was still stuck in the old top-down dport
enumeration scheme.
So, yes, we do need to make sure that switch port creation does not race
port lookup. However, I think the scoped_guard() tends to make code less
readable and in this case hides the opportunity for more comments to
explain what is happening.
I also think, per that observation from Jonathan, that we can save the
cxl_bus_resan() violence by taking the CXL platform device lock.
So, please move the locking internal to find_or_add_dport(), so that
plain guard() can be used. Add comments for the fact that
devm_cxl_create_port() and the CXL platform init path need to be flushed
by taking the device lock. And explain why the device to lock is
different dependening on whether the parent_port is the cxl_root or a
descendant port.
Powered by blists - more mailing lists