[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c1954ed-8ae8-a029-6a37-2065c6addbc2@amd.com>
Date: Wed, 22 Jan 2025 09:26:35 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: Dan Williams <dan.j.williams@...el.com>, alejandro.lucero-palau@....com,
linux-cxl@...r.kernel.org, netdev@...r.kernel.org, edward.cree@....com,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v9 15/27] cxl: define a driver interface for HPA free
space enumeration
On 1/21/25 23:44, Dan Williams wrote:
> Alejandro Lucero Palau wrote:
> [..]
>>>> So, I am not sure this code path has ever been tested as lockdep should
>>>> complain about the double acquisition.
>>>
>>> Oddly enough, it has been tested with two different drivers and with
>>> the kernel configuring lockdep.
>>>
>>> It is worth to investigate ...
>>>
>> Confirmed the double lock is not an issue. Maybe the code hidden in
>> those macros is checking if the current caller is the same one that the
>> current owner of the lock. I will check that or investigate further.
> Are you sure?
I'm sure it does not seem a problem ... with only
CONFIG_LOCKDEP_SUPPORT=y what was what I saw in a quick search in the
kernel config file.
But it triggers as expected if the right configuration does exist at
kernel hacking->Lock debugging ->*
Moreover, my comment yesterday about checking current vs owner does not
make sense since it is one of the reasons to check ...
Happy you spotted it. As I said, I think no special lock is needed for
the following code, but I'll double check before v10.
Thanks!
> This splat:
>
> ============================================
> WARNING: possible recursive locking detected
> 6.13.0-rc2+ #68 Tainted: G OE
> --------------------------------------------
> cat/1212 is trying to acquire lock:
> ffffffffc0591cf0 (cxl_region_rwsem){++++}-{4:4}, at: decoders_committed_show+0x2a/0x90 [cxl_core]
>
> but task is already holding lock:
> ffffffffc0591cf0 (cxl_region_rwsem){++++}-{4:4}, at: decoders_committed_show+0x1e/0x90 [cxl_core]
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(cxl_region_rwsem);
> lock(cxl_region_rwsem);
>
> *** DEADLOCK ***
>
>
> ...results from this change:
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 72950f631d49..9ebe9d46422b 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -560,9 +560,11 @@ static ssize_t decoders_committed_show(struct device *dev,
> struct cxl_port *port = to_cxl_port(dev);
> int rc;
>
> + down_read(&cxl_region_rwsem);
> down_read(&cxl_region_rwsem);
> rc = sysfs_emit(buf, "%d\n", cxl_num_decoders_committed(port));
> up_read(&cxl_region_rwsem);
> + up_read(&cxl_region_rwsem);
>
> return rc;
> }
>
> ...and "cat /sys/bus/cxl/devices/port*/decoders_committed".
Powered by blists - more mailing lists