[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <360de9fe-75eb-4327-a35a-8811fa2d1201@gmail.com>
Date: Wed, 3 Dec 2025 11:13:10 +0100
From: Paweł Mielimonka <pawel.mielimonka@...il.com>
To: Alison Schofield <alison.schofield@...el.com>,
Pawel Mielimonka <pawel.mielimonka@...itsu.com>
Cc: dan.j.williams@...el.com, Smita.KoralahalliChannabasappa@....com,
linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org, dave@...olabs.net,
jonathan.cameron@...wei.com, dave.jiang@...el.com, vishal.l.verma@...el.com,
ira.weiny@...el.com
Subject: [RFC PATCH v1 2/2] cxl/cli: enforce HPA-descending teardown order for
destroy-region
W dniu 3.12.2025 o 05:15, Alison Schofield pisze:
> On Tue, Nov 25, 2025 at 11:38:24PM +0900, Pawel Mielimonka wrote:
>> Implement destroy_multiple_regions() and bypass the generic
>> do_region_xable() path for ACTION_DESTROY. Regions are collected and
>> sorted by HPA, then destroyed from highest to lowest, stopping at the
>> first "matching after skipped" to provide user with better error log.
>> This prevents attempts on non-last regions and aligns destroy-region
>> with required decoder programming order.
> It would be useful to add a sample bad spew or what happens now
> on attempt to destroy out of order. Folks sometimes search on
> those strings.
In all cases in which I reproduced this misbehavior, in log i could find:
"write(3</sys/devices/platform/ACPI0017:00/root0/port1/endpoint2/decoder2.0/dpa_size>,
"0\n\0", 3) = -1 EBUSY (Device or resource busy)"
And the cli returned:
"cxl region: destroy_region: decoder 2.0: set_dpa_size failed: Device or
resource busy"
>
>> Signed-off-by: Pawel Mielimonka <pawel.mielimonka@...itsu.com>
>> ---
>> cxl/region.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 57 insertions(+), 3 deletions(-)
>>
>> diff --git a/cxl/region.c b/cxl/region.c
>> index 58765b3d..1bf1901a 100644
>> --- a/cxl/region.c
>> +++ b/cxl/region.c
>> @@ -950,6 +950,57 @@ static int destroy_region(struct cxl_region *region)
>> return cxl_region_delete(region);
>> }
>>
>> +static int destroy_multiple_regions(struct parsed_params *p,
>> + struct cxl_decoder *decoder,
>> + int *count)
>> +{
>> + struct cxl_region **list;
>> + int nr, rc, i;
>> + bool skipped = false;
>> +
>> + rc = collect_regions_sorted(decoder, NULL, &list, &nr);
>> + if (rc) {
>> + log_err(&rl, "failed to allocate region list: %s\n", strerror(-rc));
>> + return rc;
>> + }
>> +
>> + for (i = nr - 1; i >= 0; --i) {
>> + struct cxl_region *region = NULL;
>> +
> Here is where there is a difference btw 'all' and a decoder. 'All' gets
> passed as an argument and the filter function recognizes it. But for the
> by decoder option: "cxl destroy-region -f -d decoder0.2" argc=0 needs
> special handling
>
> Inserting this here worked for me:
> + /* If no region arguments provided, match all regions */
> + if (p->argc == 0)
> + region = list[i];
> +
>
> Then with argc == 0 this next loop is a no-op but that is OK because
> region is now assigned.
Thanks for the suggestion.
According to the manual, "cxl destroy-region -f -d decoder0.2" is not a
valid CLI invocation — a region number (or "all") is required:
SYNOPSIS
cxl destroy-region <region> [<options>]
So the case you describe would normally be written as
"cxl destroy-region all -f -d decoder0.2". If we want to accept the case
where no region argument is provided, it feels like this should be
handled earlier during parsing and the documentation should explicitly
state that "all" is assumed when <region> is omitted.
Given the current docs, I would lean towards reporting an error when
p->argc == 0, unless we decide to update the documentation to permit the
implicit "all" behavior.
I've also tested various cases with explicit region arguments, e.g.
"cxl destroy-region 1 2 3" with regions 0..7 present, and those paths
behave correctly.
Happy to adjust in whichever direction makes the most sense.
>
>> + for (int j = 0; j < p->argc; j++) {
>> + region = util_cxl_region_filter(list[i], p->argv[j]);
>> + if (region)
>> + break;
>> + }
>> +
>> + if (!region) {
>> + skipped = true;
>> + continue;
>> + }
>> +
>> + /* if current region matches filter, but previous didn't, destroying would
>> + * result in breaking HPA continuity
>> + */
> Use kernel comment style. See other samples in this file.
>
>
>> + if (skipped) {
>> + log_err(&rl, "failed to destroy %s: not a valid HPA suffix under %s\n",
> I'm not familiar w the usage of 'suffix' in this context.
> How about replace "not a valid HPA suffix under"
> with "out of order decoder reset"
I intended it to mean "the last region in HPA order". Your wording is
clearer,
so I'll adopt your suggestion in v2.
>
>> + cxl_region_get_devname(region),
>> + cxl_decoder_get_devname(decoder));
>> + rc = -EINVAL;
>> + break;
>> + }
>> +
>> + rc = destroy_region(region);
>> + if (rc) {
>> + log_err(&rl, "%s: failed: %s\n",
>> + cxl_region_get_devname(region), strerror(-rc));
>> + break;
>> + }
>> + ++(*count);
>> + }
>> + free(list);
>> + return rc;
>> +}
>> +
>> static int do_region_xable(struct cxl_region *region, enum region_actions action)
>> {
>> switch (action) {
>> @@ -957,8 +1008,6 @@ static int do_region_xable(struct cxl_region *region, enum region_actions action
>> return cxl_region_enable(region);
>> case ACTION_DISABLE:
>> return disable_region(region);
>> - case ACTION_DESTROY:
>> - return destroy_region(region);
>> default:
>> return -EINVAL;
>> }
>> @@ -1026,7 +1075,12 @@ static int region_action(int argc, const char **argv, struct cxl_ctx *ctx,
>> if (!util_cxl_decoder_filter(decoder,
>> param.root_decoder))
>> continue;
>> - rc = decoder_region_action(p, decoder, action, count);
>> +
>> + if (action == ACTION_DESTROY)
>> + rc = destroy_multiple_regions(p, decoder, count);
>> + else
>> + rc = decoder_region_action(p, decoder, action, count);
>> +
>> if (rc)
>> err_rc = rc;
>> }
>> --
>> 2.45.1.windows.1
>>
Powered by blists - more mailing lists