lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c91fb7cd-13d5-4280-b2b6-c6ed37512ebd@arm.com>
Date: Fri, 17 Oct 2025 19:51:03 +0100
From: James Morse <james.morse@....com>
To: Gavin Shan <gshan@...hat.com>, linux-kernel@...r.kernel.org,
 linux-arm-kernel@...ts.infradead.org, linux-acpi@...r.kernel.org
Cc: D Scott Phillips OS <scott@...amperecomputing.com>,
 carl@...amperecomputing.com, lcherian@...vell.com,
 bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
 baolin.wang@...ux.alibaba.com, Jamie Iles <quic_jiles@...cinc.com>,
 Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com,
 dfustini@...libre.com, amitsinght@...vell.com,
 David Hildenbrand <david@...hat.com>, Dave Martin <dave.martin@....com>,
 Koba Ko <kobak@...dia.com>, Shanker Donthineni <sdonthineni@...dia.com>,
 fenghuay@...dia.com, baisheng.gao@...soc.com,
 Jonathan Cameron <jonathan.cameron@...wei.com>, Rob Herring
 <robh@...nel.org>, Rohit Mathew <rohit.mathew@....com>,
 Rafael Wysocki <rafael@...nel.org>, Len Brown <lenb@...nel.org>,
 Lorenzo Pieralisi <lpieralisi@...nel.org>, Hanjun Guo
 <guohanjun@...wei.com>, Sudeep Holla <sudeep.holla@....com>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Danilo Krummrich <dakr@...nel.org>
Subject: Re: [PATCH v2 06/29] ACPI / MPAM: Parse the MPAM table

Hi Gavin,

On 03/10/2025 01:58, Gavin Shan wrote:
> On 9/11/25 6:42 AM, James Morse wrote:
>> Add code to parse the arm64 specific MPAM table, looking up the cache
>> level from the PPTT and feeding the end result into the MPAM driver.
>>
>> For now the MPAM hook mpam_ris_create() is stubbed out, but will update
>> the MPAM driver with optional discovered data.

>> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
>> new file mode 100644
>> index 000000000000..fd9cfa143676
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/mpam.c

>> +static bool acpi_mpam_register_irq(struct platform_device *pdev, int intid,
>> +                   u32 flags, int *irq,
>> +                   u32 processor_container_uid)
>> +{
>> +    int sense;
>> +
>> +    if (!intid)
>> +        return false;
>> +
>> +    if (FIELD_GET(ACPI_MPAM_MSC_IRQ_TYPE_MASK, flags) !=
>> +        ACPI_MPAM_MSC_IRQ_TYPE_WIRED)
>> +        return false;
>> +
>> +    sense = FIELD_GET(ACPI_MPAM_MSC_IRQ_MODE_MASK, flags);
>> +
>> +    if (16 <= intid && intid < 32 && processor_container_uid != GLOBAL_AFFINITY) {
>> +        pr_err_once("Partitioned interrupts not supported\n");
>> +        return false;
>> +    }
>> +
>> +    *irq = acpi_register_gsi(&pdev->dev, intid, sense, ACPI_ACTIVE_HIGH);
>> +    if (*irq <= 0) {
>> +        pr_err_once("Failed to register interrupt 0x%x with ACPI\n",
>> +                intid);
>> +        return false;
>> +    }
>> +
>> +    return true;
>> +}
> 
> 0 is allowed by acpi_register_gsi().
> 
>     if (*irq < 0) {
>         pr_err_once(...);
>         return false;
>     }

Really? I thought irq-zero was nonsense.
acpi_register_gsi() does this:
|        irq = irq_create_fwspec_mapping(&fwspec);
|        if (!irq)
|                return -EINVAL;
|
|        return irq;


>> +static int acpi_mpam_parse_resource(struct mpam_msc *msc,
>> +                    struct acpi_mpam_resource_node *res)
>> +{
>> +    int level, nid;
>> +    u32 cache_id;
>> +
>> +    switch (res->locator_type) {
>> +    case ACPI_MPAM_LOCATION_TYPE_PROCESSOR_CACHE:
>> +        cache_id = res->locator.cache_locator.cache_reference;
>> +        level = find_acpi_cache_level_from_id(cache_id);
>> +        if (level <= 0) {
>> +            pr_err_once("Bad level (%u) for cache with id %u\n", level, cache_id);
>> +            return -EINVAL;
>> +        }
>> +        return mpam_ris_create(msc, res->ris_index, MPAM_CLASS_CACHE,
>> +                       level, cache_id);
>> +    case ACPI_MPAM_LOCATION_TYPE_MEMORY:
>> +        nid = pxm_to_node(res->locator.memory_locator.proximity_domain);
>> +        if (nid == NUMA_NO_NODE)
>> +            nid = 0;
>> +        return mpam_ris_create(msc, res->ris_index, MPAM_CLASS_MEMORY,
>> +                       255, nid);
> 
> It's perhaps worthy a warning message when @nid is explicitly set to zero due to
> the bad proximity domain, something like below.
> 
>         if (nid == NUMA_NO_NODE) {
>             nid = 0;
>             if (num_possible_nodes() > 1) {
>                 pr_warn("Bad proximity domain %d, mapped to node 0\n",
>                     res->locator.memory_locator.proximity_domain);
>             }
>         }

This was to catch the case where you build the kernel without NUMA support - which
wouldn't be an error. But that returns 0 instead of NUMA_NO_NODE, so NUMA_NO_NODE only
occurs when there is a bug. I'l add this - but it'll be a pr_debug() as the message is
only of use to about 4 people!


>> +    default:
>> +        /* These get discovered later and treated as unknown */
>> +        return 0;
>> +    }
>> +}

>> +int acpi_mpam_count_msc(void)
>> +{
>> +    struct acpi_table_header *table __free(acpi_table) =
>> acpi_get_table_ret(ACPI_SIG_MPAM, 0);
>> +    char *table_end, *table_offset = (char *)(table + 1);
>> +    struct acpi_mpam_msc_node *tbl_msc;
>> +    int count = 0;
>> +
>> +    if (IS_ERR(table))
>> +        return 0;
>> +
>> +    if (table->revision < 1)
>> +        return 0;
>> +
>> +    table_end = (char *)table + table->length;
>> +
>> +    while (table_offset < table_end) {
>> +        tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> +        if (!tbl_msc->mmio_size)
>> +            continue;
>> +
>> +        if (tbl_msc->length < sizeof(*tbl_msc))
>> +            return -EINVAL;
>> +        if (tbl_msc->length > table_end - table_offset)
>> +            return -EINVAL;
>> +        table_offset += tbl_msc->length;
>> +
>> +        count++;
>> +    }
>> +
>> +    return count;
>> +}

> acpi_mpam_count_msc() iterates the existing MSC node, which is part of acpi_mpam_parse().
> So the question is why we can't drop acpi_mpam_count_msc() and maintain a variable to
> count the existing MSC nodes in acpi_mpam_parse() ?

Once the platform device has been created, the driver's probe call can be called, and that
needs to know how many MSC there are going to be. Doing it like this means we don't depend
on the driver probe function not being called until we got to the end of the list.
(the comments about the initcall dependencies are already annoying - I prefer not to add
any that are specific to MPAM).

This also lets us catch non-backward compatible ACPI table changes, which has already
happened with PCC. acpi_mpam_parse() skips MSC with reserved fields set,
acpi_mpam_count_msc() does not - this means we are guaranteed the values will mismatch
if any of those reserved fields are set, and the driver will never try to touch the hardware.


Thanks,

James


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ