lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75244728-bc16-4657-9cab-9e931399e766@arm.com>
Date: Wed, 12 Mar 2025 18:04:31 +0000
From: James Morse <james.morse@....com>
To: Amit Singh Tomar <amitsinght@...vell.com>, x86@...nel.org,
 linux-kernel@...r.kernel.org
Cc: Reinette Chatre <reinette.chatre@...el.com>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>,
 Babu Moger <Babu.Moger@....com>, shameerali.kolothum.thodi@...wei.com,
 D Scott Phillips OS <scott@...amperecomputing.com>,
 carl@...amperecomputing.com, lcherian@...vell.com,
 bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
 baolin.wang@...ux.alibaba.com, Jamie Iles <quic_jiles@...cinc.com>,
 Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com,
 dfustini@...libre.com, David Hildenbrand <david@...hat.com>,
 Rex Nie <rex.nie@...uarmicro.com>, Dave Martin <dave.martin@....com>,
 Koba Ko <kobak@...dia.com>, Shanker Donthineni <sdonthineni@...dia.com>,
 fenghuay@...dia.com
Subject: Re: [PATCH v7 37/49] x86/resctrl: Expand the width of dom_id by
 replacing mon_data_bits

Hi Amit,

On 07/03/2025 10:17, Amit Singh Tomar wrote:
>> MPAM platforms retrieve the cache-id property from the ACPI PPTT table.
>> The cache-id field is 32 bits wide. Under resctrl, the cache-id becomes
>> the domain-id, and is packed into the mon_data_bits union bitfield.
>> The width of cache-id in this field is 14 bits.
>>
>> Expanding the union would break 32bit x86 platforms as this union is
>> stored as the kernfs kn->priv pointer. This saved allocating memory
>> for the priv data storage.
>>
>> The firmware on MPAM platforms have used the PPTT cache-id field to
>> expose the interconnect's id for the cache, which is sparse and uses
>> more than 14 bits. Use of this id is to enable PCIe direct cache
>> injection hints. Using this feature with VFIO means the value provided
>> by the ACPI table should be exposed to user-space.
>>
>> To support cache-id values greater than 14 bits, convert the
>> mon_data_bits union to a structure. This is allocated for the default
>> control group when the kernfs event files are created, and free'd when
>> the monitor directory is rmdir'd when the domain goes offline.
>> All other control and monitor groups lookup the struct mon_data allocated
>> for the default control group, and use this.
>> This simplifies the lifecycle of this structure as the default control
>> group cannot be rmdir()d by user-space, so only needs to consider
>> domain-offline, which removes all the event files corresponding to a
>> domain while holding rdtgroup_mutex - which prevents concurrent
>> readers. mkdir_mondata_subdir_allrdtgrp() must special case the default
>> control group to ensure it is created first.


>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/
>> rdtgroup.c
>> index aecd3fa734cd..443635d195f0 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -3114,6 +3114,110 @@ static struct file_system_type rdt_fs_type = {
>>       .kill_sb        = rdt_kill_sb,
>>   };
>>   +/**
>> + * mon_get_default_kn_priv() - Get the mon_data priv data for this event from
>> + *                             the default control group.
>> + * Called when monitor event files are created for a domain.
>> + * When called with the default control group, the structure will be allocated.
>> + * This happens at mount time, before other control or monitor groups are
>> + * created.
>> + * This simplifies the lifetime management for rmdir() versus domain-offline
>> + * as the default control group lives forever, and only one group needs to be
>> + * special cased.
>> + *
>> + * @r:      The resource for the event type being created.
>> + * @d:        The domain for the event type being created.
>> + * @mevt:   The event type being created.
>> + * @rdtgrp: The rdtgroup for which the monitor file is being created,
>> + *          used to determine if this is the default control group.
>> + * @do_sum: Whether the SNC sub-numa node monitors are being created.
>> + */
>> +static struct mon_data *mon_get_default_kn_priv(struct rdt_resource *r,
>> +                        struct rdt_mon_domain *d,
>> +                        struct mon_evt *mevt,
>> +                        struct rdtgroup *rdtgrp,
>> +                        bool do_sum)
>> +{
>> +    struct kernfs_node *kn_dom, *kn_evt;
>> +    struct mon_data *priv;
>> +    bool snc_mode;
>> +    char name[32];
>> +
>> +    lockdep_assert_held(&rdtgroup_mutex);
>> +
>> +    snc_mode = r->mon_scope == RESCTRL_L3_NODE;
>> +    if (!do_sum)
>> +        sprintf(name, "mon_%s_%02d", r->name, snc_mode ? d->ci->id : d->hdr.id);

> This change triggered a minor report during compilation.
> 
> fs/resctrl/rdtgroup.c: In function ‘mon_get_default_kn_priv’:
> fs/resctrl/rdtgroup.c:2931:28: warning: format ‘%d’ expects argument of type ‘int’, but
> argument 4 has type ‘long unsigned int’ [-Wformat=]
>  2931 |   sprintf(name, "mon_%s_%02d", r->name, snc_mode ? d->ci->id : d->hdr.id);
>       |                         ~~~^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>       | |                                 |
>       | int                               long unsigned int
>       |                         %02ld

Heh, not yet its not! You must have rebased the MPAM tree on-top, its a patch in there
that causes this:
This is because of the device-tree folk want to make cache-id an unsigned long so they can
use the arm CPU's affinity id as a cache-id. That patch already has to cleanup this
pattern elsewhere in resctrl, I need to add this one to it.

That thing is a discussion for the DT folk to drive ... I think they could just as easily
use the CPU number - only it wouldn't be a hardware-derived value. (the upshot is
cache-ids could change over a firmware update - which I think is fine)


Thanks,

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ