[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd338526-9c89-409a-8bbd-13704b79a70a@arm.com>
Date: Mon, 19 Jan 2026 17:20:51 +0000
From: Ben Horgan <ben.horgan@....com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: amitsinght@...vell.com, baisheng.gao@...soc.com,
baolin.wang@...ux.alibaba.com, carl@...amperecomputing.com,
dave.martin@....com, david@...nel.org, dfustini@...libre.com,
fenghuay@...dia.com, gshan@...hat.com, james.morse@....com,
jonathan.cameron@...wei.com, kobak@...dia.com, lcherian@...vell.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
peternewman@...gle.com, punit.agrawal@....qualcomm.com,
quic_jiles@...cinc.com, rohit.mathew@....com, scott@...amperecomputing.com,
sdonthineni@...dia.com, tan.shaopeng@...itsu.com, xhao@...ux.alibaba.com,
catalin.marinas@....com, will@...nel.org, corbet@....net, maz@...nel.org,
oupton@...nel.org, joey.gouly@....com, suzuki.poulose@....com,
kvmarm@...ts.linux.dev
Subject: Re: [PATCH v3 14/47] arm_mpam: resctrl: Add boilerplate cpuhp and
domain allocation
Hi Reinette,
On 1/13/26 16:49, Reinette Chatre wrote:
> Hi Ben,
>
> (Please note I am unfamiliar with this code so missing some context.)
>
> On 1/12/26 8:58 AM, Ben Horgan wrote:
>> +
>> +static struct mpam_resctrl_dom *
>> +mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
>> +{
>> + int err;
>> + struct mpam_resctrl_dom *dom;
>> + struct rdt_mon_domain *mon_d;
>> + struct rdt_ctrl_domain *ctrl_d;
>> + struct mpam_class *class = res->class;
>> + struct mpam_component *comp_iter, *ctrl_comp;
>> + struct rdt_resource *r = &res->resctrl_res;
>> +
>> + lockdep_assert_held(&domain_list_lock);
>> +
>> + ctrl_comp = NULL;
>> + guard(srcu)(&mpam_srcu);
>> + list_for_each_entry_srcu(comp_iter, &class->components, class_list,
>> + srcu_read_lock_held(&mpam_srcu)) {
>> + if (cpumask_test_cpu(cpu, &comp_iter->affinity)) {
>> + ctrl_comp = comp_iter;
>> + break;
>> + }
>> + }
>> +
>> + /* class has no component for this CPU */
>> + if (WARN_ON_ONCE(!ctrl_comp))
>> + return ERR_PTR(-EINVAL);
>> +
>> + dom = kzalloc_node(sizeof(*dom), GFP_KERNEL, cpu_to_node(cpu));
>> + if (!dom)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + if (exposed_alloc_capable) {
>> + dom->ctrl_comp = ctrl_comp;
>> +
>> + ctrl_d = &dom->resctrl_ctrl_dom;
>> + mpam_resctrl_domain_hdr_init(cpu, ctrl_comp, &ctrl_d->hdr);
>> + ctrl_d->hdr.type = RESCTRL_CTRL_DOMAIN;
>> + /* TODO: this list should be sorted */
>> + list_add_tail_rcu(&ctrl_d->hdr.list, &r->ctrl_domains);
>> + err = resctrl_online_ctrl_domain(r, ctrl_d);
>> + if (err) {
>> + dom = ERR_PTR(err);
>> + goto offline_ctrl_domain;
>
> It should not be necessary to offline the control domain if attempt to
> online it failed but removing it from the ctrl_domains list is necessary. What
> happens to memory dom points to?
Yeah, that leaks the memory and the offline call is unnecessary.
>
>> + }
>> + } else {
>> + pr_debug("Skipped control domain online - no controls\n");
>> + }
>> +
>> + if (exposed_mon_capable) {
>> + mon_d = &dom->resctrl_mon_dom;
>> + mpam_resctrl_domain_hdr_init(cpu, ctrl_comp, &mon_d->hdr);
>> + mon_d->hdr.type = RESCTRL_MON_DOMAIN;
>> + /* TODO: this list should be sorted */
>> + list_add_tail_rcu(&mon_d->hdr.list, &r->mon_domains);
>> + err = resctrl_online_mon_domain(r, mon_d);
>> + if (err) {
>> + dom = ERR_PTR(err);
>> + goto offline_mon_hdr;
>> + }
>> + } else {
>> + pr_debug("Skipped monitor domain online - no monitors\n");
>> + }
>> +
>> + return dom;
>> +
>> +offline_mon_hdr:
>> + mpam_resctrl_offline_domain_hdr(cpu, &mon_d->hdr);
>> +offline_ctrl_domain:
>> + resctrl_offline_ctrl_domain(r, ctrl_d);
>> +
>> + return dom;
>
> This error path is unexpected to me. From what I can tell, if there is a problem
> initializing the monitor domain this flow will undo both monitor and control domain,
> even if initialization of control domain was successful. In this case:
> - Flow jumps to error path from within the if (exposed_mon_capable) block and proceeds
> to do control domain cleanup without considering whether control domain was initialized
> or not. That is, does not take exposed_alloc_capable into account
Yes.
> - Control domain cleanup seems to be partial, for example, should it remove domain from ctrl_domains list?
Indeed.
> - On failure there is dom = ERR_PTR(err) but I cannot see where this memory is freed in both
> the monitor and control domain error paths.
Yes, it's missing.
I've reworked the code to move the resctrl_online_*() calls further up
so there is less to do on error, added a kfree(dom) and made the
ctrl_mon cleanup after the monitor domain failure to be conditional on
exposed_alloc_capable.
>
>
>> +int mpam_resctrl_online_cpu(unsigned int cpu)
>> +{
>> + struct mpam_resctrl_res *res;
>> + enum resctrl_res_level rid;
>> +
>> + guard(mutex)(&domain_list_lock);
>> + for_each_mpam_resctrl_control(res, rid) {
>> + struct mpam_resctrl_dom *dom;
>> +
>> + if (!res->class)
>> + continue; // dummy_resource;
>> +
>> + dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
>
> On success, should cpu be added to the respective headers' cpumask?
Yes, added.
>
>> + if (!dom)
>> + dom = mpam_resctrl_alloc_domain(cpu, res);
>> + if (IS_ERR(dom))
>> + return PTR_ERR(dom);
>> + }
>> +
>> + resctrl_online_cpu(cpu);
>> +
>> + return 0;
>> +}
>
>
> Reinette
>
Thanks,
Ben
Powered by blists - more mailing lists