[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <513b0698bab160228c642598ba6cc7abdad1b694.camel@surriel.com>
Date: Tue, 03 Jun 2025 16:08:37 -0400
From: Rik van Riel <riel@...riel.com>
To: Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, x86@...nel.org, kernel-team@...a.com,
dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
nadav.amit@...il.com, Yu-cheng Yu <yu-cheng.yu@...el.com>
Subject: Re: [RFC v2 7/9] x86/mm: Introduce Remote Action Request
On Wed, 2025-05-21 at 09:38 -0700, Dave Hansen wrote:
>
> > +static void wait_for_done(unsigned long idx, int target_cpu)
> > +{
> > + u8 status;
> > + u8 *rar_actions = per_cpu(rar_action, target_cpu);
> > +
> > + status = READ_ONCE(rar_actions[idx]);
> > +
> > + while ((status != RAR_ACTION_OK) && (status !=
> > RAR_ACTION_FAIL)) {
>
> Should this be:
>
> while (status == RAR_ACTION_START) {
> ...
>
> ? That would more clearly link it to set_action_entry() and would
> also
> be shorter.
>
That is a very good question. The old RAR code
suggests there might be some intermediate state
when the target CPU works on processing the
RAR entry, but the current documentation only
shows RAR_SUCCESS, RAR_PENDING, and RAR_FAILURE
as possible values.
Lets try with status == RAR_ACTION_PENDING.
> >
> > +void rar_cpu_init(void)
> > +{
> > + u64 r;
> > + u8 *bitmap;
> > + int this_cpu = smp_processor_id();
> > +
> > + cpumask_clear(&per_cpu(rar_cpu_mask, this_cpu));
> > +
> > + rdmsrl(MSR_IA32_RAR_INFO, r);
> > + pr_info_once("RAR: support %lld payloads\n", r >> 32);
>
> Doesn't this need to get coordinated or checked against
> RAR_MAX_PAYLOADS?
I just added that in, and also applied all the cleanups
from your email.
>
> > + // reserved bits!!! r |= (RAR_VECTOR & 0xff);
>
> Is this just some cruft from testing?
>
I'm kind of guessing the old code might have used this
value to specify which IRQ vector to use for RAR, but
modern microcode hardcodes the RAR_VECTOR value.
> > + wrmsrl(MSR_IA32_RAR_CTRL, r);
> > +}
> > +
> > +/*
> > + * This is a modified version of smp_call_function_many() of
> > kernel/smp.c,
> > + * without a function pointer, because the RAR handler is the
> > ucode.
> > + */
>
> It doesn't look _that_ much like smp_call_function_many(). I don't
> see
> much that can be consolidated.
Agreed. It looks even less like it after some more
simplifications.
>
> > + /* No online cpus? We're done. */
> > + if (cpu >= nr_cpu_ids)
> > + return;
>
> This little idiom _is_ in smp_call_function_many_cond(). I wonder if
> it
> can be refactored out.
Removing the arch_send_rar_single_ipi fast path
gets rid of this code completely.
Once we cpumask_and with the cpu_online_mask,
the cpumask_weight should end up as 0 if no
online CPUs are in the mask.
Thank you for all the cleanup suggestions.
I've tried to address them all for v3.
--
All Rights Reversed.
Powered by blists - more mailing lists