[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0ireu4pOedLjMjK2NrLkq_2vySpdgEgGccQEiFC5=otWQ@mail.gmail.com>
Date: Mon, 15 Apr 2024 13:37:08 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, "Russell King (Oracle)" <linux@...linux.org.uk>,
"Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org, loongarch@...ts.linux.dev,
linux-acpi@...r.kernel.org, linux-arch@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
kvmarm@...ts.linux.dev, x86@...nel.org, Miguel Luis <miguel.luis@...cle.com>,
James Morse <james.morse@....com>, Salil Mehta <salil.mehta@...wei.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>, Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, linuxarm@...wei.com, justin.he@....com,
jianyong.wu@....com
Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
<Jonathan.Cameron@...wei.com> wrote:
>
> On Sat, 13 Apr 2024 01:23:48 +0200
> Thomas Gleixner <tglx@...utronix.de> wrote:
>
> > Russell!
> >
> > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
> > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > >> > being taken - so I've no idea why the "make_present" case takes these
> > >> > locks.
> > >>
> > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > >> boot must hold the appropriate write locks. Otherwise it would be
> > >> possible to online a CPU which just got marked present, but the
> > >> registration has not completed yet.
> > >
> > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > is initialising the struct cpu, registering the embedded struct
> > > device, and setting up the sysfs links to its NUMA node.
> > >
> > > There is nothing obvious in there which manipulates any CPU masks, and
> > > this is rather my fundamental point when I said "I couldn't find
> > > anything in arch_register_cpu() that depends on ...".
> > >
> > > If there is something, then comments in the code would be a useful aid
> > > because it's highly non-obvious where such a manipulation is located,
> > > and hence why the locks are necessary.
> >
> > acpi_processor_hotadd_init()
> > ...
> > acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> >
> > That ends up in fiddling with cpu_present_mask.
> >
> > I grant you that arch_register_cpu() is not, but it might rely on the
> > external locking too. I could not be bothered to figure that out.
> >
> > >> Define "real hotplug" :)
> > >>
> > >> Real physical hotplug does not really exist. That's at least true for
> > >> x86, where the physical hotplug support was chased for a while, but
> > >> never ended up in production.
> > >>
> > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > >> to/from a guest.
> > >>
> > >> There are limitations to this and we learned it the hard way on X86. At
> > >> the end we came up with the following restrictions:
> > >>
> > >> 1) All possible CPUs have to be advertised at boot time via firmware
> > >> (ACPI/DT/whatever) independent of them being present at boot time
> > >> or not.
> > >>
> > >> That guarantees proper sizing and ensures that associations
> > >> between hardware entities and software representations and the
> > >> resulting topology are stable for the lifetime of a system.
> > >>
> > >> It is really required to know the full topology of the system at
> > >> boot time especially with hybrid CPUs where some of the cores
> > >> have hyperthreading and the others do not.
> > >>
> > >>
> > >> 2) Hot add can only mark an already registered (possible) CPU
> > >> present. Adding non-registered CPUs after boot is not possible.
> > >>
> > >> The CPU must have been registered in #1 already to ensure that
> > >> the system topology does not suddenly change in an incompatible
> > >> way at run-time.
> > >>
> > >> The same restriction would apply to real physical hotplug. I don't think
> > >> that's any different for ARM64 or any other architecture.
> > >
> > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > from a misunderstanding as far as a CPU goes.
> > >
> > > However, there is a big difference between the two. On x86, a processor
> > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > those even when the processor itself is not enabled. This is the whole
> > > reason there's a difference between "present" and "enabled" and why
> > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > The processor never actually goes away in arm64, it's just prevented
> > > from being used.
> >
> > It's the same on X86 at least in the physical world.
>
> There were public calls on this via the Linaro Open Discussions group,
> so I can talk a little about how we ended up here. Note that (in my
> opinion) there is zero chance of this changing - it took us well over
> a year to get to this conclusion. So if we ever want ARM vCPU HP
> we need to work within these constraints.
>
> The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> specs etc, not the kernel maintainers) are determined that they want
> to retain the option to do real physical CPU hotplug in the future
> with all the necessary work around dynamic interrupt controller
> initialization, debug and many other messy corners.
That's OK, but the difference is not in the ACPi CPU enumeration/removal code.
> Thus anything defined had to be structured in a way that was 'different'
> from that.
Apparently, that's where things got confused.
> I don't mind the proposed flattening of the 2 paths if the ARM kernel
> maintainers are fine with it but it will remove the distinctions and
> we will need to be very careful with the CPU masks - we can't handle
> them the same as x86 does.
At the ACPI code level, there is no distinction.
A CPU that was not available before has just become available. The
platform firmware has notified the kernel about it and now
acpi_processor_add() runs. Why would it need to use different code
paths depending on what _STA bits were clear before?
Yes, there is some arch stuff to be called and that arch stuff should
figure out what to do to make things actually work.
> I'll get on with doing that, but do need input from Will / Catalin / James.
> There are some quirks that need calling out as it's not quite a simple
> as it appears from a high level.
>
> Another part of that long discussion established that there is userspace
> (Android IIRC) in which the CPU present mask must include all CPUs
> at boot. To change that would be userspace ABI breakage so we can't
> do that. Hence the dance around adding yet another mask to allow the
> OS to understand which CPUs are 'present' but not possible to online.
>
> Flattening the two paths removes any distinction between calls that
> are for real hotplug and those that are for this online capable path.
Which calls exactly do you mean?
> As a side note, the indicating bit for these flows is defined in ACPI
> for x86 from ACPI 6.3 as a flag in Processor Local APIC
> (the ARM64 definition is a cut and paste of that text). So someone
> is interested in this distinction on x86. I can't say who but if
> you have a mantis account you can easily follow the history and it
> might be instructive to not everyone considering the current x86
> flow the right way to do it.
So a physically absent processor is different from a physically
present processor that has not been disabled. No doubt about this.
That said, I'm still unsure why these two cases require two different
code paths in acpi_processor_add().
Powered by blists - more mailing lists