[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e80bfa8cb9b74997a4214e531366c71d@huawei.com>
Date: Mon, 15 Apr 2024 11:51:30 +0000
From: Salil Mehta <salil.mehta@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>, "Russell King (Oracle)"
<linux@...linux.org.uk>, "Rafael J. Wysocki" <rafael@...nel.org>
CC: Jonathan Cameron <jonathan.cameron@...wei.com>, "linux-pm@...r.kernel.org"
<linux-pm@...r.kernel.org>, "loongarch@...ts.linux.dev"
<loongarch@...ts.linux.dev>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>, "linux-arch@...r.kernel.org"
<linux-arch@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "kvmarm@...ts.linux.dev"
<kvmarm@...ts.linux.dev>, "x86@...nel.org" <x86@...nel.org>, Miguel Luis
<miguel.luis@...cle.com>, James Morse <james.morse@....com>, "Jean-Philippe
Brucker" <jean-philippe@...aro.org>, Catalin Marinas
<catalin.marinas@....com>, Will Deacon <will@...nel.org>, Linuxarm
<linuxarm@...wei.com>, "justin.he@....com" <justin.he@....com>,
"jianyong.wu@....com" <jianyong.wu@....com>
Subject: RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from
acpi_processor_get_info()
Hello,
> From: Thomas Gleixner <tglx@...utronix.de>
> Sent: Friday, April 12, 2024 9:55 PM
>
> On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
> > On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
> >> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> >> What's the difference then? The locking, which should be fine if I'm
> >> not mistaken and need_hotplug_init that needs to be set if this code
> >> runs after the processor driver has loaded AFAICS.
> >
> > It is over this that I walked away from progressing this code, because
> > I don't think it's quite as simple as you make it out to be.
> >
> > Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch
> implemented
> > functions, so Arm64 can easily provide stubs for these that do nothing.
> > That never caused me any concern.
> >
> > What does cause me great concern though are the finer details. For
> > example, above you seem to drop the evaluation of _STA for the
> > "make_present" case - I've no idea whether that is something that
> > should be deleted or not (if it is something that can be deleted, then
> > why not delete it now?)
> >
> > As for the cpu locking, I couldn't find anything in
> > arch_register_cpu() that depends on the cpu_maps_update stuff nor
> > needs the cpus_write_lock being taken - so I've no idea why the
> > "make_present" case takes these locks.
>
> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> boot must hold the appropriate write locks. Otherwise it would be possible
> to online a CPU which just got marked present, but the registration has not
> completed yet.
>
> > Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
> > obvious that this is required - remember that with Arm64's "enabled"
> > toggling, the "processor" is a slice of the system and doesn't
> > actually go away - it's just "not enabled" for use.
> >
> > Again, as "processors" in Arm64 are slices of the system, they have to
> > be fully described in ACPI before the OS boots, and they will be
> > marked as being "present", which means they will be enumerated, and
> > the driver will be probed. Any processor that is not to be used will
> > not have its enabled bit set. It is my understanding that every
> > processor will result in the ACPI processor driver being bound to it
> > whether its enabled or not.
> >
> > The difference between real hotplug and Arm64 hotplug is that real
> > hotplug makes stuff not-present (and thus unenumerable). Arm64
> hotplug
> > makes stuff not-enabled which is still enumerable.
>
> Define "real hotplug" :)
>
> Real physical hotplug does not really exist. That's at least true for x86, where
> the physical hotplug support was chased for a while, but never ended up in
> production.
>
> Though virtualization happily jumped on it to hot add/remove CPUs to/from
> a guest.
>
> There are limitations to this and we learned it the hard way on X86. At the
> end we came up with the following restrictions:
>
> 1) All possible CPUs have to be advertised at boot time via firmware
> (ACPI/DT/whatever) independent of them being present at boot time
> or not.
>
> That guarantees proper sizing and ensures that associations
> between hardware entities and software representations and the
> resulting topology are stable for the lifetime of a system.
>
> It is really required to know the full topology of the system at
> boot time especially with hybrid CPUs where some of the cores
> have hyperthreading and the others do not.
>
>
> 2) Hot add can only mark an already registered (possible) CPU
> present. Adding non-registered CPUs after boot is not possible.
>
> The CPU must have been registered in #1 already to ensure that
> the system topology does not suddenly change in an incompatible
> way at run-time.
>
> The same restriction would apply to real physical hotplug. I don't think that's
> any different for ARM64 or any other architecture.
There is a difference:
1. ARM arch does not allows for any processor to be NOT present. Hence, because of
this restriction any of its related per-cpu components must be present and enumerated
at the boot time as well (exposed by firmware and ACPI). This means all the enumerated
processors will be marked as 'present' but they might exist in NOT enabled (_STA.enabled=0)
state.
There was one clear difference and please correct me if I'm wrong here, for x86, the LAPIC
associated with the x86 core can be brought online later even after boot?
But for ARM Arch, processors and its corresponding per-cpu components like redistributors
all need to be present and enumerated during the boot time. Redistributors are part of
ALWAYS-ON power domain.
2. Agreed regarding the topology. Are you suggesting that we must call arch_register_cpu()
during boot time for all the 'present' CPUs? Even if that's the case, we might still want to defer
registration of the cpu device (register_cpu() API) with the Linux device model. Later is what
we are doing to hide/unhide the CPUs from the user while STA.Enabled Bit is toggled due to
CPU (un)plug action.
Best regards
Salil.
>
> Hope that helps.
>
> Thanks,
>
> tglx
Powered by blists - more mailing lists