[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0itoeayy6tgwq2SLhOMB8xCarSNTeE71RNqGEPsuuJtng@mail.gmail.com>
Date: Wed, 17 Apr 2024 18:19:37 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Salil Mehta <salil.mehta@...wei.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
"Russell King (Oracle)" <linux@...linux.org.uk>, Jonathan Cameron <jonathan.cameron@...wei.com>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
"loongarch@...ts.linux.dev" <loongarch@...ts.linux.dev>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>, "x86@...nel.org" <x86@...nel.org>,
Miguel Luis <miguel.luis@...cle.com>, James Morse <james.morse@....com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>, Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Linuxarm <linuxarm@...wei.com>,
"justin.he@....com" <justin.he@....com>, "jianyong.wu@....com" <jianyong.wu@....com>
Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
On Wed, Apr 17, 2024 at 5:01 PM Salil Mehta <salil.mehta@...wei.com> wrote:
>
> HI Rafael,
>
> > From: Rafael J. Wysocki <rafael@...nel.org>
> > Sent: Monday, April 15, 2024 5:39 PM
> >
> > On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@...wei.com> wrote:
> > >
> > > > From: Rafael J. Wysocki <rafael@...nel.org>
> > > > Sent: Monday, April 15, 2024 1:51 PM
> > > >
> > > > On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta
> > > > <salil.mehta@...wei.com>
> > > > wrote:
> > > > >
> >
> > [cut]
> >
> > > > > > Though virtualization happily jumped on it to hot add/remove
> > > > CPUs > > to/from a guest.
> > > > > >
> > > > > > There are limitations to this and we learned it the hard way
> > > > on > > X86. At the end we came up with the following restrictions:
> > > > > >
> > > > > > 1) All possible CPUs have to be advertised at boot time via firmware
> > > > > > (ACPI/DT/whatever) independent of them being present at boot time
> > > > > > or not.
> > > > > >
> > > > > > That guarantees proper sizing and ensures that associations
> > > > > > between hardware entities and software representations and the
> > > > > > resulting topology are stable for the lifetime of a system.
> > > > > >
> > > > > > It is really required to know the full topology of the system at
> > > > > > boot time especially with hybrid CPUs where some of the cores
> > > > > > have hyperthreading and the others do not.
> > > > > >
> > > > > >
> > > > > > 2) Hot add can only mark an already registered (possible) CPU
> > > > > > present. Adding non-registered CPUs after boot is not possible.
> > > > > >
> > > > > > The CPU must have been registered in #1 already to ensure that
> > > > > > the system topology does not suddenly change in an incompatible
> > > > > > way at run-time.
> > > > > >
> > > > > > The same restriction would apply to real physical hotplug. I
> > > > don't > > think that's any different for ARM64 or any other architecture.
> > > > >
> > > > >
> > > > > There is a difference:
> > > > >
> > > > > 1. ARM arch does not allows for any processor to be NOT present. Hence, because of
> > > > > this restriction any of its related per-cpu components must be
> > > > present > and enumerated at the boot time as well (exposed by
> > > > firmware and > ACPI). This means all the enumerated processors will
> > > > be marked as > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
> > > > >
> > > > > There was one clear difference and please correct me if I'm wrong
> > > > > here, for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
> > > > >
> > > > > But for ARM Arch, processors and its corresponding per-cpu
> > > > components > like redistributors all need to be present and
> > > > enumerated during the > boot time. Redistributors are part of ALWAYS-ON power domain.
> > > >
> > > > OK
> > > >
> > > > So what exactly is the problem with this and what does
> > > > acpi_processor_add() have to do with it?
> > >
> > >
> > > For ARM Arch, during boot time, it should add processor as if no
> > > hotplug exists. But later, in context to the (fake) hotplug trigger
> > > from the virtualizer as a result of the CPU (un)plug action it should just
> > end up in registering the already present CPU with the Linux Driver Model.
> >
> > So let me repeat this last time: acpi_processor_add() cannot do that,
> > because (as defined today) it rejects CPUs with the "enabled" bit clear in _STA.
>
>
> I understand that now because you have placed a check recently. sorry for stretching
> it a bit but I wanted to clearly understand the reason for this behavior. Is it because,
>
> 1. It does not makes sense to add a disabled but present/functional processor or
> perhaps there are repercussions to support such a behavior?
Yes because it is unusable.
> Or
>
> 2. None of the existing processors need such a behavior?
>
>
>
> > > > Do you want the per-CPU structures etc. to be created from the
> > > > acpi_processor_add() path?
> > >
> > >
> > > I referred to the components related to ARM CPU Arch like redistributors etc.
> > > which will get initialized in context to Arch specific _init code not
> > > here. This i.e. acpi_processor_add() is arch agnostic code common to all architectures.
> > >
> > > [ A digression: You do have _weak functions which can be overridden to
> > > arch specific handling like arch_(un)map_cpu() etc. but we can't use
> > > those to defer initialize the CPU related components - ARM Arch
> > > constraint!]
> >
> > Not right now, but they can be added I suppose.
> >
> > >
> > > >
> > > > This plain won't work because acpi_processor_add(), as defined in
> > > > the mainline kernel today (and the Jonathan's patches don't change
> > > > that AFAICS), returns an error for processor devices with the
> > > > "enabled" bit clear in _STA (it returns an error if the "present"
> > > > bit is clear too, but that's obvious), so it only gets to calling
> > > > arch_register_cpu() if
> > > > *both* "present" and "enabled" _STA bits are set for the given
> > > > processor device.
> > >
> > >
> > > If you are referring to the _STA check in the XX_hot_add_init() then
> > > in the current kernel code it only checks for the
> > > ACPI_STA_DEVICE_PRESENT flag and not the ACPI_STA_DEVICE_ENABLED flag(?).
> >
> > No, I am not. I'm referring to this code in 6.9-rc4:
> >
> > static int acpi_processor_add(struct acpi_device *device,
> > const struct acpi_device_id *id) {
> > struct acpi_processor *pr;
> > struct device *dev;
> > int result = 0;
> >
> > if (!acpi_device_is_enabled(device))
> > return -ENODEV;
>
>
> Ahh, sorry, I missed this check as this has been added recently. Yes, now your
> logic of having common legs makes more sense.
>
>
> >
> > ...
> > }
> >
> > where acpi_device_is_enabled() is defined as follows:
> >
> > bool acpi_device_is_enabled(const struct acpi_device *adev) {
> > return adev->status.present && adev->status.enabled; }
>
>
> Got it.
>
>
> [digression note:]
> BTW, I'm wondering why we are checking adev->status.present
> as having adev->status.enabled as set and adev->status.present
> as unset would mean firmware has a BUG. If we really want to
> check this then we should rather flag a warning on detection
> of this condition?
Adding a warning would be fine with me.
> Either this:
> bool acpi_device_is_enabled(const struct acpi_device *adev) {
>
> if (!acpi_device_is_present(adev)) {
> if (adev->status.enabled)
> pr_debug("Device [%s] status inconsistent: Enabled but not Present\n",
> device->pnp.bus_id);
> return false;
> }
> return true;
> }
>
> Ideally this inconsistency should have been checked in acpi_bus_get_status()
> and above function should have been just,
Yes, it can be added there. It can even clear 'enabled' if 'present' is clear.
> file: drivers/acpi/scan.c
> bool acpi_device_is_enabled(const struct acpi_device *adev) {
> return !!adev->status.enabled; }
Sure.
> file: drivers/acpi/bus.c
> int acpi_bus_get_status(struct acpi_device *device)
> {
> [...]
>
> status = acpi_bus_get_status_handle(device->handle, &sta);
> if (ACPI_FAILURE(status))
> return -ENODEV;
>
> acpi_set_device_status(device, sta);
>
> if (device->status.functional && !device->status.present) {
> pr_debug("Device [%s] status [%08x]: functional but not present\n",
> device->pnp.bus_id, (u32)sta);
> }
>
> + if (device->status.enabled && !device->status.present) {
> + pr_debug("BUG: Device [%s] status [%08x]: enabled but not present\n",
> + device->pnp.bus_id, (u32)sta);
> + /* any specific handling here? */
> + }
>
> pr_debug("Device [%s] status [%08x]\n", device->pnp.bus_id, (u32)sta);
> return 0;
> }
>
> >
> > > The code being reviewed has changed
> > > exactly that behavior for 2 legs i.e. make-present and make-enabled legs.
> >
> > I'm not sure what you mean here, but the code above means that
> > acpi_processor_add) does not distinguish between CPU with the "present"
> > bit clear (in which case the "enabled" bit must also be clear as per the spec)
> > and CPUs with the "present" bit set and the "enabled" bit clear. These two
> > cases are handled in the same way.
> >
> > > I'm open to further address your point clearly.
> >
> > I hope that the above is clear enough.
>
>
> Yes, clear now. I missed the new check.
>
> >
> > > >
> > > > That, BTW, is why I keep saying that from the ACPI CPU enumeration
> > > > code perspective, there is no difference between "present" and
> > "enabled".
> > >
> > >
> > > Agreed but there is still a subtle difference. Enumeration happens
> > > once and for all the processors during the boot time. And as confirmed
> > > by yourself and Thomas as well that even in x86 arch all the
> > > processors will be discovered and their topology fixed during the boot
> > > time which is effectively the same behavior as in the ARM Arch. But
> > > ARM assumes those 'present' bits in the present masks to be set during
> > > the boot time which is not like x86(?). Hence, 'present cpu' Bits
> > > will always be equal to 'possible cpu' Bits. This is a constraint put by the
> > ARM maintainers and looks unshakable.
> >
> > Yes, there are differences between architectures, but the ACPI code is (or
> > at least should be) architecture-agnostic (as you said somewhere above).
> > So why does this matter for the ACPI code?
>
>
> It should not. There were few bits like overriding of arch_register_cpu() which
> was not allowed by ARM folks in 2020 when I floated the first prototype.
>
>
> > > > > 2. Agreed regarding the topology. Are you suggesting that we
> > > > must > call arch_register_cpu() during boot time for all the 'present' CPUs?
> > > > > Even if that's the case, we might still want to defer
> > > > registration of > the cpu device (register_cpu() API) with the
> > > > Linux device model. Later > is what we are doing to hide/unhide the
> > > > CPUs from the user while STA.Enabled Bit is toggled due to CPU (un)plug action.
> > > >
> > > > There are two ways to approach this IMV, and both seem to be valid
> > > > in principle.
> > > >
> > > > One would be to treat CPUs with the "enabled" bit clear as not
> > > > present and create all of the requisite data structures for them
> > > > when they become available (in analogy with the "real hot-add" case).
> > >
> > >
> > > Right. This one is out-of-scope for ARM Arch as we cannot defer any
> > > Arch specific sizing and initializations after boot i.e. when
> > > processor_add() gets called again later as a trigger of CPU plug action at the Virtualizer.
> > >
> > >
> > > >
> > > > The alternative one is to create all of the requisite data
> > > > structures for the CPUs that you find during boot, but register CPU
> > > > devices for those having the "enabled" _STA bit set.
> > >
> > >
> > > Correct. and we defer the registration for CPUs with online-capable
> > > Bit set in the ACPI MADT/GICC data structure. These CPUs basically
> > > form set of hot-pluggable CPUs on ARM.
> > >
> > >
> > > >
> > > > It looks like you have chosen the second approach, which is fine
> > > > with me (although personally, I would rather choose the first one),
> > > > but then your arch code needs to arrange for the requisite CPU data
> > > > structures etc. to be set up before acpi_processor_add() gets
> > > > called because, as per the above, the latter just rejects CPUs with the "enabled" _STA bit clear.
> > >
> > > Yes, correct. First one is not possible - at least for now and to have
> > > that it will require lot of rework especially at GIC. But there are
> > > many other arch components (like timers, PMUs, etc.) whose behavior
> > > needs to be specified somewhere in the architecture as well. All these are closely coupled with the ARM CPU architecture.
> > > (it's beyond this discussion and lets leave that to ARM folks).
> > >
> > > This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.
> >
> > Well, I'm having a hard time with this.
> >
> > As far as CPU enumeration goes, if the "enabled" bit is clear in _STA, it does
> > not happen at all. Both on ARM and on x86.
>
> sure, I can see that now.
>
> >
> > Now tell me why there need to be two separate code paths calling
> > arch_register_cpu() in acpi_processor_add()?
>
>
> As mentioned above, in the first prototype I floated in the year 2020 any attempts
> to override the __weak call of arch_register_cpu() for ARM64 was discouraged.
> Though, the reasons might have changed now as some code has been moved.
>
> Once we are allowed to override the calls then there are many more possibilities
> which open up to simplify the code further.
Well, IMV this should just be an arch function with no __weak
defaults, because the default would probably be unusable in practice
anyway.
Powered by blists - more mailing lists