[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO6a-98XFxbCnOMp5ARwPssjYomyNKWjT=WTk=z2+ZKyOAQ0jQ@mail.gmail.com>
Date: Mon, 27 Jan 2025 22:47:28 +0530
From: Vivek yadav <linux.ninja23@...il.com>
To: Sudeep Holla <sudeep.holla@....com>
Cc: Dhruva Gole <d-gole@...com>, linux-newbie@...r.kernel.org, linux-pm@...r.kernel.org,
daniel.lezcano@...aro.org, lpieralisi@...nel.org, krzk@...nel.org,
christian.loehle@....com, quic_sibis@...cinc.com, cristian.marussi@....com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
vigneshr@...com, khilman@...com, sebin.francis@...com, khilman@...libre.com
Subject: Re: Fwd: ARM64: CPUIdle driver is not select any Idle state other
then WFI
Hi @Dhruva Gole,
Q.1. Does your CA-53 properly go into CPUIdle state and come out of
sleep state ?
Q.2. Can you provide a snapshot of command
cat /sys/devices/system/cpu/cpu*/cpuidle/state*/usage ?
Q.3. How frequently CPUs are going into custom state1 (other than
standard WFI state) ?
> Any further luck on this?
I am still facing some issues. This issue is not closed yet.
>
> idle-states {
> entry-method = "psci";
> cpu_ret_l: cpu-retention-l {
> compatible = "arm,idle-state";
> arm,psci-suspend-param = <0x00010001>;
> local-timer-stop;
> entry-latency-us = <55>;
> exit-latency-us = <140>;
> min-residency-us = <780>;
> };
> };
>
> I am using ``Menu governor`` with the ``psci_idle driver`` in its original form.
> After booting Linux I find out that the CPUIdle core is never going
> inside the ``cpu-retention`` state.
> To check time spent by CPU in any state. I am using the below command.
>
> ``cat /sys/devices/system/cpu/cpu*/cpuidle/state*/time``
As of now I made some changes in the DT node. After making changes in
latency (which is mentioned below).
idle-states {
entry-method = "psci";
cpu_ret_l: cpu-retention-l {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x00000000>;
local-timer-stop;
entry-latency-us = <300000>; # 300ms
exit-latency-us = <300000>; # 300ms
min-residency-us = <1000000>; # 1 sec
};
};
I can see that CA-55 went into a sleep state (state1) using command
``cat /sys/devices/system/cpu/cpu*/cpuidle/state*/time``.
As you mention earlier in a multicore system (2 or more) at least one
core keeps working and does not go into sleep state. It should happen
as per theory and other developers' case.
In my case, after some time, both CPUs (CPU0 and CPU1) go into sleep
state (state1). Hence the system console hangs.
My expectations are,
If I type anything on keyboard. UART interrupt should take out CPUs
from sleep state and execute commands. OR some periodic timer should
take the CPU out of sleep. Which is not happening as of now.
As you said we can safely remove`` local-timer-stop``. It means local
timers are working for the CPUs and triggering interrupts ?
Any discussion on this topic will definitely help me.
Regards,
Vivek Yadav
On Thu, Dec 12, 2024 at 6:16 PM Sudeep Holla <sudeep.holla@....com> wrote:
>
> On Wed, Dec 11, 2024 at 08:04:28PM +0530, Dhruva Gole wrote:
> > On Dec 11, 2024 at 12:18:25 +0000, Sudeep Holla wrote:
> > > On Wed, Dec 11, 2024 at 11:20:52AM +0530, Dhruva Gole wrote:
> > [...]
> > > > >
> > > > >
> > > > > Hi @all,
> > > > >
> > > > > I am working on one custom SoC. Where I add one CPUIdle state for
> > > > > ``arm,cortex-a55`` processor.
> > > >
> > > > Any further luck on this?
> > > >
> > > > I have also been working on something similar[1] but on an A53 core on
> > > > TI-K3 AM62x processor.
> > >
> > > Does upstream DTS have support for this platform to understand it better ?
> > > Even reference to any complete DT file for the platform will help.
> >
> > Yes, you can ref to the AM625 (CPU layout) DT here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/arm64/boot/dts/ti/k3-am625.dtsi
> >
> > The board/starter kit DT is:
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/arm64/boot/dts/ti/k3-am625-sk.dts
> >
> > The patches for idle state are not upstream, and only exist in this
> > patch of mine here:
> > https://github.com/DhruvaG2000/v-linux/commit/0fd088d624276a2e72b8dc6660d261ab6d194f4b
> >
>
> "arm,psci-suspend-param" indicate that this idle state doesn't loose the
> cpu context which means timer doesn't stop. So adding "local-timer-stop"
> sound completely wrong to me.
>
> > [...]
> > > > See this chunk in the kernel cpuidle driver:
> > > > if (broadcast && tick_broadcast_enter()) {
> > > >
> > > > When I dug deeper into tick_broadcast_enter it always returns something
> > > > non zero and hence in my case it was entering the if block and tried to
> > > > find a deepest state. Then the deepest state would always return WFI and
> > > > not the idle-state I had added.
> > > >
>
> It depends. If this is the last CPU and since you have marked the state with
> "local-timer-stop" and the system doesn't have any other timers to use as
> source of broadcast, it prevents one of the CPU entering that state. So you
> could be matching all the above conditions on your platform and hence you
> are observing the above.
>
> > > > What we found out was on our kernel we end up using
> > > >
> > > > kernel/time/tick-broadcast-hrtimer.c
> > > >
> > > > This always seems to be keeping atleast 1 CPU busy and prevents idle.
> > > > If we remove the local-timer-stop it was helping us, but we still need
> > > > to dig into the full impact of what that entails and I am still
> > > > interested in finding out how so many other users of similar idle-state
> > > > implementation are able to do so without trouble.
> > > >
> > >
>
> As mentioned about adding "local-timer-stop" for a retention state seems
> pure wrong in my opinion as it contradicts to the fact that context is
> retained.
>
> > > Interesting. So if the platform is functional removing local-timer-stop,
> > > I am bit confused. Either there is something else that is getting it out
> >
> > Yes it was interesting to us too, as to how the RCU didn't kick in and
> > system continued to function as though nothing was wrong.
> >
>
> It worked as if it was a state with context lost. So there might be some
> impact on the latency though it as the kernel assumed context lost and
> re-entered/resumed through resume entry point rather than where it called
> cpu_suspend() similar to wfi(). I mean only on the CPUs it was able to
> enter this state as one of the CPU will never enter this if there are no
> system timers to act as broadcast timer.
>
> Does you system not have Arch timers memory mapped interface enabled and
> interrupt wired to GIC(other than PPIs) ? Look at Juno R2 as example.
>
> > > from the idle state so, it should be fine and it could be just some
> >
> > It's probably UART keypresses or some userspace processes that get
> > scheduled that bring the CPUs back out of TF-A's cpu_standby.
>
> I doubt the CPU resume from suspend is based on some userspace event.
>
> > Is it possible that EL1 interrupts can bring EL3 out of WFI? Is yes then
> > it explains the behaviour. The arch timer could also be continuing to
> > tick and bringing the CPUs out of ATF WFI.
> >
>
> Yes but that doesn't explain the behaviour. It could be just the timer
> event from the broadcast timer.
>
> > > misconfiguration.
> > >
> > > > Arm64 recommends to use arch_timer instead of external timers. Once we
> > > > enter el3, timer interrupts to el1 is blocked and hence it's equivalent
> > > > to local-timer-stop, so it does make sense to keep this property, but
> > > > then how are others able to enter idle-states for all plugged CPUs at
> > > > the same time?
> > > >
> > >
> > > Some systems have system timer that can take over as broadcast timer when
> > > CPUs enter deeper idle states where the local timers are stopped.
> >
> > In CPUIdle we're not really clock gating anything so the timer does keep
> > ticking. So in this particular case it might make sense to remove the
> > local-timer-stop property from the idle-state.
> >
>
> Correct in your case it is retention state and hence local CPU timers
> keep ticking and you can safely drop that property. However if you add
> deeper idle states like CPU OFF with the power rail cut off, then you need
> some system timer to act as backup/broadcast timer so that all the CPUs
> can enter the state concurrently and wake up successfully.
>
> > However we're looking into taking this further and putting interconnect
> > and few other PLLs in bypass which could cause arch timer for eg. to
> > tick slower.
>
> I assume it will be present as another timer with the rate set appropriately.
>
> > In this case would it still make sense to omit the property?
>
> No, you should mark it as stopped even if it is running at slower rate
> as I am not sure if the local CPU timer support can handle rate change.
>
> > We may even have some usecases planned where we may turn OFF
> > the CPU once it is in TF-A cpu_standby/ WFI. What would be the right
> > approach in such scenarios?
> >
>
> As mentioned above, this will be separate state and all CPUs can use this
> if there is another system broadcast timer.
>
> > Could you provide any examples where the local-timer-stop property is
> > being used and an alternative timer can be configured once we enter the
> > idle-state where CPU CTX maybe lost or clocks maybe bypass?
> > great if you could share some example implementation if you're aware.
>
> As I mentioned, Juno R2 is an example. It was broken on R0 with some SoC
> errata(can't recall all the details as I looked at it almost a decade ago)
>
> --
> Regards,
> Sudeep
Powered by blists - more mailing lists