lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 21 Feb 2020 21:02:30 +0100
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Dmitry Osipenko <digetx@...il.com>
Cc:     Thierry Reding <thierry.reding@...il.com>,
        Jonathan Hunter <jonathanh@...dia.com>,
        Peter De Schrijver <pdeschrijver@...dia.com>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Michał Mirosław <mirq-linux@...e.qmqm.pl>,
        Jasper Korten <jja2000@...il.com>,
        David Heidelberg <david@...t.cz>,
        Peter Geis <pgwipeout@...il.com>, linux-pm@...r.kernel.org,
        linux-tegra@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 09/17] arm: tegra20: cpuidle: Handle case where
 secondary CPU hangs on entering LP2

On 21/02/2020 19:19, Dmitry Osipenko wrote:
> 21.02.2020 20:36, Daniel Lezcano пишет:
>> On Fri, Feb 21, 2020 at 07:56:51PM +0300, Dmitry Osipenko wrote:
>>> Hello Daniel,
>>>
>>> 21.02.2020 18:43, Daniel Lezcano пишет:
>>>> On Thu, Feb 13, 2020 at 02:51:26AM +0300, Dmitry Osipenko wrote:
>>>>> It is possible that something may go wrong with the secondary CPU, in that
>>>>> case it is much nicer to get a dump of the flow-controller state before
>>>>> hanging machine.
>>>>>
>>>>> Acked-by: Peter De Schrijver <pdeschrijver@...dia.com>
>>>>> Tested-by: Peter Geis <pgwipeout@...il.com>
>>>>> Tested-by: Jasper Korten <jja2000@...il.com>
>>>>> Tested-by: David Heidelberg <david@...t.cz>
>>>>> Signed-off-by: Dmitry Osipenko <digetx@...il.com>
>>>>> ---
>>
>> [ ... ]
>>
>>>>> +static int tegra20_wait_for_secondary_cpu_parking(void)
>>>>> +{
>>>>> +	unsigned int retries = 3;
>>>>> +
>>>>> +	while (retries--) {
>>>>> +		ktime_t timeout = ktime_add_ms(ktime_get(), 500);
>>>>
>>>> Oops I missed this one. Do not use ktime_get() in this code path, use jiffies.
>>>
>>> Could you please explain what benefits jiffies have over the ktime_get()?
>>
>> ktime_get() is very slow, jiffies is updated every tick.
> 
> But how jiffies are supposed to be updated if interrupts are disabled?

Yeah, other cpus must not be idle in this.

> Aren't jiffies actually slower than ktime_get() because jiffies are
> updating every 10/1ms (depending on CONFIG_HZ)?

They are no slower, they have a lower resolution which is 10ms or 4ms.

Given the 500ms timeout, it is fine.

> We're kinda interesting here in getting into deep-idling state as quick
> as possible. I was checking how much time takes the busy-loop below and
> it takes ~40-150us in average, which is good enough.

ktime_get() gets a seq lock and it is very slow.

>>>>> +
>>>>> +		/*
>>>>> +		 * The primary CPU0 core shall wait for the secondaries
>>>>> +		 * shutdown in order to power-off CPU's cluster safely.
>>>>> +		 * The timeout value depends on the current CPU frequency,
>>>>> +		 * it takes about 40-150us  in average and over 1000us in
>>>>> +		 * a worst case scenario.
>>>>> +		 */
>>>>> +		do {
>>>>> +			if (tegra_cpu_rail_off_ready())
>>>>> +				return 0;
>>>>> +
>>>>> +		} while (ktime_before(ktime_get(), timeout));
>>>>
>>>> So this loop will aggresively call tegra_cpu_rail_off_ready() and retry 3
>>>> times. The tegra_cpu_rail_off_ready() function can be called thoushand of times
>>>> here but the function will hang 1.5s :/
>>>>
>>>> I suggest something like:
>>>>
>>>> 	while (retries--i && !tegra_cpu_rail_off_ready()) 
>>>> 		udelay(100);
>>>>
>>>> So <retries> calls to tegra_cpu_rail_off_ready() and 100us x <retries> maximum
>>>> impact.
>>> But udelay() also results into CPU spinning in a busy-loop, and thus,
>>> what's the difference?
>>
>> busy looping instead of register reads with all the hardware things involved behind.
> 
> Please notice that this code runs only on an older Cortex-A9/A15, which
> doesn't support WFE for the delaying, and thus, CPU always busy-loops
> inside udelay().
> 
> What about if I'll add cpu_relax() to the loop? Do you think it it could
> have any positive effect?

I think udelay() has a call to cpu_relax().




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ