lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 22 Sep 2017 14:12:13 +0200
From:   Nicolas Ferre <nicolas.ferre@...rochip.com>
To:     Romain Izard <romain.izard.pro@...il.com>
CC:     Boris Brezillon <boris.brezillon@...e-electrons.com>,
        Michael Turquette <mturquette@...libre.com>,
        Stephen Boyd <sboyd@...eaurora.org>,
        Ludovic Desroches <ludovic.desroches@...rochip.com>,
        Jonathan Cameron <jic23@...nel.org>,
        Wenyou Yang <wenyou.yang@...el.com>,
        Josh Wu <rainyfeeling@...look.com>,
        David Woodhouse <dwmw2@...radead.org>,
        Brian Norris <computersforpeace@...il.com>,
        Marek Vasut <marek.vasut@...il.com>,
        Cyrille Pitchen <cyrille.pitchen@...ev4u.fr>,
        Thierry Reding <thierry.reding@...il.com>,
        Richard Genoud <richard.genoud@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Alan Stern <stern@...land.harvard.edu>,
        <linux-clk@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
        <linux-iio@...r.kernel.org>,
        linux-mtd <linux-mtd@...ts.infradead.org>,
        <linux-pwm@...r.kernel.org>, <linux-serial@...r.kernel.org>,
        <linux-usb@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v1 01/10] clk: at91: pmc: Wait for clocks when resuming

On 14/09/2017 at 18:15, Romain Izard wrote:
> 2017-09-13 14:15 GMT+02:00 Nicolas Ferre <nicolas.ferre@...rochip.com>:
>> On 08/09/2017 at 17:35, Romain Izard wrote:
>>> Wait for the syncronization of all clocks when resuming, not only the
>>> UPLL clock. Do not use regmap_read_poll_timeout, as it will call BUG()
>>> when interrupts are masked, which is the case in here.
>>>
>>> Signed-off-by: Romain Izard <romain.izard.pro@...il.com>
>>> ---
>>>  drivers/clk/at91/pmc.c | 24 ++++++++++++++++--------
>>>  1 file changed, 16 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/clk/at91/pmc.c b/drivers/clk/at91/pmc.c
>>> index 775af473fe11..5c2b26de303e 100644
>>> --- a/drivers/clk/at91/pmc.c
>>> +++ b/drivers/clk/at91/pmc.c
>>> @@ -107,10 +107,20 @@ static int pmc_suspend(void)
>>>       return 0;
>>>  }
>>>
>>> +static bool pmc_ready(unsigned int mask)
>>> +{
>>> +     unsigned int status;
>>> +
>>> +     regmap_read(pmcreg, AT91_PMC_SR, &status);
>>> +
>>> +     return ((status & mask) == mask) ? 1 : 0;
>>> +}
>>> +
>>>  static void pmc_resume(void)
>>>  {
>>> -     int i, ret = 0;
>>> +     int i;
>>>       u32 tmp;
>>> +     u32 mask = AT91_PMC_MCKRDY | AT91_PMC_LOCKA;
>>>
>>>       regmap_read(pmcreg, AT91_PMC_MCKR, &tmp);
>>>       if (pmc_cache.mckr != tmp)
>>> @@ -134,13 +144,11 @@ static void pmc_resume(void)
>>>                            AT91_PMC_PCR_CMD);
>>>       }
>>>
>>> -     if (pmc_cache.uckr & AT91_PMC_UPLLEN) {
>>> -             ret = regmap_read_poll_timeout(pmcreg, AT91_PMC_SR, tmp,
>>> -                                            !(tmp & AT91_PMC_LOCKU),
>>> -                                            10, 5000);
>>> -             if (ret)
>>> -                     pr_crit("USB PLL didn't lock when resuming\n");
>>> -     }
>>> +     if (pmc_cache.uckr & AT91_PMC_UPLLEN)
>>> +             mask |= AT91_PMC_LOCKU;
>>> +
>>> +     while (!pmc_ready(mask))
>>> +             cpu_relax();
>>
>> Okay, but I would prefer to keep the timeout property in it. So we may
>> need to re-implement a timeout way-out here.
>>
> 
> We need to have a reference clock to measure the timeout delay. If we use
> the kernel's timekeeping, it relies on the clocks that we are configuring in
> this code. Moreover, my experience with the mainline code is that when
> something goes wrong, nothing will work. No oops or panic will be reported,
> the device will just stop working.
> 
> In my case, I had obvious failures (it just stopped working unless I removed
> USB wakeup or activated the console during suspend) but also very rare
> failures, that occurred in the bootloader. Those issues were detected when
> testing repeated suspend cycles for a night: the memory controller would
> never enter the self-refresh mode during the resume sequence.
> 
> This led me to question the bootloader's code first, and I set up 4 boards
> with the backup prototype code on v4.9 to verify that it was stable on
> suspend. I've reached 1.5 million sleep cycles over 3 weeks without
> failure, so this hinted towards the difference between the prototype and the
> backup code provided for v4.12 (which contained the patch that got in
> v4.13). Once I integrated this patch, I've run the v4.12 code for 2 weeks
> without issue as well.
> 
> In the end, I don't want to touch this code if I do not have to, as checking
> that it does not regress is really cumbersome.

The timeout was more for PLL like the one use for USB. I didn't want to
block only for USB PLL failure (which is kind of hypothetical, I admit).
Anyway, I understand your arguments and taking into account the
extensive tests that you've run, I agree with your approach. I'm adding
my Ack to the v2.

Thanks for having take the time to describe your debugging session: it's
valuable information for everybody.

Best regards,
-- 
Nicolas Ferre

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ