lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOtvUMdkC+v0R39OADF3R7Qpk+NPWW+JH0S6SUNXoHM3v+19Vg@mail.gmail.com>
Date:   Thu, 7 Feb 2019 14:58:16 +0200
From:   Gilad Ben-Yossef <gilad@...yossef.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
        Linux kernel mailing list <linux-kernel@...r.kernel.org>,
        Ofir Drang <Ofir.Drang@....com>
Subject: Re: Regression due to "PM-runtime: Switch autosuspend over to using hrtimers"

On Thu, Feb 7, 2019 at 10:25 AM Gilad Ben-Yossef <gilad@...yossef.com> wrote:

> >
> > On Wed, 6 Feb 2019 at 17:40, Gilad Ben-Yossef <gilad@...yossef.com> wrote:
> > >
> > > Hi all,
> > >
> > > A regression was spotted in the ccree driver running on Arm 32 bit
> > > causing a kernel panic during the crypto API self test phase (panic
> > > messages included with this message) happening in the PM resume
> > > callback that was not happening before.
> > >
> > > I've bisected the change that caused this to commit 8234f6734c5d
> > > ("PM-runtime: Switch autosuspend over to using hrtimers").
> > >
> > > I'm still trying to figure out what is going on inside the callback,
> > > but as it was not happening before, I thought I'd give you a shout out
> > > to make you aware of this.
> >
> > Are you using autosuspend mode for this device ?
> Yes.
>
>
> > Also this happen in a platform specific function cc_init_hash_sram().
> > I can't see anything related to pm runtime and autosuspend in it.
>
> True. However, the function is called from the driver PM resume
> callback and before that commit it did not fail.
> My guess is that there is something related to the timing the callback
> is called, probably some race condition the change exposed.
>

OK, I've found it. It was indeed a race condition in the ccree driver.
We were doing something in the resume callback that relied on
initialization sequence that happens after autosuspend was enabled for
the device.
It was never a problem because with the lower res timers we always got
around to that initialization before auto suspend kicked in and we had
to resume but with your change we started losing that race.... :-)

Sorry for the noise and thanks for your help!
Gilad


-- 
Gilad Ben-Yossef
Chief Coffee Drinker

values of β will give rise to dom!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ