linux-kernel - Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190208092348.uqkfgxwx2pe3lmpd@vireshk-i7>
Date:   Fri, 8 Feb 2019 14:53:48 +0530
From:   Viresh Kumar <viresh.kumar@...aro.org>
To:     Marek Szyprowski <m.szyprowski@...sung.com>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        linux-samsung-soc@...r.kernel.org,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Nishanth Menon <nm@...com>, Stephen Boyd <sboyd@...nel.org>,
        Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
        Dave Gerlach <d-gerlach@...com>,
        Wolfram Sang <wsa@...-dreams.de>
Subject: Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization

On 08-02-19, 10:15, Marek Szyprowski wrote:
> On 2019-02-08 09:55, Viresh Kumar wrote:
> > On 08-02-19, 09:12, Marek Szyprowski wrote:
> >> On 2019-02-08 07:49, Viresh Kumar wrote:
> >>> Why don't you get similar problem during suspend? I think you can get
> >>> it when the CPUs are offlined as I2C would have gone by then. The
> >>> cpufreq or OPP core can try and run some regulator or genpd or clk
> >>> calls to disable resources, etc. Even if doesn't happen, it certainly
> >>> can.
> >> CPUfreq is suspended very early during system suspend and thus it does
> >> nothing when CPUs are being offlined.
> >>> Also at resume the cpufreq core may try to change the frequency right
> >>> from ->init() on certain cases, though not everytime and so the
> >>> problem can come despite of this series.
> >> cpufreq is still in suspended state (it is being 'resume' very late in
> >> the system resume procedure), so if driver doesn't explicitly change any
> >> opp in ->init(), then cpufreq core waits until everything is resumed. To
> >> sum up, this seems to be fine, beside the issue with regulator
> >> initialization I've addressed in this patchset.
> > Yeah, the governors are suspended very soon, but any frequency change
> > starting from cpufreq core can still happen. There are at least two
> > points in cpufreq_online() where we may end up changing the frequency,
> > but that is conditional and may not be getting hit.
> 
> Then probably cpufreq core suspend should handle this.

Handle what ? That code is part of cpufreq_online() and needs to be
there only.

> >>> We guarantee that the resources are available during probe but not
> >>> during resume, that's where the problem is.
> >> Yes, so I've changed cpufreq-dt to the common approach, in which the
> >> driver keeps all needed resources for the whole lifetime of the device.
> > That's not what I was saying actually. I was saying that it should be
> > fine to do a I2C transfer during resume, else we will always have
> > problems and have to fix them with hacks like the one you proposed
> > where you acquire resources for all the possible CPUs. Maybe we can
> > fix it once and for all.
> 
> It is fine to do i2c transfers during cpufreq resume (see

By resume I meant system resume and the whole onlining process of
non-boot CPUs.

> drivers/base/power/main.c dpm_resume() function for exact call place).
> The problem is that such calls are not allowed in ->init(), which might
> be called very early from CPU hotplug path (CPUs are resumed in the
> first step of system resume procedure).

Right and that's where I think we can do something to fix it in a
proper way.

> What's wrong with my proposed fix? It is not that uncommon to gather all
> resources in probe() and keep them until remove() happens.

For cpufreq drivers, we must be doing most of the stuff in init/exit
only as far as possible. I am not saying your patch is bad, that is
the best we can do in such situations. But I don't like that we have
to get the resources for all CPUs, despite the fact that many of them
would be part of the same policy and hence share resources. The
problem was that we need to get sharing-cpus detail as well in probe
then, etc.

I was thinking about doing disable_nonboot_cpus() much earlier as the
userspace is already frozen by then.

@Rafael: Will that slowdown the suspend/resume process? Maybe not as
we are doing everything from a single CPU/thread anyways ?

-- 
viresh