linux-kernel - Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b0a743d8-ab28-2da3-9cb2-f9bb369ec2dd@samsung.com>
Date:   Fri, 8 Feb 2019 11:02:48 +0100
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        linux-samsung-soc@...r.kernel.org,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Nishanth Menon <nm@...com>, Stephen Boyd <sboyd@...nel.org>,
        Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
        Dave Gerlach <d-gerlach@...com>,
        Wolfram Sang <wsa@...-dreams.de>
Subject: Re: [PATCH 0/2] cpufreq/opp: rework regulator initialization

Hi Viresh,

On 2019-02-08 10:23, Viresh Kumar wrote:
> On 08-02-19, 10:15, Marek Szyprowski wrote:
>> On 2019-02-08 09:55, Viresh Kumar wrote:
>>> On 08-02-19, 09:12, Marek Szyprowski wrote:
>>>> On 2019-02-08 07:49, Viresh Kumar wrote:
>>>>> Why don't you get similar problem during suspend? I think you can get
>>>>> it when the CPUs are offlined as I2C would have gone by then. The
>>>>> cpufreq or OPP core can try and run some regulator or genpd or clk
>>>>> calls to disable resources, etc. Even if doesn't happen, it certainly
>>>>> can.
>>>> CPUfreq is suspended very early during system suspend and thus it does
>>>> nothing when CPUs are being offlined.
>>>>> Also at resume the cpufreq core may try to change the frequency right
>>>>> from ->init() on certain cases, though not everytime and so the
>>>>> problem can come despite of this series.
>>>> cpufreq is still in suspended state (it is being 'resume' very late in
>>>> the system resume procedure), so if driver doesn't explicitly change any
>>>> opp in ->init(), then cpufreq core waits until everything is resumed. To
>>>> sum up, this seems to be fine, beside the issue with regulator
>>>> initialization I've addressed in this patchset.
>>> Yeah, the governors are suspended very soon, but any frequency change
>>> starting from cpufreq core can still happen. There are at least two
>>> points in cpufreq_online() where we may end up changing the frequency,
>>> but that is conditional and may not be getting hit.
>> Then probably cpufreq core suspend should handle this.
> Handle what ? That code is part of cpufreq_online() and needs to be
> there only.

If got it right, it is a matter of handling
CPUFREQ_NEED_INITIAL_FREQ_CHECK flag. Maybe there should be additional
check if CPUfreq is not suspended?

>>>>> We guarantee that the resources are available during probe but not
>>>>> during resume, that's where the problem is.
>>>> Yes, so I've changed cpufreq-dt to the common approach, in which the
>>>> driver keeps all needed resources for the whole lifetime of the device.
>>> That's not what I was saying actually. I was saying that it should be
>>> fine to do a I2C transfer during resume, else we will always have
>>> problems and have to fix them with hacks like the one you proposed
>>> where you acquire resources for all the possible CPUs. Maybe we can
>>> fix it once and for all.
>> It is fine to do i2c transfers during cpufreq resume (see
> By resume I meant system resume and the whole onlining process of
> non-boot CPUs.

Right now those are 2 separate things in cpufreq core.

>> drivers/base/power/main.c dpm_resume() function for exact call place).
>> The problem is that such calls are not allowed in ->init(), which might
>> be called very early from CPU hotplug path (CPUs are resumed in the
>> first step of system resume procedure).
> Right and that's where I think we can do something to fix it in a
> proper way.
>
>> What's wrong with my proposed fix? It is not that uncommon to gather all
>> resources in probe() and keep them until remove() happens.
> For cpufreq drivers, we must be doing most of the stuff in init/exit
> only as far as possible. I am not saying your patch is bad, that is
> the best we can do in such situations. But I don't like that we have
> to get the resources for all CPUs, despite the fact that many of them
> would be part of the same policy and hence share resources. The
> problem was that we need to get sharing-cpus detail as well in probe
> then, etc.

Both resources in this case: clocks and regulators are refcounted by
their frameworks, so the cost of getting a few more references for them
is imho negligible.

> I was thinking about doing disable_nonboot_cpus() much earlier as the
> userspace is already frozen by then.
>
> @Rafael: Will that slowdown the suspend/resume process? Maybe not as
> we are doing everything from a single CPU/thread anyways ?

For some reasons drivers are handled partially asynchronously in
suspend/resume procedure and can do suspend and resume in parallel. I
don't think that limiting the whole suspend/resume process to a single
cpu is the best we can do.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland