lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8b1bdb9f-3942-8af2-3d7c-c6e53a6392d4@arm.com>
Date:   Wed, 18 Jan 2023 11:55:01 +0000
From:   Lukasz Luba <lukasz.luba@....com>
To:     Vincent Wang3 <vincentwang3@...ovo.com>
Cc:     "rafael@...nel.org" <rafael@...nel.org>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        Vincent Wang <bhuwz@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        "qperret@...gle.com" <qperret@...gle.com>,
        Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH v2] cpufreq: Register with perf domain before


+Quentin and Todd

On 1/18/23 10:00, Vincent Wang3 wrote:
> Hi, Lukasz, Viresh
> 
> I found this issue on Android phone with kernel 5.15, and the governor is schedutil.
> With the pd_init issue, the rd->pd will be NULL and EAS doesn't work since rd->pd will be checked in function find_energy_efficient_cpu( ).

That's true, but that's not mainline issue. This discussion should
be moved to Android kernel bug tracking system.
I've seen probably similar issue in Android.

I have added Quentin and Todd who are more familiar with this design in
Android kernel.

> 
> I didn't notice the modification in schedutil at mainline, so I submitted this patch to fix the problem, sorry!
> 
> However, besides EAS will check rd->pd, I noticed that it's also used in the find_busiest_group() and update_cpu_capacity() functions at mainline, so I'm worried that may not be enough to circumvent it only in schedutil.

Those function indeed use the perf domain, but they don't use the
Energy Model pointer: rd->pd->em_pd, so they don't suffer.

> 
> 
> detail of the exact code path:
> in function register_cpufreq_notifier( ), the notifier_block init_cpu_capacity_notifier is registered with CPUFREQ_POLICY_NOTIFIER.
> During the initialization of a new policy, the notifier_call function init_cpu_capacity_callback( ) will be called,in which the work update_topology_flags_work will be scheduled.
> 
> for the kwork function update_topology_flags_workfn( ):
> update_topology_flags_workfn( ) -> rebuild_sched_domains( ) -> rebuild_sched_domains_locked( ) -> partition_sched_domains_locked( ) -> build_perf_domains( ) -> pd_init( )
> 
> 
> BRs
> Vincent
> 
> -----邮件原件-----
> 发件人: Lukasz Luba <lukasz.luba@....com>
> 发送时间: 2023年1月18日 17:24
> 收件人: Viresh Kumar <viresh.kumar@...aro.org>; Vincent Wang <bhuwz@....com>; Vincent Wang3 <vincentwang3@...ovo.com>
> 抄送: rafael@...nel.org; linux-pm@...r.kernel.org; linux-kernel@...r.kernel.org
> 主题: [External] Re: [PATCH v2] cpufreq: Register with perf domain before
> 
> Hi Viresh, Vincent
> 
> I'm surprised seeing this thread and thanks that you Viresh have answered, so it could go through my spam/junk filters.
> (I hope when answer to that domain it would change something).
> 
> On 1/18/23 08:49, Viresh Kumar wrote:
>> On 18-01-23, 12:47, Vincent Wang wrote:
>>> From: Vincent Wang <vincentwang3@...ovo.com>
>>>
>>> We found the following issue during kernel boot on android phone:
>>>
>>> [    1.325272][    T1] cpu cpu0: EM: created perf domain
>>> [    1.329317][    T1] cpu cpu4: EM: created perf domain
>>> [    1.337597][   T76] pd_init: no EM found for CPU7
>>> [    1.350849][    T1] cpu cpu7: EM: created perf domain
>>>
>>> pd init for cluster2 is executed in a kworker thread and is earlier
>>> than the perf domain creation for cluster2.
>>
>> Can you please give detail of the exact code path, for mainline kernel
>> ? I am not sure which kworker thread are you talking about here.
> 
> Please also tell us your cpufreq governor. The schedutil governor at mainline can handle those situations and we rebuild the perf domains here [1].
> 
>>
>>> pd_init() is called from the cpufreq notification of
>>> CPUFREQ_CREATE_POLICY in cpufreq_online(), which is earlier than that
>>> cpufreq_driver->register_em() is called.
> 
> Viresh, If that's an issue for other governors, than maybe we should address that. IMO the patch just relies on side-effect in arch_topology.c update_topology_flags_workfn(). That would be a workaround and dangerous if someone would change that arch_topology.c design. Shouldn't we come up with something reliable inside the cpufreq.c if there is a real issue?
> Even now, these two mechanisms:
> 1. in the schedutil [1]
> 2. in the update_topology_flags_workfn() are a bit leaky (in terms of design holes).
> 
> Regards,
> Lukasz
> 
> [1]
> https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fkernel%2Fsched%2Fcpufreq_schedutil.c%23L858&data=05%7C01%7Cvincentwang3%40lenovo.com%7Cc69b7dca4401474bd23808daf935bfd5%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638096306507392144%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WTPpWDytIsbfdW920cGNiHhEs7udgrX0O78sAh8JV7I%3D&reserved=0
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ