linux-kernel - Re: [RFC PATCH 1/1] sched: Extend cpu idle state for 1ms

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c8bca664-76cf-52d7-bd73-795b467c460b@linux.vnet.ibm.com>
Date:   Sat, 5 Aug 2023 21:07:18 +0530
From:   Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Swapnil Sapkal <Swapnil.Sapkal@....com>
Cc:     linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Aaron Lu <aaron.lu@...el.com>, x86@...nel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH 1/1] sched: Extend cpu idle state for 1ms



On 8/4/23 1:42 AM, Mathieu Desnoyers wrote:
> On 8/3/23 01:53, Swapnil Sapkal wrote:
> [...]
> 
> Those are interesting metrics. I still have no clue why it behaves that
> way though.

I was thinking this might be the case. some workload would benefit while
some would suffer. Specially ones which favor latency over cache might suffer. 

> 
> More specifically: I also noticed that the number of migrations is
> heavily affected, and that select_task_rq behavior changes drastically.
> I'm unsure why though.
> 

FWIU, load_balance uses idle_cpu to calculate the number of idle_cpus in the 
sched_domain. Maybe that is getting confused with 1ms delay concept. Likely 
sched_domain stay balanced because of this, and hence less migrations.
In select_rq_fair, prev_cpu will returned by wake_affine_idle since idle_cpu will return 
true more often. Hence task will get woken on the same CPU as before instead of migrating. 


on SMT systems, gain is further added by having the threads on single CPU, thereby
making it ST mode. That is subject to utilization. Running on ST is more faster compared
to running on SMT. 

-------------------------------------------------------------------------------------------

Ran the hackbench with perf stat on SMT system. That indicates slightly higher ST mode cycles
and ips improves slightly thereby making it faster. 


baseline 6.5-rc1:
hackbench -pipe (50 groups) 
Time: 0.67      ( Average of 50 runs)

    94,432,028,029      instructions              #    0.52  insn per cycle          
   168,130,543,309      cycles							  (% of total cycles)  
     1,162,153,934      PM_RUN_CYC_ST_MODE                                            ( 0.70% )
       613,018,646      PM_RUN_CYC_SMT2_MODE                                          ( 0.35% )
   166,358,778,832      PM_RUN_CYC_SMT4_MODE                                          (98.95%  ) 


Latest patch in this series.
https://lore.kernel.org/lkml/447f756c-9c79-f801-8257-a97cc8256efe@efficios.com/#t
hackbench -pipe (50 groups)
Time: 0.62      ( Average of 50 runs)

    92,078,390,150      instructions              #    0.55  insn per cycle  
   159,368,662,574      cycles   
     1,330,653,107      PM_RUN_CYC_ST_MODE                                            ( 0.83% )
       656,950,636      PM_RUN_CYC_SMT2_MODE                                          ( 0.41% )
   157,384,470,123      PM_RUN_CYC_SMT4_MODE                                          (98.75%  )


>>
>> Can you share your build config just in case I am missing something.
> 
> Build config attached.
> 
> Thanks,
> 
> Mathieu
> 
>>
>>>
>>> And using it now brings the hackbench wall time at 28s :)
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>>>
>>>>>>>        struct task_struct    *stop;
>>>>>>>        unsigned long        next_balance;
>>>>>>>        struct mm_struct    *prev_mm;
>>>>>
>>>
>> -- 
>> Thanks and regards,
>> Swapnil
>