lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZQFj2hk4i1QAWEQp@chenyu5-mobl2.ccr.corp.intel.com>
Date:   Wed, 13 Sep 2023 15:25:14 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     "Gautham R. Shenoy" <gautham.shenoy@....com>
CC:     Aaron Lu <aaron.lu@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Ingo Molnar <mingo@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Tim Chen <tim.c.chen@...el.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        "Daniel Bristot de Oliveira" <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        "K Prateek Nayak" <kprateek.nayak@....com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 2/2] sched/fair: skip the cache hot CPU in
 select_idle_cpu()

Hi Gautham,

thanks for the review,

On 2023-09-13 at 11:52:14 +0530, Gautham R. Shenoy wrote:
> On Mon, Sep 11, 2023 at 04:40:02PM +0800, Chen Yu wrote:
> > Hi Aaron,
> > 
> > thanks for the review,
> > 
> > On 2023-09-11 at 15:26:29 +0800, Aaron Lu wrote:
> 
> [..snip..]
> 
> > > > @@ -6982,8 +6997,13 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p
> > > >  static inline int __select_idle_cpu(int cpu, struct task_struct *p)
> > > >  {
> > > >  	if ((available_idle_cpu(cpu) || sched_idle_cpu(cpu)) &&
> > > > -	    sched_cpu_cookie_match(cpu_rq(cpu), p))
> > > > +	    sched_cpu_cookie_match(cpu_rq(cpu), p)) {
> > > > +		if (sched_feat(SIS_CACHE) &&
> > > > +		    sched_clock_cpu(cpu) < cpu_rq(cpu)->cache_hot_timeout)
> > > > +			return -1;
> > > > +
> > > 
> > > Maybe introduce a new function that also considers rq->cache_hot_timeout,
> > > like available_idle_cpu_migrate() so that above and below logic can be
> > > simplified a bit?
> > > 
> > 
> > Yes, that would be simpler, I'll do in next version.
> > 
> > > I was thinking to simply add that rq->cache_hot_timeout check to
> > > available_idle_cpu() but then a long sleeping task could be forced to
> > > migrate if its prev_cpu happens to just deschedule a task that sets rq's
> > > cache_hot_timeout. I guess that's why you chose to only change the idle
> > > semantic in select_idle_cpu() but not in select_idle_sibling()?
> > >
> > 
> > Yes, sort of. And the reason I did not put this cache hot check in available_idle_cpu()
> > or idle_cpu() was mainly because these APIs are generic and could be invoked by select_idle_sibling().
> > If the task fall asleep and woken up quickly, its previous idle CPU will also be skipped,
> > thus no one could use this CPU within the cache hot period, including the cache-hot task
> > itself.
> 
> This happens even with this patch right? It is possible for a task p1
> whose avg sleep time is "t" to go to sleep which causes its CPU to go
> idle. When it wakes up after a time t' < t, the logic above skips the
> idle CPU because it is still cache-hot, despite the fact that it is
> cache hot for p1!
> 
Not sure if I understand correctly, in select_idle_sibling() p1's previous
CPU will be checked first, and that check does not involve cache-hot. So if
p1's previous CPU is idle, it will be picked, no?

        if (prev != target && cpus_share_cache(prev, target) &&
            (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
            asym_fits_cpu(task_util, util_min, util_max, prev))
                return prev;

Or do you mean, in select_idle_cpu(), we will re-check p1's previous
CPU but it is skipped due to cache-hot?

> Have you considered recording p1's identity in the
> rq->cache_hot_sleeper so that in select_task_rq_fair(), we can simply
> return the previous CPU if it is cache hot and the wakee is
> rq->cache_hot_sleeper, thus avoiding the whole select_idle_sibling
> scan.
> 

Yes this seems to be donable, and one problem would be, if there are
more than 2 dequeued tasks prefer the same (previous) CPU, which task
should be the rq->cache_hot_sleeper. And per Mathieu's feedback[1], we
want to deal with multiple dequeued tasks. If we record all of them,
it might be costly.

[1] https://lore.kernel.org/lkml/2a47ae82-b8cd-95db-9f48-82b3df0730f3@efficios.com/

thanks,
Chenyu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ