lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 15 Apr 2020 11:34:08 +0800
From:   Aaron Lu <aaron.lwe@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Vineeth Remanan Pillai <vpillai@...italocean.com>
Cc:     Nishanth Aravamudan <naravamudan@...italocean.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>, mingo@...nel.org,
        tglx@...utronix.de, pjt@...gle.com, torvalds@...ux-foundation.org,
        Aaron Lu <aaron.lu@...ux.alibaba.com>,
        linux-kernel@...r.kernel.org, fweisbec@...il.com,
        keescook@...omium.org, kerrnel@...gle.com,
        Phil Auld <pauld@...hat.com>,
        Aubrey Li <aubrey.intel@...il.com>, aubrey.li@...ux.intel.com,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Joel Fernandes <joelaf@...gle.com>, joel@...lfernandes.org
Subject: Re: [RFC PATCH 09/13] sched/fair: core wide vruntime comparison

On Tue, Apr 14, 2020 at 03:56:24PM +0200, Peter Zijlstra wrote:
> On Wed, Mar 04, 2020 at 04:59:59PM +0000, vpillai wrote:
> > From: Aaron Lu <aaron.lu@...ux.alibaba.com>
> > 
> > This patch provides a vruntime based way to compare two cfs task's
> > priority, be it on the same cpu or different threads of the same core.
> > 
> > When the two tasks are on the same CPU, we just need to find a common
> > cfs_rq both sched_entities are on and then do the comparison.
> > 
> > When the two tasks are on differen threads of the same core, the root
> > level sched_entities to which the two tasks belong will be used to do
> > the comparison.
> > 
> > An ugly illustration for the cross CPU case:
> > 
> >    cpu0         cpu1
> >  /   |  \     /   |  \
> > se1 se2 se3  se4 se5 se6
> >     /  \            /   \
> >   se21 se22       se61  se62
> > 
> > Assume CPU0 and CPU1 are smt siblings and task A's se is se21 while
> > task B's se is se61. To compare priority of task A and B, we compare
> > priority of se2 and se6. Whose vruntime is smaller, who wins.
> > 
> > To make this work, the root level se should have a common cfs_rq min
> > vuntime, which I call it the core cfs_rq min vruntime.
> > 
> > When we adjust the min_vruntime of rq->core, we need to propgate
> > that down the tree so as to not cause starvation of existing tasks
> > based on previous vruntime.
> 
> You forgot the time complexity analysis.

This is a mistake and the adjust should be needed only once when core
scheduling is initially enabled. It is an initialization thing and there
is no reason to do it in every invocation of coresched_adjust_vruntime().

Vineeth,

I think we have talked about this before and you agreed that it is
needed only once:
https://lore.kernel.org/lkml/20191012035503.GA113034@aaronlu/
https://lore.kernel.org/lkml/CANaguZBevMsQ_Zy1ozKn2Z5Uj6WBviC6UU+zpTQOVdDDLK6r2w@mail.gmail.com/

I'll see how to fix it, but feel free to beat me to it.
 
> > +static void coresched_adjust_vruntime(struct cfs_rq *cfs_rq, u64 delta)
> > +{
> > +	struct sched_entity *se, *next;
> > +
> > +	if (!cfs_rq)
> > +		return;
> > +
> > +	cfs_rq->min_vruntime -= delta;
> > +	rbtree_postorder_for_each_entry_safe(se, next,
> > +			&cfs_rq->tasks_timeline.rb_root, run_node) {
> 
> Which per this ^
> 
> > +		if (se->vruntime > delta)
> > +			se->vruntime -= delta;
> > +		if (se->my_q)
> > +			coresched_adjust_vruntime(se->my_q, delta);
> > +	}
> > +}
> 
> > @@ -511,6 +607,7 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq)
> >  
> >  	/* ensure we never gain time by being placed backwards. */
> >  	cfs_rq->min_vruntime = max_vruntime(cfs_rq_min_vruntime(cfs_rq), vruntime);
> > +	update_core_cfs_rq_min_vruntime(cfs_rq);
> >  #ifndef CONFIG_64BIT
> >  	smp_wmb();
> >  	cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
> 
> as called from here, is exceedingly important.
> 
> Worse, I don't think our post-order iteration is even O(n).
> 
> 
> All of this is exceedingly yuck.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ