linux-kernel - Re: [PATCH 03/15] sched/fair: Add lag based placement

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231012223428.GP6307@noisy.programming.kicks-ass.net>
Date:   Fri, 13 Oct 2023 00:34:28 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Benjamin Segall <bsegall@...gle.com>
Cc:     mingo@...nel.org, vincent.guittot@...aro.org,
        linux-kernel@...r.kernel.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, mgorman@...e.de,
        bristot@...hat.com, corbet@....net, qyousef@...alina.io,
        chris.hyser@...cle.com, patrick.bellasi@...bug.net, pjt@...gle.com,
        pavel@....cz, qperret@...gle.com, tim.c.chen@...ux.intel.com,
        joshdon@...gle.com, timj@....org, kprateek.nayak@....com,
        yu.c.chen@...el.com, youssefesmat@...omium.org,
        joel@...lfernandes.org, efault@....de, tglx@...utronix.de
Subject: Re: [PATCH 03/15] sched/fair: Add lag based placement

On Thu, Oct 12, 2023 at 12:15:12PM -0700, Benjamin Segall wrote:
> Peter Zijlstra <peterz@...radead.org> writes:
> 
> > @@ -4853,49 +4872,119 @@ static void
> >  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >  {
> >  	u64 vruntime = avg_vruntime(cfs_rq);
> > +	s64 lag = 0;
> >  
> > -	/* sleeps up to a single latency don't count. */
> > -	if (!initial) {
> > -		unsigned long thresh;
> > +	/*
> > +	 * Due to how V is constructed as the weighted average of entities,
> > +	 * adding tasks with positive lag, or removing tasks with negative lag
> > +	 * will move 'time' backwards, this can screw around with the lag of
> > +	 * other tasks.
> > +	 *
> > +	 * EEVDF: placement strategy #1 / #2
> > +	 */
> 
> So the big problem with EEVDF #1 compared to #2/#3 and CFS (hacky though
> it is) is that it creates a significant perverse incentive to yield or
> spin until you see yourself be preempted, rather than just sleep (if you
> have any competition on the cpu). If you go to sleep immediately after
> doing work and happen to do so near the end of a slice (arguably what
> you _want_ to have happen overall), then you have to pay that negative
> lag in wakeup latency later, because it is maintained through any amount
> of sleep. (#1 or similar is good for reweight/migrate of course)
> 
> #2 in theory could be abused by micro-sleeping right before you are
> preempted, but that isn't something tasks can really predict, unlike
> seeing more "don't go to sleep, just spin, the latency numbers are so
> much better" nonsense.

Right, so I do have this:

  https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=sched/eevdf&id=344944e06f11da25b49328825ed15fedd63036d3

That allows tasks to sleep away the lag -- with all the gnarly bits that
sleep time has. And it reliably fixes the above. However, it also
depresses a bunch of other stuff. Never a free lunch etc.

It is so far the least horrible of the things I've tried.