lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0befc9ed8979594d790a8d4fe7ff5c5534c61c3c.camel@gmx.de>
Date: Tue, 12 Nov 2024 15:23:38 +0100
From: Mike Galbraith <efault@....de>
To: Phil Auld <pauld@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com, 
 juri.lelli@...hat.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com,  rostedt@...dmis.org, bsegall@...gle.com,
 mgorman@...e.de, vschneid@...hat.com,  linux-kernel@...r.kernel.org,
 kprateek.nayak@....com, wuyun.abel@...edance.com, 
 youssefesmat@...omium.org, tglx@...utronix.de
Subject: Re: [PATCH] sched/fair: Dequeue sched_delayed tasks when waking to
 a busy CPU

On Tue, 2024-11-12 at 07:41 -0500, Phil Auld wrote:
> On Tue, Nov 12, 2024 at 08:05:04AM +0100 Mike Galbraith wrote:
> >
> > Unfortunate change log place holder below aside, I think this patch may
> > need to be yanked as trading one not readily repeatable regression for
> > at least one that definitely is, and likely multiple others.
> >
> > (adds knob)
> >
>
> Yes, I ws just coming here to reply. I have the results from the first
> version of the patch (I don't think the later one fundemtally changed
> enough that it will matter but those results are still pending).
>
> Not entirely surprisingly we've traded a ~10% rand write regression for
> 5-10% rand read regression. This makes sense to me since the reads are
> more likely to be synchronous and thus be more buddy-like and benefit
> from flipping back and forth on the same cpu. 

Ok, that would seem to second "shoot it".

> I'd probably have to take the reads over the writes in such a trade off :)
>
> > tbench 8
> >
> > NO_MIGRATE_DELAYED    3613.49 MB/sec
> > MIGRATE_DELAYED       3145.59 MB/sec
> > NO_DELAY_DEQUEUE      3355.42 MB/sec
> >
> > First line is DELAY_DEQUEUE restoring pre-EEVDF tbench throughput as
> > I've mentioned it doing, but $subject promptly did away with that and
> > then some.
> >
>
> Yep, that's not pretty.

Yeah, not to mention annoying.

I get the "adds bounce cache pain" aspect, but not why pre-EEVDF
wouldn't be just as heavily affected, it having nothing blocking high
frequency migration (the eternal scheduler boogieman:).

Bottom line would appear to be that these survivors should be left
where they ended up, either due to LB or more likely bog standard
prev_cpu locality, for they are part and parcel of a progression.

> > I thought I might be able to do away with the reservation like side
> > effect of DELAY_DEQUEUE by borrowing h_nr_delayed from...
> >
> >      sched/eevdf: More PELT vs DELAYED_DEQUEUE
> >
> > ...for cgroups free test config, but Q/D poke at idle_cpu() helped not
> > at all.

We don't however have to let sched_delayed block SIS though.  Rendering
them transparent in idle_cpu() did NOT wreck the progression, so
maaaybe could help your regression.

> I wonder if the last_wakee stuff could be leveraged here (an idle thought,
> so to speak). Haven't looked closely enough.

If you mean heuristics, the less of those we have, the better off we
are.. they _always_ find a way to embed their teeth in your backside.

	-Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ