lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 03 Oct 2021 16:52:45 +0200
From:   Mike Galbraith <efault@....de>
To:     Barry Song <21cnbao@...il.com>
Cc:     Mel Gorman <mgorman@...hsingularity.net>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        Barry Song <song.bao.hua@...ilicon.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: wakeup_affine_weight() is b0rked - was Re: [PATCH 2/2]
 sched/fair: Scale wakeup granularity relative to nr_running

On Sun, 2021-10-03 at 20:34 +1300, Barry Song wrote:
> >
> > I looked into that crazy stacking depth...
> >
> > static int
> > wake_affine_weight(struct sched_domain *sd, struct task_struct *p,
> >                    int this_cpu, int prev_cpu, int sync)
> > {
> >         s64 this_eff_load, prev_eff_load;
> >         unsigned long task_load;
> >
> >         this_eff_load = cpu_load(cpu_rq(this_cpu));
> >                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the butler didit!
> >
> > That's pretty darn busted as it sits.  Between load updates, X, or any
> > other waker of many, can stack wakees to a ludicrous depth.  Tracing
> > kbuild vs firefox playing a youtube clip, I watched X stack 20 of the
> > zillion firefox minions while their previous CPUs all had 1 lousy task
> > running but a cpu_load() higher than the cpu_load() of X's CPU.  Most
> > of those prev_cpus were where X had left them when it migrated. Each
> > and every crazy depth migration was wake_affine_weight() deciding we
> > should pull based on crappy data.  As instantaneous load on the waker
> > CPU blew through the roof in my trace snapshot, its cpu_load() did
> > finally budge.. a tiny bit.. downward.  No idea where the stack would
> > have topped out, my tracing_off() limit was 20.
>
> Mike, not quite sure I caught your point. It seems you mean x wakes up
> many firefoxes within a short period, so it pulls them to the CPU where x
> is running. Technically those pulling should increase cpu_load of x' CPU.
> But due to some reason, the cpu_load is not increased in time on x' CPU,
> So this makes a lot of firefoxes piled on x' CPU, but at that time,  the load
> of the cpu which firefox was running on is still larger than x' cpu with a lot
> of firefoxes?

It looked like this.

X-2211    [007] d...211  2327.810997: select_task_rq_fair: this_run/load:4:373 prev_run/load:4:373 waking firefox:4971 CPU7 ==> CPU7
X-2211    [007] d...211  2327.811004: select_task_rq_fair: this_run/load:5:373 prev_run/load:1:1029 waking QXcbEventQueue:4952 CPU0 ==> CPU7
X-2211    [007] d...211  2327.811010: select_task_rq_fair: this_run/load:6:373 prev_run/load:1:1528 waking QXcbEventQueue:3969 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811015: select_task_rq_fair: this_run/load:7:373 prev_run/load:1:1029 waking evolution-alarm:3833 CPU0 ==> CPU7
X-2211    [007] d...211  2327.811021: select_task_rq_fair: this_run/load:8:373 prev_run/load:1:1528 waking QXcbEventQueue:3860 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811026: select_task_rq_fair: this_run/load:8:373 prev_run/load:1:1528 waking QXcbEventQueue:3800 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811032: select_task_rq_fair: this_run/load:9:373 prev_run/load:1:1528 waking xdg-desktop-por:3341 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811037: select_task_rq_fair: this_run/load:10:373 prev_run/load:1:289 waking at-spi2-registr:3165 CPU4 ==> CPU7
X-2211    [007] d...211  2327.811042: select_task_rq_fair: this_run/load:11:373 prev_run/load:1:1029 waking ibus-ui-gtk3:2865 CPU0 ==> CPU0
X-2211    [007] d...211  2327.811049: select_task_rq_fair: this_run/load:11:373 prev_run/load:1:226 waking ibus-x11:2868 CPU2 ==> CPU2
X-2211    [007] d...211  2327.811054: select_task_rq_fair: this_run/load:11:373 prev_run/load:11:373 waking ibus-extension-:2866 CPU7 ==> CPU7
X-2211    [007] d...211  2327.811059: select_task_rq_fair: this_run/load:12:373 prev_run/load:1:289 waking QXcbEventQueue:2804 CPU4 ==> CPU7
X-2211    [007] d...211  2327.811063: select_task_rq_fair: this_run/load:13:373 prev_run/load:1:935 waking QXcbEventQueue:2756 CPU1 ==> CPU7
X-2211    [007] d...211  2327.811068: select_task_rq_fair: this_run/load:14:373 prev_run/load:1:1528 waking QXcbEventQueue:2753 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811074: select_task_rq_fair: this_run/load:15:373 prev_run/load:1:1528 waking QXcbEventQueue:2741 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811079: select_task_rq_fair: this_run/load:16:373 prev_run/load:1:1528 waking QXcbEventQueue:2730 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811085: select_task_rq_fair: this_run/load:17:373 prev_run/load:1:5 waking QXcbEventQueue:2724 CPU0 ==> CPU0
X-2211    [007] d...211  2327.811090: select_task_rq_fair: this_run/load:17:373 prev_run/load:1:1010 waking QXcbEventQueue:2721 CPU6 ==> CPU7
X-2211    [007] d...211  2327.811096: select_task_rq_fair: this_run/load:18:373 prev_run/load:1:1528 waking QXcbEventQueue:2720 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811101: select_task_rq_fair: this_run/load:19:373 prev_run/load:1:1528 waking QXcbEventQueue:2704 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811105: select_task_rq_fair: this_run/load:20:373 prev_run/load:0:226 waking QXcbEventQueue:2705 CPU2 ==> CPU2
X-2211    [007] d...211  2327.811110: select_task_rq_fair: this_run/load:19:342 prev_run/load:1:1528 waking QXcbEventQueue:2695 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811115: select_task_rq_fair: this_run/load:20:342 prev_run/load:1:1528 waking QXcbEventQueue:2694 CPU5 ==> CPU7
X-2211    [007] d...211  2327.811120: select_task_rq_fair: this_run/load:21:342 prev_run/load:1:1528 waking QXcbEventQueue:2679 CPU5 ==> CPU7

Legend: foo_run/load:foo->nr_running:cpu_load(foo)

Every migration to CPU7 in the above was due to wake_affine_weight()
seeing more or less static effective load numbers (the trace was wider,
showing which path was taken).

> I am wondering if this should be the responsibility of wake_wide()?

That's a good point.  I'm not so sure that would absolve use of what
appears to be stagnant state though.  If we hadn't gotten there, this
stack obviously wouldn't have happened.. but we did get there, and
state that was used did not reflect reality.  wake_wide() deflecting
this particular gaggle wouldn't improved state accuracy one whit for a
subsequent wakeup, or?

	-Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ