lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231011105329.GA17066@noisy.programming.kicks-ass.net>
Date:   Wed, 11 Oct 2023 12:53:29 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Ankit Jain <ankitja@...are.com>
Cc:     yury.norov@...il.com, andriy.shevchenko@...ux.intel.com,
        linux@...musvillemoes.dk, qyousef@...alina.io, pjt@...gle.com,
        joshdon@...gle.com, bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, namit@...are.com,
        amakhalov@...are.com, srinidhir@...are.com, vsirnapalli@...are.com,
        vbrahmajosyula@...are.com, akaher@...are.com,
        srivatsa@...il.mit.edu
Subject: Re: [PATCH RFC] cpumask: Randomly distribute the tasks within
 affinity mask

On Wed, Oct 11, 2023 at 12:49:25PM +0530, Ankit Jain wrote:
> commit 46a87b3851f0 ("sched/core: Distribute tasks within affinity masks")
> and commit 14e292f8d453 ("sched,rt: Use cpumask_any*_distribute()")
> introduced the logic to distribute the tasks at initial wakeup on cpus
> where load balancing works poorly or disabled at all (isolated cpus).
> 
> There are cases in which the distribution of tasks
> that are spawned on isolcpus does not happen properly.
> In production deployment, initial wakeup of tasks spawn from
> housekeeping cpus to isolcpus[nohz_full cpu] happens on first cpu
> within isolcpus range instead of distributed across isolcpus.
> 
> Usage of distribute_cpu_mask_prev from one processes group,
> will clobber previous value of another or other groups and vice-versa.
> 
> When housekeeping cpus spawn multiple child tasks to wakeup on
> isolcpus[nohz_full cpu], using cpusets.cpus/sched_setaffinity(),
> distribution is currently performed based on per-cpu
> distribute_cpu_mask_prev counter.
> At the same time, on housekeeping cpus there are percpu
> bounded timers interrupt/rcu threads and other system/user tasks
> would be running with affinity as housekeeping cpus. In a real-life
> environment, housekeeping cpus are much fewer and are too much loaded.
> So, distribute_cpu_mask_prev value from these tasks impacts
> the offset value for the tasks spawning to wakeup on isolcpus and
> thus most of the tasks end up waking up on first cpu within the
> isolcpus set.
> 
> Steps to reproduce:
> Kernel cmdline parameters:
> isolcpus=2-5 skew_tick=1 nohz=on nohz_full=2-5
> rcu_nocbs=2-5 rcu_nocb_poll idle=poll irqaffinity=0-1
> 
> * pid=$(echo $$)
> * taskset -pc 0 $pid
> * cat loop-normal.c
> int main(void)
> {
>         while (1)
>                 ;
>         return 0;
> }
> * gcc -o loop-normal loop-normal.c
> * for i in {1..50}; do ./loop-normal & done
> * pids=$(ps -a | grep loop-normal | cut -d' ' -f5)
> * for i in $pids; do taskset -pc 2-5 $i ; done
> 
> Expected output:
> * All 50 “loop-normal” tasks should wake up on cpu2-5
> equally distributed.
> * ps -eLo cpuid,pid,tid,ppid,cls,psr,cls,cmd | grep "^    [2345]"
> 
> Actual output:
> * All 50 “loop-normal” tasks got woken up on cpu2 only

Your expectation is wrong. Things work as advertised.

Also, isolcpus is crap.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ