lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 9 Mar 2022 10:55:54 +0530
From:   Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:     K Prateek Nayak <kprateek.nayak@....com>
Cc:     peterz@...radead.org, aubrey.li@...ux.intel.com, efault@....de,
        gautham.shenoy@....com, linux-kernel@...r.kernel.org,
        mgorman@...hsingularity.net, mingo@...nel.org,
        song.bao.hua@...ilicon.com, valentin.schneider@....com,
        vincent.guittot@...aro.org
Subject: Re: [PATCH v6] sched/fair: Consider cpu affinity when allowing NUMA
 imbalance in find_idlest_group

* K Prateek Nayak <kprateek.nayak@....com> [2022-03-08 17:18:16]:

> Hello Srikar,
> 
> On 3/8/2022 2:59 PM, Srikar Dronamraju wrote:
> > [..snip..]


> >> @@ -9200,10 +9201,19 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
> >>  			 * Otherwise, keep the task close to the wakeup source
> >>  			 * and improve locality if the number of running tasks
> >>  			 * would remain below threshold where an imbalance is
> >> -			 * allowed. If there is a real need of migration,
> >> -			 * periodic load balance will take care of it.
> >> +			 * allowed while accounting for the possibility the
> >> +			 * task is pinned to a subset of CPUs. If there is a
> >> +			 * real need of migration, periodic load balance will
> >> +			 * take care of it.
> >>  			 */
> >> -			if (allow_numa_imbalance(local_sgs.sum_nr_running + 1, sd->imb_numa_nr))
> >> +			imb = sd->imb_numa_nr;
> >> +			if (p->nr_cpus_allowed != num_online_cpus()) {

> > Again, repeating, is the problem only happening in the pinned case?
> Yes. We've tested stream with 8 and 16 stream threads on a Zen3 system
> with 16 LLCs and in both cases, with unbound runs, we've seen each
> Stream thread get a separate LLC and we didn't observe any stacking.

If the problem is only happening with pinned case, then it means that in the
in unpinned case, the load balancer is able to do the load balancing
correctly and quickly but for some reason may not be able to do the same in
pinned case. Without the patch, even in the unpinned case, the initial CPU
range is more less the same number of LLCs as the pinned. However its able
to spread better.

I believe the problem could be in can_migrate_task() checking for
!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)

i.e dst_cpu is doing a load balance on behalf of the entire LLC, however it
only will pull tasks that can be pulled into it.

> --
> Thanks and Regards,
> Prateek

-- 
Thanks and Regards
Srikar Dronamraju

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ