linux-kernel - Re: [RESEND PATCH] sched: sd_llc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <xhsmh8rgyre8i.mognet@vschneid.remote.csb>
Date:   Wed, 15 Feb 2023 18:10:53 +0000
From:   Valentin Schneider <vschneid@...hat.com>
To:     Sun Shouxin <sunshouxin@...natelecom.cn>, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com
Cc:     linux-kernel@...r.kernel.org, huyd12@...natelecom.cn,
        sunshouxin@...natelecom.cn
Subject: Re: [RESEND PATCH] sched: sd_llc_id initialized

On 14/02/23 17:54, Sun Shouxin wrote:
> In my test,I use isolcpus to isolate cpu for specific,
> and then I noticed different scenario when core binding.
>
> For example, the NUMA topology is as follows,
> NUMA node0 CPU(s):               0-15,32-47
> NUMA node1 CPU(s):               16-31,48-63
>
> and the 'isolcpus' is as follows,
> isolcpus=14,15,30,31,46,47,62,63
>
> One task initially running on the non-isolated core belong to NUMA0
> was bind to one isolated core on NUMA1, and then change its cpu affinity
> to all cores, I notice the task can be scheduled back to the
> non-isolated core on NUMA0.
>
> 1.taskset -pc 0-13 3512  (task running on core 1)
> 2.taskset -pc 63 3512    (task running on isolated core 63)
> 3.taskset -pc 0-63 3512  (task running on core 1)
>

This is working as intended, no?

> Another case, one task initially running on the non-isolated core
> belong to NUMA1 was bind to one isolated core on NUMA1,
> and then change its cpu affinity to  all cores,
> the task can not be scheduled out and always run on the isolated core.
>
> 1.taskset -pc 16-29 3512 (task running on core 17)
> 2.taskset -pc 63 3512    (task running on isolated core 63)
> 3.taskset -pc 0-63 3512  (task still running on core 63
>                           and not schedule out)
>

And this is also not wrong, since CPU63 is in the task's affinity mask.

That said, I can see that in this case we'd want the task to use other CPUs
if it makes sense wrt load balance.

However, since CPU63 is attached to a NULL sched_domain, AFAIA your
solution is at the mercy of the @prev and @target CPUs passed to
select_idle_sibling(). So this might only work if the waker is on a
non-isolated CPU.

I don't think your patch is wrong, but I don't think it entirely fixes the
issue either. Unfortunately, due to isolated CPUs being attached to NULL
sched_domains, there isn't a magic solution as the majority of scheduler
decisions are based on these.

A safe bet would be to exclude isolated CPUs from the affinity of your
non-critical tasks. Things like TuneD [1] and/or cpusets could help.

[1]: https://github.com/redhat-performance/tuned