lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21a18b1c-b5ae-410c-8d1f-3b63358b0e61@efficios.com>
Date: Mon, 2 Dec 2024 10:01:57 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Gabriele Monaco <gmonaco@...hat.com>, Ingo Molnar <mingo@...hat.com>,
 Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] sched: Optimise task_mm_cid_work duration

On 2024-12-02 09:56, Gabriele Monaco wrote:
> Hi Mathieu,
> 
> thanks for the quick reply.
> 
>> Thanks for looking into this. I understand that you are after
>> minimizing the
>> latency introduced by task_mm_cid_work on isolated cores. I think
>> we'll need
>> to think a bit harder, because the proposed solution does not work:
>>
>>    * for_each_cpu_from - iterate over CPUs present in @mask, from @cpu
>> to the end of @mask.
>>
>> cpu is uninitialized. So this is completely broken.
> 
> My bad, wrong macro.. Should be for_each_cpu
> 
>> Was this tested
>> against a workload that actually uses concurrency IDs to ensure it
>> does
>> not break the whole thing ? Did you run the rseq selftests ?
>>
> 
> I did run the stress-ng --rseq command for a while and didn't see any
> error reported, but it's probably not bulletproof. I'll use the
> selftests for the next iterations.
> 
>> Also, the mm_cidmask is a mask of concurrency IDs, not a mask of
>> CPUs. So
>> using it to iterate on CPUs is wrong.
>>
> 
> Mmh I get it, during my tests I was definitely getting better results
> than using the mm_cpus_allowed mask, but I guess that was a broken test
> so it just doesn't count..
> Do you think using mm_cpus_allowed would make more sense, with the
> /risk/ of being a bit over-cautious?

mm_cpus_allowed can be updated dynamically by setting cpu affinity
and changing the cpusets. If we change the iteration from each possible
cpus to allowed cpus, then we need to adapt the allowed cpus updates
with the associated updates to the mm_cid as well. This is adding
complexity.

I understand that you wish to offload this task_work to a non-isolated
CPU (non-RT). If you do so, do you really care about the duration of
task_mm_cid_work enough to justify the added complexity to the
cpu affinity/cpusets updates ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ