lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <31fa089d-1f55-4bc7-9323-389fda4cadfa@efficios.com>
Date: Mon, 10 Mar 2025 11:50:23 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Gabriele Monaco <gmonaco@...hat.com>
Cc: Ingo Molnar <mingo@...hat.org>, Shuah Khan <shuah@...nel.org>,
 linux-kernel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 "Paul E. McKenney" <paulmck@...nel.org>, linux-mm@...ck.org
Subject: Re: [PATCH v11 2/3] sched: Move task_mm_cid_work to mm work_struct

On 2025-03-10 10:46, Gabriele Monaco wrote:
> On Thu, 2025-02-27 at 16:33 +0100, Gabriele Monaco wrote:
>> Currently, the task_mm_cid_work function is called in a task work
>> triggered by a scheduler tick to frequently compact the mm_cids of
>> each
>> process. This can delay the execution of the corresponding thread for
>> the entire duration of the function, negatively affecting the
>> response
>> in case of real time tasks. In practice, we observe task_mm_cid_work
>> increasing the latency of 30-35us on a 128 cores system, this order
>> of
>> magnitude is meaningful under PREEMPT_RT.
>>
>> Run the task_mm_cid_work in a new work_struct connected to the
>> mm_struct rather than in the task context before returning to
>> userspace.
>>
>> This work_struct is initialised with the mm and disabled before
>> freeing
>> it. The queuing of the work happens while returning to userspace in
>> __rseq_handle_notify_resume, maintaining the checks to avoid running
>> more frequently than MM_CID_SCAN_DELAY.
>> To make sure this happens predictably also on long running tasks, we
>> trigger a call to __rseq_handle_notify_resume also from the scheduler
>> tick if the runtime exceeded a 100ms threshold.
>> [...]
>>
>> Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by
>> mm_cid")
>> Signed-off-by: Gabriele Monaco <gmonaco@...hat.com>
> 
> Is this patch missing anything?
> 
> I refactored a bit to have it build in configurations without RSEQ
> and/or MM_CID (which was failing v10)

Found a small nit. Please fix and resend with my reviewed-by, and
that version will be ready for inclusion.

Thanks!

Mathieu


> 
> Thanks,
> Gabriele
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ