lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <82ea7cb508ee4cee6d2fde1158692d0cee97eb42.camel@redhat.com>
Date:   Mon, 28 Aug 2023 21:07:30 -0300
From:   Leonardo BrĂ¡s <leobras@...hat.com>
To:     Marcelo Tosatti <mtosatti@...hat.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/4] swap: apply new local_schedule_work_on()
 interface

On Tue, 2023-08-08 at 16:39 -0300, Marcelo Tosatti wrote:
> On Sat, Jul 29, 2023 at 05:37:33AM -0300, Leonardo Bras wrote:
> > Make use of the new local_*lock_n*() and local_schedule_work_on() interface
> > to improve performance & latency on PREEMTP_RT kernels.
> > 
> > For functions that may be scheduled in a different cpu, replace
> > local_*lock*() by local_lock_n*(), and replace schedule_work_on() by
> > local_schedule_work_on(). The same happens for flush_work() and
> > local_flush_work().
> > 
> > This should bring no relevant performance impact on non-RT kernels:
> > For functions that may be scheduled in a different cpu, the local_*lock's
> > this_cpu_ptr() becomes a per_cpu_ptr(smp_processor_id()).
> > 
> > Signed-off-by: Leonardo Bras <leobras@...hat.com>
> > ---
> >  mm/swap.c | 18 +++++++++---------
> >  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> Leo,
> 
> I think the interruptions should rather be removed for both
> CONFIG_PREEMPT_RT AND !CONFIG_PREEMPT_RT.
> 
> The impact of grabbing locks must be properly analyzed and not
> "rejected blindly".

Yes, I agree with the idea, but we have been noticing a lot of rejection for
these ideas lately, as the maintainers perceive this as a big change.

My idea here is to provide a general way to improve the PREEMPT_RT scenarios CPU
Isolation, even though there is resistance in !PREEMPT_RT case.

As I commented, spinlocks are already held by PREEMPT_RT's local_locks hotpaths,
and yet are using schedule_work_on() to have remote work done. This patch tries
to solve this by using spin_lock(remote_cpu_lock) instead and save a lot of
cycles while decreasing IPI at the remote cpu.

It looks a simple solution, improving isolation and performance on PREEMPT_RT
with no visible drawbacks. I agree the interface is not ideal, and for that I
really need you guys help.

I understand that having this change merged, we will have more precedence to
discuss performance for the !PREEMPT_RT case.

What do you think on that?

Thanks!
Leo

> 
> Example:
> 
> commit 01b44456a7aa7c3b24fa9db7d1714b208b8ef3d8
> Author: Mel Gorman <mgorman@...hsingularity.net>
> Date:   Fri Jun 24 13:54:23 2022 +0100
> 
>     mm/page_alloc: replace local_lock with normal spinlock
>     
>     struct per_cpu_pages is no longer strictly local as PCP lists can be
>     drained remotely using a lock for protection.  While the use of local_lock
>     works, it goes against the intent of local_lock which is for "pure CPU
>     local concurrency control mechanisms and not suited for inter-CPU
>     concurrency control" (Documentation/locking/locktypes.rst)
>     
>     local_lock protects against migration between when the percpu pointer is
>     accessed and the pcp->lock acquired.  The lock acquisition is a preemption
>     point so in the worst case, a task could migrate to another NUMA node and
>     accidentally allocate remote memory.  The main requirement is to pin the
>     task to a CPU that is suitable for PREEMPT_RT and !PREEMPT_RT.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ