linux-kernel - Re: [PATCH v6 0/6] Reduce synchronize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e055993f-9a1b-4c92-b093-7babed9ba58b@paulmck-laptop>
Date: Mon, 11 Mar 2024 12:19:59 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Uladzislau Rezki <urezki@...il.com>
Cc: RCU <rcu@...r.kernel.org>, Neeraj upadhyay <Neeraj.Upadhyay@....com>,
	Boqun Feng <boqun.feng@...il.com>, Hillf Danton <hdanton@...a.com>,
	Joel Fernandes <joel@...lfernandes.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>,
	Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [PATCH v6 0/6] Reduce synchronize_rcu() latency(v6)

On Mon, Mar 11, 2024 at 09:43:51AM +0100, Uladzislau Rezki wrote:
> On Fri, Mar 08, 2024 at 01:51:29PM -0800, Paul E. McKenney wrote:
> > On Fri, Mar 08, 2024 at 06:34:03PM +0100, Uladzislau Rezki (Sony) wrote:
> > > This is v6. It is based on the Paul's "dev" branch:
> > > 
> > > HEAD: f1bfe538c7970283040a7188a291aca9f18f0c42
> > > 
> > > please note, that patches should be applied from scratch,
> > > i.e. the v5 has to be dropped from the "dev".
> > > 
> > > v5 -> v6:
> > >  - Fix a race due to realising a wait-head from the gp-kthread;
> > >  - Use our own private workqueue with WQ_MEM_RECLAIM to have
> > >    at least one execution context.
> > > 
> > > v5: https://lore.kernel.org/lkml/20240220183115.74124-1-urezki@gmail.com/
> > > v4: https://lore.kernel.org/lkml/ZZ2bi5iPwXLgjB-f@google.com/T/
> > > v3: https://lore.kernel.org/lkml/cd45b0b5-f86b-43fb-a5f3-47d340cd4f9f@paulmck-laptop/T/
> > > v2: https://lore.kernel.org/all/20231030131254.488186-1-urezki@gmail.com/T/
> > > v1: https://lore.kernel.org/lkml/20231025140915.590390-1-urezki@gmail.com/T/
> > 
> > Queued in place of your earlier series, thank you!
> > 
> Thank you!
> 
> >
> > Not urgent, but which rcutorture scenario should be pressed into service
> > testing this?
> > 
> I tested with setting '5*TREE01 5*TREE02 5*TREE03 5*TREE04' apart of that
> i used some private test cases. The rcutree.rcu_normal_wake_from_gp=1 has
> to be passed also.
> 
> Also, "rcuscale" can be used to stress the "cur_ops->sync()" path:
> 
> <snip>
> #! /usr/bin/env bash
> 
> LOOPS=1
> 
> for (( i=0; i<$LOOPS; i++ )); do
>         tools/testing/selftests/rcutorture/bin/kvm.sh --memory 10G --torture rcuscale \
>     --allcpus \
>       --kconfig CONFIG_NR_CPUS=64 \
>       --kconfig CONFIG_RCU_NOCB_CPU=y \
>       --kconfig CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y \
>       --kconfig CONFIG_RCU_LAZY=n \
>       --bootargs "rcuscale.nwriters=200 rcuscale.nreaders=220 rcuscale.minruntime=50000 \
>                          torture.disable_onoff_at_boot rcutree.rcu_normal_wake_from_gp=1" --trust-make
>         echo "Done $i"
> done
> <snip>

Very good, thank you!

Of those five options (TREE01, TREE02, TREE03, TREE04, and rcuscale),
which one should be changed so that my own testing automatically covers
the rcutree.rcu_normal_wake_from_gp=1 case?  I would guess that we should
leave out TREE03, since it covers tall rcu_node trees.  TREE01 looks
closest to the ChromeOS/Android use case, but you tell me!

And it might be time to rework the test cases to better align with
the use cases.  For example, I created TREE10 to cover Meta's fleet.
But ChromeOS and Android have relatively small numbers of CPUs, so it
should be possible to rework things a bit to make one of the existing
tests cover that case, while modifying other tests to take up any
situations that these changes exclude.

Thoughts?

							Thanx, Paul