[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e055993f-9a1b-4c92-b093-7babed9ba58b@paulmck-laptop>
Date: Mon, 11 Mar 2024 12:19:59 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Uladzislau Rezki <urezki@...il.com>
Cc: RCU <rcu@...r.kernel.org>, Neeraj upadhyay <Neeraj.Upadhyay@....com>,
Boqun Feng <boqun.feng@...il.com>, Hillf Danton <hdanton@...a.com>,
Joel Fernandes <joel@...lfernandes.org>,
LKML <linux-kernel@...r.kernel.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>,
Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [PATCH v6 0/6] Reduce synchronize_rcu() latency(v6)
On Mon, Mar 11, 2024 at 09:43:51AM +0100, Uladzislau Rezki wrote:
> On Fri, Mar 08, 2024 at 01:51:29PM -0800, Paul E. McKenney wrote:
> > On Fri, Mar 08, 2024 at 06:34:03PM +0100, Uladzislau Rezki (Sony) wrote:
> > > This is v6. It is based on the Paul's "dev" branch:
> > >
> > > HEAD: f1bfe538c7970283040a7188a291aca9f18f0c42
> > >
> > > please note, that patches should be applied from scratch,
> > > i.e. the v5 has to be dropped from the "dev".
> > >
> > > v5 -> v6:
> > > - Fix a race due to realising a wait-head from the gp-kthread;
> > > - Use our own private workqueue with WQ_MEM_RECLAIM to have
> > > at least one execution context.
> > >
> > > v5: https://lore.kernel.org/lkml/20240220183115.74124-1-urezki@gmail.com/
> > > v4: https://lore.kernel.org/lkml/ZZ2bi5iPwXLgjB-f@google.com/T/
> > > v3: https://lore.kernel.org/lkml/cd45b0b5-f86b-43fb-a5f3-47d340cd4f9f@paulmck-laptop/T/
> > > v2: https://lore.kernel.org/all/20231030131254.488186-1-urezki@gmail.com/T/
> > > v1: https://lore.kernel.org/lkml/20231025140915.590390-1-urezki@gmail.com/T/
> >
> > Queued in place of your earlier series, thank you!
> >
> Thank you!
>
> >
> > Not urgent, but which rcutorture scenario should be pressed into service
> > testing this?
> >
> I tested with setting '5*TREE01 5*TREE02 5*TREE03 5*TREE04' apart of that
> i used some private test cases. The rcutree.rcu_normal_wake_from_gp=1 has
> to be passed also.
>
> Also, "rcuscale" can be used to stress the "cur_ops->sync()" path:
>
> <snip>
> #! /usr/bin/env bash
>
> LOOPS=1
>
> for (( i=0; i<$LOOPS; i++ )); do
> tools/testing/selftests/rcutorture/bin/kvm.sh --memory 10G --torture rcuscale \
> --allcpus \
> --kconfig CONFIG_NR_CPUS=64 \
> --kconfig CONFIG_RCU_NOCB_CPU=y \
> --kconfig CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y \
> --kconfig CONFIG_RCU_LAZY=n \
> --bootargs "rcuscale.nwriters=200 rcuscale.nreaders=220 rcuscale.minruntime=50000 \
> torture.disable_onoff_at_boot rcutree.rcu_normal_wake_from_gp=1" --trust-make
> echo "Done $i"
> done
> <snip>
Very good, thank you!
Of those five options (TREE01, TREE02, TREE03, TREE04, and rcuscale),
which one should be changed so that my own testing automatically covers
the rcutree.rcu_normal_wake_from_gp=1 case? I would guess that we should
leave out TREE03, since it covers tall rcu_node trees. TREE01 looks
closest to the ChromeOS/Android use case, but you tell me!
And it might be time to rework the test cases to better align with
the use cases. For example, I created TREE10 to cover Meta's fleet.
But ChromeOS and Android have relatively small numbers of CPUs, so it
should be possible to rework things a bit to make one of the existing
tests cover that case, while modifying other tests to take up any
situations that these changes exclude.
Thoughts?
Thanx, Paul
Powered by blists - more mailing lists