[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181015144217.nu5cp5mxlboyjbre@linutronix.de>
Date: Mon, 15 Oct 2018 16:42:17 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: "Paul E. McKenney" <paulmck@...ux.ibm.com>
Cc: Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org,
Boqun Feng <boqun.feng@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
tglx@...utronix.de, Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [PATCH] rcu: Use cpus_read_lock() while looking at
cpu_online_mask
On 2018-10-13 06:48:13 [-0700], Paul E. McKenney wrote:
>
> My concern would be that it would queue it by default for the current
> CPU, which would serialize the processing, losing the concurrency of
> grace-period initialization. But that was a long time ago, and perhaps
> workqueues have changed.
but the code here is always using the first CPU of a NUMA node or did I
miss something?
> So, have you tried using rcuperf to test the
> update performance on a large system?
Something like this:
| tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuperf --configs TREE
| ----Start batch 1: Mon Oct 15 12:46:13 CEST 2018
| TREE 142: Starting build. …
| …
| Average grace-period duration: 19952.7 microseconds
| Minimum grace-period duration: 9004.51
| 50th percentile grace-period duration: 19998.3
| 90th percentile grace-period duration: 24994.4
| 99th percentile grace-period duration: 30002.1
| Maximum grace-period duration: 42998.2
| Grace periods: 6560 Batches: 209 Ratio: 31.3876
| Computed from rcuperf printk output.
| Completed in 27 vs. 1800
|
| Average grace-period duration: 18700 microseconds
| Minimum grace-period duration: 7069.2
| 50th percentile grace-period duration: 18987.5
| 90th percentile grace-period duration: 22997
| 99th percentile grace-period duration: 28944.7
| Maximum grace-period duration: 36994.5
| Grace periods: 6551 Batches: 209 Ratio: 31.3445
| Computed from rcuperf printk output.
| Completed in 27 vs. 1800
two runs patched followed by two runs without the patched:
| Average grace-period duration: 19423.3 microseconds
| Minimum grace-period duration: 8006.93
| 50th percentile grace-period duration: 19002.8
| 90th percentile grace-period duration: 23997.5
| 99th percentile grace-period duration: 29995.7
| Maximum grace-period duration: 37997.9
| Grace periods: 6526 Batches: 208 Ratio: 31.375
| Computed from rcuperf printk output.
| Completed in 27 vs. 1800
|
| Average grace-period duration: 18822.4 microseconds
| Minimum grace-period duration: 8348.15
| 50th percentile grace-period duration: 18996.9
| 90th percentile grace-period duration: 23000
| 99th percentile grace-period duration: 27999.5
| Maximum grace-period duration: 39001.9
| Grace periods: 6540 Batches: 209 Ratio: 31.2919
| Computed from rcuperf printk output.
| Completed in 27 vs. 1800
I think difference might come from cpufreq on the host. But in general,
this is what you have been asking for or did you want to see something
on the host (or an additional argument to the script)?
> If this change does not impact performance on an rcuperf test, why not
> send me a formal patch with Signed-off-by and commit log (including
> performance testing results)? I will then apply it, it will be exposed
> to 0day and eventually -next testing, and if no problems arise, it will go
> to mainline, perhaps as soon as the merge window after the upcoming one.
>
> Fair enough?
sure.
> Thanx, Paul
>
Sebastian
Powered by blists - more mailing lists