[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4yW72mktbWjRfE9ngXoq9oXBXyAd_TPjKBNdGiRSoh9LA@mail.gmail.com>
Date: Fri, 1 Oct 2021 23:32:18 +1300
From: Barry Song <21cnbao@...il.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: aubrey.li@...ux.intel.com, Borislav Petkov <bp@...en8.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Ben Segall <bsegall@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Guodong Xu <guodong.xu@...aro.org>,
"H. Peter Anvin" <hpa@...or.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Juri Lelli <juri.lelli@...hat.com>, lenb@...nel.org,
linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Linuxarm <linuxarm@...wei.com>,
Mark Rutland <mark.rutland@....com>,
Mel Gorman <mgorman@...e.de>, msys.mizuma@...il.com,
prime.zeng@...ilicon.com, rjw@...ysocki.net,
Steven Rostedt <rostedt@...dmis.org>,
Barry Song <song.bao.hua@...ilicon.com>,
Sudeep Holla <sudeep.holla@....com>,
Thomas Gleixner <tglx@...utronix.de>, rafael@...nel.org,
Tim Chen <tim.c.chen@...ux.intel.com>,
Valentin Schneider <valentin.schneider@....com>,
will@...nel.org, x86@...nel.org, yangyicong@...wei.com
Subject: Re: [PATCH RESEND 0/3] Represent cluster topology and enable load
balance between clusters
Hi Vincent, Dietmar, Peter, Ingo,
Do you have any comment on this first series which exposes cluster topology
of ARM64 kunpeng 920 & x86 Jacobsville and supports load balance only for
the 1st stage?
I will be very grateful for your comments so that things can move forward in the
right direction. I think Tim also looks forward to bringing up cluster
support in
Jacobsville.
Best Regards
Barry
On Fri, Sep 24, 2021 at 8:51 PM Barry Song <21cnbao@...il.com> wrote:
>
> From: Barry Song <song.bao.hua@...ilicon.com>
>
> ARM64 machines like kunpeng920 and x86 machines like Jacobsville have a
> level of hardware topology in which some CPU cores, typically 4 cores,
> share L3 tags or L2 cache.
>
> That means spreading those tasks between clusters will bring more memory
> bandwidth and decrease cache contention. But packing tasks might help
> decrease the latency of cache synchronization.
>
> We have three series to bring up cluster level scheduler in kernel.
> This is the first series.
>
> 1st series(this one): make kernel aware of cluster, expose cluster to sysfs
> ABI and add SCHED_CLUSTER which can make load balance between clusters to
> benefit lots of workload.
> Testing shows this can hugely boost the performance, for example, this
> can increase 25.1% of SPECrate mcf on Jacobsville and 13.574% of mcf
> on kunpeng920.
>
> 2nd series(wake_affine): modify the wake_affine and let kernel select CPUs
> within cluster first before scanning the whole LLC so that we can benefit
> from the lower latency of cache coherence within one single cluster. This
> series is much more tricky. so we would like to send it after we build
> the base of cluster by the 1st series. Prototype for 2nd series is here:
> https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-June/000219.html
>
> 3rd series: a sysctl to permit users to enable or disable cluster scheduler
> from Tim Chen. Prototype here:
> Add run time sysctl to enable/disable cluster scheduling
> https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-July/000258.html
>
> This series is resent and rebased on 5.15-rc2.
>
> -V1:
> differences with RFC v6
> * removed wake_affine path modifcation, which will be separately second series
> * cluster_id is gotten by detecting valid ID before falling back to use offset
> * lots of benchmark data from both x86 Jacobsville and ARM64 kunpeng920
>
> -RFC v6:
> https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/
>
> Barry Song (1):
> scheduler: Add cluster scheduler level in core and related Kconfig for
> ARM64
>
> Jonathan Cameron (1):
> topology: Represent clusters of CPUs within a die
>
> Tim Chen (1):
> scheduler: Add cluster scheduler level for x86
>
> Documentation/ABI/stable/sysfs-devices-system-cpu | 15 +++++
> Documentation/admin-guide/cputopology.rst | 12 ++--
> arch/arm64/Kconfig | 7 +++
> arch/arm64/kernel/topology.c | 2 +
> arch/x86/Kconfig | 8 +++
> arch/x86/include/asm/smp.h | 7 +++
> arch/x86/include/asm/topology.h | 3 +
> arch/x86/kernel/cpu/cacheinfo.c | 1 +
> arch/x86/kernel/cpu/common.c | 3 +
> arch/x86/kernel/smpboot.c | 44 ++++++++++++++-
> drivers/acpi/pptt.c | 67 +++++++++++++++++++++++
> drivers/base/arch_topology.c | 14 +++++
> drivers/base/topology.c | 10 ++++
> include/linux/acpi.h | 5 ++
> include/linux/arch_topology.h | 5 ++
> include/linux/sched/topology.h | 7 +++
> include/linux/topology.h | 13 +++++
> kernel/sched/topology.c | 5 ++
> 18 files changed, 223 insertions(+), 5 deletions(-)
>
> --
> 1.8.3.1
>
Powered by blists - more mailing lists