[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SL2PR06MB308268BDF646BFA0FC9A4590BD0C9@SL2PR06MB3082.apcprd06.prod.outlook.com>
Date: Fri, 11 Mar 2022 09:30:24 +0000
From: 王擎 <wangqing@...o.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
CC: Catalin Marinas <Catalin.Marinas@....com>,
Will Deacon <will@...nel.org>,
Sudeep Holla <sudeep.holla@....com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] sched: topology: make cache topology separate from cpu
topology
>>
>>
>> >>
>> >>
>> >> >On Thu, 10 Mar 2022 at 13:59, Qing Wang <wangqing@...o.com> wrote:
>> >> >>
>> >> >> From: Wang Qing <wangqing@...o.com>
>> >> >>
>> >> >> Some architectures(e.g. ARM64), caches are implemented below:
>> >> >> cluster: ****** cluster 0 ***** ****** cluster 1 *****
>> >> >> core: 0 1 2 3 4 5 6 7
>> >> (add cache level 1) c0 c1 c2 c3 c4 c5 c6 c7
>> >> >> cache(Leveln): **cache0** **cache1** **cache2** **cache3**
>> >> (add cache level 3) *************share level 3 cache ***************
>> >> >> sd_llc_id(current): 0 0 0 0 4 4 4 4
>> >> >> sd_llc_id(should be): 0 0 2 2 4 4 6 6
>> >> >>
>> >> Here, n always be 2 in ARM64, but others are also possible.
>> >> core[0,1] form a complex(ARMV9), which share L2 cache, core[2,3] is the same.
>> >>
>> >> >> Caches and cpus have different topology, this causes cpus_share_cache()
>> >> >> return the wrong value, which will affect the CPU load balance.
>> >> >>
>> >> >What does your current scheduler topology look like?
>> >> >
>> >> >For CPU 0 to 3, do you have the below ?
>> >> >DIE [0 - 3] [4-7]
>> >> >MC [0] [1] [2] [3]
>> >>
>> >> The current scheduler topology consistent with CPU topology:
>> >> DIE [0-7]
>> >> MC [0-3] [4-7] (SD_SHARE_PKG_RESOURCES)
>> >> Most Android phones have this topology.
>> >> >
>> >> >But you would like something like below for cpu 0-1 instead ?
>> >> >DIE [0 - 3] [4-7]
>> >> >CLS [0 - 1] [2 - 3]
>> >> >MC [0] [1]
>> >> >
>> >> >with SD_SHARE_PKG_RESOURCES only set to MC level ?
>> >>
>> >> We don't change the current scheduler topology, but the
>> >> cache topology should be separated like below:
>> >
>> >The scheduler topology is not only cpu topology but a mixed of cpu and
>> >cache/memory cache topology
>> >
>> >> [0-7] (shared level 3 cache )
>> >> [0-1] [2-3][4-5][6-7] (shared level 2 cache )
>> >
>> >So you don't bother the intermediate cluster level which is even simpler.
>> >you have to modify generic arch topology so that cpu_coregroup_mask
>> >returns the correct cpu mask directly.
>> >
>> >You will notice a llc_sibling field that is currently used by acpi but
>> >not DT to return llc cpu mask
>> >
>> cpu_topology[].llc_sibling describe the last level cache of whole system,
>> not in the sched_domain.
>>
>> in the above cache topology, llc_sibling is 0xff([0-7]) , it describes
>
>If llc_sibling was 0xff([0-7] on your system, you would have only one level:
>MC[0-7]
Sorry, but I don't get it, why llc_sibling was 0xff([0-7] means MC[0-7]?
In our system(Android), llc_sibling is indeed 0xff([0-7]) , because they
shared the llc(L3), but we also have two level:
DIE [0-7]
MC [0-3][4-6]
It makes sense, [0-3] are little cores, [4-7] are bit cores, se only up migrate
when misfit. We won't change it.
>
>> the L3 cache sibling, but sd_llc_id describes the maximum shared cache
>> in sd, which should be [0-1] instead of [0-3].
>
>sd_llc_id describes the last sched_domain with SD_SHARE_PKG_RESOURCES.
>If you want llc to be [0-3] make sure that the
>sched_domain_topology_level array returns the correct cpumask with
>this flag
Acturely, we want sd_llc to be [0-1] [2-3], but if the MC domain don't have
SD_SHARE_PKG_RESOURCES flag, the sd_llc will be [0][1][2][3]. It's not true.
So we must separate sd_llc from sd topology, or the demand cannot be
met in any case under the existing mechanism.
Thanks,
Wang
Powered by blists - more mailing lists