[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3e4d2594-f678-b77a-4883-0b893daf19f6@bytedance.com>
Date: Mon, 27 Jun 2022 18:13:23 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: Chen Yu <yu.c.chen@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Vincent Guittot <vincent.guittot@...aro.org>,
Josh Don <joshdon@...gle.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
K Prateek Nayak <kprateek.nayak@....com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 6/7] sched/fair: skip busy cores in SIS search
On 6/24/22 11:30 AM, Chen Yu Wrote:
>> ...
>>>> @@ -9273,8 +9319,40 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
>>>> static void sd_update_state(struct lb_env *env, struct sd_lb_stats *sds)
>>>> {
>>>> - if (sds->sd_state == sd_has_icpus && !test_idle_cpus(env->dst_cpu))
>>>> - set_idle_cpus(env->dst_cpu, true);
>>>> + struct sched_domain_shared *sd_smt_shared = env->sd->shared;
>>>> + enum sd_state new = sds->sd_state;
>>>> + int this = env->dst_cpu;
>>>> +
>>>> + /*
>>>> + * Parallel updating can hardly contribute accuracy to
>>>> + * the filter, besides it can be one of the burdens on
>>>> + * cache traffic.
>>>> + */
>>>> + if (cmpxchg(&sd_smt_shared->updating, 0, 1))
>>>> + return;
>>>> +
>>>> + /*
>>>> + * There is at least one unoccupied cpu available, so
>>>> + * propagate it to the filter to avoid false negative
>>>> + * issue which could result in lost tracking of some
>>>> + * idle cpus thus throughupt downgraded.
>>>> + */
>>>> + if (new != sd_is_busy) {
>>>> + if (!test_idle_cpus(this))
>>>> + set_idle_cpus(this, true);
>>>> + } else {
>>>> + /*
>>>> + * Nothing changes so nothing to update or
>>>> + * propagate.
>>>> + */
>>>> + if (sd_smt_shared->state == sd_is_busy)
>>>> + goto out;
>>>> + }
>>>> +
>>>> + sd_update_icpus(this, sds->idle_cpu);
>>> I wonder if we could further enhance it to facilitate idle CPU scan.
>>> For example, can we propagate the idle CPUs in smt domain, to its parent
>>> domain in a hierarchic sequence, and finally to the LLC domain. If there is
>>
>> In fact, it was my first try to cache the unoccupied cpus in SMT
>> shared domain, but the overhead of cpumask ops seems like a major
>> stumbling block.
>>
>>> a cluster domain between SMT and LLC domain, the cluster domain idle CPU filter
>>> could benefit from this mechanism.
>>> https://lore.kernel.org/lkml/20220609120622.47724-3-yangyicong@hisilicon.com/
>>
>> Putting SIS into a hierarchical pattern is good for cache locality.
>> But I don't think multi-level filter is appropriate since it could
>> bring too much cache traffic in SIS,
> Could you please elaborate a little more about the cache traffic? I thought we
> don't save the unoccupied cpus in SMT shared domain, but to store it in middle
> layer shared domain, say, cluster->idle_cpus, this would reduce cache write
> contention compared to writing to llc->idle_cpus directly, because a smaller
> set of CPUs share the idle_cpus filter. Similarly, SIS can only scan the cluster->idle_cpus
> first, without having to query the llc->idle_cpus. This looks like splitting
> a big lock into fine grain small lock.
I'm afraid I didn't quite follow.. Did you mean replace the LLC filter
with multiple cluster filters? Then I agree with what you suggested
that the contention would be reduced. But there are other concerns:
a. Is it appropriate to fake an intermediate sched_domain if
cluster level doesn't available? How to identify the proper
size of the faked sched_domain?
b. The SIS path might touch more cachelines (multiple cluster
filters). I'm not sure how much is the impact.
Whatever, this seems worth a try. :)
>> and it could be expected to be
>> a disaster for netperf/tbench or the workloads suffering frequent
>> context switches.
>>
> So this overhead comes from the NEWLY_IDLE case?
>
Yes, I think it's the main cause to rise the contention to new heights.
But it's also important to keep the filter fresh.
Thanks & BR,
Abel
Powered by blists - more mailing lists