[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211115104601.lwcwxoc67ftbf6uh@shindev>
Date: Mon, 15 Nov 2021 10:46:02 +0000
From: Shinichiro Kawasaki <shinichiro.kawasaki@....com>
To: Vincent Donnefort <vincent.donnefort@....com>
CC: "peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"dietmar.eggemann@....com" <dietmar.eggemann@....com>,
"valentin.schneider@....com" <valentin.schneider@....com>,
"jing-ting.wu@...iatek.com" <jing-ting.wu@...iatek.com>,
Damien Le Moal <Damien.LeMoal@....com>
Subject: Re: [PATCH] sched/core: Mitigate race
cpus_share_cache()/update_top_cache_domain()
On Nov 04, 2021 / 17:51, Vincent Donnefort wrote:
> Nothing protects the access to the per_cpu variable sd_llc_id. When testing
> the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
> update_top_cache_domain(). One scenario being:
>
> CPU1 CPU2
> ==================================================================
>
> per_cpu(sd_llc_id, CPUX) => 0
> partition_sched_domains_locked()
> detach_destroy_domains()
> cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
> per_cpu(sd_llc_id, CPUX) => 0
> per_cpu(sd_llc_id, CPUX) = CPUX
> per_cpu(sd_llc_id, CPUX) => CPUX
> return false
>
> ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
> is a warning triggered from ttwu_queue_wakelist().
>
> Avoid a such race in cpus_share_cache() by always returning true when
> this_cpu == that_cpu.
>
> Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
> Reported-by: Jing-Ting Wu <jing-ting.wu@...iatek.com>
> Signed-off-by: Vincent Donnefort <vincent.donnefort@....com>
> Reviewed-by: Valentin Schneider <valentin.schneider@....com>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f2611b9cf503..f5ca15cdcff4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)
>
> bool cpus_share_cache(int this_cpu, int that_cpu)
> {
> + if (this_cpu == that_cpu)
> + return true;
> +
> return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
> }
>
> --
> 2.25.1
>
Oh, this is the exactly same fix as I posted before [1]. It is a little bit sad
that my post did not get reviewed. Anyway, good to see the issue fixed. Thanks.
[1] https://lore.kernel.org/all/20211029005618.773579-1-shinichiro.kawasaki@wdc.com/
--
Best Regards,
Shin'ichiro Kawasaki
Powered by blists - more mailing lists