lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZPVATM5xfmiFKWtk@chenyu5-mobl2>
Date:   Mon, 4 Sep 2023 10:26:20 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     Yang Yingliang <yangyingliang@...wei.com>
CC:     <linux-kernel@...r.kernel.org>, <mingo@...hat.com>,
        <peterz@...radead.org>, <juri.lelli@...hat.com>,
        <vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
        <rostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
        <bristot@...hat.com>, <vschneid@...hat.com>, <tglx@...utronix.de>
Subject: Re: [PATCH] sched/smt: fix unbalance sched_smt_present dec/inc

Hi Yingliang,

On 2023-09-02 at 15:46:09 +0800, Yang Yingliang wrote:
> I got the following warn report while doing stress test:
> 

May I know if the test is to run many deadline tasks while offline the CPUs,
so as to trigger the failing case that removing one CPU gets us below the total
allocated bandwidth?

> jump label: negative count!
> WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
> Call Trace:
>  <TASK>
>  __static_key_slow_dec_cpuslocked+0x16/0x70
>  sched_cpu_deactivate+0x26e/0x2a0
>  cpuhp_invoke_callback+0x3ad/0x10d0
>  cpuhp_thread_fun+0x3f5/0x680
>  smpboot_thread_fn+0x56d/0x8d0
>  kthread+0x309/0x400
>  ret_from_fork+0x41/0x70
>  ret_from_fork_asm+0x1b/0x30
>  </TASK>
> 
> Becaus when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),

s/Becaus/Because/

> the cpu offline failed, but sched_smt_present is decreased before
> calling sched_cpu_deactivate, it leads unbalance dec/inc, so fix

s/calling sched_cpu_deactivate/calling cpuset_cpu_inactive() ?

> it by increasing sched_smt_present in the error path.
> 
> Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
> Signed-off-by: Yang Yingliang <yangyingliang@...wei.com>
> ---
>  kernel/sched/core.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 2299a5cfbfb9..b7ef2df36b75 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -9745,6 +9745,10 @@ int sched_cpu_deactivate(unsigned int cpu)
>  	sched_update_numa(cpu, false);
>  	ret = cpuset_cpu_inactive(cpu);
>  	if (ret) {
> +#ifdef CONFIG_SCHED_SMT
> +		if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
> +			static_branch_inc_cpuslocked(&sched_smt_present);
> +#endif

While checking the code, it seems that the core scheduling also missed
the error path, maybe we should also invoke sched_core_cpu_starting() to
restore the context. I'll have a check.

Other than above typo, it looks good to me,

Reviewed-by: Chen Yu <yu.c.chen@...el.com>

thanks,
Chenyu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ