lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a5d04748-b34b-3b92-fb1d-bf85c2019cc3@linux.alibaba.com>
Date:   Wed, 1 Jun 2022 13:54:33 +0800
From:   Tianchen Ding <dtcccc@...ux.alibaba.com>
To:     Valentin Schneider <vschneid@...hat.com>,
        Mel Gorman <mgorman@...e.de>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched: Queue task on wakelist in the same llc if the
 wakee cpu is idle

On 2022/5/31 23:56, Valentin Schneider wrote:

> Thanks!
> 
> So I'm thinking we could first make that into
> 
> 	if ((wake_flags & WF_ON_CPU) && !cpu_rq(cpu)->nr_running)
> 
> Then building on this, we can generalize using the wakelist to any remote
> idle CPU (which on paper isn't as much as a clear win as just WF_ON_CPU,
> depending on how deeply idle the CPU is...)
> 
> We need the cpu != this_cpu check, as that's currently served by the
> WF_ON_CPU check (AFAIU we can only observe p->on_cpu in there for remote
> tasks).
> 
> ---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 66c4e5922fe1..60038743f2f1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3830,13 +3830,20 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
>   	if (!cpus_share_cache(smp_processor_id(), cpu))
>   		return true;
>   
> +	if (cpu == smp_processor_id())
> +		return false;
> +
>   	/*
>   	 * If the task is descheduling and the only running task on the
>   	 * CPU then use the wakelist to offload the task activation to
>   	 * the soon-to-be-idle CPU as the current CPU is likely busy.
>   	 * nr_running is checked to avoid unnecessary task stacking.
> +	 *
> +	 * Note that we can only get here with (wakee) p->on_rq=0,
> +	 * p->on_cpu can be whatever, we've done the dequeue, so
> +	 * the wakee has been accounted out of ->nr_running
>   	 */
> -	if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1)
> +	if (!cpu_rq(cpu)->nr_running)
>   		return true;
>   
>   	return false;

Hi Valentin. I've done a simple unixbench test (Pipe-based Context 
Switching) on my x86 machine with full threads (104).

              old            patch1           patch1+patch2
score       7825.4     7500(more)-8000          9061.6

patch1: use !cpu_rq(cpu)->nr_running instead of cpu_rq(cpu)->nr_running <= 1
patch2: ignore WF_ON_CPU check

The score of patch1 is not stable. I've tested for many times and the 
score is floating between about 7500-8000 (more at 7500).

patch1 means more strict limit on using wakelist. But it may cause 
performance regression.

It seems that, using wakelist properly can help improve wakeup 
performance, but using it too much may cause more IPIs. It's a trade-off 
about how strict the ttwu_queue_cond() is.

Anyhow, I think patch2 should be a pure improvement. What's your idea?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ