lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 Nov 2021 20:46:44 +0800
From:   Yicong Yang <yangyicong@...ilicon.com>
To:     Mel Gorman <mgorman@...e.de>
CC:     <yangyicong@...ilicon.com>, <mingo@...hat.com>,
        <peterz@...radead.org>, <juri.lelli@...hat.com>,
        <vincent.guittot@...aro.org>, <linux-kernel@...r.kernel.org>,
        <dietmar.eggemann@....com>, <rostedt@...dmis.org>,
        <bsegall@...gle.com>, <bristot@...hat.com>,
        <song.bao.hua@...ilicon.com>, <prime.zeng@...wei.com>,
        <linuxarm@...wei.com>, <21cnbao@...il.com>,
        "shenyang (M)" <shenyang39@...wei.com>
Subject: Re: [PATCH] sched/fair: Clear target from cpus to scan in
 select_idle_cpu

On 2021/11/25 19:17, Mel Gorman wrote:
> On Wed, Nov 24, 2021 at 04:54:01PM +0800, Yicong Yang wrote:
>> Commit 56498cfb045d noticed that "When select_idle_cpu starts scanning for
>> an idle CPU, it starts with a target CPU that has already been checked
>> by select_idle_sibling. This patch starts with the next CPU instead."
>> It only changed the scanning start cpu to target + 1 but still leave
>> the target in the scanning cpumask. The target still have a chance to be
>> checked in the last turn. Fix this by clear the target from the cpus
>> to scan.
>>
>> Fixes: 56498cfb045d ("sched/fair: Avoid a second scan of target in select_idle_cpu")
>> Signed-off-by: Yicong Yang <yangyicong@...ilicon.com>
> 
> Did you check the performance of this? When I tried something like this
> in a different context, I found that the cost of clearing the bit was
> more expensive than simply using target + 1. For the target to be
> rescanned, the whole mask would have to be scanned as no other CPUs are
> idle which is the unlikely case. By clearing the bit, a cost is always
> incurred even if the first CPU scanned is idle.
> 

Not yet, it's from code. I've launched some tests and we'll see the results tomorrow.

We traced the scanning here and seems the case that scan the whole LLC without
finding an idle cpu has some proportion. On 4-NUMA 128-Core Kunpeng 920 server
tested with mysql, there is ~1% probability for not finding and idle cpu when
sysbench threads is 128. The probability will increase when the load increases.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ