linux-kernel - Re: [PATCH] Add busy loop polling for idle SMT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e31e9b58-591b-c538-ccd1-5864e586ad02@linux.alibaba.com>
Date:   Fri, 19 Nov 2021 11:19:03 +0800
From:   Peng Wang <rocking@...ux.alibaba.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Add busy loop polling for idle SMT

On 17/11/2021 18:58, , Peter Zijlstra wrote:
> On Tue, Nov 16, 2021 at 07:51:35PM +0800, Peng Wang wrote:
>> Now we have cpu_idle_force_poll which uses cpu_relax() waiting for
>> an arriving IPI, while sometimes busy loop on idle cpu is also
>> useful to provide consistent pipeline interference for hardware SMT.
>>
>> When hardware SMT is enabled, the switching between idle and
>> busy state of one cpu will cause performance fluctuation of
>> other sibling cpus on the same core.
>>
>> In pay-for-execution-time scenario, cloud service providers prefer
>> stable performance data to set stabel price for same workload.
>> Different execution time of the same workload caused by different
>> idle or busy state of sibling SMT cpus will make different bills, which
>> is confused for customers.
>>
>> Since there is no dynamic CPU time scaling based on SMT pipeline interference,
>> to coordinate sibling SMT noise no matter whether they are idle or not,
>> busy loop in idle state can provide approximately consistent pipeline interference.
>>
>> For example, a workload computing tangent and cotangent will finish in 9071ms when
>> sibling SMT cpus are idle, and 13299ms when sibling SMT cpus are computiing other workload.
>> This generate 32% performance fluctuation.
>>
>> SMT idle polling makes things slower, but we can set bigger cpu quota to make up
>> a deficiency. This also increase power consumption by 2.2%, which is acceptable.
>>
>> There may be some other possible solutions, while each has its own problem:
>> a) disbale hardware SMT, which means half of SMT is unused and more hardware cost.
>> b) busy loop in a userspace thread, but the cpu usage is confusing.
>>
>> We propose this patch to discuss the performance fluctuation problem related to SMT
>> pipeline interference, and any comments are welcome.
> 
> I think you missed April Fools' Day by a wide margin.
> 
> Lowering performance and increasing power usage is a direct

Siblings' noise depends on workloads, when persuing performance 
stability, we have to consider what performance data to keep:

a) the worst with all-time noise
b) the best with monopolizing a whole core by disabling SMT or using 
core scheduling, while wasting some logic CPUs
c) A number between the worst and the best which is hard to decide

That's where lowering performance comes from.

> contradiction to sanity. It also doesn't really work as advertised,
> if the siblings are competing for AVX resources the performance is a
> *lot* lower than when an AVX task is competing against a spinner like
> this.
> 

Yes, idle SMT busy loop polling can only provide approximately pipeline 
interference for normal instructions.

When it comes to AVX works, we notice an idea modifing CPU time 
accounting[1], do you think the combination can lead to a feasible
solution, or any other better ideas?

[1] https://www.usenix.org/conference/atc21/presentation/gottschlag