lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Sep 2017 19:19:53 +0800
From:   qiaozhou <qiaozhou@...micro.com>
To:     Vikram Mulukutla <markivx@...eaurora.org>,
        Will Deacon <will.deacon@....com>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        John Stultz <john.stultz@...aro.org>, <sboyd@...eaurora.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Wang Wilbur <wilburwang@...micro.com>,
        Marc Zyngier <marc.zyngier@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        <linux-kernel-owner@...r.kernel.org>, <sudeep.holla@....com>
Subject: Re: [Question]: try to fix contention between expire_timers and
 try_to_del_timer_sync



On 2017年08月29日 07:12, Vikram Mulukutla wrote:
> 
> Well here's something interesting. I tried a different platform and 
> found that
> the workaround doesn't help much at all, similar to Qiao's observation 
> on his b.L
> chipset. Something to do with the WFE implementation or event-stream?

Hi Vikram,

I did some experiments, to tune the ddr controller(and ddr ram) freq, 
and cci freq. And the result is as below:

cpu2: a53, 832MHz, cpu7: a73, 1.75Hz
cci: 832M
dclk: DDR controller clock.(data rate = 4 * dclk)
With cpu_relax bodging patch:
==============================================================
dclk   | cpu2 time | cpu2 counter | cpu7 time | cpu7 counter |
=======|===========|==============|===========|==============|
78M    |       8906|         55438|         13|       4015789|
156M   |       5964|         75109|          4|       8229050|
500M   |        102|       5984783|          1|       6400885|
600M   |         16|       6233601|          1|       6504718|
==============================================================

I suspect that the global exclusive monitor in ddr controller may play 
an important part. With ddr frequency is higher enough, it seems to 
handle the exclusive requests efficiently and fairly.

If reducing cci freq to a lower value, the result of little core drops a 
lot again.

cpu2: a53, 832MHz, cpu7: a73, 1.75Hz
cci: 416M
dclk: DDR controller clock.(data rate = 4 * dclk)
With cpu_relax bodging patch:
==============================================================
dclk   | cpu2 time | cpu2 counter | cpu7 time | cpu7 counter |
=======|===========|==============|===========|==============|
78M    |       8837|         10596|         11|       3873635|
156M   |      17597|         10211|          4|       6513493|
500M   |      10888|         13214|          2|       8916396|
600M   |       8934|         15842|          2|       9394124|
==============================================================

I guess the result on your different platform might be related with DDR 
frequency too.

Best Regards
Qiao

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ