lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAE4VaGD6wwyLUyBWNTjKmk1Lnu034WnO8XGWSLg+kqYoZPGawA@mail.gmail.com>
Date:	Thu, 28 Jul 2016 23:48:30 +0200
From:	Jirka Hladky <jhladky@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel <linux-kernel@...r.kernel.org>,
	Kamil Kolakowski <kkolakow@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Jean-Pierre Lozi <jplozi@...ce.fr>,
	Alexandra Fedorova <sasha@....ubc.ca>,
	Baptiste Lepers <baptiste.lepers@...il.com>,
	Lauro Venancio <lvenanci@...hat.com>
Subject: Re: Kernel v4.7-rc5 - performance degradation upto 40% after
 disabling and re-enabling a core

Hi Peter,

I have updated regarding the performance degradation after disabling
and re-enabling a core.

It turns out that lu.C.x results show quite big variation and tests
have to be repeated several times and mean value of real time has to
be used to get reliable results.

There is NO regression on following CPUs

4x Xeon(R) CPU E5-4610 v2 @ 2.30GHz
4x Xeon(R) CPU E5-2690 v3 @ 2.60GHz

but there is regression (slow down by factor 6x) on

AMD Opteron(TM) Processor 6272

Kernel 4.7.0-0.rc7.git0.1.el7.x86_64

real_time to run ./lu.C.x benchmark (mean value out of 10 runs)

Right after boot: 273 seconds
After disabling and enabling a core: 1702 seconds!

So you were right that it's related to COD technology

> The Opteron 6272, which they use, is an Interlagos, that has something
> similar in that each package contains two nodes.

Lauro Venancio is now working on a fix.

Jirka


On Tue, Jul 12, 2016 at 11:04 AM, Jirka Hladky <jhladky@...hat.com> wrote:
> Hi Peter,
>
> have you a chance to look into this? Is there anything I can do to
> help you to fix it?
>
> Thanks a lot!
> Jirka
>
>
> On Wed, Jun 29, 2016 at 11:58 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>> On Wed, Jun 29, 2016 at 11:47:56AM +0200, Jirka Hladky wrote:
>>> Hi Peter,
>>>
>>> I think Cluster on Die technology was introduced in Haswell generation. The
>>> server I'm using is equipped with 4x Intel E5-4610 v2 (Ivy Bridge). I have
>>> double checked the BIOS and there is no cluster on die setting.
>>
>> Oh right, that's E5v3..
>>
>>> The authors of the paper have reported the issue on AMD Bulldozer CPU which
>>> also does not have COD technology.
>>
>> The Opteron 6272, which they use, is an Interlagos, that has something
>> similar in that each package contains two nodes.
>>
>> And their patch touches exactly that part of the x86 topo setup, the
>> match_die() && !same_node() condition, IOW same package, different node.
>>
>> That's not a path an Intel chip would trigger without COD support.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ