lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf7439c4-f72c-a145-5a65-84ae15c5d96f@intel.com>
Date:   Tue, 12 Sep 2023 15:10:53 -0700
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
        Shuah Khan <shuah@...nel.org>,
        Shuah Khan <skhan@...uxfoundation.org>,
        <linux-kselftest@...r.kernel.org>,
        Maciej Wieczór-Retman 
        <maciej.wieczor-retman@...el.com>
CC:     <linux-kernel@...r.kernel.org>,
        Shaopeng Tan <tan.shaopeng@...fujitsu.com>,
        <stable@...r.kernel.org>
Subject: Re: [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in
 MBA/MBM tests

Hi Ilpo,

On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> 5% difference upper bound for success is a bit on the low side for the

"a bit on the low side" is very vague.

> MBA and MBM tests. Some platforms produce outliers that are slightly
> above that, typically 6-7%.
> 
> Relaxing the MBA/MBM success bound to 8% removes most of the failures
> due those frequent outliers.

This description needs more context on what issue is being solved here.
What does the % difference represent? How was new percentage determined?

Did you investigate why there are differences between platforms? From
what I understand these tests measure memory bandwidth using perf and
resctrl and then compare the difference. Are there interesting things
about the platforms on which the difference is higher than 5%? Could
those be systems with multiple sockets (and thus multiple PMUs that need
to be setup, reset, and read)? Can the reading of the counters be improved
instead of relaxing the success criteria? A quick comparison between
get_mem_bw_imc() and get_mem_bw_resctrl() makes me think that a difference
is not surprising ... note how the PMU counters are started and reset
(potentially on multiple sockets) at every iteration while the resctrl
counters keep rolling and new values are just subtracted from previous.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ