lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf59f587-9ca2-4f0d-b412-69b559acbabb@intel.com>
Date: Fri, 29 Mar 2024 09:37:12 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>
CC: <linux-doc@...r.kernel.org>, Fenghua Yu <fenghua.yu@...el.com>, "Peter
 Newman" <peternewman@...gle.com>, James Morse <james.morse@....com>, "Babu
 Moger" <babu.moger@....com>, Drew Fustini <dfustini@...libre.com>,
	<x86@...nel.org>, <linux-kernel@...r.kernel.org>, <patches@...ts.linux.dev>
Subject: Re: [PATCH] Documentation/x86: Document resctrl bandwidth control
 units are MiB

Hi Tony,

On 3/29/2024 8:31 AM, Tony Luck wrote:
> On Thu, Mar 28, 2024 at 06:01:33PM -0700, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 3/22/2024 11:20 AM, Tony Luck wrote:
>>> The memory bandwidth software controller uses 2^20 units rather than
>>> 10^6. See mbm_bw_count() which computes bandwidth using the "SZ_1M"
>>> Linux define for 0x00100000.
>>>
>>> Update the documentation to use MiB when describing this feature.
>>> It's too late to fix the mount option "mba_MBps" as that is now an
>>> established user interface.
>>
>> I see that this is merged already but I do not think this is correct.
> 
> I was surprised that Ingo merged it without giving folks a chance to
> comment.
> 
>> Shouldn't the implementation be fixed instead? Looking at the implementation
>> the intent appears to be clear that the goal is to have bandwidth be
>> MBps .... that is when looking from documentation to the define
>> (MBA_MAX_MBPS) to the comments of the function you reference above
>> mbm_bw_count(). For example, "...and delta bandwidth in MBps ..."
>> and "...maintain values in MBps..."
> 
> Difficult to be sure of intent. But in general when people talk about
> "megabytes" in the context of memory they mean 2^20. Storage capacity
> on computers was originally in 2^20 units until the marketing teams
> at disk drive manufacturers realized they could print numbers 4.8% bigger
> on their products by using SI unit 10^6 Mega prefix (rising to 7.3% with
> Giga and 10% with Tera).

This is not so obvious to me. I hear what you are saying about storage
capacity but the topic here is memory bandwidth and here I find the custom
to be that MB/s means 10^6 bytes per second. That is looking from how DDR
bandwidth is documented to how benchmarks like
https://github.com/intel/memory-bandwidth-benchmarks report the data, to
what wikipedia says in https://en.wikipedia.org/wiki/Memory_bandwidth.

I also took a sample of what the perf side of things may look like
and, for example, when looking at;
tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
I understand that the custom for bandwidth is MB/s. For example:

    {
        "BriefDescription": "DDR memory read bandwidth (MB/sec)",
        "MetricExpr": "UNC_M_CAS_COUNT.RD * 64 / 1e6 / duration_time",
        "MetricName": "memory_bandwidth_read",
        "ScaleUnit": "1MB/s"
    },

> 
> It is clear that the code uses 2^20 as it converts from bytes using
> a right shift by 20.

Right. This appears to be a bug.

> 
> Fixing the code would change the legacy API. Folks with a schemata
> file that sets a limit of 5000 MB/s would find their applications
> throttled by an addtional 4.8% on upgrading to a kernel with this
> "fix".

Wouldn't this be the right thing to do though? If the user sets a limit
of 5000MB/s then I believe the expectation is that it should not exceed
that bandwidth.

>> To me this change creates significant confusion since it now contradicts
>> with the source code and comments I reference above. Not to mention the
>> discrepancy with user documentation.
>>
>> If you believe that this should be MiB then should the
>> source and comments not also be changed to reflect that? Or alternatively,
>> why not just fix mbm_bw_count() to support the documentation and what
>> it appears to be intended to do. If users have been using the interface
>> expecting MBps then this seems more like a needed bugfix than 
>> a needed documentation change.
> 
> I agree that the comments need to be fixed. I will spin up a patch.

It is not just the comments but the constants and variables also. All point
to MBps.

Also note that there remain several examples in the resctrl doc (see
section "Memory bandwidth Allocation and monitoring") that continue to
use GB/s. Are you really proposing that all should be changed to GiB/s?

In addition to all above this change would fragment resctrl even more
with AMD's memory bandwidth control clearly being in GB/s.

I continue to find this change to be the source of confusion and diverges
resctrl from the custom.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ