lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Nov 2023 13:49:32 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Tony Luck <tony.luck@...el.com>
CC:     Fenghua Yu <fenghua.yu@...el.com>,
        Peter Newman <peternewman@...gle.com>,
        Jonathan Corbet <corbet@....net>,
        Shuah Khan <skhan@...uxfoundation.org>, <x86@...nel.org>,
        Shaopeng Tan <tan.shaopeng@...itsu.com>,
        James Morse <james.morse@....com>,
        Jamie Iles <quic_jiles@...cinc.com>,
        Babu Moger <babu.moger@....com>,
        Randy Dunlap <rdunlap@...radead.org>,
        <linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <patches@...ts.linux.dev>
Subject: Re: [PATCH v3] x86/resctrl: mba_MBps: Fall back to total b/w if local
 b/w unavailable

Hi Tony,

On 11/7/2023 1:15 PM, Tony Luck wrote:
> On Fri, Nov 03, 2023 at 02:43:15PM -0700, Reinette Chatre wrote:
>> On 10/26/2023 1:02 PM, Tony Luck wrote:
>>> If local bandwidth measurement is not available, do not give up on
>>> providing the "mba_MBps" feedback option completely, make the code fall
>>> back to using total bandwidth.
>>
>> It is interesting to me that the "fall back" is essentially a drop-in
>> replacement without any adjustments to the data/algorithm.
> 
> The algorithm is, by necessity, very simple. Essentially "if measured
> bandwidth is above desired target, apply one step extra throttling.
> Reverse when bandwidth is below desired level." I'm not sure what tweaks
> are possible.
> 
>> Can these measurements be considered equivalent? Could a user now perhaps
>> want to experiment by disabling local bandwidth measurement to explore if
>> system behaves differently when using total memory bandwidth? What
>> would have a user choose one over the other (apart from when user
>> is forced by system ability)?
> 
> This may be interesting. I dug around in the e-mail archives to see if
> there was any discussion on why "local" was picked as the feedback
> measurement rather that "total". But I couldn't find anything.
> 
> Thinking about it now, "total" feels like a better choice. Why would
> you not care about off-package memory bandwidth? In pathological cases
> all the memory traffic might be going off package, but the existing
> mba_MBps algorithm would *reduce* the amount of throttling, eventually
> to zero.
> 
> Maybe additional an mount option "mba_MBps_total" so the user can pick
> total instead of local?

Is this something for which a remount is required? Can it not perhaps be
changed at runtime?

> 
>>>
>>> Signed-off-by: Tony Luck <tony.luck@...el.com>
>>> ---
>>> Change since v2:
>>>
>>> Babu doesn't like the global variable. So here's a version without it.
>>>
>>> Note that my preference is still the v2 version. But as I tell newbies
>>> to Linux "Your job isn't to get YOUR patch upstream. You job is to get
>>> the problem fixed.".  So taking my own advice I don't really mind
>>> whether v2 or v3 is applied.
>>>
>>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 43 ++++++++++++++++++--------
>>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 +-
>>>  2 files changed, 31 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> index f136ac046851..29e86310677d 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> @@ -418,6 +418,20 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
>>>  	return 0;
>>>  }
>>>  
>>> +/*
>>> + * For legacy compatibility use the local memory bandwidth to drive
>>> + * the mba_MBps feedback control loop. But on platforms that do not
>>> + * provide the local event fall back to use the total bandwidth event
>>> + * instead.
>>> + */
>>> +static enum resctrl_event_id pick_mba_mbps_event(void)
>>> +{
>>> +	if (is_mbm_local_enabled())
>>> +		return QOS_L3_MBM_LOCAL_EVENT_ID;
>>> +
>>> +	return QOS_L3_MBM_TOTAL_EVENT_ID;
>>> +}
>>
>> Can there be a WARN here to catch the unlikely event that
>> !is_mbm_total_enabled()?
>> This may mean the caller (in update_mba_bw()) needs to move
>> to code protected by is_mbm_enabled().
> 
> All this code is under the protection of the check at mount time
> done by supports_mba_mbps()
> 
> static bool supports_mba_mbps(void)
> {
>         struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
> 
>         return (is_mbm_enabled() &&
>                 r->alloc_capable && is_mba_linear());
> }
> 
> Adding even more run-time checks seems overkill.

Refactoring the code into a function but then implicitly assume and
require that the function be called in specific flows on systems with
particular environment does not sound appealing to me.

Another alternative, since only one caller of this function remains,
is to remove this function and instead open code it within update_mba_bw(),
replacing the is_mbm_enabled() call.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ