lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <nchqtx6dp3e2pcp64j7fdeyauarl4funonwkjx3nn5zztgpfw2@2xpb44k52ke6>
Date:   Wed, 20 Sep 2023 08:06:09 +0200
From:   Maciej Wieczór-Retman 
        <maciej.wieczor-retman@...el.com>
To:     "Luck, Tony" <tony.luck@...el.com>
CC:     "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        Peter Newman <peternewman@...gle.com>,
        Jonathan Corbet <corbet@....net>,
        Shuah Khan <skhan@...uxfoundation.org>,
        "x86@...nel.org" <x86@...nel.org>,
        James Morse <james.morse@....com>,
        Jamie Iles <quic_jiles@...cinc.com>,
        "Babu Moger" <babu.moger@....com>,
        Randy Dunlap <rdunlap@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "patches@...ts.linux.dev" <patches@...ts.linux.dev>
Subject: Re: [PATCH v5 8/8] selftests/resctrl: Adjust effective L3 cache size
 when SNC enabled

Hi, thanks for the reply

On 2023-09-19 at 14:36:06 +0000, Luck, Tony wrote:
>> On a system that has SNC disabled the function reports both "node_cpus"
>> and "cache_cpus" equal to 56. In this case snc_ways() returns "2". It is
>> the same on a system with SNC enabled that reports the previously mentioned
>> variables to be different by a factor of two (36 and 72).
>
>> Is it possible for node_cpus and cache_cpus to not be multiples of each
>> other? (as in for example cache_cpus being 10 and node_cpus being 21?).
>> If not I'd suggest using "==" instead of ">=".
>
>Some CPUs may be offline when the test is run. E.g. with one CPU offline on SNC
>node 0, you'd see node_cpus = 35 and cache_cpus = 71. But with one CPU offline
>on node 1, you'd have node_cpus = 36, cache_cpus = 71.

Okay, thanks, good to know. On systems with disabled SNC that number
should be equal even if some CPUs were offline, right? I was mostly
concerned that the previous version was returning the same number
whether SNC was enabled with 2 nodes or disabled.

>> If yes then I guess something like this could work? :
>
>+     if (node_cpus >= cache_cpus)
>+             return 1;
>+     else if (2 * node_cpus >= cache_cpus)
>+             return 2;
>+     else if (4 * node_cpus >= cache_cpus)
>+             return 4;
>
>This returns "4" for the 36 71 case. But should still be "2".
>
>>> PS. I did my tests on two Intel Ice Lakes.
>
>Perhaps easier to play with the algorithm in user code?
>
>#include <stdio.h>
>#include <stdlib.h>
>
>static int snc(int node_cpus, int cache_cpus)
>{
>     if (node_cpus >= cache_cpus)
>             return 1;
>     else if (2 * node_cpus >= cache_cpus)
>             return 2;
>     else if (4 * node_cpus >= cache_cpus)
>             return 4;
>     return -1;
>}
>
>int main(int argc, char **argv)
>{
>        printf("%d\n", snc(atoi(argv[1]), atoi(argv[2])));
>
>        return 0;
>}

My previous understanding was that the presence of ">=" comparison
implied the number of node_cpus could somehow get larger. So I
assumed that keeping it that way would be sufficient but now I can see
that wouldn't be the case.

>
>N.B. it's probably not possible to handle the case where somebody took ALL the CPUs in SNC
>node 1 offline (or SNC nodes 1,2,3 for the SNC 4 case).
>
>I think it reasonable that the code handle some simple "small number of CPUs offline" cases.
>But don't worry too much about cases where the user has done something extreme.
>
>-Tony

What about outputing this value to userspace from resctrl? The ratio is
already saved inside snc_nodes_per_l3_cache variable. And that would
help avoid these difficult cases when some cpus are offline which could
cause snc_ways() to return a wrong value. Or are there some pitfalls
to that approach?

-- 
Kind regards
Maciej Wieczór-Retman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ