linux-kernel - Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAP-5=fXRphB0gU6CxAuj9Fy40sbwub23RbLLo=5LEY=-_D=3+g@mail.gmail.com>
Date:   Wed, 6 Apr 2022 17:35:24 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     Athira Rajeev <atrajeev@...ux.vnet.ibm.com>
Cc:     acme@...nel.org, jolsa@...nel.org, disgoel@...ux.vnet.ibm.com,
        mpe@...erman.id.au, linux-perf-users@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org, maddy@...ux.vnet.ibm.com,
        rnsastry@...ux.ibm.com, kjain@...ux.ibm.com,
        linux-kernel@...r.kernel.org, srikar@...ux.vnet.ibm.com
Subject: Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with
 machines having #CPUs > 1K

On Wed, Apr 6, 2022 at 10:51 AM Athira Rajeev
<atrajeev@...ux.vnet.ibm.com> wrote:
>
> The perf benchmark for collections: numa, futex and epoll
> hits failure in system configuration with CPU's more than 1024.
> These benchmarks uses "sched_getaffinity" and "sched_setaffinity"
> in the code to work with affinity.
>
> Example snippet from numa benchmark:
> <<>>
> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
> Aborted (core dumped)
> <<>>
>
> bind_to_node function uses "sched_getaffinity" to save the cpumask.
> This fails with EINVAL because the default mask size in glibc is 1024.
>
> Similarly in futex and epoll benchmark, uses sched_setaffinity during
> pthread_create with affinity. And since it returns EINVAL in such system
> configuration, benchmark doesn't run.
>
> To overcome this 1024 CPUs mask size limitation of cpu_set_t,
> change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to
> allocate cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.
>
> Fix all the relevant places in the code to use mask size which is large
> enough to represent number of possible CPU's in the system.
>
> Fix parse_setup_cpu_list function in numa bench to check if input CPU
> is online before binding task to that CPU. This is to fix failures where,
> though CPU number is within max CPU, it could happen that CPU is offline.
> Here, sched_setaffinity will result in failure when using cpumask having
> that cpu bit set in the mask.
>
> Patch 1 and Patch 2 address fix for perf bench futex and perf bench
> epoll benchmark. Patch 3 and Patch 4 address fix in perf bench numa
> benchmark
>
> Athira Rajeev (4):
>   tools/perf: Fix perf bench futex to correct usage of affinity for
>     machines with #CPUs > 1K
>   tools/perf: Fix perf bench epoll to correct usage of affinity for
>     machines with #CPUs > 1K
>   tools/perf: Fix perf numa bench to fix usage of affinity for machines
>     with #CPUs > 1K
>   tools/perf: Fix perf bench numa testcase to check if CPU used to bind
>     task is online
>
> Changelog:
> From v1 -> v2:
>  Addressed review comment from Ian Rogers to do
>  CPU_FREE in a cleaner way.
>  Added Tested-by from Disha Goel


The whole set:
Acked-by: Ian Rogers <irogers@...gle.com>

Thanks,
Ian

>  tools/perf/bench/epoll-ctl.c           |  25 ++++--
>  tools/perf/bench/epoll-wait.c          |  25 ++++--
>  tools/perf/bench/futex-hash.c          |  26 ++++--
>  tools/perf/bench/futex-lock-pi.c       |  21 +++--
>  tools/perf/bench/futex-requeue.c       |  21 +++--
>  tools/perf/bench/futex-wake-parallel.c |  21 +++--
>  tools/perf/bench/futex-wake.c          |  22 ++++--
>  tools/perf/bench/numa.c                | 105 ++++++++++++++++++-------
>  tools/perf/util/header.c               |  43 ++++++++++
>  tools/perf/util/header.h               |   1 +
>  10 files changed, 242 insertions(+), 68 deletions(-)
>
> --
> 2.35.1
>