linux-kernel - Re: [PATCH v1 5/8] perf test: Tag parallel failing shell tests with "(exclusive)"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAP-5=fWep2dV4-1tzVDQ8z-Ud7tmnw4JBKqkpLoN=nRbbMpxVg@mail.gmail.com>
Date: Fri, 11 Oct 2024 09:52:10 -0700
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@...aro.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	Howard Chu <howardchu95@...il.com>, Athira Jajeev <atrajeev@...ux.vnet.ibm.com>, 
	Michael Petlan <mpetlan@...hat.com>, Veronika Molnarova <vmolnaro@...hat.com>, 
	Dapeng Mi <dapeng1.mi@...ux.intel.com>, Thomas Richter <tmricht@...ux.ibm.com>, 
	Ilya Leoshkevich <iii@...ux.ibm.com>, Colin Ian King <colin.i.king@...il.com>, 
	Weilin Wang <weilin.wang@...el.com>, Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v1 5/8] perf test: Tag parallel failing shell tests with "(exclusive)"

On Fri, Oct 11, 2024 at 3:29 AM James Clark <james.clark@...aro.org> wrote:
>
>
>
> On 11/10/2024 11:01 am, James Clark wrote:
> >
> >
> > On 11/10/2024 8:35 am, Ian Rogers wrote:
> >> Some shell tests compete for resources and so can't run with other
> >> tests, tag such tests.  The "(exclusive)" stems from shared/exclusive
> >> to describe how the tests run as if holding a lock.
> >>
> >> Signed-off-by: Ian Rogers <irogers@...gle.com>
> >> ---
> >>   tools/perf/tests/shell/perftool-testsuite_report.sh | 2 +-
> >>   tools/perf/tests/shell/record.sh                    | 2 +-
> >>   tools/perf/tests/shell/record_lbr.sh                | 2 +-
> >>   tools/perf/tests/shell/record_offcpu.sh             | 2 +-
> >>   tools/perf/tests/shell/stat_all_pmu.sh              | 2 +-
> >>   tools/perf/tests/shell/test_intel_pt.sh             | 2 +-
> >>   tools/perf/tests/shell/test_stat_intel_tpebs.sh     | 2 +-
> >>   7 files changed, 7 insertions(+), 7 deletions(-)
> >>
> >
> > The following ones would also need to be marked as exclusive, not sure
> > if you can include those here or you want me to send a patch:
> >
> >   tools/perf/tests/shell/coresight/asm_pure_loop.sh
> >   tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
> >   tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
> >   tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
> >   tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
> >   tools/perf/tests/shell/test_arm_coresight.sh
> >   tools/perf/tests/shell/test_arm_coresight_disasm.sh
> >   tools/perf/tests/shell/test_arm_spe.sh

I'll add it to v2 and add your suggested-by. Thanks.

> > In theory all tests using probes would also need to be exclusive because
> > they install and delete probes globally. In practice I don't think I saw
> > any failures, whether that's just luck or because of some skips I'm not
> > sure.
> >
> > And this one fails consistently in parallel mode on Arm:
> >
> >    22: Number of exit events of a simple workload
> >      : FAILED!

This looks like it could be a real issue. I believe the test is doing
uid filtering:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/task-exit.c?h=perf-tools-next#n49
uid filtering scans /proc looking for processes of the given uid. This
is inherently racy with processes exiting and we'd be better using a
BPF filter to drop samples with the wrong uid - same effect but no
racy /proc scan. I've seen the racy /proc scan cause termination
issues, so possibly this is the issue you are seeing.

It could also be that tweaking the retry count will fix things:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/task-exit.c?h=perf-tools-next#n134

Anyway, for now I think it is expedient to mark the test as exclusive.

> > But it's a C test so I assume there isn't an exclusive mechanism to skip
> > it? It doesn't look like it should be affected though, so maybe we could
> > leave it failing as a real bug.
> >
>
> Oh I see it says in the cover letter it can be set for C tests. But can
> that be done through all the existing TEST_CASE() etc macros?

Currently only whole suites can be exclusive. We could add macros for
exclusive C tests but my preference would be to make the test work
non-exclusive. I'll make test cases exclusive and mark this one.

Thanks,
Ian