[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUmrnOUOBWcypD=Q7bCSQ3HTnicRXhr8nmSRqcbZv7Mmw@mail.gmail.com>
Date: Wed, 28 May 2025 11:02:44 -0700
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>, Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
Danilo Krummrich <dakr@...nel.org>, Jiapeng Chong <jiapeng.chong@...ux.alibaba.com>,
James Clark <james.clark@...aro.org>, Howard Chu <howardchu95@...il.com>,
Weilin Wang <weilin.wang@...el.com>, Stephen Brennan <stephen.s.brennan@...cle.com>,
Andi Kleen <ak@...ux.intel.com>, Dmitry Vyukov <dvyukov@...gle.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
On Wed, May 28, 2025 at 10:53 AM Namhyung Kim <namhyung@...nel.org> wrote:
>
> Hi Ian,
>
> On Tue, May 27, 2025 at 08:26:34PM -0700, Ian Rogers wrote:
> > Reading through the evsel->evlist may seg fault if a sample arrives
> > when the evlist is being deleted. Detect this case and ignore samples
> > arriving when the evlist is being deleted.
> >
> > Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
> > Signed-off-by: Ian Rogers <irogers@...gle.com>
> > ---
> > tools/perf/util/intel-tpebs.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> > index 4ad4bc118ea5..3b92ebf5c112 100644
> > --- a/tools/perf/util/intel-tpebs.c
> > +++ b/tools/perf/util/intel-tpebs.c
> > @@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
> >
> > static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
> > {
> > - pid_t workload_pid = t->evsel->evlist->workload.pid;
> > - pid_t sample_pid = sample->pid;
> > + pid_t workload_pid, sample_pid = sample->pid;
> >
> > + /*
> > + * During evlist__purge the evlist will be removed prior to the
> > + * evsel__exit calling evsel__tpebs_close and taking the
> > + * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
> > + */
> > + if (t->evsel->evlist == NULL)
> > + return true;
> > +
> > + workload_pid = t->evsel->evlist->workload.pid;
>
> I'm curious if there's a chance of TOCTOU race. It'd certainly help
> the segfault but would this code prevent it completely?
Good point. I think the race is already small as it doesn't happen
without sanitizers for me.
Thinking about the evlist problem. When a destructor (evlist__delete)
it is generally assumed the code is being single threaded and in C++
clang's -Wthread-safety will ignore destructors for this reason
(annoying imo as it hides bugs). I don't see a good way to solve that
for the evlist and evsel for the TPEBS case without using reference
counting. Adding reference counts to evlist and evsel would be do-able
as we could use reference count checking, but it would be a large and
invasive change. Wdyt?
Thanks,
Ian
> Thanks,
> Namhyung
>
>
> > if (workload_pid < 0 || workload_pid == sample_pid)
> > return false;
> >
> > --
> > 2.49.0.1238.gf8c92423fb-goog
> >
Powered by blists - more mailing lists