[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y0awHa8oS5yal5M9@hirez.programming.kicks-ass.net>
Date: Wed, 12 Oct 2022 14:16:29 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Ravi Bangoria <ravi.bangoria@....com>
Cc: acme@...nel.org, alexander.shishkin@...ux.intel.com,
jolsa@...hat.com, namhyung@...nel.org, songliubraving@...com,
eranian@...gle.com, ak@...ux.intel.com, mark.rutland@....com,
frederic@...nel.org, maddy@...ux.ibm.com, irogers@...gle.com,
will@...nel.org, robh@...nel.org, mingo@...hat.com,
catalin.marinas@....com, ndesaulniers@...gle.com,
srw@...dewatkins.net, linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
sandipan.das@....com, ananth.narayan@....com, kim.phillips@....com,
santosh.shukla@....com
Subject: Re: [PATCH v2] perf: Rewrite core context handling
On Wed, Oct 12, 2022 at 02:09:00PM +0530, Ravi Bangoria wrote:
> > @@ -3366,6 +3370,14 @@ static void perf_event_sync_stat(struct
> > }
> > }
> >
> > +#define list_for_each_entry_double(pos1, pos2, head1, head2, member) \
> > + for (pos1 = list_first_entry(head1, typeof(*pos1), member), \
> > + pos2 = list_first_entry(head2, typeof(*pos2), member); \
> > + !list_entry_is_head(pos1, head1, member) && \
> > + !list_entry_is_head(pos2, head2, member); \
> > + pos1 = list_next_entry(pos1, member), \
> > + pos2 = list_next_entry(pos2, member))
> > +
> > static void perf_event_swap_task_ctx_data(struct perf_event_context *prev_ctx,
> > struct perf_event_context *next_ctx)
> > {
> > @@ -3374,16 +3386,9 @@ static void perf_event_swap_task_ctx_dat
> > if (!prev_ctx->nr_task_data)
> > return;
> >
> > - prev_epc = list_first_entry(&prev_ctx->pmu_ctx_list,
> > - struct perf_event_pmu_context,
> > - pmu_ctx_entry);
> > - next_epc = list_first_entry(&next_ctx->pmu_ctx_list,
> > - struct perf_event_pmu_context,
> > - pmu_ctx_entry);
> > -
> > - while (&prev_epc->pmu_ctx_entry != &prev_ctx->pmu_ctx_list &&
> > - &next_epc->pmu_ctx_entry != &next_ctx->pmu_ctx_list) {
> > -
> > + list_for_each_entry_double(prev_epc, next_epc,
> > + &prev_ctx->pmu_ctx_list, &next_ctx->pmu_ctx_list,
> > + pmu_ctx_entry) {
>
> There are more places which can use list_for_each_entry_double().
> I'll fix those.
I've gone and renamed it: double_list_for_each_entry(), but yeah, didn't
look too hard for other users.
> > @@ -4859,7 +4879,14 @@ static void put_pmu_ctx(struct perf_even
> > if (epc->ctx) {
> > struct perf_event_context *ctx = epc->ctx;
> >
> > - // XXX ctx->mutex
> > + /*
> > + * XXX
> > + *
> > + * lockdep_assert_held(&ctx->mutex);
> > + *
> > + * can't because of the call-site in _free_event()/put_event()
> > + * which isn't always called under ctx->mutex.
> > + */
>
> Yes. I came across the same and could not figure out how to solve
> this. So Just kept XXX as is.
Yeah, I can sorta fix it, but it's ugly so there we are.
> >
> > WARN_ON_ONCE(list_empty(&epc->pmu_ctx_entry));
> > raw_spin_lock_irqsave(&ctx->lock, flags);
> > @@ -12657,6 +12675,13 @@ perf_event_create_kernel_counter(struct
> > goto err_unlock;
> > }
> >
> > + pmu_ctx = find_get_pmu_context(pmu, ctx, event);
> > + if (IS_ERR(pmu_ctx)) {
> > + err = PTR_ERR(pmu_ctx);
> > + goto err_unlock;
> > + }
> > + event->pmu_ctx = pmu_ctx;
>
> We should call find_get_pmu_context() with ctx->mutex held and thus
> above perf_event_create_kernel_counter() change. Is my understanding
> correct?
That's the intent yeah. But due to not always holding ctx->mutex over
put_pmu_ctx() this might be moot. I'm almost through auditing epc usage
and I think ctx->lock is sufficient, fingers crossed.
> > +
> > if (!task) {
> > /*
> > * Check if the @cpu we're creating an event for is online.
> > @@ -12998,7 +13022,7 @@ void perf_event_free_task(struct task_st
> > struct perf_event_context *ctx;
> > struct perf_event *event, *tmp;
> >
> > - ctx = rcu_dereference(task->perf_event_ctxp);
> > + ctx = rcu_access_pointer(task->perf_event_ctxp);
>
> We dereference ctx pointer but with mutex and lock held. And thus
> rcu_access_pointer() is sufficient. Is my understanding correct?
We do not in fact hold ctx->lock here IIRC; but this is a NULL test, if
it is !NULL we know we have a reference on it and are good.
Powered by blists - more mailing lists