[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEAau+v4qBQSt13s@e129823.arm.com>
Date: Wed, 4 Jun 2025 11:06:51 +0100
From: Yeoreum Yun <yeoreum.yun@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Leo Yan <leo.yan@....com>, mingo@...hat.com, mingo@...nel.org,
acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com,
kan.liang@...ux.intel.com, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, David Wang <00107082@....com>
Subject: Re: [PATCH 1/1] perf/core: fix dangling cgroup pointer in cpuctx
Hi Peter,
> On Tue, Jun 03, 2025 at 04:44:14PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 03, 2025 at 03:00:40PM +0100, Leo Yan wrote:
> >
> > > > + if (event->state > PERF_EVENT_STATE_OFF)
> > > > + perf_cgroup_event_disable(event, ctx);
> > > > +
> > >
> > > As we discussed, seems to me, the issue is caused by an ambigous state
> > > machine transition:
> > >
> > > When a PMU event state is PERF_EVENT_STATE_EXIT, the current code does
> > > not transite the state to PERF_EVENT_STATE_OFF. As a result, the
> > > list_del_event() function skips to clean up cgroup pointer for non OFF
> > > states. This is different from the code prior to the commit a3c3c6667,
> > > which transits states EXIT -> INACTIVE -> OFF.
> >
> > Right.
> >
> > > My suggestion is not reliable. Roughly read code, except for the
> > > PERF_EVENT_STATE_EXIT case, I think other error cases should also clean
> > > up the cgroup pointer. The reason is I don't see other places to
> > > clean up the cgroup pointer for these error cases:
> > >
> > > PERF_EVENT_STATE_REVOKED
> > > PERF_EVENT_STATE_DEAD
> >
> > Those should be done here; on the first transition into these states.
> >
> > > Only in the PERF_EVENT_STATE_ERROR state, we don't need to cleanup
> > > cgroup as this has already been handled in merge_sched_in().
> > >
> > > So a correct condition would be:
> > >
> > > if (event->state > PERF_EVENT_STATE_OFF ||
> > > event->state <= PERF_EVENT_STATE_EXIT)
> > > perf_cgroup_event_disable(event, ctx);
> >
> > I'm too tired to get my head straight. I'll look tomorrow.
>
>
> Right; had a sleep. Lets do this :-)
>
>
> So the normal states are:
>
> ACTIVE ---.
> ^ |
> | |
> sched_{in,out}() |
> | |
> v |
> ,---> INACTIVE <--+
> | |
> | {dis,en}able()
> sched_in() |
> | OFF <--+
> | |
> `---> ERROR ---'
>
> That is:
>
> sched_in: INACTIVE -> {ACTIVE,ERROR}
> sched_out: ACTIVE -> INACTIVE
> disable: {ACTIVE,INACTIVE} -> OFF
> enable: {OFF,ERROR} -> INACTIVE
>
> Where OFF/ERROR are 'disable' and have this perf_cgroup_event_disable()
> called.
>
> Then we have {EXIT,REVOKED,DEAD} states which are various shades of
> defunct events:
>
> - EXIT means task that the even was assigned to died, but child events
> still live, and further children can still be created. But the event
> itself will never be active again. It can only transition to
> {REVOKED,DEAD};
I have a slight quetions. after parent event set EXIT,
Does EXIT event should be inherited?
for example
parent task(0, ...) -> parent_event(0, parent_event:NULL)
` child_task(1, parent:0) -> child_event(1, parent_event:0)
` child_task(2, parent:1) -> child_event(2, parent_event:0)
In this case when parent task(0) is exited,
parent->event will be set as EXIT state.
But suppose the child_task(2) try to fork (child_task3) and
inherit the event (create child_event(3, parent_event:0),
and at the fork, forking can observe the parent event state as "EXIT".
In thie situation why child_event(3, parent_event:0) should be created for
child_task(3)?
> - REVOKED means the PMU the event was associated with is gone; all
> functionality is stopped but the event is still alive. Can only
> transition to DEAD;
>
> - DEAD event really is DYING tearing down state and freeing bits.
>
> And now we have the sitation that __perf_remove_from_context() can do:
>
> {ACTIVE,INACTIVE,OFF,ERROR} -> {OFF,EXIT,REVOKED,DEAD}
>
> Where the {OFF,ERROR} -> * transition already have
> perf_cgroup_event_disable(), but the {ACTIVE,INACTIVE} -> * part has
> not.
>
> The condition:
>
> if (event->state > PERF_EVENT_STATE_OFF)
>
> captures exactly that {ACTIVE,INACTIVE} that still needs the cgroup
> disable. Every other state is already a disabled state.
>
>
> Agreed?
>
> Also, I should probably stick most of all this in a comment somewhere.
--
Sincerely,
Yeoreum Yun
Powered by blists - more mailing lists