lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEAau+v4qBQSt13s@e129823.arm.com>
Date: Wed, 4 Jun 2025 11:06:51 +0100
From: Yeoreum Yun <yeoreum.yun@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Leo Yan <leo.yan@....com>, mingo@...hat.com, mingo@...nel.org,
	acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
	alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
	irogers@...gle.com, adrian.hunter@...el.com,
	kan.liang@...ux.intel.com, linux-perf-users@...r.kernel.org,
	linux-kernel@...r.kernel.org, David Wang <00107082@....com>
Subject: Re: [PATCH 1/1] perf/core: fix dangling cgroup pointer in cpuctx

Hi Peter,

> On Tue, Jun 03, 2025 at 04:44:14PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 03, 2025 at 03:00:40PM +0100, Leo Yan wrote:
> >
> > > > +	if (event->state > PERF_EVENT_STATE_OFF)
> > > > +		perf_cgroup_event_disable(event, ctx);
> > > > +
> > >
> > > As we discussed, seems to me, the issue is caused by an ambigous state
> > > machine transition:
> > >
> > > When a PMU event state is PERF_EVENT_STATE_EXIT, the current code does
> > > not transite the state to PERF_EVENT_STATE_OFF. As a result, the
> > > list_del_event() function skips to clean up cgroup pointer for non OFF
> > > states. This is different from the code prior to the commit a3c3c6667,
> > > which transits states EXIT -> INACTIVE -> OFF.
> >
> > Right.
> >
> > > My suggestion is not reliable. Roughly read code, except for the
> > > PERF_EVENT_STATE_EXIT case, I think other error cases should also clean
> > > up the cgroup pointer.  The reason is I don't see other places to
> > > clean up the cgroup pointer for these error cases:
> > >
> > >   PERF_EVENT_STATE_REVOKED
> > >   PERF_EVENT_STATE_DEAD
> >
> > Those should be done here; on the first transition into these states.
> >
> > > Only in the PERF_EVENT_STATE_ERROR state, we don't need to cleanup
> > > cgroup as this has already been handled in merge_sched_in().
> > >
> > > So a correct condition would be:
> > >
> > >     if (event->state > PERF_EVENT_STATE_OFF ||
> > >         event->state <= PERF_EVENT_STATE_EXIT)
> > >         perf_cgroup_event_disable(event, ctx);
> >
> > I'm too tired to get my head straight. I'll look tomorrow.
>
>
> Right; had a sleep. Lets do this :-)
>
>
> So the normal states are:
>
> 	    ACTIVE    ---.
> 	       ^         |
> 	       |         |
>        sched_{in,out}()  |
> 	       |         |
> 	       v         |
>       ,---> INACTIVE  <--+
>       |                  |
>       |            {dis,en}able()
>   sched_in()             |
>       |       OFF     <--+
>       |                  |
>       `--->  ERROR    ---'
>
> That is:
>
> sched_in:	INACTIVE          -> {ACTIVE,ERROR}
> sched_out:	ACTIVE            -> INACTIVE
> disable:	{ACTIVE,INACTIVE} -> OFF
> enable:         {OFF,ERROR}       -> INACTIVE
>
> Where OFF/ERROR are 'disable' and have this perf_cgroup_event_disable()
> called.
>
> Then we have {EXIT,REVOKED,DEAD} states which are various shades of
> defunct events:
>
>  - EXIT means task that the even was assigned to died, but child events
>    still live, and further children can still be created. But the event
>    itself will never be active again. It can only transition to
>    {REVOKED,DEAD};

I have a slight quetions. after parent event set EXIT,
Does EXIT event should be inherited?

for example

   parent task(0, ...) -> parent_event(0, parent_event:NULL)
     ` child_task(1, parent:0) -> child_event(1, parent_event:0)
         ` child_task(2, parent:1) -> child_event(2, parent_event:0)

In this case when parent task(0) is exited,
parent->event will be set as EXIT state.

But suppose the child_task(2) try to fork (child_task3) and
inherit the event (create child_event(3, parent_event:0),
and at the fork, forking can observe the parent event state as "EXIT".
In thie situation why child_event(3, parent_event:0) should be created for
child_task(3)?

>  - REVOKED means the PMU the event was associated with is gone; all
>    functionality is stopped but the event is still alive. Can only
>    transition to DEAD;
>
>  - DEAD event really is DYING tearing down state and freeing bits.
>
> And now we have the sitation that __perf_remove_from_context() can do:
>
>   {ACTIVE,INACTIVE,OFF,ERROR} -> {OFF,EXIT,REVOKED,DEAD}
>
> Where the {OFF,ERROR} -> * transition already have
> perf_cgroup_event_disable(), but the {ACTIVE,INACTIVE} -> * part has
> not.
>
> The condition:
>
>   if (event->state > PERF_EVENT_STATE_OFF)
>
> captures exactly that {ACTIVE,INACTIVE} that still needs the cgroup
> disable. Every other state is already a disabled state.
>
>
> Agreed?
>
> Also, I should probably stick most of all this in a comment somewhere.

--
Sincerely,
Yeoreum Yun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ