lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZhU2YwettB6i6AMp@localhost.localdomain>
Date: Tue, 9 Apr 2024 14:36:51 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	Adrian Hunter <adrian.hunter@...el.com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ian Rogers <irogers@...gle.com>, Ingo Molnar <mingo@...hat.com>,
	Jiri Olsa <jolsa@...nel.org>, Marco Elver <elver@...gle.com>,
	Mark Rutland <mark.rutland@....com>,
	Namhyung Kim <namhyung@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH v3 2/4] perf: Enqueue SIGTRAP always via task_work.

Le Tue, Apr 09, 2024 at 10:57:32AM +0200, Sebastian Andrzej Siewior a écrit :
> > >  static void
> > >  perf_event_exit_event(struct perf_event *event, struct perf_event_context *ctx)
> > >  {
> > > @@ -13088,6 +13084,18 @@ perf_event_exit_event(struct perf_event *event, struct perf_event_context *ctx)
> > >  		 * Kick perf_poll() for is_event_hup();
> > >  		 */
> > >  		perf_event_wakeup(parent_event);
> > > +		/*
> > > +		 * Cancel pending task_work and update counters if it has not
> > > +		 * yet been delivered to userland. free_event() expects the
> > > +		 * reference counter at one and keeping the event around until
> > > +		 * the task returns to userland can be a unexpected if there is
> > > +		 * no signal handler registered.
> > > +		 */
> > > +		if (event->pending_work &&
> > > +		    task_work_cancel_match(current, task_work_cb_match, event)) {
> > > +			put_event(event);
> > > +			local_dec(&event->ctx->nr_pending);
> > > +		}
> > 
> > So exiting task, privileged exec and also exit on exec call into this before
> > releasing the children.
> > 
> > And parents rely on put_event() from file close + the task work.
> > 
> > But what about remote release of children on file close?
> > See perf_event_release_kernel() directly calling free_event() on them.
> 
> Interesting things you are presenting. I had events popping up at random
> even after the task decided that it won't go back to userland to handle
> it so letting it free looked like the only option…
> 
> > One possible fix is to avoid the reference count game around task work
> > and flush them on free_event().
> > 
> > See here:
> > 
> > https://lore.kernel.org/all/202403310406.TPrIela8-lkp@intel.com/T/#m63c28147d8ac06b21c64d7784d49f892e06c0e50
> 
> That wake_up() within preempt_disable() section breaks on RT.

Ah, but the wake-up still wants to go inside recursion protection somehow or
it could generate task_work loop again due to tracepoint events...

Although... the wake up occurs only when the event is dead after all...

> How do we go on from here?

I'd tend to think you need my patchset first because the problems it
fixes were not easily visible as long as there was an irq work to take
care of things most of the time. But once you rely on task_work only then
these become a real problem. Especially the sync against perf_release().

Thanks.

> 
> > Thanks.
> 
> Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ