linux-kernel - Re: [PATCH v4 2/6] perf: Enqueue SIGTRAP always via task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z1F6_cC4bRvcN56T@pavilion.home>
Date: Thu, 5 Dec 2024 11:05:49 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	"Lai, Yi" <yi1.lai@...ux.intel.com>,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	Adrian Hunter <adrian.hunter@...el.com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Daniel Bristot de Oliveira <bristot@...nel.org>,
	Ian Rogers <irogers@...gle.com>, Ingo Molnar <mingo@...hat.com>,
	Jiri Olsa <jolsa@...nel.org>, Kan Liang <kan.liang@...ux.intel.com>,
	Marco Elver <elver@...gle.com>, Mark Rutland <mark.rutland@....com>,
	Namhyung Kim <namhyung@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Arnaldo Carvalho de Melo <acme@...hat.com>, yi1.lai@...el.com,
	syzkaller-bugs@...glegroups.com
Subject: Re: [PATCH v4 2/6] perf: Enqueue SIGTRAP always via task_work.

Le Thu, Dec 05, 2024 at 10:20:16AM +0100, Oleg Nesterov a écrit :
> > Looking at task_work, it seems that most enqueues happen to the current task.
> > AFAICT, only io_uring() does remote enqueue. Would it make sense to have a light
> > version of task_work that is only ever used by current? This would be a very
> > simple flavour with easy queue and cancellation without locking/atomics/RmW
> > operations.
> 
> Perhaps, but we also need to avoid the races with task_work_cancel() from
> another task. I mean, if a task T does task_work_add_light(work), it can race
> with task_work_cancel(T, ...) which can change T->task_works on another CPU.

I was thinking about two different lists.

> 
> 
> OK, I can't suggest a good solution for this problem, so I won't argue with
> the patch from Sebastian. Unfortunately, with this change we can never restore
> the fifo ordering. And note that the current ordering is not lifo, it is
> "unpredictable". Plus the extra lock/xchg for each work...

Good point it's unpredictable due to extraction before execution.

Another alternative is to maintain another head that points to the
head of the executing list. This way we can have task_work_cancel_current()
that completely cancels the work. That was my initial proposal here and it
avoids the lock/xchg for each work:

https://lore.kernel.org/all/Zx-B0wK3xqRQsCOS@localhost.localdomain/

> 
> Hmm. I just noticed that task_work_run() needs a simple fix:
> 
> 	--- x/kernel/task_work.c
> 	+++ x/kernel/task_work.c
> 	@@ -235,7 +235,7 @@
> 			raw_spin_unlock_irq(&task->pi_lock);
> 	 
> 			do {
> 	-			next = work->next;
> 	+			next = READ_ONCE(work->next);
> 				work->func(work);
> 				work = next;
> 				cond_resched();
> 
> Perhaps it makes sense before the patch from Sebastian even if that patch
> removes this do/while loop ?

Hmm, can work->next be modified concurrently here?

Thanks.

> 
> Oleg.
>