[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc485622-2e69-46e6-b95f-c1b4868e8d53@paulmck-laptop>
Date: Fri, 19 Jan 2024 06:37:26 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Chen Zhongjin <chenzhongjin@...wei.com>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
yangjihong1@...wei.com, naveen.n.rao@...ux.ibm.com,
anil.s.keshavamurthy@...el.com, davem@...emloft.net,
mhiramat@...nel.org, akpm@...ux-foundation.org, tglx@...utronix.de,
peterz@...radead.org, pmladek@...e.com, dianders@...omium.org,
npiggin@...il.com, mpe@...erman.id.au, jkl820.git@...il.com,
juerg.haefliger@...onical.com, rick.p.edgecombe@...el.com,
eric.devolder@...cle.com, mic@...ikod.net
Subject: Re: [PATCH v2] kprobes: Use synchronize_rcu_tasks_rude in
kprobe_optimizer
On Thu, Jan 18, 2024 at 02:18:42AM +0000, Chen Zhongjin wrote:
> There is a deadlock scenario in kprobe_optimizer():
>
> pid A pid B pid C
> kprobe_optimizer() do_exit() perf_kprobe_init()
> mutex_lock(&kprobe_mutex) exit_tasks_rcu_start() mutex_lock(&kprobe_mutex)
> synchronize_rcu_tasks() zap_pid_ns_processes() // waiting kprobe_mutex
> // waiting tasks_rcu_exit_srcu kernel_wait4()
> // waiting pid C exit
>
> To avoid this deadlock loop, use synchronize_rcu_tasks_rude() in kprobe_optimizer()
> rather than synchronize_rcu_tasks(). synchronize_rcu_tasks_rude() can also promise
> that all preempted tasks have scheduled, but it will not wait tasks_rcu_exit_srcu.
>
> Fixes: a30b85df7d59 ("kprobes: Use synchronize_rcu_tasks() for optprobe with CONFIG_PREEMPT=y")
> Signed-off-by: Chen Zhongjin <chenzhongjin@...wei.com>
Just so you know, your email ends up in gmail's spam folder. :-/
> ---
> v1 -> v2: Add Fixes tag
> ---
> arch/Kconfig | 2 +-
> kernel/kprobes.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index f4b210ab0612..dc6a18854017 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -104,7 +104,7 @@ config STATIC_CALL_SELFTEST
> config OPTPROBES
> def_bool y
> depends on KPROBES && HAVE_OPTPROBES
> - select TASKS_RCU if PREEMPTION
> + select TASKS_RUDE_RCU
>
> config KPROBES_ON_FTRACE
> def_bool y
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index d5a0ee40bf66..09056ae50c58 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -623,7 +623,7 @@ static void kprobe_optimizer(struct work_struct *work)
> * Note that on non-preemptive kernel, this is transparently converted
> * to synchronoze_sched() to wait for all interrupts to have completed.
> */
> - synchronize_rcu_tasks();
> + synchronize_rcu_tasks_rude();
Again, that comment reads in full as follows:
/*
* Step 2: Wait for quiesence period to ensure all potentially
* preempted tasks to have normally scheduled. Because optprobe
* may modify multiple instructions, there is a chance that Nth
* instruction is preempted. In that case, such tasks can return
* to 2nd-Nth byte of jump instruction. This wait is for avoiding it.
* Note that on non-preemptive kernel, this is transparently converted
* to synchronoze_sched() to wait for all interrupts to have completed.
*/
Please note well that first sentence.
Unless that first sentence no longer holds, this patch cannot work
because synchronize_rcu_tasks_rude() will not (repeat, NOT) wait for
preempted tasks.
So how to safely break this deadlock? Reproducing Chen Zhongjin's
diagram:
pid A pid B pid C
kprobe_optimizer() do_exit() perf_kprobe_init()
mutex_lock(&kprobe_mutex) exit_tasks_rcu_start() mutex_lock(&kprobe_mutex)
synchronize_rcu_tasks() zap_pid_ns_processes() // waiting kprobe_mutex
// waiting tasks_rcu_exit_srcu kernel_wait4()
// waiting pid C exit
We need to stop synchronize_rcu_tasks() from waiting on tasks like
pid B that are voluntarily blocked. One way to do that is to replace
SRCU with a set of per-CPU lists. Then exit_tasks_rcu_start() adds the
current task to this list and does ...
OK, this is getting a bit involved. If you would like to follow along,
please feel free to look here:
https://docs.google.com/document/d/1MEHHs5qbbZBzhN8dGP17pt-d87WptFJ2ZQcqS221d9I/edit?usp=sharing
Thanx, Paul
> /* Step 3: Optimize kprobes after quiesence period */
> do_optimize_kprobes();
> --
> 2.25.1
>
Powered by blists - more mailing lists