linux-kernel - Re: [PATCH 2/3] livepatch,sched: Add livepatch task switching to cond

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y+zefMw6wqaXHZSz@alley>
Date:   Wed, 15 Feb 2023 14:30:36 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Josh Poimboeuf <jpoimboe@...nel.org>
Cc:     live-patching@...r.kernel.org, linux-kernel@...r.kernel.org,
        Seth Forshee <sforshee@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Song Liu <song@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Miroslav Benes <mbenes@...e.cz>,
        Jiri Kosina <jikos@...nel.org>, Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH 2/3] livepatch,sched: Add livepatch task switching to
 cond_resched()

On Thu 2023-02-09 11:17:48, Josh Poimboeuf wrote:
> There have been reports [1][2] of live patches failing to complete
> within a reasonable amount of time due to CPU-bound kthreads.
> 
> Fix it by patching tasks in cond_resched().
> 
> There are four different flavors of cond_resched(), depending on the
> kernel configuration.  Hook into all of them.
> 
> A more elegant solution might be to use a preempt notifier.  However,
> non-ORC unwinders can't unwind a preempted task reliably.
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2085,20 +2090,25 @@ static __always_inline int _cond_resched(void)
>  	return dynamic_cond_resched();
>  }
>  
> -#else
> +#else /* !CONFIG_PREEMPTION */
>  
>  static inline int _cond_resched(void)
>  {
> +	klp_sched_try_switch();
>  	return __cond_resched();

My only concern is if it might cause any performance problems.

On one hand, cond_resched() is used in code paths that are slow
on its own. Also it will do nothing most of the time.

On the other hand, cond_resched() is typically used in cycles.
One cycle might be fast. The code might be slow because there
are too many cycles. Repeating the same failing test might
prolong the time significantly.

An idea is to try the switch only when it was not done during
a real schedule. Something like:

static inline int _cond_resched(void)
{
	int scheduled;

	scheduled = __cond_resched();
	if (scheduled)
		klp_sched_try_switch();

	return scheduled();
}

But it would make it less reliable/predictable. Also it won't work
in configurations when cond_resched() is always a nop.

I am probably too careful. We might keep it simple until any real
life problems are reported.

> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -76,6 +96,8 @@ static void klp_complete_transition(void)
>  		 klp_transition_patch->mod->name,
>  		 klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
>  
> +	klp_cond_resched_disable();
> +

Nit: Strictly speaking, this is not needed when klp_complete_transition()
     is called from klp_cancel_transition(). In this case,
     klp_cond_resched_enable() was not called. So it might be moved into
     klp_try_complete_transition().


More important thing, thinking loudly:

We need to make sure that no task is in the middle
klp_cond_resched_disable() when we modify anything that is used there.

We seem to be on the safe side in klp_complete_transition(). We are
here only when all tasks have TIF_PATCH_PENDING cleared. In this case,
__klp_sched_try_switch() just returns. Also it calls
klp_synchronize_transition() so that all tasks finish the critical part
in __klp_sched_try_switch() before any new transition starts.

But it is not the case in klp_reverse_transition(). It modifies
klp_target_state() when __klp_sched_try_switch might be in the middle
of klp_check_stack() and it might give wrong result.

klp_reverse_transition() already solves similar race with
klp_update_patch_state() by clearing all TIF_PATCH_PENDING flags
and calling klp_synchronize_transition(). We just need to do
it earlier. Something like:

--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -642,14 +642,11 @@ void klp_reverse_transition(void)
 		 klp_target_state == KLP_PATCHED ? "patching to unpatching" :
 						   "unpatching to patching");
 
-	klp_transition_patch->enabled = !klp_transition_patch->enabled;
-
-	klp_target_state = !klp_target_state;
 
 	/*
 	 * Clear all TIF_PATCH_PENDING flags to prevent races caused by
-	 * klp_update_patch_state() running in parallel with
-	 * klp_start_transition().
+	 * klp_update_patch_state() and __klp_sched_try_switch()
+	 * running in parallel.
 	 */
 	read_lock(&tasklist_lock);
 	for_each_process_thread(g, task)
@@ -659,9 +656,16 @@ void klp_reverse_transition(void)
 	for_each_possible_cpu(cpu)
 		clear_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
 
-	/* Let any remaining calls to klp_update_patch_state() complete */
+	/*
+	 * Make sure that both klp_update_patch_state() and
+	 * __klp_sched_try_switch() see the TIF_PATCH_PENDING cleared.
+	 */
 	klp_synchronize_transition();
 
+	klp_transition_patch->enabled = !klp_transition_patch->enabled;
+
+	klp_target_state = !klp_target_state;
+
 	klp_start_transition();
 }
 


>  	if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) {
>  		klp_unpatch_replaced_patches(klp_transition_patch);
>  		klp_discard_nops(klp_transition_patch);


Otherwise, the patch looks fine. I looked at it primary from
the livepatching code side.

Best Regards,
Petr