linux-kernel - Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161221212505.dbxeddu2skmjmwiq@treble>
Date:   Wed, 21 Dec 2016 15:25:05 -0600
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Jessica Yu <jeyu@...hat.com>, Jiri Kosina <jikos@...nel.org>,
        Miroslav Benes <mbenes@...e.cz>, linux-kernel@...r.kernel.org,
        live-patching@...r.kernel.org,
        Michael Ellerman <mpe@...erman.id.au>,
        Heiko Carstens <heiko.carstens@...ibm.com>, x86@...nel.org,
        linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
        Vojtech Pavlik <vojtech@...e.com>, Jiri Slaby <jslaby@...e.cz>,
        Chris J Arges <chris.j.arges@...onical.com>,
        Andy Lutomirski <luto@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v3 13/15] livepatch: change to a per-task consistency
 model

On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz
> > 
> > Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
> > ---
> > diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
> > index 6c43f6e..f87e742 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> 
> I like the description.
> 
> Just a note that we will also need to review the section about
> limitations. But I am not sure that we want to do it in this patch.
> It might open a long discussion on its own.
> 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 1a5a93c..8e06fe5 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -28,18 +28,40 @@
> >  
> >  #include <asm/livepatch.h>
> >  
> > +/* task patch states */
> > +#define KLP_UNDEFINED	-1
> > +#define KLP_UNPATCHED	 0
> > +#define KLP_PATCHED	 1
> > +
> >  /**
> >   * struct klp_func - function structure for live patching
> >   * @old_name:	name of the function to be patched
> >   * @new_func:	pointer to the patched function code
> >   * @old_sympos: a hint indicating which symbol position the old function
> >   *		can be found (optional)
> > + * @immediate:  patch the func immediately, bypassing backtrace safety checks
> 
> There are more checks possible. I would use the same description
> as for klp_object.

Agreed.

> >   * @old_addr:	the address of the function being patched
> >   * @kobj:	kobject for sysfs resources
> >   * @stack_node:	list node for klp_ops func_stack list
> >   * @old_size:	size of the old function
> >   * @new_size:	size of the new function
> >   * @patched:	the func has been added to the klp_ops list
> > + * @transition:	the func is currently being applied or reverted
> > + *
> > @@ -86,6 +110,7 @@ struct klp_object {
> >   * struct klp_patch - patch structure for live patching
> >   * @mod:	reference to the live patch module
> >   * @objs:	object entries for kernel objects to be patched
> > + * @immediate:  patch all funcs immediately, bypassing safety mechanisms
> >   * @list:	list node for global list of registered patches
> >   * @kobj:	kobject for sysfs resources
> >   * @enabled:	the patch is enabled (but operation may be incomplete)
> 
> [...]
> 
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index fc160c6..22c0c01 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -424,7 +477,10 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr,
> >  		goto err;
> >  	}
> >  
> > -	if (enabled) {
> > +	if (patch == klp_transition_patch) {
> > +		klp_reverse_transition();
> > +		mod_delayed_work(system_wq, &klp_transition_work, 0);
> 
> I would put this mod_delayed_work() into klp_reverse_transition().
> Also I would put that schedule_delayed_work() into
> klp_try_complete_transition().
> 
> If I did not miss anything, it will allow to move the
> klp_transition_work code to transition.c where it logically
> belongs.

Makes sense, I'll see if I can move all the klp_transition_work code to
transition.c.

> > +	} else if (enabled) {
> >  		ret = __klp_enable_patch(patch);
> >  		if (ret)
> >  			goto err;
> 
> [...]
> 
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 5efa262..e79ebb5 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -29,6 +29,7 @@
> >  #include <linux/bug.h>
> >  #include <linux/printk.h>
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  static LIST_HEAD(klp_ops);
> >  
> > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> >  {
> >  	struct klp_ops *ops;
> >  	struct klp_func *func;
> > +	int patch_state;
> >  
> >  	ops = container_of(fops, struct klp_ops, fops);
> >  
> >  	rcu_read_lock();
> > +
> >  	func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> >  				      stack_node);
> > -	if (WARN_ON_ONCE(!func))
> > +
> > +	if (!func)
> >  		goto unlock;
> 
> Why do you removed the WARN_ON_ONCE(), please?
> 
> We still add the function on the stack before registering
> the ftrace handler. Also we unregister the ftrace handler
> before removing the the last entry from the stack.
> 
> AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> to make sure that no-one is inside the handler once finished.
> Mirek knows more about it.

Hm, this is news to me.  Mirek, please share :-)

> If this is not true, we have a problem. For example,
> we call kfree(ops) after unregister_ftrace_function();

Agreed.

> BTW: I thought that this change was really needed because of
> klp_try_complete_transition(). But I think that the WARN
> could and should stay after all. See below.
> 
> 
> > +	/*
> > +	 * Enforce the order of the ops->func_stack and func->transition reads.
> > +	 * The corresponding write barrier is in __klp_enable_patch().
> > +	 */
> > +	smp_rmb();
> > +
> > +	if (unlikely(func->transition)) {
> > +
> > +		/*
> > +		 * Enforce the order of the func->transition and
> > +		 * current->patch_state reads.  Otherwise we could read an
> > +		 * out-of-date task state and pick the wrong function.  The
> > +		 * corresponding write barriers are in klp_init_transition()
> > +		 * and __klp_disable_patch().
> > +		 */
> > +		smp_rmb();
> > +
> > +		patch_state = current->patch_state;
> > +
> > +		WARN_ON_ONCE(patch_state == KLP_UNDEFINED);
> > +
> > +		if (patch_state == KLP_UNPATCHED) {
> > +			/*
> > +			 * Use the previously patched version of the function.
> > +			 * If no previous patches exist, use the original
> > +			 * function.
> 
> s/use the original/continue with the original/  ?

Ok.

> > +			 */
> > +			func = list_entry_rcu(func->stack_node.next,
> > +					      struct klp_func, stack_node);
> > +
> > +			if (&func->stack_node == &ops->func_stack)
> > +				goto unlock;
> > +		}
> > +	}
> > +
> >  	klp_arch_set_pc(regs, (unsigned long)func->new_func);
> >  unlock:
> >  	rcu_read_unlock();
> > @@ -211,3 +250,12 @@ int klp_patch_object(struct klp_object *obj)
> >  
> >  	return 0;
> >  }
> > +
> > +void klp_unpatch_objects(struct klp_patch *patch)
> > +{
> > +	struct klp_object *obj;
> > +
> > +	klp_for_each_object(patch, obj)
> > +		if (obj->patched)
> > +			klp_unpatch_object(obj);
> > +}
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > @@ -0,0 +1,479 @@
> > +/*
> > + * transition.c - Kernel Live Patching transition functions
> > + *
> > + * Copyright (C) 2015-2016 Josh Poimboeuf <jpoimboe@...hat.com>
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version 2
> > + * of the License, or (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include <linux/cpu.h>
> > +#include <linux/stacktrace.h>
> > +#include "patch.h"
> > +#include "transition.h"
> > +#include "../sched/sched.h"
> 
> Is this acceptable for the scheduler guys? 

I discussed the use of task_rq_lock() with Peter Zijlstra on IRC and he
seemed to think it was ok.  Peter, please speak up if you disagree :-)

> > +#define MAX_STACK_ENTRIES 100
> > +
> > +struct klp_patch *klp_transition_patch;
> > +
> > +static int klp_target_state = KLP_UNDEFINED;
> > +
> > +/* called from copy_process() during fork */
> > +void klp_copy_process(struct task_struct *child)
> > +{
> > +	child->patch_state = current->patch_state;
> > +
> > +	/* TIF_PATCH_PENDING gets copied in setup_thread_stack() */
> > +}
> > +
> > +/*
> > + * klp_update_patch_state() - change the patched state of a task
> > + * @task:	The task to change
> > + *
> > + * Switches the patched state of the task to the set of functions in the target
> > + * patch state.
> > + */
> 
> Please, add here some warning. Something like:
> 
>  * This function must never be called in parallel with
>  * klp_ftrace_handler(). Otherwise, the handler might do random
>  * decisions and break the consistency.
>  *
>  * By other words, call this function only by the @task itself
>  * or make sure that it is not running.

Yeah, I'll add a comment here.  This goes back to our discussion from
last time:

  https://lkml.kernel.org/r/20160504172517.tdatoj2nlkqwyd4g@treble

> > +void klp_update_patch_state(struct task_struct *task)
> > +{
> > +	/*
> > +	 * The synchronize_rcu() call in klp_try_complete_transition() ensures
> > +	 * this critical section completes before the global patch transition
> > +	 * is considered complete so we don't have spurious patch_state updates
> > +	 * afterwards.
> > +	 */
> > +	rcu_read_lock();
> > +
> > +	/*
> > +	 * This test_and_clear_tsk_thread_flag() call also serves as a read
> > +	 * barrier to enforce the order of the TIF_PATCH_PENDING and
> > +	 * klp_target_state reads.  The corresponding write barriers are in
> > +	 * __klp_disable_patch() and klp_reverse_transition().
> > +	 */
> > +	if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
> > +		task->patch_state = READ_ONCE(klp_target_state);
> > +
> > +	rcu_read_unlock();
> > +}
> > +
> > +/*
> > + * Initialize the global target patch state and all tasks to the initial patch
> > + * state, and initialize all function transition states to true in preparation
> > + * for patching or unpatching.
> > + */
> > +void klp_init_transition(struct klp_patch *patch, int state)
> > +{
> > +	struct task_struct *g, *task;
> > +	unsigned int cpu;
> > +	struct klp_object *obj;
> > +	struct klp_func *func;
> > +	int initial_state = !state;
> > +
> > +	WARN_ON_ONCE(klp_target_state != KLP_UNDEFINED);
> > +
> > +	klp_transition_patch = patch;
> > +
> > +	/*
> > +	 * Set the global target patch state which tasks will switch to.  This
> > +	 * has no effect until the TIF_PATCH_PENDING flags get set later.
> > +	 */
> > +	klp_target_state = state;
> > +
> > +	/*
> > +	 * If the patch can be applied or reverted immediately, skip the
> > +	 * per-task transitions.
> > +	 */
> > +	if (patch->immediate)
> > +		return;
> > +
> > +	/*
> > +	 * Initialize all tasks to the initial patch state to prepare them for
> > +	 * switching to the target state.
> > +	 */
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(g, task) {
> > +		WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > +		task->patch_state = initial_state;
> > +	}
> > +	read_unlock(&tasklist_lock);
> > +
> > +	/*
> > +	 * Ditto for the idle "swapper" tasks.
> > +	 */
> > +	get_online_cpus();
> > +	for_each_online_cpu(cpu) {
> > +		task = idle_task(cpu);
> > +		WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > +		task->patch_state = initial_state;
> > +	}
> > +	put_online_cpus();
> 
> We allow to add/remove CPUs here. I am afraid that we will also need
> to add a cpu coming/going handler that will set the task->patch_state
> the right way. We must not set the klp_target_state until all ftrace
> handlers are ready.

What if we instead just change the above to use for_each_possible_cpu()?
We could do the same in klp_complete_transition().

> > +	/*
> > +	 * Enforce the order of the task->patch_state initializations and the
> > +	 * func->transition updates to ensure that, in the enable path,
> > +	 * klp_ftrace_handler() doesn't see a func in transition with a
> > +	 * task->patch_state of KLP_UNDEFINED.
> > +	 */
> > +	smp_wmb();
> > +
> > +	/*
> > +	 * Set the func transition states so klp_ftrace_handler() will know to
> > +	 * switch to the transition logic.
> > +	 *
> > +	 * When patching, the funcs aren't yet in the func_stack and will be
> > +	 * made visible to the ftrace handler shortly by the calls to
> > +	 * klp_patch_object().
> > +	 *
> > +	 * When unpatching, the funcs are already in the func_stack and so are
> > +	 * already visible to the ftrace handler.
> > +	 */
> > +	klp_for_each_object(patch, obj)
> > +		klp_for_each_func(obj, func)
> > +			func->transition = true;
> > +}
> > +
> > +/*
> > + * Start the transition to the specified target patch state so tasks can begin
> > + * switching to it.
> > + */
> > +void klp_start_transition(void)
> > +{
> > +	struct task_struct *g, *task;
> > +	unsigned int cpu;
> > +
> > +	WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > +
> > +	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > +		  klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > +
> > +	/*
> > +	 * If the patch can be applied or reverted immediately, skip the
> > +	 * per-task transitions.
> > +	 */
> > +	if (klp_transition_patch->immediate)
> > +		return;
> > +
> > +	/*
> > +	 * Mark all normal tasks as needing a patch state update.  As they pass
> > +	 * through the syscall barrier they'll switch over to the target state
> > +	 * (unless we switch them in klp_try_complete_transition() first).
> > +	 */
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(g, task)
> > +		set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> 
> This is called also from klp_reverse_transition(). We should set it
> only when the task need migration. Also we should clear it when
> the task is in the right state already.
> 
> It is not only optimization. It actually solves a race between
> klp_complete_transition() and klp_update_patch_state(), see below.

I agree about the race, but if I did:

	for_each_process_thread(g, task) {
		if (task->patch_state != klp_target_state)
			set_tsk_thread_flag(task, TIF_PATCH_PENDING);
		else
			clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
	}

It would still leave a small window where TIF_PATCH_PENDING gets set for
an already patched task, if klp_update_patch_state() is running at the
same time.

See below for another solution.

> > +	read_unlock(&tasklist_lock);
> > +
> > +	/*
> > +	 * Ditto for the idle "swapper" tasks, though they never cross the
> > +	 * syscall barrier.  Instead they switch over in cpu_idle_loop().
> > +	 */
> > +	get_online_cpus();
> > +	for_each_online_cpu(cpu)
> > +		set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> > +	put_online_cpus();
> 
> Also this stage need to be somehow handled by CPU coming/going
> handlers.

Here I think we could automatically switch any offline CPUs' idle tasks.
And something similar in klp_try_complete_transition().

> > +}
> > +
> > +/*
> > + * The transition to the target patch state is complete.  Clean up the data
> > + * structures.
> > + */
> > +void klp_complete_transition(void)
> > +{
> > +	struct klp_object *obj;
> > +	struct klp_func *func;
> > +	struct task_struct *g, *task;
> > +	unsigned int cpu;
> > +
> > +	if (klp_transition_patch->immediate)
> > +		goto done;
> > +
> > +	klp_for_each_object(klp_transition_patch, obj)
> > +		klp_for_each_func(obj, func)
> > +			func->transition = false;
> 
> We should call rcu_synchronize() here. Otherwise, there
> might be a race, see below:
> 
> CPU1					CPU2
> 
> klp_ftrace_handler()
>   if (unlikely(func->transition))
> 	// still true
> 
> 					klp_complete_transition()
> 					  func->transition = false;
> 					  task->patch_state =
> 					      KLP_UNDEFINED;
> 
>      patch_state = current->patch_state;
> 
>      WARN_ON(patch_state == KLP_UNDEFINED);
> 
> BANG!: We print the warning.

This shouldn't be possible because klp_try_complete_transition() calls
rcu_synchronize() before calling klp_complete_transition().  So by the
time klp_complete_transition() is called, the ftrace handler can no
longer see the affected func.  See the comment for rcu_synchronize() in
klp_try_complete_transition().

> Note that that smp_wmb() is enough in klp_init_transition()
> but it is not enough here. We need to wait longer once
> someone might be inside the if (true) code.
> 
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(g, task) {
> > +		clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > +		task->patch_state = KLP_UNDEFINED;
> > +	}
> > +	read_unlock(&tasklist_lock);
> > +
> > +	get_online_cpus();
> > +	for_each_online_cpu(cpu) {
> > +		task = idle_task(cpu);
> > +		clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> 
> If TIF_PATCH_PENDING flag is set here it means that
> klp_update_patch_state() might get triggered and it might
> put wrong value into task->patch_state.
> 
> We must make sure that all task have this cleared before
> calling this function. This is another reason why
> klp_init_transition() should set the flag only when
> transition is needed.
> 
> We should only check the state here.
> 
> It still might make sense to clear it when it is set wrongly.
> But the question is if it is really safe to continue. I am
> afraid that it is not. It would mean that the consistency
> model is broken and we are in strange state.

As I mentioned above, with your proposal I think there could still be a
task with a spurious set TIF_PATCH_PENDING at this point.

Maybe instead we should clear all the TIF_PATCH_PENDING flags before the
synchronize_rcu() in klp_try_complete_transition().

> > +		task->patch_state = KLP_UNDEFINED;
> > +	}
> > +	put_online_cpus();
> > +
> > +done:
> > +	klp_target_state = KLP_UNDEFINED;
> > +	klp_transition_patch = NULL;
> > +}
> 
> [...]
> 
> > +
> > +/*
> > + * Try to switch all remaining tasks to the target patch state by walking the
> > + * stacks of sleeping tasks and looking for any to-be-patched or
> > + * to-be-unpatched functions.  If such functions are found, the task can't be
> > + * switched yet.
> > + *
> > + * If any tasks are still stuck in the initial patch state, schedule a retry.
> > + */
> > +bool klp_try_complete_transition(void)
> > +{
> > +	unsigned int cpu;
> > +	struct task_struct *g, *task;
> > +	bool complete = true;
> > +
> > +	WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > +
> > +	/*
> > +	 * If the patch can be applied or reverted immediately, skip the
> > +	 * per-task transitions.
> > +	 */
> > +	if (klp_transition_patch->immediate)
> > +		goto success;
> > +
> > +	/*
> > +	 * Try to switch the tasks to the target patch state by walking their
> > +	 * stacks and looking for any to-be-patched or to-be-unpatched
> > +	 * functions.  If such functions are found on a stack, or if the stack
> > +	 * is deemed unreliable, the task can't be switched yet.
> > +	 *
> > +	 * Usually this will transition most (or all) of the tasks on a system
> > +	 * unless the patch includes changes to a very common function.
> > +	 */
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(g, task)
> > +		if (!klp_try_switch_task(task))
> > +			complete = false;
> > +	read_unlock(&tasklist_lock);
> > +
> > +	/*
> > +	 * Ditto for the idle "swapper" tasks.
> > +	 */
> > +	get_online_cpus();
> > +	for_each_online_cpu(cpu)
> > +		if (!klp_try_switch_task(idle_task(cpu)))
> > +			complete = false;
> > +	put_online_cpus();
> > +
> > +	/*
> > +	 * Some tasks weren't able to be switched over.  Try again later and/or
> > +	 * wait for other methods like syscall barrier switching.
> > +	 */
> > +	if (!complete)
> > +		return false;
> > +
> > +success:
> > +
> > +	/*
> > +	 * When unpatching, all tasks have transitioned to KLP_UNPATCHED so we
> > +	 * can now remove the new functions from the func_stack.
> > +	 */
> > +	if (klp_target_state == KLP_UNPATCHED)
> > +		klp_unpatch_objects(klp_transition_patch);
> > +
> > +	/*
> > +	 * Wait for all RCU read-side critical sections to complete.
> > +	 *
> > +	 * This has two purposes:
> > +	 *
> > +	 * 1) Ensure all existing critical sections in klp_update_patch_state()
> > +	 *    complete, so task->patch_state won't be unexpectedly updated
> > +	 *    later.
> 
> We should not be here if anyone still might be in klp_update_patch_state().

Depends on our discussion about conditionally setting TIF_PATCH_PENDING.

> 
> > +	 *
> > +	 * 2) When unpatching, don't allow any existing instances of
> > +	 *    klp_ftrace_handler() to access any obsolete funcs before we reset
> > +	 *    the func transition states to false.  Otherwise the handler may
> > +	 *    see the deleted "new" func, see that it's not in transition, and
> > +	 *    wrongly pick the new version of the function.
> > +	 */
> 
> This makes sense but it too me long time to understand. I wonder if
> this might be better:
> 
> 	/*
> 	 * Make sure that the function is removed from ops->func_stack
> 	 * before we clear func->transition. Otherwise the handler may
> 	 * pick the wrong version.
> 	 */

Sounds good.

> And I would call this only when the patch is being removed
> 
> 	if (klp_target_state = KLP_UNPATCHED)
> 		synchronize_rcu();

Depends on our discussion about conditionally setting TIF_PATCH_PENDING.

> I think that this was the reason to remove WARN_ON_ONCE(!func)
> in klp_ftrace_handler(). But this is not related. If this was
> the last entry in the list, we removed the ftrace_handler
> before removing the last entry. And unregister_ftrace_function()
> calls rcu_synchronize() to prevent calling the handler later.
> 
> 
> > +	synchronize_rcu();
> > +
> > +	pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > +		  klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > +
> > +	/* we're done, now cleanup the data structures */
> > +	klp_complete_transition();
> > +
> > +	return true;
> > +}
> > +
> > +/*
> > + * This function can be called in the middle of an existing transition to
> > + * reverse the direction of the target patch state.  This can be done to
> > + * effectively cancel an existing enable or disable operation if there are any
> > + * tasks which are stuck in the initial patch state.
> > + */
> > +void klp_reverse_transition(void)
> > +{
> > +	klp_transition_patch->enabled = !klp_transition_patch->enabled;
> > +
> > +	klp_target_state = !klp_target_state;
> > +
> > +	/*
> > +	 * Enforce the order of the write to klp_target_state above and the
> > +	 * TIF_PATCH_PENDING writes in klp_start_transition() to ensure that
> > +	 * klp_update_patch_state() doesn't set a wrong task->patch_state.
> > +	 */
> > +	smp_wmb();
> 
> I would call rcu_synchronize() here to make sure that
> klp_update_patch_state() calls will not set
> an outdated task->patch_state.
> 
> Note that smp_wmb() is not enough. We do not check TIF_PATCH_PENDING
> in klp_try_switch_task(). There is a tiny race:
> 
> CPU1					CPU2
> 
> klp_update_patch_state()
> 
> 	if (test_and clear(task, TIF)
> 	     READ_ONCE(klp_target_state);
> 
> 					mutex_lock(klp_lock);
> 
> 					klp_reverse_transition()
> 					  klp_target_state =
> 					      !klp_target_state;
> 
> 					  klp_start_transition()
> 
> 					mutex_unlock(klp_lock);
> 
> 					 <switch to another process>
> 
> 					 klp_transition_work_fn()
> 					   mutex_lock(klp_lock);
> 					   klp_try_complete_transition()
> 					     klp_try_switch_task()
> 					       if (task->patch_state ==
> 						   klp_target_state)
> 						  return true;
> 
> 	    task->patch_state = <outdated_value>;
> 
> 	 klp_ftrace_handler()
> 
> BANG: klp_ftrace_handler() will use wrong implementation according
>       to the outdated task->patch_state. At the same time,
>       klp_transition() is not blocked by the task because it thinks
>       that it has a correct state.

Good find!

> > +
> > +	klp_start_transition();
> > +}
> > +
> > diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
> > index e34f871..bb61c65 100644
> > --- a/samples/livepatch/livepatch-sample.c
> > +++ b/samples/livepatch/livepatch-sample.c
> > @@ -17,6 +17,8 @@
> >   * along with this program; if not, see <http://www.gnu.org/licenses/>.
> >   */
> >  
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> >  #include <linux/module.h>
> >  #include <linux/kernel.h>
> >  #include <linux/livepatch.h>
> > @@ -69,6 +71,11 @@ static int livepatch_init(void)
> >  {
> >  	int ret;
> >  
> > +	if (!klp_have_reliable_stack() && !patch.immediate) {
> > +		pr_notice("disabling consistency model!\n");
> > +		patch.immediate = true;
> > +	}
> 
> I am scared to have this in the sample module. It makes sense
> to use the consistency model even for immediate patches because
> it allows to remove them. But this must not be used for patches
> that really require the consistency model. We should add
> a big fat warning at least.

I did this so that the sample module would still work for non-x86_64
arches, for which there's currently no way to patch kthreads.

Notice I did add a warning:

  pr_notice("disabling consistency model!\n");

Is the warning not fat enough?

> > +
> >  	ret = klp_register_patch(&patch);
> >  	if (ret)
> >  		return ret;
> 
> I like the patch. All the problems that I found look solvable.
> I think that we are on the right way.

Thank you for the excellent review!

-- 
Josh