linux-kernel - Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.20.1702031711120.28531@pobox.suse.cz>
Date:   Fri, 3 Feb 2017 17:21:41 +0100 (CET)
From:   Miroslav Benes <mbenes@...e.cz>
To:     Petr Mladek <pmladek@...e.com>
cc:     Josh Poimboeuf <jpoimboe@...hat.com>, Jessica Yu <jeyu@...hat.com>,
        Jiri Kosina <jikos@...nel.org>, linux-kernel@...r.kernel.org,
        live-patching@...r.kernel.org,
        Michael Ellerman <mpe@...erman.id.au>,
        Heiko Carstens <heiko.carstens@...ibm.com>, x86@...nel.org,
        linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
        Vojtech Pavlik <vojtech@...e.com>, Jiri Slaby <jslaby@...e.cz>,
        Chris J Arges <chris.j.arges@...onical.com>,
        Andy Lutomirski <luto@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
        Balbir Singh <bsingharora@...il.com>
Subject: Re: [PATCH v4 13/15] livepatch: change to a per-task consistency
 model

On Thu, 2 Feb 2017, Petr Mladek wrote:

> > diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
> > index 7f04e13..fb00d66 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> 
> > +  In that case, arches without
> > +   HAVE_RELIABLE_STACKTRACE would still be able to use the
> > +   non-stack-checking parts of the consistency model:
> > +
> > +   a) patching user tasks when they cross the kernel/user space
> > +      boundary; and
> > +
> > +   b) patching kthreads and idle tasks at their designated patch points.
> > +
> > +   This option isn't as good as option 1 because it requires signaling
> > +   most of the tasks to patch them.
> 
> Kthreads need to be waken because most of them ignore signals.
> 
> And idle tasks might need to be explicitely scheduled if there
> is too big load. Mirek knows more about this.

Yes, and we've already discussed it with Josh. The plan is not to do 
anything now, because described situations should be rare and/or 
theoretical only. It should be on our TODO lists though.
 
> Well, I am not sure if you want to get into such details.

I would not bother about it.
 
[...]

> > +/*
> > + * Start the transition to the specified target patch state so tasks can begin
> > + * switching to it.
> > + */
> > +void klp_start_transition(void)
> > +{
> > +	struct task_struct *g, *task;
> > +	unsigned int cpu;
> > +
> > +	WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > +
> > +	pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > +		  klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > +
> > +	/*
> > +	 * If the patch can be applied or reverted immediately, skip the
> > +	 * per-task transitions.
> > +	 */
> > +	if (klp_transition_patch->immediate)
> 
> We should call klp_try_complete_transition() here. Otherwise, it will
> never be called and the transition will never get completed.
> 
> Alternative solution would be to move klp_try_complete_transition()
> from klp_start_transition() and explicitely call it from
> __klp_disable_patch() and klp_enable_patch(). It would actually
> solve one issue with klp_revert_patch(), see below.
> 
> I kind of like the alternative solution. I hope that it was not
> me who suggested to move klp_try_complete_transition() into
> klp_start_transtion().

[...]

> Hmm, we should not call klp_try_complete_transition() when
> klp_start_transition() is called from here. I can't find a safe
> way to cancel klp_transition_work() when we own klp_mutex.
> It smells with a possible deadlock.
> 
> I suggest to move move klp_try_complete_transition() outside
> klp_start_transition() and explicitely call it from
>  __klp_disable_patch() and __klp_enabled_patch().
> This would fix also the problem with immediate patches, see
> klp_start_transition().

I agree. I think the best would be to move klp_try_complete_transition() 
out from klp_start_transition() and call it explicitly. This would solve 
the immediate problem for free.

I agree we should not call klp_try_complete_transition() from 
klp_reverse_transition() and leave it entirely to our delayed work. We 
discussed this in the past and the counter argument was that explicit call 
to klp_try_complete_transition() could make the process a bit faster. 
Possibly true, but reversion is really slow path and I would not care 
about speed at all. I think it is cleaner and perhaps safer.

Thanks,
Miroslav