[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150221183005.GB8406@gmail.com>
Date: Sat, 21 Feb 2015 19:30:05 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Vojtech Pavlik <vojtech@...e.com>, Jiri Kosina <jkosina@...e.cz>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...hat.com>,
Seth Jennings <sjenning@...hat.com>,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: live patching design (was: Re: [PATCH 1/3] sched: add
sched_task_call())
* Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> On Fri, Feb 20, 2015 at 10:46:13PM +0100, Vojtech Pavlik wrote:
> > On Fri, Feb 20, 2015 at 08:49:01PM +0100, Ingo Molnar wrote:
> >
> > > I.e. it's in essence the strong stop-all atomic
> > > patching model of 'kpatch', combined with the
> > > reliable avoidance of kernel stacks that 'kgraft'
> > > uses.
> >
> > > That should be the starting point, because it's the
> > > most reliable method.
> >
> > In the consistency models discussion, this was marked
> > the "LEAVE_KERNEL+SWITCH_KERNEL" model. It's indeed the
> > strongest model of all, but also comes at the highest
> > cost in terms of impact on running tasks. It's so high
> > (the interruption may be seconds or more) that it was
> > deemed not worth implementing.
>
> Yeah, this is way too disruptive to the user.
>
> Even the comparatively tiny latency caused by kpatch's
> use of stop_machine() was considered unacceptable by
> some.
Unreliable, unrobust patching is even more disruptive...
What I think makes it long term fragile is that we combine
two unrobust, unlikely mechanisms: the chance that a task
just happens to execute a patched function, with the chance
that debug information is unreliable.
For example tracing patching got debugged to a fair degree
because we rely on the patching for actual tracing
functionality. Even with that relatively robust usage model
we had our crises ...
I just don't see how a stack backtrace based live patching
method can become robust in the long run.
> Plus a lot of processes would see EINTR, causing more
> havoc.
Parking threads safely in user mode does not require the
propagation of syscall interruption to user-space.
(It does have some other requirements, such as making all
syscalls interruptible to a 'special' signalling method
that only live patching triggers - even syscalls that are
under the normal ABI uninterruptible, such as sys_sync().)
On the other hand, if it's too slow, people will work on
improving signal propagation latencies: making syscalls
more readily interruptible and more seemlessly restartable
has various other advantages beyond live kernel patching.
I.e. it's a win-win scenario and will improve various areas
of the kernel in terms of syscall interruptability
latencies.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists