netdev - Re: [PATCH 2/2] vhost: check for pending livepatches from vhost worker kthreads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9FfenH/p3qzRlar@do-x1extreme>
Date:   Wed, 25 Jan 2023 10:57:30 -0600
From:   Seth Forshee <sforshee@...nel.org>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Jason Wang <jasowang@...hat.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jiri Kosina <jikos@...nel.org>,
        Miroslav Benes <mbenes@...e.cz>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
        netdev@...r.kernel.org, live-patching@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] vhost: check for pending livepatches from vhost
 worker kthreads

On Wed, Jan 25, 2023 at 12:34:26PM +0100, Petr Mladek wrote:
> On Tue 2023-01-24 11:21:39, Seth Forshee wrote:
> > On Tue, Jan 24, 2023 at 03:17:43PM +0100, Petr Mladek wrote:
> > > On Fri 2023-01-20 16:12:22, Seth Forshee (DigitalOcean) wrote:
> > > > Livepatch relies on stack checking of sleeping tasks to switch kthreads,
> > > > so a busy kthread can block a livepatch transition indefinitely. We've
> > > > seen this happen fairly often with busy vhost kthreads.
> > > 
> > > To be precise, it would be "indefinitely" only when the kthread never
> > > sleeps.
> > > 
> > > But yes. I believe that the problem is real. It might be almost
> > > impossible to livepatch some busy kthreads, especially when they
> > > have a dedicated CPU.
> > > 
> > > 
> > > > Add a check to call klp_switch_current() from vhost_worker() when a
> > > > livepatch is pending. In testing this allowed vhost kthreads to switch
> > > > immediately when they had previously blocked livepatch transitions for
> > > > long periods of time.
> > > > 
> > > > Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@...nel.org>
> > > > ---
> > > >  drivers/vhost/vhost.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > > 
> > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > > index cbe72bfd2f1f..d8624f1f2d64 100644
> > > > --- a/drivers/vhost/vhost.c
> > > > +++ b/drivers/vhost/vhost.c
> > > > @@ -366,6 +367,9 @@ static int vhost_worker(void *data)
> > > >  			if (need_resched())
> > > >  				schedule();
> > > >  		}
> > > > +
> > > > +		if (unlikely(klp_patch_pending(current)))
> > > > +			klp_switch_current();
> > > 
> > > I suggest to use the following intead:
> > > 
> > > 		if (unlikely(klp_patch_pending(current)))
> > > 			klp_update_patch_state(current);
> > > 
> > > We already use this in do_idle(). The reason is basically the same.
> > > It is almost impossible to livepatch the idle task when a CPU is
> > > very idle.
> > > 
> > > klp_update_patch_state(current) does not check the stack.
> > > It switches the task immediately.
> > > 
> > > It should be safe because the kthread never leaves vhost_worker().
> > > It means that the same kthread could never re-enter this function
> > > and use the new code.
> > 
> > My knowledge of livepatching internals is fairly limited, so I'll accept
> > it if you say that it's safe to do it this way. But let me ask about one
> > scenario.
> > 
> > Let's say that a livepatch is loaded which replaces vhost_worker(). New
> > vhost worker threads are started which use the replacement function. Now
> > if the patch is disabled, these new worker threads would be switched
> > despite still running the code from the patch module, correct? Could the
> > module then be unloaded, freeing the memory containing the code these
> > kthreads are executing?
> 
> Great catch! Yes, this might theoretically happen.
> 
> The above scenario would require calling klp_update_patch_state() from
> the code in the livepatch module. It is not possible at the moment because
> this function is not exported for modules.

vhost can be built as a module, so in order to call
klp_update_patch_state() from vhost_worker() it would have to be
exported to modules.

> Hmm, the same problem might be when we livepatch a function that calls
> another function that calls klp_update_patch_state(). But in this case
> it would be kthread() from kernel/kthread.c. It would affect any
> running kthread. I doubt that anyone would seriously think about
> livepatching this function.

Yes, there are clearly certain functions that are not safe/practical to
patch, and authors need to know what they are doing. Most kthread main()
functions probably qualify as impractical at best, at least without a
strategy to restart relevant kthreads.

But a livepatch transition will normally stall if patching these
functions when a relevant kthread is running (unless the patch is
forced), so a patch author who made a mistake should quickly notice.
vhost_worker() would behave differently.

> A good enough solution might be to document this. Livepatches could
> not be created blindly. There are more situations where the
> livepatch is tricky or not possible at all.

I can add this if you like. Is Documentation/livepatch/livepatch.rst the
right place for this?

> Crazy idea. We could prevent this problem even technically. A solution
> would be to increment a per-process counter in klp_ftrace_handler() when a
> function is redirected(). And klp_update_patch_state() might refuse
> the migration when this counter is not zero. But it would require
> to use a trampoline on return that would decrement the counter.
> I am not sure if this is worth the complexity.
> 
> One the other hand, this counter might actually remove the need
> of the reliable backtrace. It is possible that I miss something
> or that it is not easy/possible to implement the return trampoline.

I agree this should work for unpatching, and even for patching a
function which is already patched.

Maybe I'm misunderstanding, but this would only work for unpatching or
patching an already-patched function, wouldn't it? Because the original
functions would not increment the counter so you would not know if tasks
still had those on their call stacks.

> Back to the original problem. I still consider calling
> klp_update_patch_state(current) in vhost_worker() safe.

Okay, I can send a v2 which does this, so long as it's okay to export
klp_update_patch_state() to modules.

Thanks,
Seth