netdev - Re: [PATCH 2/2] vhost: check for pending livepatches from vhost worker kthreads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y9JoxAHLplZoVPea@alley>
Date:   Thu, 26 Jan 2023 12:49:24 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Seth Forshee <sforshee@...nel.org>
Cc:     Jason Wang <jasowang@...hat.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jiri Kosina <jikos@...nel.org>,
        Miroslav Benes <mbenes@...e.cz>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
        netdev@...r.kernel.org, live-patching@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] vhost: check for pending livepatches from vhost
 worker kthreads

On Thu 2023-01-26 12:16:36, Petr Mladek wrote:
> On Wed 2023-01-25 10:57:30, Seth Forshee wrote:
> > On Wed, Jan 25, 2023 at 12:34:26PM +0100, Petr Mladek wrote:
> > > On Tue 2023-01-24 11:21:39, Seth Forshee wrote:
> > > > On Tue, Jan 24, 2023 at 03:17:43PM +0100, Petr Mladek wrote:
> > > > > On Fri 2023-01-20 16:12:22, Seth Forshee (DigitalOcean) wrote:
> > > > > > Livepatch relies on stack checking of sleeping tasks to switch kthreads,
> > > > > > so a busy kthread can block a livepatch transition indefinitely. We've
> > > > > > seen this happen fairly often with busy vhost kthreads.
> > > > > 
> > > > > > --- a/drivers/vhost/vhost.c
> > > > > > +++ b/drivers/vhost/vhost.c
> > > > > > @@ -366,6 +367,9 @@ static int vhost_worker(void *data)
> > > > > >  			if (need_resched())
> > > > > >  				schedule();
> > > > > >  		}
> > > > > > +
> > > > > > +		if (unlikely(klp_patch_pending(current)))
> > > > > > +			klp_switch_current();
> > > > > 
> > > > > I suggest to use the following intead:
> > > > > 
> > > > > 		if (unlikely(klp_patch_pending(current)))
> > > > > 			klp_update_patch_state(current);
> > > > > 
> > > > > We already use this in do_idle(). The reason is basically the same.
> > > > > It is almost impossible to livepatch the idle task when a CPU is
> > > > > very idle.
> > > > > 
> > > > Let's say that a livepatch is loaded which replaces vhost_worker(). New
> > > > vhost worker threads are started which use the replacement function. Now
> > > > if the patch is disabled, these new worker threads would be switched
> > > > despite still running the code from the patch module, correct? Could the
> > > > module then be unloaded, freeing the memory containing the code these
> > > > kthreads are executing?
> > > 
> > > Hmm, the same problem might be when we livepatch a function that calls
> > > another function that calls klp_update_patch_state(). But in this case
> > > it would be kthread() from kernel/kthread.c. It would affect any
> > > running kthread. I doubt that anyone would seriously think about
> > > livepatching this function.

And I missed something. klp_update_patch_state_safe(), proposed below,
would not cover the above scenario.

It might be possible to add something similar to kthread()
function. I think that it is the only "livepatchable" function
that might call vhost_worker(). We could block
klp_update_patch_state() for the entire kthread when the kthread()
function is called from a livepatch.

Well, it is all just the best effort. The reference counting in
the ftrace handler would be more reliable. But it would require
adding the trampoline on the return.

> /**
>  * klp_update_patch_state_safe() - do not update the path state when
>  *	called from a livepatch.
>  * @task: task_struct to be updated
>  * @calller_addr: address of the function which  calls this one
>  *
>  * Do not update the patch set when called from a livepatch.
>  * It would allow to remove the livepatch module even when
>  * the code still might be in use.
>  */
> void klp_update_patch_state_safe(struct task_struct *task, void *caller_addr)
> {
> 	static bool checked;
> 	static bool safe;
> 
> 	if (unlikely(!checked)) {
> 		struct module *mod;
> 
> 		preempt_disable();
> 		mod = __module_address(caller_addr);
> 		if (!mod || !is_livepatch_module(mod))
> 			safe = true;
> 		checked = true;
> 		preempt_enable();
> 	}
> 
> 	if (safe)
> 		klp_update_patch_state(task);
> }
> 
> and use in vhost_worker()
> 
> 		if (unlikely(klp_patch_pending(current)))
> 			klp_update_patch_state_safe(current, vhost_worker);
> 
> Even better might be to get the caller address using some compiler
> macro. I guess that it should be possible.
> 
> And even better would be to detect this at the compile time. But
> I do not know how to do so.
> 
> > Okay, I can send a v2 which does this, so long as it's okay to export
> > klp_update_patch_state() to modules.
> 
> It would be acceptable for me if we added a warning above the function
> definition and into the livepatch documentation.

I would probably go this way after all. Still thinking...

Best Regards,
Petr