netdev - Re: [PATCH v2 0/3] vhost_task: Fix a bug where KVM wakes an exited task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aMxIMADtzYrJg6Pb@google.com>
Date: Thu, 18 Sep 2025 10:58:08 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Paolo Bonzini <pbonzini@...hat.com>, 
	Jason Wang <jasowang@...hat.com>, kvm@...r.kernel.org, virtualization@...ts.linux.dev, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/3] vhost_task: Fix a bug where KVM wakes an exited task

On Thu, Sep 18, 2025, Michael S. Tsirkin wrote:
> On Thu, Sep 18, 2025 at 09:52:19AM -0700, Sean Christopherson wrote:
> > On Thu, Sep 18, 2025, Michael S. Tsirkin wrote:
> > > On Thu, Sep 18, 2025 at 09:04:07AM -0700, Sean Christopherson wrote:
> > > > On Thu, Sep 18, 2025, Sebastian Andrzej Siewior wrote:
> > > > > On 2025-09-18 11:09:05 [-0400], Michael S. Tsirkin wrote:
> > > > > > So how about switching to this approach then?
> > > > > > Instead of piling up fixes like we seem to do now ...
> > > > 
> > > > I don't have a strong preference for 6.17, beyond landing a fix of some kind.
> > > > I think there are three options for 6.17, in order of "least like to break
> > > > something":
> > > > 
> > > >  1. Sebastian's get_task_struct() fix
> > > 
> > > 
> > > I am just a bit apprehensive that we don't create a situation
> > > where we leak the task struct somehow, given the limited
> > > testing time. Can you help me get convinced that risk is 0?
> > 
> > I doubt it, I share same similar concerns about lack of testing.  So I guess
> > thinking about this again, #2 is probably safer since it'd only impact KVM?
> 
> I can't say I understand completely how we get that state though?
> Why did the warning trigger if it's not a UAF?

It's purely a flaw in the sanity check itself due to the ordering in vhost_task_fn().

As is, vhost_task_fn() marks the task KILLED before invoking ->handle_sigkill(),
i.e. before vhost_worker_killed() is guaranteed to complete, and thus before
worker->killed is set.  As a result, vhost can keep waking workers that have
KILLED set, but haven't actually exited.  That's perfectly fine as UAF won't
occur until do_exit() is called, and that won't happen until ->handle_sigkill()
completes.

> > > >  2. This series, without the KILLED sanity check in __vhost_task_wake()
> > > >  3. This series, with my fixup (with which syzbot was happy)
>