lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ae8412474f793ff59d5567bc721dc012d6ee0199.camel@infradead.org>
Date: Sat, 03 Aug 2024 09:35:56 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: Peter Xu <peterx@...hat.com>
Cc: Carsten Stollmaier <stollmc@...zon.com>, Sean Christopherson
 <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner
 <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov
 <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
 "H. Peter Anvin" <hpa@...or.com>, nh-open-source@...zon.com, Sebastian
 Biemueller <sbiemue@...zon.de>, kvm@...r.kernel.org,
 linux-kernel@...r.kernel.org,  Matthew Wilcox <willy@...radead.org>, Andrew
 Morton <akpm@...ux-foundation.org>, "linux-mm@...ck.org"
 <linux-mm@...ck.org>
Subject: Re: [PATCH] KVM: x86: Use gfn_to_pfn_cache for steal_time

On Fri, 2024-08-02 at 18:40 -0400, Peter Xu wrote:
> On Fri, Aug 02, 2024 at 01:03:16PM +0100, David Woodhouse wrote:
> > An alternative workaround (which perhaps we should *also* consider)
> > looked like this (plus some suitable code comment, of course):
> > 
> > --- a/arch/x86/mm/fault.c
> > +++ b/arch/x86/mm/fault.c
> > @@ -1304,6 +1304,8 @@ void do_user_addr_fault(struct pt_regs *regs,
> >          */
> >         if (user_mode(regs))
> >                 flags |= FAULT_FLAG_USER;
> > +       else
> > +               flags &= ~FAULT_FLAG_INTERRUPTIBLE;
> >  
...
> Instead of "interruptible exception" or the original patch (which might
> still be worthwhile, though?  I didn't follow much on kvm and the new gpc
> cache, but looks still nicer than get/put user from initial glance), above

Yes, I definitely think we want the GPC conversion anyway. That's why I
suggested it to Carsten, to resolve our *immediate* problem while we
continue to ponder the general case.

> looks like the easier and complete solution to me.  For "completeness", I
> mean I am not sure how many other copy_to/from_user() code in kvm can hit
> this, so looks like still possible to hit outside steal time page?

Right. It theoretically applies to *any* user access. It's just that
anything other than *guest* pages is slightly less likely to be backed
by userfaultfd.

> I thought only the slow fault path was involved in INTERRUPTIBLE thing and
> that was the plan, but I guess I overlooked how the default value could
> affect copy to/from user invoked from KVM as well..
> 
> With above patch to drop FAULT_FLAG_INTERRUPTIBLE for !user, KVM can still
> opt-in INTERRUPTIBLE anywhere by leveraging hva_to_pfn[_slow]() API, which
> is "INTERRUPTIBLE"-ready with a boolean the caller can set. But the caller
> will need to be able to process KVM_PFN_ERR_SIGPENDING.

Right. I think converting kvm_{read,write}_guest() and friends to do
that and be interruptible might make sense?

The patch snippet above obviously only fixes it for x86 and would need
to be done across the board. Unless we do this one instead, abusing the
knowledge that uffd is the only thing which honours
FAULT_FLAG_INTERRUPTIBLE?

--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -351,7 +351,7 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx,
 
 static inline unsigned int userfaultfd_get_blocking_state(unsigned int flags)
 {
-       if (flags & FAULT_FLAG_INTERRUPTIBLE)
+       if ((flags & FAULT_FLAG_INTERRUPTIBLE) && (flags & FAULT_FLAG_USER))
                return TASK_INTERRUPTIBLE;
 
        if (flags & FAULT_FLAG_KILLABLE)

I still quite like the idea of *optional* interruptible exceptions, as
seen in my proof of concept. Perhaps we wouldn't want the read(2) and
write(2) system calls to use them, but there are plenty of other system
calls which could be interruptible instead of blocking.

Right now, even the simple case of a trivial SIGINT handler which does
some minor cleanup before exiting, makes it a non-fatal signal so the
kernel blocks and waits for ever.


Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ