linux-kernel - Re: [PATCH v3 3/3] vfio-pci: Invalidate mmaps and block MMIO access on disabled memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b30fdc87-31e7-cb45-dab3-c37322ce05b1@nvidia.com>
Date:   Mon, 25 May 2020 17:46:34 -0700
From:   John Hubbard <jhubbard@...dia.com>
To:     Jason Gunthorpe <jgg@...pe.ca>
CC:     Peter Xu <peterx@...hat.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <cohuck@...hat.com>, <cai@....pw>,
        Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: [PATCH v3 3/3] vfio-pci: Invalidate mmaps and block MMIO access
 on disabled memory

On 2020-05-25 17:37, Jason Gunthorpe wrote:
...
>> commit 318b275fbca1ab9ec0862de71420e0e92c3d1aa7
>> Author: Gleb Natapov <gleb@...hat.com>
>> Date:   Tue Mar 22 16:30:51 2011 -0700
>>
>>      mm: allow GUP to fail instead of waiting on a page
>>
>>      GUP user may want to try to acquire a reference to a page if it is already
>>      in memory, but not if IO, to bring it in, is needed.  For example KVM may
>>      tell vcpu to schedule another guest process if current one is trying to
>>      access swapped out page.  Meanwhile, the page will be swapped in and the
>>      guest process, that depends on it, will be able to run again.
>>
>>      This patch adds FAULT_FLAG_RETRY_NOWAIT (suggested by Linus) and
>>      FOLL_NOWAIT follow_page flags.  FAULT_FLAG_RETRY_NOWAIT, when used in
>>      conjunction with VM_FAULT_ALLOW_RETRY, indicates to handle_mm_fault that
>>      it shouldn't drop mmap_sem and wait on a page, but return VM_FAULT_RETRY
>>      instead.
> 
> So, from kvm's perspective it was to avoid excessively long blocking in
> common paths when it could rejoin the completed IO by somehow waiting
> on a page itself?


Or perhaps some variation on that, such as just retrying with an intervening
schedule() call. It's not clear just from that commit.

> 
> It all seems like it should not be used unless the page is going to go
> to IO?


That's my conclusion so far, yes.


> 
> Certainly there is no reason to optimize the fringe case of vfio
> sleeping if there is and incorrect concurrnent attempt to disable the
> a BAR.


Definitely agree with that position.

> 
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 8429d5aa31e44..e32e8e52a57ac 100644
>> +++ b/include/linux/mm.h
>> @@ -430,6 +430,15 @@ extern pgprot_t protection_map[16];
>>    * continuous faults with flags (b).  We should always try to detect pending
>>    * signals before a retry to make sure the continuous page faults can still be
>>    * interrupted if necessary.
>> + *
>> + * About @FAULT_FLAG_RETRY_NOWAIT: this is intended for callers who would like
>> + * to acquire a page, but only if the page is already in memory. If, on the
>> + * other hand, the page requires IO in order to bring it into memory, then fault
>> + * handlers will immediately return VM_FAULT_RETRY ("don't wait"), while leaving
>> + * mmap_lock held ("don't drop mmap_lock"). For example, this is useful for
>> + * virtual machines that have multiple guests running: if guest A attempts
>> + * get_user_pages() on a swapped out page, another guest can be scheduled while
>> + * waiting for IO to swap in guest A's page.
>>    */
>>   #define FAULT_FLAG_WRITE                       0x01
>>   #define FAULT_FLAG_MKWRITE                     0x02
> 
> It seems reasonable but people might complain about the kvm
> specifics of the explanation.
> 
> It might be better to explain how the caller is supposed to know when
> it is OK to try GUP again and expect success, as it seems to me this
> is really about externalizing the sleep for page wait?
> 

OK, good point. The example was helpful in the commit log, but not quite
appropriate in mm.h, yes.

thanks,
-- 
John Hubbard
NVIDIA