linux-kernel - Re: [PATCH RESEND v6 00/16] mm: Page fault enhancements

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6d8ed084-0740-cee1-663e-a78a2faee432@redhat.com>
Date:   Sun, 8 Mar 2020 13:12:34 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Andrea Arcangeli <aarcange@...hat.com>,
        Martin Cracauer <cracauer@...s.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Johannes Weiner <hannes@...xchg.org>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>,
        Bobby Powers <bobbypowers@...il.com>,
        Maya Gokhale <gokhale2@...l.gov>,
        Jerome Glisse <jglisse@...hat.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Marty McFadden <mcfadden8@...l.gov>,
        Mel Gorman <mgorman@...e.de>, Hugh Dickins <hughd@...gle.com>,
        Brian Geffon <bgeffon@...gle.com>,
        Denis Plotnikov <dplotnikov@...tuozzo.com>,
        Pavel Emelyanov <xemul@...tuozzo.com>
Subject: Re: [PATCH RESEND v6 00/16] mm: Page fault enhancements

[...]

> Yes, IIUC the race can happen like this in your below test:
> 
>      main thread          uffd thread             disgard thread
>      ===========          ===========             ==============
>      access page
>        uffd page fault
>          wait for page
>                           UFFDIO_ZEROCOPY
>                             put a page P there
>                                                   MADV_DONTNEED on P
>                             wakeup main thread
>          return from fault
>        page still missing
>        uffd page fault again
>        (without ALLOW_RETRY)
>        --> SIGBUS.

Exactly!

>> Can we please have a way to identify that this "feature" is available?
>> I'd appreciate a new read-only UFFD_FEAT_ , so we can detect this from
>> user space easily and use concurrent discards without crashing our applications.
> 
> I'm not sure how others think about it, but to me this still fells
> into the bucket of "solving an existing problem" rather than a
> feature.  Also note that this should change the behavior for the page
> fault logic in general, rather than an uffd-only change. So I'm also
> not sure whether UFFD_FEAT_* suites here even if we want it.

So, are we planning on backporting this to stable kernels?

Imagine using this in QEMU/KVM to allow discards (e.g., balloon
inflation) while postcopy is active . You certainly don't want random
guest crashes. So either, we treat this as a fix (and backport) or as a
change in behavior/feature.

[...]

>>
>> 2. What will happen if I don't place a page on a pagefault, but only do a UFFDIO_WAKE?
>>    For now we were able to trigger a signal this way.
> 
> If I'm not mistaken the UFFDIO_WAKE will directly trigger the sigbus
> even without the help of the MADV_DONTNEED race.

Yes, that's the current way of injecting a SIGBUS instead of resolving
the pagefault. And AFAIKs, you're changing that behavior. (I am not
aware of a user, there could be use cases, but somehow it's strange to
get a signal when accessing memory that is mapped READ|WRITE and also
represented like this in e.g., /proc/$PID/maps). So IMHO, only the new
behavior makes really sense.

> 
>> If the behavior is changed, can
>>    we make this configurable via a UFFD_FEAT?
> 
> I'll still think that could be an overkill, but I'll leave the
> discussion to the experts.

I'll be happy to hear what Andrea Et al. think. At least I really want
to see the new behavior - and if it's not a fix, then I want some way to
detect if a kernel has this new (fixed?) behavior.

Thanks a lot for this work!

-- 
Thanks,

David / dhildenb