linux-kernel - Re: [PATCH] mm/userfaultfd: provide unmasked address on page-fault

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f5ea62e1-cf21-6bd8-37f8-1a0f9637402c@redhat.com>
Date:   Sat, 9 Oct 2021 09:59:35 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Nadav Amit <nadav.amit@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Peter Xu <peterx@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Jan Kara <jack@...e.cz>, stable@...r.kernel.org
Subject: Re: [PATCH] mm/userfaultfd: provide unmasked address on page-fault

On 09.10.21 00:02, Nadav Amit wrote:
> 
> 
>> On Oct 8, 2021, at 1:05 AM, David Hildenbrand <david@...hat.com> wrote:
>>
>> On 08.10.21 01:50, Nadav Amit wrote:
>>> From: Nadav Amit <namit@...are.com>
>>> Userfaultfd is supposed to provide the full address (i.e., unmasked) of
>>> the faulting access back to userspace. However, that is not the case for
>>> quite some time.
>>> Even running "userfaultfd_demo" from the userfaultfd man page provides
>>> the wrong output (and contradicts the man page). Notice that
>>> "UFFD_EVENT_PAGEFAULT event" shows the masked address.
>>> 	Address returned by mmap() = 0x7fc5e30b3000
>>> 	fault_handler_thread():
>>> 	    poll() returns: nready = 1; POLLIN = 1; POLLERR = 0
>>> 	    UFFD_EVENT_PAGEFAULT event: flags = 0; address = 7fc5e30b3000
>>> 		(uffdio_copy.copy returned 4096)
>>> 	Read address 0x7fc5e30b300f in main(): A
>>> 	Read address 0x7fc5e30b340f in main(): A
>>> 	Read address 0x7fc5e30b380f in main(): A
>>> 	Read address 0x7fc5e30b3c0f in main(): A
>>> Add a new "real_address" field to vmf to hold the unmasked address. It
>>> is possible to keep the unmasked address in the existing address field
>>> (and mask whenever necessary) instead, but this is likely to cause
>>> backporting problems of this patch.
>>
>> Can we be sure that no existing users will rely on this behavior that has been the case since end of 2016 IIRC, one year after UFFD was upstreamed?
> 
> Let me to blow off your mind: how do you be sure that the current behavior does not make applications to misbehave? It might cause performance issues as it did for me or hidden correctness issues.
> 

Fair point, but now we can speculate what's more likely:

Having an app rely on >4 year old kernel behavior just after the feature 
was released or having and app rely on kernel behavior that was the case 
for the last 4 years?

<offtopic>
Someone once told me about the unwritten way to remove things from the 
kernel. 1) Silently break it upstream 2) Wait 2 kernel releases 3) 
Propose removal of the feature because it's broken and nobody complained.
<\offtopic>

You might ask "why does David even care?", here is why:

For the records, I *do* have a prototype from last year that breaks with 
this new behavior as far as I can tell: using uffd in the context of 
virtio-balloon in QEMU. I just pushed the latest state to a !private 
github tree:
   https://github.com/davidhildenbrand/qemu/tree/virtio-balloon-uffd

In that code, I made sure that I'm only dealing with 4k pages (because 
that's the only thing virtio-balloon really can deal with), and during 
the debugging I figured that the kernel always returns 4k aligned page 
fault addresses, so I didn't care about masking. I'll reuse the 
unmodified fault address for UFFDIO_ZEROPAGE()/UFFDIO_COPY()/... which 
should then fail because:

"
EINVAL The start or the len field of the ufdio_range structure
               was not a multiple of the system page size; or len was
               zero; or the specified range was otherwise invalid.
"

If I'm too lazy to read all documentation, I'm quite sure that there are 
other people that don't. I don't care to much if this patch breaks that 
prototype, it's just a prototype after all, but I am concerned that we 
might break other users in a similar way.

>> I do wonder what the official ABI nowadays is, because man pages aren't necessarily the source of truth.
> 
> Documentation/admin-guide/mm/userfaultfd.rst says: "You get the address of the access that triggered the missing page
> event”.
> 
> So it is a bug.

The least thing I would expect in the patch description is a better 
motivation ("who cares and why" -- I know you have a better motivation 
that making the doc correct :) ) and a discussion on the chances of this 
actually breaking other apps (see my example).

I'd sleep better if we'd glue the changed behavior to a new feature 
flag, but that's just my 2 cents.

-- 
Thanks,

David / dhildenb