linux-kernel - Re: [PATCH v2 3/7] mm: Introduce FAULT_FLAG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <df8cc623-617f-f14f-d23e-312dbdf05d2e@redhat.com>
Date:   Fri, 6 Sep 2019 14:53:13 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Hugh Dickins <hughd@...gle.com>,
        Maya Gokhale <gokhale2@...l.gov>,
        Jerome Glisse <jglisse@...hat.com>,
        Pavel Emelyanov <xemul@...tuozzo.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Martin Cracauer <cracauer@...s.org>,
        Marty McFadden <mcfadden8@...l.gov>, Shaohua Li <shli@...com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Denis Plotnikov <dplotnikov@...tuozzo.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mel Gorman <mgorman@...e.de>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>
Subject: Re: [PATCH v2 3/7] mm: Introduce FAULT_FLAG_INTERRUPTIBLE

On 06.09.19 14:38, Peter Xu wrote:
> On Fri, Sep 06, 2019 at 11:02:22AM +0200, David Hildenbrand wrote:
>> On 05.09.19 12:15, Peter Xu wrote:
>>> handle_userfaultfd() is currently the only one place in the kernel
>>> page fault procedures that can respond to non-fatal userspace signals.
>>> It was trying to detect such an allowance by checking against USER &
>>> KILLABLE flags, which was "un-official".
>>>
>>> In this patch, we introduced a new flag (FAULT_FLAG_INTERRUPTIBLE) to
>>> show that the fault handler allows the fault procedure to respond even
>>> to non-fatal signals.  Meanwhile, add this new flag to the default
>>> fault flags so that all the page fault handlers can benefit from the
>>> new flag.  With that, replacing the userfault check to this one.
>>>
>>> Since the line is getting even longer, clean up the fault flags a bit
>>> too to ease TTY users.
>>>
>>> Although we've got a new flag and applied it, we shouldn't have any
>>> functional change with this patch so far.
>>>
>>> Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
>>> Signed-off-by: Peter Xu <peterx@...hat.com>
>>> ---
>>>  fs/userfaultfd.c   |  4 +---
>>>  include/linux/mm.h | 39 ++++++++++++++++++++++++++++-----------
>>>  2 files changed, 29 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
>>> index ccbdbd62f0d8..4a8ad2dc2b6f 100644
>>> --- a/fs/userfaultfd.c
>>> +++ b/fs/userfaultfd.c
>>> @@ -462,9 +462,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
>>>  	uwq.ctx = ctx;
>>>  	uwq.waken = false;
>>>  
>>> -	return_to_userland =
>>> -		(vmf->flags & (FAULT_FLAG_USER|FAULT_FLAG_KILLABLE)) ==
>>> -		(FAULT_FLAG_USER|FAULT_FLAG_KILLABLE);
>>> +	return_to_userland = vmf->flags & FAULT_FLAG_INTERRUPTIBLE;
>>>  	blocking_state = return_to_userland ? TASK_INTERRUPTIBLE :
>>>  			 TASK_KILLABLE;
>>>  
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 57fb5c535f8e..53ec7abb8472 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -383,22 +383,38 @@ extern unsigned int kobjsize(const void *objp);
>>>   */
>>>  extern pgprot_t protection_map[16];
>>>  
>>> -#define FAULT_FLAG_WRITE	0x01	/* Fault was a write access */
>>> -#define FAULT_FLAG_MKWRITE	0x02	/* Fault was mkwrite of existing pte */
>>> -#define FAULT_FLAG_ALLOW_RETRY	0x04	/* Retry fault if blocking */
>>> -#define FAULT_FLAG_RETRY_NOWAIT	0x08	/* Don't drop mmap_sem and wait when retrying */
>>> -#define FAULT_FLAG_KILLABLE	0x10	/* The fault task is in SIGKILL killable region */
>>> -#define FAULT_FLAG_TRIED	0x20	/* Second try */
>>> -#define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
>>> -#define FAULT_FLAG_REMOTE	0x80	/* faulting for non current tsk/mm */
>>> -#define FAULT_FLAG_INSTRUCTION  0x100	/* The fault was during an instruction fetch */
>>> +/**
>>> + * Fault flag definitions.
>>> + *
>>> + * @FAULT_FLAG_WRITE: Fault was a write fault.
>>> + * @FAULT_FLAG_MKWRITE: Fault was mkwrite of existing PTE.
>>> + * @FAULT_FLAG_ALLOW_RETRY: Allow to retry the fault if blocked.
>>> + * @FAULT_FLAG_RETRY_NOWAIT: Don't drop mmap_sem and wait when retrying.
>>> + * @FAULT_FLAG_KILLABLE: The fault task is in SIGKILL killable region.
>>> + * @FAULT_FLAG_TRIED: The fault has been tried once.
>>> + * @FAULT_FLAG_USER: The fault originated in userspace.
>>> + * @FAULT_FLAG_REMOTE: The fault is not for current task/mm.
>>> + * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch.
>>> + * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals.
>>> + */
>>> +#define FAULT_FLAG_WRITE			0x01
>>> +#define FAULT_FLAG_MKWRITE			0x02
>>> +#define FAULT_FLAG_ALLOW_RETRY			0x04
>>> +#define FAULT_FLAG_RETRY_NOWAIT			0x08
>>> +#define FAULT_FLAG_KILLABLE			0x10
>>> +#define FAULT_FLAG_TRIED			0x20
>>> +#define FAULT_FLAG_USER				0x40
>>> +#define FAULT_FLAG_REMOTE			0x80
>>> +#define FAULT_FLAG_INSTRUCTION  		0x100
>>> +#define FAULT_FLAG_INTERRUPTIBLE		0x200
>>>  
>>
>> I'd probably split off the unrelated doc changes. Just a matter of taste.
> 
> The thing is that it's not really a document change but only a format
> change (when I wanted to add the new macro it's easily getting out of
> 80 chars so I simply reformatted all the rest to make them look
> similar).  I'm afraid that could be too trivial to change the format
> as a single patch, but I can do it if anyone else also thinks it
> proper.
> 
>>
>>>  /*
>>>   * The default fault flags that should be used by most of the
>>>   * arch-specific page fault handlers.
>>>   */
>>>  #define FAULT_FLAG_DEFAULT  (FAULT_FLAG_ALLOW_RETRY | \
>>> -			     FAULT_FLAG_KILLABLE)
>>> +			     FAULT_FLAG_KILLABLE | \
>>> +			     FAULT_FLAG_INTERRUPTIBLE)
>>
>> So by default, all faults are marked interruptible, also
>> !FAULT_FLAG_USER. I assume the trick right now is that
>> handle_userfault() will indeed only be called on user faults and the
>> flag is used nowhere else ;)
> 
> Sorry if this is confusing, but FAULT_FLAG_DEFAULT is just a macro to
> make the patchset easier so we define this initial flags for most of
> the archs (say, there can be some arch that does not use this default
> value, but the fact is most archs are indeed using the same flags
> hence we define it here now).
> 
> And, userfaultfd can also handle kernel faults.  For FAULT_FLAG_USER,
> it will be set if the fault comes from userspace (in
> do_user_addr_fault()).
> 
Got it, sounds sane to me then.

>>
>> Would it make sense to name it FAULT_FLAG_USER_INTERRUPTIBLE, to stress
>> that the flag only applies to user faults? (or am I missing something
>> and this could also apply to !user faults somewhen in the future?
> 
> As mentioned above, uffd can handle kernel faults.  And, for what I
> understand, it's not really directly related to user fault or not at
> all, instead its more or less match with TASK_{INTERRUPTIBLE|KILLABLE}
> on what kind of signals we care about during the fault processing.  So
> it seems to me that it's two different things.
> 
>>
>> (I am no expert on the fault paths yet, so sorry for the silly questions)
> 
> (I only hope that I'm not providing silly answers. :)

I highly doubt it :) Thanks!

-- 

Thanks,

David / dhildenb