linux-kernel - Re: [PATCH 13/31] mm/hmm: retry if pte_offset

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <874jo270sg.fsf@nvidia.com>
Date:   Wed, 24 May 2023 15:16:06 +1000
From:   Alistair Popple <apopple@...dia.com>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Qi Zheng <qi.zheng@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...nel.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        David Hildenbrand <david@...hat.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Qi Zheng <zhengqi.arch@...edance.com>,
        Yang Shi <shy828301@...il.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Xu <peterx@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
        Ralph Campbell <rcampbell@...dia.com>,
        Ira Weiny <ira.weiny@...el.com>,
        Steven Price <steven.price@....com>,
        SeongJae Park <sj@...nel.org>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Zack Rusin <zackr@...are.com>, Jason Gunthorpe <jgg@...pe.ca>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Minchan Kim <minchan@...nel.org>,
        Christoph Hellwig <hch@...radead.org>,
        Song Liu <song@...nel.org>,
        Thomas Hellstrom <thomas.hellstrom@...ux.intel.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 13/31] mm/hmm: retry if pte_offset_map() fails


Hugh Dickins <hughd@...gle.com> writes:

> On Tue, 23 May 2023, Qi Zheng wrote:
>> On 2023/5/23 10:39, Alistair Popple wrote:
>> > Qi Zheng <qi.zheng@...ux.dev> writes:
>> >> On 2023/5/22 13:05, Hugh Dickins wrote:
>> >>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
>> >>> again loop of its own, so take part in that if pte_offset_map() fails.
>> >>> Signed-off-by: Hugh Dickins <hughd@...gle.com>
>> >>> ---
>> >>>    mm/hmm.c | 2 ++
>> >>>    1 file changed, 2 insertions(+)
>> >>> diff --git a/mm/hmm.c b/mm/hmm.c
>> >>> index e23043345615..b1a9159d7c92 100644
>> >>> --- a/mm/hmm.c
>> >>> +++ b/mm/hmm.c
>> >>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
>> >>>     }
>> >>>      	ptep = pte_offset_map(pmdp, addr);
>> >>> +	if (!ptep)
>> >>> +		goto again;
>> >>>     for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) {
>> >>>      int r;
>> >>>    
>> >>
>> >> I haven't read the entire patch set yet, but taking a note here.
>> >> The hmm_vma_handle_pte() will unmap pte and then call
>> >> migration_entry_wait() to remap pte, so this may fail, we need to
>> >> handle this case like below:
>> > 
>> > I don't see a problem here. Sure, hmm_vma_handle_pte() might return
>> > -EBUSY but that will get returned up to hmm_range_fault() which will
>> > retry the whole thing again and presumably fail when looking at the PMD.
>> 
>> Yeah. There is no problem with this and the modification to
>> migration_entry_wait() can be simplified. My previous thought was that
>> we can finish the retry logic in hmm_vma_walk_pmd() without handing it
>> over to the caller. :)
>
> Okay, Alistair has resolved this one, thanks, I agree; but what is
> "the modification to migration_entry_wait()" that you refer to there?
>
> I don't think there's any need to make it a bool, it's normal for there
> to be races on entry to migration_entry_wait(), and we're used to just
> returning to caller (and back up to userspace) when it does not wait.

Agreed. I didn't spot any places where returning to the caller without
actually waiting would cause looping. I assume any retries or refaults
will find the cleared PMD and fault/error out in some other manner
anyway.

hmm_range_fault() is the only place that might have been a bit special,
but it looks fine to me so:

Reviewed-by: Alistair Popple <apopple@...dia.com>

> Hugh