[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9bd6e57e-2c77-33f9-f9ea-7916b20ee6a5@redhat.com>
Date: Wed, 30 Nov 2022 11:21:40 +0100
From: David Hildenbrand <david@...hat.com>
To: Peter Xu <peterx@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Cc: James Houghton <jthoughton@...gle.com>,
Jann Horn <jannh@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Rik van Riel <riel@...riel.com>,
Nadav Amit <nadav.amit@...il.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Muchun Song <songmuchun@...edance.com>,
Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: [PATCH 03/10] mm/hugetlb: Document huge_pte_offset usage
On 29.11.22 20:35, Peter Xu wrote:
> huge_pte_offset() is potentially a pgtable walker, looking up pte_t* for a
> hugetlb address.
>
> Normally, it's always safe to walk a generic pgtable as long as we're with
> the mmap lock held for either read or write, because that guarantees the
> pgtable pages will always be valid during the process.
With the addition, that it's only safe to walk within VMA ranges while
holding the mmap lock in read mode. It's not safe to walk outside VMA
ranges.
But the point is that we're walking within a known hugetlbfs VMA, I
assume, just adding it for completeness :)
>
> But it's not true for hugetlbfs, especially shared: hugetlbfs can have its
> pgtable freed by pmd unsharing, it means that even with mmap lock held for
> current mm, the PMD pgtable page can still go away from under us if pmd
> unsharing is possible during the walk.
>
> So we have two ways to make it safe even for a shared mapping:
>
> (1) If we're with the hugetlb vma lock held for either read/write, it's
> okay because pmd unshare cannot happen at all.
>
> (2) If we're with the i_mmap_rwsem lock held for either read/write, it's
> okay because even if pmd unshare can happen, the pgtable page cannot
> be freed from under us.
>
> Document it.
>
> Signed-off-by: Peter Xu <peterx@...hat.com>
In general, I like that documentation. Let's see if we can figure out
what to do with the i_mmap_rwsem.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists