[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d2cb431-bd43-7064-4311-ab541f11fbf8@redhat.com>
Date: Wed, 1 Sep 2021 18:10:55 +0200
From: David Hildenbrand <david@...hat.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Qi Zheng <zhengqi.arch@...edance.com>, akpm@...ux-foundation.org,
tglx@...utronix.de, hannes@...xchg.org, mhocko@...nel.org,
vdavydov.dev@...il.com, kirill.shutemov@...ux.intel.com,
mika.penttila@...tfour.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
songmuchun@...edance.com
Subject: Re: [PATCH v2 0/9] Free user PTE page table pages
On 01.09.21 18:07, Jason Gunthorpe wrote:
> On Wed, Sep 01, 2021 at 02:32:08PM +0200, David Hildenbrand wrote:
>
>> b) pmd_trans_unstable_or_pte_try_get() and friends are really ugly.
>
> I suspect the good API here is really more like:
That was my exactly my first idea and I tried to rework the code for
roughly 2 days and failed.
Especially in pagefault logic, we temporarily unmap/unlock to map/lock
again later and don't want the page table to just vanish.
I think I met similar cases when allocating a page table and not wanting
it to vanish and not wanting to map/lock it. But I don't recall all the
corner cases: it didn't work for me.
>
> ptep = pte_try_map(pmdp, &pmd_value)
> if (!ptep) {
> // pmd_value is guarenteed to not be a PTE table pointer.
> if (pmd_XXX(pmd_value))
> }
>
> Ie the core code will do whatever stuff, including the THP data race
> avoidance, to either return the next level page table or the value of
> a pmd that is not a enxt level page table. Callers are much clearer in
> this way.
>
> Eg this is a fairly representative sample user:
>
> static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
> struct mm_walk *walk)
> {
> if (pmd_trans_unstable(pmd))
> goto out;
> pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>
> And it is obviously pretty easy to integrate any refcount into
> pte_try_map and pte_unmap as in my other email.
It didn't work when I tried.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists