lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Dec 2020 14:49:36 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Oscar Salvador <osalvador@...e.de>
Cc:     Muchun Song <songmuchun@...edance.com>, corbet@....net,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, x86@...nel.org,
        hpa@...or.com, dave.hansen@...ux.intel.com, luto@...nel.org,
        peterz@...radead.org, viro@...iv.linux.org.uk,
        akpm@...ux-foundation.org, paulmck@...nel.org,
        mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
        rdunlap@...radead.org, oneukum@...e.com, anshuman.khandual@....com,
        jroedel@...e.de, almasrymina@...gle.com, rientjes@...gle.com,
        willy@...radead.org, mhocko@...e.com, song.bao.hua@...ilicon.com,
        david@...hat.com, duanxiongchun@...edance.com,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v9 03/11] mm/hugetlb: Free the vmemmap pages associated
 with each HugeTLB page

On 12/16/20 2:25 PM, Oscar Salvador wrote:
> On Wed, Dec 16, 2020 at 02:08:30PM -0800, Mike Kravetz wrote:
>>> + * vmemmap_rmap_walk - walk vmemmap page table
>>> +
>>> +static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr,
>>> +			      unsigned long end, struct vmemmap_rmap_walk *walk)
>>> +{
>>> +	pte_t *pte;
>>> +
>>> +	pte = pte_offset_kernel(pmd, addr);
>>> +	do {
>>> +		BUG_ON(pte_none(*pte));
>>> +
>>> +		if (!walk->reuse)
>>> +			walk->reuse = pte_page(pte[VMEMMAP_TAIL_PAGE_REUSE]);
>>
>> It may be just me, but I don't like the pte[-1] here.  It certainly does work
>> as designed because we want to remap all pages in the range to the page before
>> the range (at offset -1).  But, we do not really validate this 'reuse' page.
>> There is the BUG_ON(pte_none(*pte)) as a sanity check, but we do nothing similar
>> for pte[-1].  Based on the usage for HugeTLB pages, we can be confident that
>> pte[-1] is actually a pte.  In discussions with Oscar, you mentioned another
>> possible use for these routines.
> 
> Without giving it much of a thought, I guess we could duplicate the
> BUG_ON for the pte outside the loop, and add a new one for pte[-1].
> Also, since walk->reuse seems to not change once it is set, we can take
> it outside the loop? e.g:
> 
> 	pte *pte;
> 
> 	pte = pte_offset_kernel(pmd, addr);
> 	BUG_ON(pte_none(*pte));
> 	BUG_ON(pte_none(pte[VMEMMAP_TAIL_PAGE_REUSE]));
> 	walk->reuse = pte_page(pte[VMEMMAP_TAIL_PAGE_REUSE]);
> 	do {
> 		....
> 	} while...
> 
> Or I am not sure whether we want to keep it inside the loop in case
> future cases change walk->reuse during the operation.
> But to be honest, I do not think it is realistic of all future possible
> uses of this, so I would rather keep it simple for now.

I was thinking about possibly passing the 'reuse' address as another parameter
to vmemmap_remap_reuse().  We could add this addr to the vmemmap_rmap_walk
struct and set walk->reuse when we get to the pte for that address.  Of
course this would imply that the addr would need to be part of the range.

Ideally, we would walk the page table to get to the reuse page.  My concern
was not explicitly about adding the BUG_ON.  In more general use, *pte could
be the first entry on a pte page.  And, then pte[-1] may not even be a pte.

Again, I don't think this matters for the current HugeTLB use case.  Just a
little concerned if code is put to use for other purposes.
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ