lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44efc25e-525b-9e51-60e4-da20deb25ded@oracle.com>
Date:   Thu, 19 Nov 2020 15:21:32 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Muchun Song <songmuchun@...edance.com>,
        Oscar Salvador <osalvador@...e.de>
Cc:     Jonathan Corbet <corbet@....net>,
        Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
        bp@...en8.de, x86@...nel.org, hpa@...or.com,
        dave.hansen@...ux.intel.com, luto@...nel.org,
        Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk,
        Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
        mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
        Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
        anshuman.khandual@....com, jroedel@...e.de,
        Mina Almasry <almasrymina@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Xiongchun duan <duanxiongchun@...edance.com>,
        linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH v4 05/21] mm/hugetlb: Introduce pgtable
 allocation/freeing helpers

On 11/18/20 10:17 PM, Muchun Song wrote:
> On Tue, Nov 17, 2020 at 11:06 PM Oscar Salvador <osalvador@...e.de> wrote:
>>
>> On Fri, Nov 13, 2020 at 06:59:36PM +0800, Muchun Song wrote:
>>> +#define page_huge_pte(page)          ((page)->pmd_huge_pte)
>>
>> Seems you do not need this one anymore.
>>
>>> +void vmemmap_pgtable_free(struct page *page)
>>> +{
>>> +     struct page *pte_page, *t_page;
>>> +
>>> +     list_for_each_entry_safe(pte_page, t_page, &page->lru, lru) {
>>> +             list_del(&pte_page->lru);
>>> +             pte_free_kernel(&init_mm, page_to_virt(pte_page));
>>> +     }
>>> +}
>>> +
>>> +int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page)
>>> +{
>>> +     unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h);
>>> +
>>> +     /* Store preallocated pages on huge page lru list */
>>> +     INIT_LIST_HEAD(&page->lru);
>>> +
>>> +     while (nr--) {
>>> +             pte_t *pte_p;
>>> +
>>> +             pte_p = pte_alloc_one_kernel(&init_mm);
>>> +             if (!pte_p)
>>> +                     goto out;
>>> +             list_add(&virt_to_page(pte_p)->lru, &page->lru);
>>> +     }
>>
>> Definetely this looks better and easier to handle.
>> Btw, did you explore Matthew's hint about instead of allocating a new page,
>> using one of the ones you are going to free to store the ptes?
>> I am not sure whether it is feasible at all though.
> 
> Hi Oscar and Matthew,
> 
> I have started an investigation about this. Finally, I think that it
> may not be feasible. If we use a vmemmap page frame as a
> page table when we split the PMD table firstly, in this stage,
> we need to set 512 pte entry to the vmemmap page frame. If
> someone reads the tail struct page struct of the HugeTLB,
> it can get the arbitrary value (I am not sure it actually exists,
> maybe the memory compaction module can do this). So on
> the safe side, I think that allocating a new page is a good
> choice.

Thanks for looking into this.

If I understand correctly, the issue is that you need the pte page to set
up the new mappings.  In your current code, this is done before removing
the pages of struct pages.  This keeps everything 'consistent' as things
are remapped.

If you want to use one of the 'pages of struct pages' for the new pte
page, then there will be a period of time when things are inconsistent.
Before setting up the mapping, some code could potentially access that
pages of struct pages.

I tend to agree that allocating allocating a new page is the safest thing
to do here.  Or, perhaps someone can think of a way make this safe.
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ