lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b1533f7-69c6-6f19-fc93-c69750caaecc@redhat.com>
Date:   Fri, 20 Nov 2020 10:27:05 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Michal Hocko <mhocko@...e.com>,
        Muchun Song <songmuchun@...edance.com>
Cc:     corbet@....net, mike.kravetz@...cle.com, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, x86@...nel.org, hpa@...or.com,
        dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
        viro@...iv.linux.org.uk, akpm@...ux-foundation.org,
        paulmck@...nel.org, mchehab+huawei@...nel.org,
        pawan.kumar.gupta@...ux.intel.com, rdunlap@...radead.org,
        oneukum@...e.com, anshuman.khandual@....com, jroedel@...e.de,
        almasrymina@...gle.com, rientjes@...gle.com, willy@...radead.org,
        osalvador@...e.de, song.bao.hua@...ilicon.com,
        duanxiongchun@...edance.com, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v5 00/21] Free some vmemmap pages of hugetlb page

On 20.11.20 09:42, Michal Hocko wrote:
> On Fri 20-11-20 14:43:04, Muchun Song wrote:
> [...]
> 
> Thanks for improving the cover letter and providing some numbers. I have
> only glanced through the patchset because I didn't really have more time
> to dive depply into them.
> 
> Overall it looks promissing. To summarize. I would prefer to not have
> the feature enablement controlled by compile time option and the kernel
> command line option should be opt-in. I also do not like that freeing
> the pool can trigger the oom killer or even shut the system down if no
> oom victim is eligible.
> 
> One thing that I didn't really get to think hard about is what is the
> effect of vmemmap manipulation wrt pfn walkers. pfn_to_page can be
> invalid when racing with the split. How do we enforce that this won't
> blow up?

I have the same concerns - the sections are online the whole time and 
anybody with pfn_to_online_page() can grab them

I think we have similar issues with memory offlining when removing the 
vmemmap, it's just very hard to trigger and we can easily protect by 
grabbing the memhotplug lock. I once discussed with Dan using rcu to 
protect the SECTION_IS_ONLINE bit, to make sure anybody who did a 
pfn_to_online_page() stopped using the page. Of course, such an approach 
is not easy to use in this context where the sections stay online the 
whole time ... we would have to protect vmemmap table entries using rcu 
or similar, which can get quite ugly.

To keep things easy, maybe simply never allow to free these hugetlb 
pages again for now? If they were reserved during boot and the vmemmap 
condensed, then just let them stick around for all eternity.

Once we have a safe approach on how to modify an online vmemmap, we can 
enable this freeing, and eventually also dynamically manage vmemmaps for 
runtime-allocated huge pages.

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ