lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YC0fIhEHRDOVzK8U@dhcp22.suse.cz>
Date:   Wed, 17 Feb 2021 14:50:26 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Oscar Salvador <osalvador@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Muchun Song <songmuchun@...edance.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages

On Wed 17-02-21 14:36:47, David Hildenbrand wrote:
> On 17.02.21 14:30, Michal Hocko wrote:
> > On Wed 17-02-21 11:08:15, Oscar Salvador wrote:
> > > Free hugetlb pages are tricky to handle so as to no userspace application
> > > notices disruption, we need to replace the current free hugepage with
> > > a new one.
> > > 
> > > In order to do that, a new function called alloc_and_dissolve_huge_page
> > > is introduced.
> > > This function will first try to get a new fresh hugetlb page, and if it
> > > succeeds, it will dissolve the old one.
> > > 
> > > With regard to the allocation, since we do not know whether the old page
> > > was allocated on a specific node on request, the node the old page belongs
> > > to will be tried first, and then we will fallback to all nodes containing
> > > memory (N_MEMORY).
> > 
> > I do not think fallback to a different zone is ok. If yes then this
> > really requires a very good reasoning. alloc_contig_range is an
> > optimistic allocation interface at best and it shouldn't break carefully
> > node aware preallocation done by administrator.
> 
> What does memory offlining do when migrating in-use hugetlbfs pages? Does it
> always keep the node?

No it will break the node pool. The reasoning behind that is that
offlining is an explicit request from the userspace and it is expected
to break affinities because it is a destructive action from the memory
capacity point of view. It is impossible to have former affinity while
you are cutting the memory off under its user.

> I think keeping the node is the easiest/simplest approach for now.
> 
> > 
> > > Note that gigantic hugetlb pages are fenced off since there is a cyclic
> > > dependency between them and alloc_contig_range.
> > 
> > Why do we need/want to do all this in the first place?
> 
> cma and virtio-mem (especially on ZONE_MOVABLE) really want to handle
> hugetlbfs pages.

Do we have any real life examples? Or does this fall more into, let's
optimize an existing implementation category.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ