lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180823105130.GB14924@techadventures.net>
Date:   Thu, 23 Aug 2018 12:51:30 +0200
From:   Oscar Salvador <osalvador@...hadventures.net>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Michal Hocko <mhocko@...nel.org>, tglx@...utronix.de,
        joe@...ches.com, arnd@...db.de, gregkh@...uxfoundation.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH] mm: Fix comment for NODEMASK_ALLOC

On Tue, Aug 21, 2018 at 01:51:59PM -0700, Andrew Morton wrote:
> On Tue, 21 Aug 2018 14:30:24 +0200 Oscar Salvador <osalvador@...hadventures.net> wrote:
> 
> > On Tue, Aug 21, 2018 at 02:17:34PM +0200, Michal Hocko wrote:
> > > We do have CONFIG_NODES_SHIFT=10 in our SLES kernels for quite some
> > > time (around SLE11-SP3 AFAICS).
> > > 
> > > Anyway, isn't NODES_ALLOC over engineered a bit? Does actually even do
> > > larger than 1024 NUMA nodes? This would be 128B and from a quick glance
> > > it seems that none of those functions are called in deep stacks. I
> > > haven't gone through all of them but a patch which checks them all and
> > > removes NODES_ALLOC would be quite nice IMHO.
> > 
> > No, maximum we can get is 1024 NUMA nodes.
> > I checked this when writing another patch [1], and since having gone
> > through all archs Kconfigs, CONFIG_NODES_SHIFT=10 is the limit.
> > 
> > NODEMASK_ALLOC gets only called from:
> > 
> > - unregister_mem_sect_under_nodes() (not anymore after [1])
> > - __nr_hugepages_store_common (This does not seem to have a deep stack, we could use a normal nodemask_t)
> > 
> > But is also used for NODEMASK_SCRATCH (mainly used for mempolicy):
> > 
> > struct nodemask_scratch {
> > 	nodemask_t	mask1;
> > 	nodemask_t	mask2;
> > };
> > 
> > that would make 256 bytes in case CONFIG_NODES_SHIFT=10.
> 
> And that sole site could use an open-coded kmalloc.

It is not really one single place, but four:

- do_set_mempolicy()
- do_mbind()
- kernel_migrate_pages()
- mpol_shared_policy_init()

They get called in:

- do_set_mempolicy()
	- From set_mempolicy syscall
	- From numa_policy_init()
	- From numa_default_policy()

	* All above do not look like they have a deep stack, so it should
	  be possible to get rid of NODEMASK_SCRATCH there.

- do_mbind
	- From mbind syscall

	* Should be feasible here as well.

- kernel_migrate_pages()

	- From migrate_pages syscall
	
	* Again, this should be doable.

- mpol_shared_policy_init()

	- From hugetlbfs_alloc_inode()
	- shmem_get_inode()
	
	* Seems doable for hugetlbfs_alloc_inode as well. 
	  I only got to check hugetlbfs_alloc_inode, because shmem_get_inode


So it seems that this can be done in most of the places.
The only tricky function might be mpol_shared_policy_init because of shmem_get_inode.
But in that case, we could use an open-coded kmalloc there.

Thanks
-- 
Oscar Salvador
SUSE L3

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ