lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0611031353570.16486@schroedinger.engr.sgi.com>
Date:	Fri, 3 Nov 2006 14:10:01 -0800 (PST)
From:	Christoph Lameter <clameter@....com>
To:	Andrew Morton <akpm@...l.org>
cc:	linux-kernel@...r.kernel.org
Subject: Re: Avoid allocating during interleave from almost full nodes

On Fri, 3 Nov 2006, Andrew Morton wrote:

> > The exact node we choose during interleave does not matter much if we are
> > under memory pressure since the allocations will be redirected anyways
> > after we have overallocated a single node.
> 
> Am not clear on what that means.

If we currently overallocate a node then we fall back to other nodes along 
the zonelist. We will not be able to allocate on the intended node and 
the next interleave node becomes the nearest node with enough memory.

> > This patch checks for the amount of free pages on a node. If it is lower
> > than a predefined limit (in /proc/sys/kernel/min_interleave_ratio) then
> 
> You mean /proc/sys/vm

Right.

> > we avoid allocating from that node. We keep a bitmap of full nodes
> > that is cleared every 2 seconds when draining the pagesets for
> > node 0.
> 
> Wall time is a bogus concept in the VM.  Can we please stop relying upon it?

We use the same 2 second pulse to drain slab caches, and the page 
allocators per cpu caches. The slab draining has been around forever. Its 
relying on jiffies and not on wall time.

> > not matter though since we always can fall back to operating without
> > full_interleave_nodes. As a result of the racyness we may uselessly
> > skip a node or retest a node.
> 
> This design relies upon nodes having certain amounts of free memory.  This
> concept is bogus.  Because it treats clean pagecache which hasn't been used
> since last Saturday as "in use".  It is not in use.

It relies on free pages, not on in use pages. The attempt is to bypass 
expensive reclaim as long as we can find free memory on other nodes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ