lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed,  8 Jun 2016 16:35:37 +0200
From:	Lukasz Odzioba <lukasz.odzioba@...el.com>
To:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	akpm@...ux-foundation.org, kirill.shutemov@...ux.intel.com,
	mhocko@...e.com, aarcange@...hat.com, vdavydov@...allels.com,
	mingli199x@...com, minchan@...nel.org
Cc:	dave.hansen@...el.com, lukasz.anaczkowski@...el.com,
	lukasz.odzioba@...el.com
Subject: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival

When the application does not exit cleanly (i.e. SIGTERM) we might
end up with some pages in lru_add_pvec, which is ok. With THP
enabled huge pages may also end up on per cpu lru_add_pvecs.
In the systems with a lot of processors we end up with quite a lot
of memory pending for addition to LRU cache - in the worst case
scenario up to CPUS * PAGE_SIZE * PAGEVEC_SIZE, which on machine
with 200+CPUs means GBs in practice.

We are able to reproduce this problem with the following program:

void main() {
{
	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
	if (p != MAP_FAILED)
		memset(p, 0, size);
	//munmap(p, size); // uncomment to make the problem go away
}
}

When we run it it will leave significant amount of memory on pvecs.
This memory will be not reclaimed if we hit OOM, so when we run
above program in a loop:
	$ for i in `seq 100`; do ./a.out; done
many processes (95% in my case) will be killed by OOM.

This patch flushes lru_add_pvecs on compound page arrival making
the problem less severe - kill rate drops to 0%.

Suggested-by: Michal Hocko <mhocko@...e.com>
Tested-by: Lukasz Odzioba <lukasz.odzioba@...el.com>
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@...el.com>
---
 mm/swap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 9591614..3fe4f18 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -391,9 +391,8 @@ static void __lru_cache_add(struct page *page)
 	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
 
 	get_page(page);
-	if (!pagevec_space(pvec))
+	if (!pagevec_add(pvec, page) || PageCompound(page))
 		__pagevec_lru_add(pvec);
-	pagevec_add(pvec, page);
 	put_cpu_var(lru_add_pvec);
 }
 
-- 
1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ