lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 12 Dec 2015 12:00:32 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	mhocko@...nel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
	rientjes@...gle.com, oleg@...hat.com, kwalker@...hat.com,
	cl@...ux.com, akpm@...ux-foundation.org, vdavydov@...allels.com,
	skozina@...hat.com, mgorman@...e.de, riel@...hat.com,
	arekm@...en.pl
Subject: Re: [PATCH v4] mm,oom: Add memory allocation watchdog kernel thread.

On Sun, Dec 13, 2015 at 12:33:04AM +0900, Tetsuo Handa wrote:
> +Currently, when something went wrong inside memory allocation request,
> +the system will stall with either 100% CPU usage (if memory allocating
> +tasks are doing busy loop) or 0% CPU usage (if memory allocating tasks
> +are waiting for file data to be flushed to storage).
> +But /proc/sys/kernel/hung_task_warnings is not helpful because memory
> +allocating tasks unlikely sleep in uninterruptible state for
> +/proc/sys/kernel/hung_task_timeout_secs seconds.

Yes, this is very annoying. Other tasks in the system get dumped out
as they are blocked for too long, but not the allocating task itself
as it's busy looping.

That being said, I'm not entirely sure why we need daemon to do this,
which then requires us to duplicate allocation state to task_struct.
There is no scenario where the allocating task is not moving at all
anymore, right? So can't we dump the allocation state from within the
allocator and leave the rest to the hung task detector?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 05ef7fb..fbfc581 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3004,6 +3004,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	enum migrate_mode migration_mode = MIGRATE_ASYNC;
 	bool deferred_compaction = false;
 	int contended_compaction = COMPACT_CONTENDED_NONE;
+	unsigned int nr_tries = 0;
 
 	/*
 	 * In the slowpath, we sanity check order to avoid ever trying to
@@ -3033,6 +3034,9 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto nopage;
 
 retry:
+	if (++nr_retries % 1000 == 0)
+		warn_alloc_failed(gfp_mask, order, "Potential GFP deadlock\n");
+
 	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
 		wake_all_kswapds(order, ac);
 
Basing it on nr_retries alone might be too crude and take too long
when each cycle spends time waiting for IO. However, if that is a
problem we can make it time-based instead, like your memalloc_timer,
to catch tasks that spend too much time in a single alloc attempt.

> +		start_memalloc_timer(alloc_mask, order);
>  		page = __alloc_pages_slowpath(alloc_mask, order, &ac);
> +		stop_memalloc_timer(alloc_mask);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ