linux-kernel - RE: [PATCH RESEND v2 1/1] fix a dead loop when in heavy low memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <BA6F50564D52C24884F9840E07E32DEC2A971B2D@CDSMSX102.ccr.corp.intel.com>
Date:	Tue, 29 Dec 2015 04:58:11 +0000
From:	"Zhang, Tianfei" <tianfei.zhang@...el.com>
To:	David Rientjes <rientjes@...gle.com>
CC:	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"mhocko@...e.com" <mhocko@...e.com>,
	"arve@...roid.com" <arve@...roid.com>,
	"anton.vorontsov@...aro.org" <anton.vorontsov@...aro.org>,
	"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
	"riandrews@...roid.com" <riandrews@...roid.com>,
	"devel@...verdev.osuosl.org" <devel@...verdev.osuosl.org>,
	"Wu, Fengguang" <fengguang.wu@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH RESEND v2 1/1] fix a dead loop when in heavy low memory

> However, it appears that the same process, dTi-lm, is still chosen for oom kill
> because lowmem_deathpending_timeout has expired.
> 
> So this looks like a problem if the constantly chosen process cannot exit.
> It would have been helpful to have the stack of pid 27289 in the log to see
> where it was stuck.  But I think it may be unrelated to
> lowmem_deathpending_timeout itself.  We'd be better off selecting a
> different process to kill with something like this:
> 
> diff --git a/drivers/staging/android/lowmemorykiller.c
> b/drivers/staging/android/lowmemorykiller.c
> --- a/drivers/staging/android/lowmemorykiller.c
> +++ b/drivers/staging/android/lowmemorykiller.c
> @@ -128,11 +128,15 @@ static unsigned long lowmem_scan(struct shrinker
> *s, struct shrink_control *sc)
>  		if (!p)
>  			continue;
> 
> -		if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
> -		    time_before_eq(jiffies, lowmem_deathpending_timeout)) {
> -			task_unlock(p);
> -			rcu_read_unlock();
> -			return 0;
> +		if (test_tsk_thread_flag(p, TIF_MEMDIE)) {
> +			if (time_before_eq(jiffies,
> +					   lowmem_deathpending_timeout)) {
> +				task_unlock(p);
> +				rcu_read_unlock();
> +				return 0;
> +			}
> +			/* Need to select a different process to kill */
> +			continue;
>  		}
>  		oom_score_adj = p->signal->oom_score_adj;
>  		if (oom_score_adj < min_score_adj) {
> 
> But we need more information.  Please make sure that
> lowmem_debug_level is 1, try to get a complete kernel log, and if possible
> please try to capture the stack of the process that can't exit (use
> /proc/<pid>/stack) before trying the above patch.

Hi Rientjes:
I re-test the monkey stress test on your patches, it seems better than current mainline code.

The kernel log is a little big, more than 10 MB. I send to you directly.

Best
tianfei





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/