lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170125184548.GB32041@dhcp22.suse.cz>
Date:   Wed, 25 Jan 2017 19:45:49 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6] mm: Add memory allocation watchdog kernel thread.

On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
[...]
> >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@...xchg.org>
> Date: Wed, 25 Jan 2017 12:57:20 -0500
> Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
>  warnings
> 
> __GFP_NOWARN, which is usually added to avoid warnings from callsites
> that expect to fail and have fallbacks, currently also suppresses
> allocation stall warnings. These trigger when an allocation is stuck
> inside the allocator for 10 seconds or longer.
> 
> But there is no class of allocations that can get legitimately stuck
> in the allocator for this long. This always indicates a problem.
> 
> Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.

Tetsuo has already suggested something like this and I didn't really
like it because it makes the semantic of the flag confusing. The mask
says to not warn while the kernel log might contain an allocation splat.
You are right that stalling for 10s seconds means a problem on its own
but on the other hand I can imagine somebody might really want to have
clean logs and the last thing we want is to have another gfp flag for
that purpose.

I also do not think that this change would make a big difference because
most allocations simply use this flag along with __GFP_NORETRY or
GFP_NOWAIT resp GFP_ATOMIC. Have we ever seen a stall with this
allocation requests?

I haven't nacked Tetsuo's patch AFAIR and will not nack this one either
I just do not think we should tweak __GFP_NOWARN.
 
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>
> ---
>  mm/page_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f3e0c69a97b7..7ce051d1d575 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3704,7 +3704,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  
>  	/* Make sure we know about allocations which stall for too long */
>  	if (time_after(jiffies, alloc_start + stall_timeout)) {
> -		warn_alloc(gfp_mask,
> +		warn_alloc(gfp_mask & ~__GFP_NOWARN,
>  			"page allocation stalls for %ums, order:%u",
>  			jiffies_to_msecs(jiffies-alloc_start), order);
>  		stall_timeout += 10 * HZ;
> -- 
> 2.11.0

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ