linux-kernel - Re: [PATCH] mm: avoid livelock on !__GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20111101123608.GD25123@suse.de>
Date:	Tue, 1 Nov 2011 12:36:08 +0000
From:	Mel Gorman <mgorman@...e.de>
To:	Colin Cross <ccross@...roid.com>
Cc:	David Rientjes <rientjes@...gle.com>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Andrea Arcangeli <aarcange@...hat.com>, linux-mm@...ck.org
Subject: Re: [PATCH] mm: avoid livelock on !__GFP_FS allocations

On Wed, Oct 26, 2011 at 12:22:14AM -0700, Colin Cross wrote:
> On Wed, Oct 26, 2011 at 12:10 AM, David Rientjes <rientjes@...gle.com> wrote:
> > On Tue, 25 Oct 2011, Colin Cross wrote:
> >
> >> > gfp_allowed_mask is initialized to GFP_BOOT_MASK to start so that __GFP_FS
> >> > is never allowed before the slab allocator is completely initialized, so
> >> > you've now implicitly made all early boot allocations to be __GFP_NORETRY
> >> > even though they may not pass it.
> >>
> >> Only before interrupts are enabled, and then isn't it vulnerable to
> >> the same livelock?  Interrupts are off, single cpu, kswapd can't run.
> >> If an allocation ever failed, which seems unlikely, why would retrying
> >> help?
> >>
> >
> > If you want to claim gfp_allowed_mask as a pm-only entity, then I see no
> > problem with this approach.  However, if gfp_allowed_mask would be allowed
> > to temporarily change after init for another purpose then it would make
> > sense to retry because another allocation with __GFP_FS on another cpu or
> > kswapd could start making progress could allow for future memory freeing.
> >
> > The suggestion to add a hook directly into a pm-interface was so that we
> > could isolate it only to suspend and, to me, is the most maintainable
> > solution.
> >
> 
> pm_restrict_gfp_mask seems to claim gfp_allowed_mask as owned by pm at runtime:
> "gfp_allowed_mask also should only be modified with pm_mutex held,
> unless the suspend/hibernate code is guaranteed not to run in parallel
> with that modification"
> 
> I think we've wrapped around to Mel's original patch, which adds a
> pm_suspending() helper that is implemented next to
> pm_restrict_gfp_mask.  His patch puts the check inside
> !did_some_progress instead of should_alloc_retry, which I prefer as it
> at least keeps trying until reclaim isn't working.  Pekka was trying
> to avoid adding pm-specific checks into the allocator, which is why I
> stuck to the symptom (__GFP_FS is clear) rather than the cause (PM).
> 

Right now, I'm still no seeing a problem with the pm_suspending() check
as it's made for a corner-case situation in a very slow path that is
self-documenting. This thread has died somewhat and there is still no
fix merged. Is someone cooking up a patch they would prefer as an
alternative? If not, I'm going to resubmit the fix based on
pm_suspending.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/