[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150630115353.GB6812@suse.de>
Date: Tue, 30 Jun 2015 12:53:53 +0100
From: Mel Gorman <mgorman@...e.de>
To: Ingo Molnar <mingo@...nel.org>
Cc: Xishi Qiu <qiuxishi@...wei.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
"Luck, Tony" <tony.luck@...el.com>,
Hanjun Guo <guohanjun@...wei.com>,
Xiexiuqi <xiexiuqi@...wei.com>, leon@...n.nu,
Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Dave Hansen <dave.hansen@...el.com>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Vlastimil Babka <vbabka@...e.cz>,
Linux MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy
allocations
On Tue, Jun 30, 2015 at 12:46:54PM +0200, Ingo Molnar wrote:
>
> * Mel Gorman <mgorman@...e.de> wrote:
>
> > [...]
> >
> > Basically, overall I feel this series is the wrong approach but not knowing who
> > the users are making is much harder to judge. I strongly suspect that if
> > mirrored memory is to be properly used then it needs to be available before the
> > page allocator is even active. Once active, there needs to be controlled access
> > for allocation requests that are really critical to mirror and not just all
> > kernel allocations. None of that would use a MIGRATE_TYPE approach. It would be
> > alterations to the bootmem allocator and access to an explicit reserve that is
> > not accounted for as "free memory" and accessed via an explicit GFP flag.
>
> So I think the main goal is to avoid kernel crashes when a #MC memory fault
> arrives on a piece of memory that is owned by the kernel.
>
Sounds logical. In that case, bootmem awareness would be crucial.
Enabling support in just the page allocator is too late.
> In that sense 'protecting' all kernel allocations is natural: we don't know how to
> recover from faults that affect kernel memory.
>
It potentially uses all mirrored memory on memory that does not need that
sort of guarantee. For example, if there was a MC on memory backing the
inode cache then potentially that is recoverable as long as the inodes
were not dirty. That's a minor detail as the kernel could later protect
only MIGRATE_UNMOVABLE requests instead of all kernel allocations if fatal
MC in kernel space could be distinguished from non-fatal checks.
Bootmem awareness is much more important either way. If that was addressed
then potentially a MIGRATE_UNMOVABLE_MIRROR type could be created that
is only used for MIGRATE_UNMOVABLE allocations and never for user-space.
That misses MIGRATE_RECLAIMABLE so if that is required then we need
something else that both preserves fragmentation avoidance and avoid
introducing loads of new migratetypes.
Reclaim-related issues could be partially avoided by forbidding use from
userspace and accounting for the size of MIGRATE_UNMOVABLE_MIRROR during
watermark checks.
> We do know how to recover from faults that affect user-space memory alone.
>
> So if a mechanism is in place that prioritizes 3 groups of allocators:
>
> - non-recoverable memory (kernel allocations mostly)
>
So bootmem at the very least followed by MIGRATE_UNMOVABLE requests whether
they are accounted for by zones of MIGRATE_TYPES.
> - high priority user memory (critical apps that must never fail)
>
This one is problematic with a MIGRATE_TYPE-based approach such as the one in
this series. If a high priority requires memory and MIGRATE_MIRROR is full
then some of it must be reclaimed. With a MIGRATE_TYPE approach, the kernel
may reclaim a lot of unnecessary memory trying to free some MIGRATE_MIRROR
memory with no guarantee of success. It'll look like unnecessary thrashing
from userspace but difficult to diagnose as reclaim stats are per-zone based.
Dealing with this needs either a zone-based approach or a lot of surgery
to reclaim (similar to what the node-based LRU series does actually when
it skips pages when the caller requires lowmem pages).
--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists