linux-kernel - Re: [PATCH 15/18] sched: Set preferred NUMA node based on number of private faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130731101041.GQ2296@suse.de>
Date:	Wed, 31 Jul 2013 11:10:41 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 15/18] sched: Set preferred NUMA node based on number of
 private faults

On Wed, Jul 31, 2013 at 11:34:37AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 31, 2013 at 10:29:38AM +0100, Mel Gorman wrote:
> > > Hurmph I just stumbled upon this PMD 'trick' and I'm not at all sure I
> > > like it. If an application would pre-fault/initialize its memory with
> > > the main thread we'll collapse it into a PMDs and forever thereafter (by
> > > virtue of do_pmd_numa_page()) they'll all stay the same. Resulting in
> > > PMD granularity.
> > > 
> > 
> > Potentially yes. When that PMD trick was introduced it was because the cost
> > of faults was very high due to a high scanning rate. The trick mitigated
> > worse-case scenarios until faults were properly accounted for and the scan
> > rates were better controlled. As these *should* be addressed by the series
> > I think I will be adding a patch to kick away this PMD crutch and see how
> > it looks in profiles.
> 
> I've been thinking on this a bit and I think we should split these and
> thp pages when we get shared faults from different nodes on them and
> refuse thp collapses when the pages are on different nodes.
> 

Agreed, I reached the same conclusion when thinking about THP false sharing
just before I went on holiday. The first prototype patch was a bit messy
and performed very badly so "Handle false sharing of THP" was chucked onto
the TODO pile to worry about when I got back. It also collided a little with
the PMD handling of base pages which is another reason to get rid of that.

> With the exception that when we introduce the interleave mempolicies we
> should define 'different node' as being outside of the interleave mask.

Understood.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/