linux-kernel - Re: [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries after unmapping pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20150611152520.GI26425@suse.de>
Date:	Thu, 11 Jun 2015 16:25:20 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>,
	Minchan Kim <minchan@...nel.org>,
	Dave Hansen <dave.hansen@...el.com>,
	Andi Kleen <andi@...stfloor.org>,
	H Peter Anvin <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries
 after unmapping pages

On Thu, Jun 11, 2015 at 05:02:51PM +0200, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@...e.de> wrote:
> 
> > > In the full-flushing case (v6 without patch 4) the batching limit is 
> > > 'infinite', we'll batch as long as possible, right?
> > 
> > No because we must flush before pages are freed so the maximum batching is 
> > related to SWAP_CLUSTER_MAX. If we free a page before the flush then in theory 
> > the page can be reallocated and a stale TLB entry can allow access to unrelated 
> > data. It would be almost impossible to trigger corruption this way but it's a 
> > concern.
> 
> Well, could we say double SWAP_CLUSTER_MAX to further reduce the IPI rate?
> 

We could but it's a suprisingly subtle change. The impacts I can think
of are;

1. LRU lock hold times increase slightly because more pages are being
   isolated
2. There are slight timing changes due to more pages having to be
   processed before they are freed. There is a slight risk that more
   pages than are necessary get reclaimed but I doubt it'll be
   measurable
3. There is a risk that too_many_isolated checks will be easier to
   trigger resulting in a HZ/10 stall
4. The rotation rate of active->inactive is slightly faster but there
   should be fewer rotations before the lists get balanced so it
   shouldn't matter.
5. More pages are reclaimed in a single pass if zone_reclaim_mode is
   active but that thing sucks hard when it's enabled no matter what
6. More pages are isolated for compaction so page hold times there
   are longer while they are being copied

There might be others. To be honest, I'm struggling to think of any serious
problems such a change would cause. The biggest risk is issue 3 but I expect
that hitting that requires that the system is already getting badly hammered.
The main downside is that it affects all page reclaim activity, not just
the mapped pages which are triggering the IPIs. I'll add a patch to the
series that alters SWAP_CLUSTER_MAX with the intent to further reduce
IPIs and see what falls out and see if any other VM person complains.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/