lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131024122646.GB2402@suse.de>
Date:	Thu, 24 Oct 2013 13:26:46 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Rik van Riel <riel@...hat.com>,
	Tom Weber <l_linux-kernel@...l2news.4t2.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Automatic NUMA balancing patches for tip-urgent/stable

On Mon, Oct 07, 2013 at 11:28:38AM +0100, Mel Gorman wrote:
> This series has roughly the same goals as previous versions despite the
> size. It reduces overhead of automatic balancing through scan rate reduction
> and the avoidance of TLB flushes. It selects a preferred node and moves tasks
> towards their memory as well as moving memory toward their task. It handles
> shared pages and groups related tasks together. Some problems such as shared
> page interleaving and properly dealing with processes that are larger than
> a node are being deferred. This version should be ready for wider testing
> in -tip.
> 

Hi Ingo,

Off-list we talked with Peter about the fact that automatic NUMA
balancing as merged in 3.10, 3.11 and 3.12 shortly may corrupt
userspace memory. There is one LKML report on this that I'm aware of --
https://lkml.org/lkml/2013/7/31/647 which I prompt forgot to follow up
properly on . The user-visible effect is that pages get filled with zeros
with results such as null pointer exceptions in JVMs. It is fairly difficult
to trigger but it became much easier to trigger during the development of
the series "Basic scheduler support for automatic NUMA balancing" which
is how it was discovered and finally fixed.

In that series I tagged patches 2-9 for -stable as these patches addressed
the problem for me. I did not call it out as clearly as I should have
and did not realise the cc: stable tags were stripped. Worse, as it was
close to the release and the bug is relatively old I was ok with waiting
until 3.12 came out and then treat it as a -stable backport. It has been
highlighted that this is the wrong attitude and we should consider merging
the fixes now and backporting to -stable sooner rather than later.

The most important patches are 

mm: Wait for THP migrations to complete during NUMA hinting fault
mm: Prevent parallel splits during THP migration
mm: Close races between THP migration and PMD numa clearing

but on their own they will cause conflicts with tricky fixups and -stable
would differ from mainline in annoying ways. Patches 2-9 have been heavily
tested in isolation so I'm reasonably confident they fix the problem and are
-stable material. While strictly speaking not all the patches are required
for the fix, the -stable kernels would then be directly comparable with
3.13 when the full NUMA balancing series is applied. If I rework them at
this point then I'll also have to retest delaying things until next week.

Please consider queueing patches 2-9 for 3.12 via -urgent if it is not
too late and preserve the cc: stable tags so Greg will pick them up
automatically.

Thanks

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ