[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1396350948-29910-40-git-send-email-luis.henriques@canonical.com>
Date: Tue, 1 Apr 2014 12:14:03 +0100
From: Luis Henriques <luis.henriques@...onical.com>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
kernel-team@...ts.ubuntu.com
Cc: Vlastimil Babka <vbabka@...e.cz>,
Daniel Borkmann <dborkman@...hat.com>,
Thomas Hellstrom <thellstrom@...are.com>,
John David Anglin <dave.anglin@...l.net>,
HATAYAMA Daisuke <d.hatayama@...fujitsu.com>,
Konstantin Khlebnikov <khlebnikov@...nvz.org>,
Carsten Otte <cotte@...ibm.com>,
Jared Hulbert <jaredeh@...il.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Luis Henriques <luis.henriques@...onical.com>
Subject: [PATCH 3.11 039/144] mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking
3.11.10.7 -stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlastimil Babka <vbabka@...e.cz>
commit 9050d7eba40b3d79551668f54e68fd6f51945ef3 upstream.
Daniel Borkmann reported a VM_BUG_ON assertion failing:
------------[ cut here ]------------
kernel BUG at mm/mlock.c:528!
invalid opcode: 0000 [#1] SMP
Modules linked in: ccm arc4 iwldvm [...]
video
CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8
Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012
task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000
RIP: 0010:[<ffffffff81171ad0>] [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0
Call Trace:
do_munmap+0x18f/0x3b0
vm_munmap+0x41/0x60
SyS_munmap+0x22/0x30
system_call_fastpath+0x1a/0x1f
RIP munlock_vma_pages_range+0x2e0/0x2f0
---[ end trace a0088dcf07ae10f2 ]---
because munlock_vma_pages_range() thinks it's unexpectedly in the middle
of a THP page. This can be reproduced with default config since 3.11
kernels. A reproducer can be found in the kernel's selftest directory
for networking by running ./psock_tpacket.
The problem is that an order=2 compound page (allocated by
alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped
by packet_mmap()) and mistaken for a THP page and assumed to be order=9.
The checks for THP in munlock came with commit ff6a6da60b89 ("mm:
accelerate munlock() treatment of THP pages"), i.e. since 3.9, but did
not trigger a bug. It just makes munlock_vma_pages_range() skip such
compound pages until the next 512-pages-aligned page, when it encounters
a head page. This is however not a problem for vma's where mlocking has
no effect anyway, but it can distort the accounting.
Since commit 7225522bb429 ("mm: munlock: batch non-THP page isolation
and munlock+putback using pagevec") this can trigger a VM_BUG_ON in
PageTransHuge() check.
This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a
list of flags that make vma's non-mlockable and non-mergeable. The
reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is
already on the VM_SPECIAL list, and both are intended for non-LRU pages
where mlocking makes no sense anyway. Related Lkml discussion can be
found in [2].
[1] tools/testing/selftests/net/psock_tpacket
[2] https://lkml.org/lkml/2014/1/10/427
Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
Signed-off-by: Daniel Borkmann <dborkman@...hat.com>
Reported-by: Daniel Borkmann <dborkman@...hat.com>
Tested-by: Daniel Borkmann <dborkman@...hat.com>
Cc: Thomas Hellstrom <thellstrom@...are.com>
Cc: John David Anglin <dave.anglin@...l.net>
Cc: HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@...nvz.org>
Cc: Carsten Otte <cotte@...ibm.com>
Cc: Jared Hulbert <jaredeh@...il.com>
Tested-by: Hannes Frederic Sowa <hannes@...essinduktion.org>
Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
Acked-by: Rik van Riel <riel@...hat.com>
Cc: Andrea Arcangeli <aarcange@...hat.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@...onical.com>
---
include/linux/mm.h | 2 +-
mm/huge_memory.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a6154b1..e99f9fb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -155,7 +155,7 @@ extern unsigned int kobjsize(const void *objp);
* Special vmas that are non-mergable, non-mlock()able.
* Note: mm/huge_memory.c VM_NO_THP depends on this definition.
*/
-#define VM_SPECIAL (VM_IO | VM_DONTEXPAND | VM_PFNMAP)
+#define VM_SPECIAL (VM_IO | VM_DONTEXPAND | VM_PFNMAP | VM_MIXEDMAP)
/*
* mapping from the currently active vm_flags protection bits (the
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 46a02bf..2eb6e38 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1919,7 +1919,7 @@ out:
return ret;
}
-#define VM_NO_THP (VM_SPECIAL|VM_MIXEDMAP|VM_HUGETLB|VM_SHARED|VM_MAYSHARE)
+#define VM_NO_THP (VM_SPECIAL | VM_HUGETLB | VM_SHARED | VM_MAYSHARE)
int hugepage_madvise(struct vm_area_struct *vma,
unsigned long *vm_flags, int advice)
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists