lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 27 Mar 2019 10:51:01 -0400
From:   Qian Cai <cai@....pw>
To:     akpm@...ux-foundation.org
Cc:     catalin.marinas@....com, cl@...ux.com, mhocko@...nel.org,
        willy@...radead.org, penberg@...nel.org, rientjes@...gle.com,
        iamjoonsoo.kim@....com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Qian Cai <cai@....pw>
Subject: [PATCH v5] kmemleak: survive in a low-memory situation

Kmemleak could quickly fail to allocate an object structure and then
disable itself below in a low-memory situation. For example, running a
mmap() workload triggering swapping and OOM. This is especially
problematic for running things like LTP testsuite where one OOM test
case would disable the whole kmemleak and render the rest of test cases
without kmemleak watching for leaking.

Kmemleak allocation could fail even though the tracked memory is
succeeded. Hence, it could still try to start a direct reclaim if it is
not executed in an atomic context (spinlock, irq-handler etc), or a
high-priority allocation in an atomic context as a last-ditch effort.
Since kmemleak is a debug feature, it is unlikely to be used in
production that memory resources is scarce where direct reclaim or
high-priority atomic allocations should not be granted lightly.

Unless there is a brave soul to reimplement the kmemleak to embed it's
metadata into the tracked memory itself in a foreseeable future, this
provides a good balance between enabling kmemleak in a low-memory
situation and not introducing too much hackiness into the existing
code for now. Another approach is to fail back the original allocation
once kmemleak_alloc() failed, but there are too many call sites to
deal with which makes it error-prone.

kmemleak: Cannot allocate a kmemleak_object structure
kmemleak: Kernel memory leak detector disabled
kmemleak: Automatic memory scanning thread ended
RIP: 0010:__alloc_pages_nodemask+0x242a/0x2ab0
Call Trace:
 alloc_pages_current+0xdb/0x1c0
 allocate_slab+0x4d9/0x930
 new_slab+0x46/0x70
 ___slab_alloc+0x5d3/0x9c0
 __slab_alloc+0x12/0x20
 kmem_cache_alloc+0x30a/0x360
 create_object+0x96/0x9a0
 kmemleak_alloc+0x71/0xa0
 kmem_cache_alloc+0x254/0x360
 mempool_alloc_slab+0x3f/0x60
 mempool_alloc+0x120/0x329
 bio_alloc_bioset+0x1a8/0x510
 get_swap_bio+0x107/0x470
 __swap_writepage+0xab4/0x1650
 swap_writepage+0x86/0xe0

Signed-off-by: Qian Cai <cai@....pw>
---

v5: Move everything into gfp_kmemleak_mask().
    Use PREEMPT_COUNT to catch irq unsafe spinlocks held.
v4: Update the commit log.
    Fix a typo in comments per Christ.
    Consolidate the allocation.
v3: Update the commit log.
    Simplify the code inspired by graph_trace_open() from ftrace.
v2: Remove the needless checking for NULL objects in slab_post_alloc_hook()
    per Catalin.

 mm/kmemleak.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index a2d894d3de07..98f874990553 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -124,11 +124,6 @@
 
 #define BYTES_PER_POINTER	sizeof(void *)
 
-/* GFP bitmask for kmemleak internal allocations */
-#define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
-				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
-				 __GFP_NOWARN | __GFP_NOFAIL)
-
 /* scanning area inside a memory block */
 struct kmemleak_scan_area {
 	struct hlist_node node;
@@ -315,6 +310,30 @@ static void kmemleak_disable(void);
 		pr_warn(fmt, ##__VA_ARGS__);		\
 } while (0)
 
+/* GFP bitmask for kmemleak internal allocations */
+static inline gfp_t gfp_kmemleak_mask(gfp_t gfp)
+{
+	gfp = (gfp & (GFP_KERNEL | GFP_ATOMIC)) | __GFP_NORETRY |
+		__GFP_NOMEMALLOC | __GFP_NOWARN | __GFP_NOFAIL;
+
+/*
+ * PREEMPT_COUNT is set by either PREEMPT or DEBUG_ATOMIC_SLEEP which is
+ * normally found in a debug kernel just like kmemleak. Otherwise, it won't be
+ * able to catch irq unsafe spinlocks held.
+ */
+#ifdef CONFIG_PREEMPT_COUNT
+	/*
+	 * The tracked memory was allocated successful, if the kmemleak object
+	 * failed to allocate for some reasons, it ends up with the whole
+	 * kmemleak disabled, so try it harder.
+	 */
+	gfp |= ((in_atomic() || irqs_disabled()) ? GFP_ATOMIC :
+		__GFP_DIRECT_RECLAIM);
+#endif
+
+	return gfp;
+}
+
 static void warn_or_seq_hex_dump(struct seq_file *seq, int prefix_type,
 				 int rowsize, int groupsize, const void *buf,
 				 size_t len, bool ascii)
-- 
2.17.2 (Apple Git-113)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ