lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z5eX4BjErq8FsNIa@arm.com>
Date: Mon, 27 Jan 2025 14:27:44 +0000
From: Catalin Marinas <catalin.marinas@....com>
To: Matthieu Baerts <matttbe@...nel.org>
Cc: torvalds@...ux-foundation.org, davem@...emloft.net,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	pabeni@...hat.com, Guo Weikang <guoweikang.kernel@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jakub Kicinski <kuba@...nel.org>, Andrea Righi <arighi@...dia.com>,
	Patrick Wang <patrick.wang.shcn@...il.com>
Subject: Re: [GIT PULL] Networking for v6.13-rc7

On Thu, Jan 23, 2025 at 06:11:16PM +0100, Matthieu Baerts wrote:
> On 21/01/2025 21:46, Catalin Marinas wrote:
> > On Tue, Jan 21, 2025 at 07:42:18AM -0800, Jakub Kicinski wrote:
> >> On Tue, 21 Jan 2025 11:09:20 +0000 Catalin Marinas wrote:
> >>>>> Hmm, I don't think this would make any difference as kmemleak does scan
> >>>>> the memblock allocations as long as they have a correspondent VA in the
> >>>>> linear map.
> >>>>>
> >>>>> BTW, is NUMA enabled or disabled in your .config?  
> >>>>
> >>>> It's pretty much kernel/configs/debug.config, with virtme-ng, booted
> >>>> with 4 CPUs. LMK if you can't repro with that, I can provide exact
> >>>> cmdline.  
> >>>
> >>> Please do. I haven't tried to reproduce it yet on x86 as I don't have
> >>> any non-arm hardware around. It did not trigger on arm64. I think
> >>> virtme-ng may work with qemu. Anyway, I'll be off from tomorrow until
> >>> the end of the week, so more likely to try it next week.
> >>
> >> vng -b -f tools/testing/selftests/net/config -f kernel/configs/debug.config
> >>
> >> vng -r arch/x86_64/boot/bzImage --cpus 4 --user root -v --network loop
> > 
> > Great, thanks. I managed to reproduce it
> 
> Thank you for investigating this issue!
> 
> Please note that on our side with MPTCP, I can only reproduce this issue
> locally, but not from our CI on GitHub Actions. The main difference is
> the kernel (6.8 on the CI, 6.12 here) and the fact our CI is launching
> virtme-ng from a VM. The rest should be the same.

It won't show up in 6.8 as kmemleak did not report per-cpu allocation
leaks. But even with the latest kernel, it's probabilistic, some data
somewhere may look like a pointer and not be reported (I couldn't
reproduce it on arm64).

It turns out to be a false positive. The __percpu pointers are
referenced from node_data[]. The latter is populated in
alloc_node_data() and kmemleak registers the pg_data_t object from the
memblock allocation. However, due to an incorrect pfn range check
introduced by commit 84c326299191 ("mm: kmemleak: check physical address
when scan"), we ignore the node_data[0] allocation. Some printks in
alloc_node_data() show:

	nd_pa = 0x3ffda140
	nd_size = 0x4ec0
	min_low_pfn = 0x0
	max_low_pfn = 0x3ffdf
	nd_pa + nd_size == 0x3ffdf000

So the "PHYS_PFN(nd_pa + nd_size) >= max_low_pfn" check in kmemleak is
true and the whole pg_data_t object ignored (not scanned). The __percpu
pointers won't be detected. The fix is simple:

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 820ba3b5cbfc..bb7d61fc4da3 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1689,7 +1689,7 @@ static void kmemleak_scan(void)
 			unsigned long phys = object->pointer;

 			if (PHYS_PFN(phys) < min_low_pfn ||
-			    PHYS_PFN(phys + object->size) >= max_low_pfn)
+			    PHYS_PFN(phys + object->size) > max_low_pfn)
 				__paint_it(object, KMEMLEAK_BLACK);
 		}

I'll post this as a proper patch and I found some minor things to clean
up in kmemleak in the meantime.

> > (after hacking vng to allow x86_64 as non-host architecture).
> 
> Do not hesitate to report this issue + hack on vng's bug tracker :)

Done ;)

https://github.com/arighi/virtme-ng/issues/223

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ