[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z5eX4BjErq8FsNIa@arm.com>
Date: Mon, 27 Jan 2025 14:27:44 +0000
From: Catalin Marinas <catalin.marinas@....com>
To: Matthieu Baerts <matttbe@...nel.org>
Cc: torvalds@...ux-foundation.org, davem@...emloft.net,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
pabeni@...hat.com, Guo Weikang <guoweikang.kernel@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Jakub Kicinski <kuba@...nel.org>, Andrea Righi <arighi@...dia.com>,
Patrick Wang <patrick.wang.shcn@...il.com>
Subject: Re: [GIT PULL] Networking for v6.13-rc7
On Thu, Jan 23, 2025 at 06:11:16PM +0100, Matthieu Baerts wrote:
> On 21/01/2025 21:46, Catalin Marinas wrote:
> > On Tue, Jan 21, 2025 at 07:42:18AM -0800, Jakub Kicinski wrote:
> >> On Tue, 21 Jan 2025 11:09:20 +0000 Catalin Marinas wrote:
> >>>>> Hmm, I don't think this would make any difference as kmemleak does scan
> >>>>> the memblock allocations as long as they have a correspondent VA in the
> >>>>> linear map.
> >>>>>
> >>>>> BTW, is NUMA enabled or disabled in your .config?
> >>>>
> >>>> It's pretty much kernel/configs/debug.config, with virtme-ng, booted
> >>>> with 4 CPUs. LMK if you can't repro with that, I can provide exact
> >>>> cmdline.
> >>>
> >>> Please do. I haven't tried to reproduce it yet on x86 as I don't have
> >>> any non-arm hardware around. It did not trigger on arm64. I think
> >>> virtme-ng may work with qemu. Anyway, I'll be off from tomorrow until
> >>> the end of the week, so more likely to try it next week.
> >>
> >> vng -b -f tools/testing/selftests/net/config -f kernel/configs/debug.config
> >>
> >> vng -r arch/x86_64/boot/bzImage --cpus 4 --user root -v --network loop
> >
> > Great, thanks. I managed to reproduce it
>
> Thank you for investigating this issue!
>
> Please note that on our side with MPTCP, I can only reproduce this issue
> locally, but not from our CI on GitHub Actions. The main difference is
> the kernel (6.8 on the CI, 6.12 here) and the fact our CI is launching
> virtme-ng from a VM. The rest should be the same.
It won't show up in 6.8 as kmemleak did not report per-cpu allocation
leaks. But even with the latest kernel, it's probabilistic, some data
somewhere may look like a pointer and not be reported (I couldn't
reproduce it on arm64).
It turns out to be a false positive. The __percpu pointers are
referenced from node_data[]. The latter is populated in
alloc_node_data() and kmemleak registers the pg_data_t object from the
memblock allocation. However, due to an incorrect pfn range check
introduced by commit 84c326299191 ("mm: kmemleak: check physical address
when scan"), we ignore the node_data[0] allocation. Some printks in
alloc_node_data() show:
nd_pa = 0x3ffda140
nd_size = 0x4ec0
min_low_pfn = 0x0
max_low_pfn = 0x3ffdf
nd_pa + nd_size == 0x3ffdf000
So the "PHYS_PFN(nd_pa + nd_size) >= max_low_pfn" check in kmemleak is
true and the whole pg_data_t object ignored (not scanned). The __percpu
pointers won't be detected. The fix is simple:
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 820ba3b5cbfc..bb7d61fc4da3 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1689,7 +1689,7 @@ static void kmemleak_scan(void)
unsigned long phys = object->pointer;
if (PHYS_PFN(phys) < min_low_pfn ||
- PHYS_PFN(phys + object->size) >= max_low_pfn)
+ PHYS_PFN(phys + object->size) > max_low_pfn)
__paint_it(object, KMEMLEAK_BLACK);
}
I'll post this as a proper patch and I found some minor things to clean
up in kmemleak in the meantime.
> > (after hacking vng to allow x86_64 as non-host architecture).
>
> Do not hesitate to report this issue + hack on vng's bug tracker :)
Done ;)
https://github.com/arighi/virtme-ng/issues/223
--
Catalin
Powered by blists - more mailing lists