[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8392dd6b-1c20-4f7b-b879-6e940f2dcbc0@kernel.org>
Date: Mon, 27 Jan 2025 18:39:11 +0100
From: Matthieu Baerts <matttbe@...nel.org>
To: Catalin Marinas <catalin.marinas@....com>
Cc: torvalds@...ux-foundation.org, davem@...emloft.net,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org, pabeni@...hat.com,
Guo Weikang <guoweikang.kernel@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, Jakub Kicinski <kuba@...nel.org>,
Andrea Righi <arighi@...dia.com>, Patrick Wang <patrick.wang.shcn@...il.com>
Subject: Re: [GIT PULL] Networking for v6.13-rc7
Hi Catalin,
On 27/01/2025 15:27, Catalin Marinas wrote:
> On Thu, Jan 23, 2025 at 06:11:16PM +0100, Matthieu Baerts wrote:
>> On 21/01/2025 21:46, Catalin Marinas wrote:
>>> On Tue, Jan 21, 2025 at 07:42:18AM -0800, Jakub Kicinski wrote:
>>>> On Tue, 21 Jan 2025 11:09:20 +0000 Catalin Marinas wrote:
>>>>>>> Hmm, I don't think this would make any difference as kmemleak does scan
>>>>>>> the memblock allocations as long as they have a correspondent VA in the
>>>>>>> linear map.
>>>>>>>
>>>>>>> BTW, is NUMA enabled or disabled in your .config?
>>>>>>
>>>>>> It's pretty much kernel/configs/debug.config, with virtme-ng, booted
>>>>>> with 4 CPUs. LMK if you can't repro with that, I can provide exact
>>>>>> cmdline.
>>>>>
>>>>> Please do. I haven't tried to reproduce it yet on x86 as I don't have
>>>>> any non-arm hardware around. It did not trigger on arm64. I think
>>>>> virtme-ng may work with qemu. Anyway, I'll be off from tomorrow until
>>>>> the end of the week, so more likely to try it next week.
>>>>
>>>> vng -b -f tools/testing/selftests/net/config -f kernel/configs/debug.config
>>>>
>>>> vng -r arch/x86_64/boot/bzImage --cpus 4 --user root -v --network loop
>>>
>>> Great, thanks. I managed to reproduce it
>>
>> Thank you for investigating this issue!
>>
>> Please note that on our side with MPTCP, I can only reproduce this issue
>> locally, but not from our CI on GitHub Actions. The main difference is
>> the kernel (6.8 on the CI, 6.12 here) and the fact our CI is launching
>> virtme-ng from a VM. The rest should be the same.
>
> It won't show up in 6.8 as kmemleak did not report per-cpu allocation
> leaks. But even with the latest kernel, it's probabilistic, some data
> somewhere may look like a pointer and not be reported (I couldn't
> reproduce it on arm64).
My bad, I was talking about the host kernel. Or to be precise, I should
say the kernel starting the test VM running a >=v6.13-rc7 kernel.
> It turns out to be a false positive. The __percpu pointers are
> referenced from node_data[]. The latter is populated in
> alloc_node_data() and kmemleak registers the pg_data_t object from the
> memblock allocation. However, due to an incorrect pfn range check
> introduced by commit 84c326299191 ("mm: kmemleak: check physical address
> when scan"), we ignore the node_data[0] allocation. Some printks in
> alloc_node_data() show:
>
> nd_pa = 0x3ffda140
> nd_size = 0x4ec0
> min_low_pfn = 0x0
> max_low_pfn = 0x3ffdf
> nd_pa + nd_size == 0x3ffdf000
>
> So the "PHYS_PFN(nd_pa + nd_size) >= max_low_pfn" check in kmemleak is
> true and the whole pg_data_t object ignored (not scanned). The __percpu
> pointers won't be detected. The fix is simple:
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 820ba3b5cbfc..bb7d61fc4da3 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1689,7 +1689,7 @@ static void kmemleak_scan(void)
> unsigned long phys = object->pointer;
>
> if (PHYS_PFN(phys) < min_low_pfn ||
> - PHYS_PFN(phys + object->size) >= max_low_pfn)
> + PHYS_PFN(phys + object->size) > max_low_pfn)
> __paint_it(object, KMEMLEAK_BLACK);
> }
Thank you for having looked at that and provided a fix quickly!
I confirm that it fixes the issue on my side! Just in case you want a
tested-by tag:
Tested-by: Matthieu Baerts (NGI0) <matttbe@...nel.org>
> I'll post this as a proper patch and I found some minor things to clean
> up in kmemleak in the meantime.
Great!
>>> (after hacking vng to allow x86_64 as non-host architecture).
>>
>> Do not hesitate to report this issue + hack on vng's bug tracker :)
>
> Done ;)
>
> https://github.com/arighi/virtme-ng/issues/223
Thank you! :)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Powered by blists - more mailing lists