lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQ+o4kE84u05kCgDui-hdk2BK=9vvAOpktiTsRThYRK+Pw@mail.gmail.com>
Date: Wed, 22 Oct 2025 18:39:31 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>, Vlastimil Babka <vbabka@...e.cz>, 
	Harry Yoo <harry.yoo@...cle.com>, Michal Hocko <mhocko@...e.com>, 
	Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Eric Biggers <ebiggers@...nel.org>, 
	Aleksei Nikiforov <aleksei.nikiforov@...ux.ibm.com>, Alexander Potapenko <glider@...gle.com>, 
	Marco Elver <elver@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>, 
	kasan-dev <kasan-dev@...glegroups.com>, linux-mm <linux-mm@...ck.org>, 
	LKML <linux-kernel@...r.kernel.org>, Ilya Leoshkevich <iii@...ux.ibm.com>, 
	Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH] mm/kmsan: Fix kmsan kmalloc hook when no stack depots are
 allocated yet

On Wed, Oct 22, 2025 at 2:36 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Tue, 21 Oct 2025 20:02:13 -0700 Eric Biggers <ebiggers@...nel.org> wrote:
>
> > On Fri, Oct 10, 2025 at 10:07:04AM +0200, Aleksei Nikiforov wrote:
> > > On 10/9/25 05:31, Andrew Morton wrote:
> > > > On Tue, 30 Sep 2025 13:56:01 +0200 Aleksei Nikiforov <aleksei.nikiforov@...ux.ibm.com> wrote:
> > > >
> > > > > If no stack depot is allocated yet,
> > > > > due to masking out __GFP_RECLAIM flags
> > > > > kmsan called from kmalloc cannot allocate stack depot.
> > > > > kmsan fails to record origin and report issues.
> > > > >
> > > > > Reusing flags from kmalloc without modifying them should be safe for kmsan.
> > > > > For example, such chain of calls is possible:
> > > > > test_uninit_kmalloc -> kmalloc -> __kmalloc_cache_noprof ->
> > > > > slab_alloc_node -> slab_post_alloc_hook ->
> > > > > kmsan_slab_alloc -> kmsan_internal_poison_memory.
> > > > >
> > > > > Only when it is called in a context without flags present
> > > > > should __GFP_RECLAIM flags be masked.
> > > > >
> > > > > With this change all kmsan tests start working reliably.
> > > >
> > > > I'm not seeing reports of "hey, kmsan is broken", so I assume this
> > > > failure only occurs under special circumstances?
> > >
> > > Hi,
> > >
> > > kmsan might report less issues than it detects due to not allocating stack
> > > depots and not reporting issues without stack depots. Lack of reports may go
> > > unnoticed, that's why you don't get reports of kmsan being broken.
> >
> > Yes, KMSAN seems to be at least partially broken currently.  Besides the
> > fact that the kmsan KUnit test is currently failing (which I reported at
> > https://lore.kernel.org/r/20250911175145.GA1376@sol), I've confirmed
> > that the poly1305 KUnit test causes a KMSAN warning with Aleksei's patch
> > applied but does not cause a warning without it.  The warning did get
> > reached via syzbot somehow
> > (https://lore.kernel.org/r/751b3d80293a6f599bb07770afcef24f623c7da0.1761026343.git.xiaopei01@kylinos.cn/),
> > so KMSAN must still work in some cases.  But it didn't work for me.
>
> OK, thanks, I pasted the above para into the changelog to help people
> understand the impact of this.
>
> > (That particular warning in the architecture-optimized Poly1305 code is
> > actually a false positive due to memory being initialized by assembly
> > code.  But that's besides the point.  The point is that I should have
> > seen the warning earlier, but I didn't.  And Aleksei's patch seems to
> > fix KMSAN to work reliably.  It also fixes the kmsan KUnit test.)
> >
> > I don't really know this code, but I can at least give:
> >
> > Tested-by: Eric Biggers <ebiggers@...nel.org>
> >
> > If you want to add a Fixes commit I think it is either 97769a53f117e2 or
> > 8c57b687e8331.  Earlier I had confirmed that reverting those commits
> > fixed the kmsan test too
> > (https://lore.kernel.org/r/20250911192953.GG1376@sol).
>
> Both commits affect the same kernel version so either should be good
> for a Fixes target.
>
> I'll add a cc:stable to this and shall stage it for 6.18-rcX.
>
> The current state is below - if people want to suggest alterations,
> please go for it.

Thanks for cc-ing and for extra context.

>
>
> From: Aleksei Nikiforov <aleksei.nikiforov@...ux.ibm.com>
> Subject: mm/kmsan: fix kmsan kmalloc hook when no stack depots are allocated yet
> Date: Tue, 30 Sep 2025 13:56:01 +0200
>
> If no stack depot is allocated yet, due to masking out __GFP_RECLAIM
> flags kmsan called from kmalloc cannot allocate stack depot.  kmsan
> fails to record origin and report issues.  This may result in KMSAN
> failing to report issues.
>
> Reusing flags from kmalloc without modifying them should be safe for kmsan.
> For example, such chain of calls is possible:
> test_uninit_kmalloc -> kmalloc -> __kmalloc_cache_noprof ->
> slab_alloc_node -> slab_post_alloc_hook ->
> kmsan_slab_alloc -> kmsan_internal_poison_memory.
>
> Only when it is called in a context without flags present should
> __GFP_RECLAIM flags be masked.

I see. So this is a combination of gfpflags_allow_spinning()
and old kmsan code.
We hit this issue a few times already.

I feel the further we go the more a new __GFP_xxx flag could be justified,
but Michal is strongly against it.
This particular issue actually might tilt it in favor of Michal's position,
since fixing kmsan is the right thing to do.

The fix itself makes sense to me. No better ideas so far.

What's puzzling is that it took 9 month to discover it ?!
and allegedly Eric is seeing it by running kmsan selftest,
but Alexander couldn't repro it initially?
Looks like there is a gap in kmsan test coverage.
People that care about kmsan should really step up.

> With this change all kmsan tests start working reliably.
>
> Eric reported:
>
> : Yes, KMSAN seems to be at least partially broken currently.  Besides the
> :_fact that the kmsan KUnit test is currently failing (which I reported at
> :_https://lore.kernel.org/r/20250911175145.GA1376@sol), I've confirmed that
> :_the poly1305 KUnit test causes a KMSAN warning with Aleksei's patch
> :_applied but does not cause a warning without it.  The warning did get
> :_reached via syzbot somehow
> :_(https://lore.kernel.org/r/751b3d80293a6f599bb07770afcef24f623c7da0.1761026343.git.xiaopei01@kylinos.cn/),
> :_so KMSAN must still work in some cases.  But it didn't work for me.
>
> Link: https://lkml.kernel.org/r/20250930115600.709776-2-aleksei.nikiforov@linux.ibm.com
> Link: https://lkml.kernel.org/r/20251022030213.GA35717@sol
> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> Signed-off-by: Aleksei Nikiforov <aleksei.nikiforov@...ux.ibm.com>
> Reviewed-by: Alexander Potapenko <glider@...gle.com>
> Tested-by: Eric Biggers <ebiggers@...nel.org>
> Cc: Dmitriy Vyukov <dvyukov@...gle.com>
> Cc: Ilya Leoshkevich <iii@...ux.ibm.com>
> Cc: Marco Elver <elver@...gle.com>
> Cc: <stable@...r.kernel.org>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> ---
>
>  mm/kmsan/core.c   |    3 ---
>  mm/kmsan/hooks.c  |    6 ++++--
>  mm/kmsan/shadow.c |    2 +-
>  3 files changed, 5 insertions(+), 6 deletions(-)
>
> --- a/mm/kmsan/core.c~mm-kmsan-fix-kmsan-kmalloc-hook-when-no-stack-depots-are-allocated-yet
> +++ a/mm/kmsan/core.c
> @@ -72,9 +72,6 @@ depot_stack_handle_t kmsan_save_stack_wi
>
>         nr_entries = stack_trace_save(entries, KMSAN_STACK_DEPTH, 0);
>
> -       /* Don't sleep. */
> -       flags &= ~(__GFP_DIRECT_RECLAIM | __GFP_KSWAPD_RECLAIM);
> -
>         handle = stack_depot_save(entries, nr_entries, flags);
>         return stack_depot_set_extra_bits(handle, extra);
>  }
> --- a/mm/kmsan/hooks.c~mm-kmsan-fix-kmsan-kmalloc-hook-when-no-stack-depots-are-allocated-yet
> +++ a/mm/kmsan/hooks.c
> @@ -84,7 +84,8 @@ void kmsan_slab_free(struct kmem_cache *
>         if (s->ctor)
>                 return;
>         kmsan_enter_runtime();
> -       kmsan_internal_poison_memory(object, s->object_size, GFP_KERNEL,
> +       kmsan_internal_poison_memory(object, s->object_size,
> +                                    GFP_KERNEL & ~(__GFP_RECLAIM),
>                                      KMSAN_POISON_CHECK | KMSAN_POISON_FREE);
>         kmsan_leave_runtime();
>  }
> @@ -114,7 +115,8 @@ void kmsan_kfree_large(const void *ptr)
>         kmsan_enter_runtime();
>         page = virt_to_head_page((void *)ptr);
>         KMSAN_WARN_ON(ptr != page_address(page));
> -       kmsan_internal_poison_memory((void *)ptr, page_size(page), GFP_KERNEL,
> +       kmsan_internal_poison_memory((void *)ptr, page_size(page),
> +                                    GFP_KERNEL & ~(__GFP_RECLAIM),
>                                      KMSAN_POISON_CHECK | KMSAN_POISON_FREE);
>         kmsan_leave_runtime();
>  }
> --- a/mm/kmsan/shadow.c~mm-kmsan-fix-kmsan-kmalloc-hook-when-no-stack-depots-are-allocated-yet
> +++ a/mm/kmsan/shadow.c
> @@ -208,7 +208,7 @@ void kmsan_free_page(struct page *page,
>                 return;
>         kmsan_enter_runtime();
>         kmsan_internal_poison_memory(page_address(page), page_size(page),
> -                                    GFP_KERNEL,
> +                                    GFP_KERNEL & ~(__GFP_RECLAIM),
>                                      KMSAN_POISON_CHECK | KMSAN_POISON_FREE);
>         kmsan_leave_runtime();
>  }
> _
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ