linux-kernel - Re: Crash on armv7-a using KASAN

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZxDh9biUbf9W8gNN@J2N7QTR9R3>
Date: Thu, 17 Oct 2024 11:09:59 +0100
From: Mark Rutland <mark.rutland@....com>
To: Linus Walleij <linus.walleij@...aro.org>,
	Ard Biesheuvel <ardb@...nel.org>
Cc: Clement LE GOFFIC <clement.legoffic@...s.st.com>,
	Russell King <linux@...linux.org.uk>,
	"Russell King (Oracle)" <rmk+kernel@...linux.org.uk>,
	Kees Cook <kees@...nel.org>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
	Mark Brown <broonie@...nel.org>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-stm32@...md-mailman.stormreply.com,
	Antonio Borneo <antonio.borneo@...s.st.com>
Subject: Re: Crash on armv7-a using KASAN

On Wed, Oct 16, 2024 at 09:00:22PM +0200, Linus Walleij wrote:
> On Wed, Oct 16, 2024 at 10:55 AM Mark Rutland <mark.rutland@....com> wrote:
> 
> > I believe that's necessary for the lazy TLB switch, at least for SMP:
> >
> >         // CPU 0                        // CPU 1
> >
> >         << switches to task X's mm >>
> >
> >                                         << creates kthread task Y >>
> >                                         << maps task Y's new stack >>
> >                                         << maps task Y's new shadow >>
> >
> >                                         // Y switched out
> >                                         context_switch(..., Y, ..., ...);
> >
> >         // Switch from X to Y
> >         context_switch(..., X, Y, ...) {
> >                 // prev = X
> >                 // next = Y
> >
> >                 if (!next->mm) {
> >                         // Y has no mm
> >                         // No switch_mm() here
> >                         // ... so no check_vmalloc_seq()
> >                 } else {
> >                         // not taken
> >                 }
> >
> >                 ...
> >
> >                 // X's mm still lacks Y's stack + shadow here
> >
> >                 switch_to(prev, next, prev);
> >         }
> >
> > ... so probably worth a comment that we're faulting in the new
> > stack+shadow for for lazy tlb when switching to a task with no mm?
> 
> Switching to a task with no mm == switching to a kernel daemon.

A common misconception, but not always true:

* A kernel thread can have an mm: see kthread_use_mm() and
  kthread_unuse_mm().

* A user thread can lose its mm while exiting: see how do_exit() calls
  exit_mm(), and how hte task remains preemptible for a while
  thereafter.

... so we really do just mean "a task with no mm".

> And those only use the kernel memory and relies on that always
> being mapped in any previous mm context, right.

A task with no mm only uses kernel memory. Anything it uses must be
mapped in init_mm, but *might* not have been copied into every other mm,
and hence might not be in the previous mm context as per the example
above.

> But where do we put that comment? In kernel/sched/core.c
> context_switch()?

I was trying to suggest we update the existing comment in switch_to() to
be more explicit. e.g. expand the existing comment:

	@
	@ Do a dummy read from the new stack while running from the old one so
	@ that we can rely on do_translation_fault() to fix up any stale PMD
	@ entries covering the vmalloc region.
	@

... with:

	@
	@ For a non-lazy mm switch, check_vmalloc_seq() has ensured that
	@ that the active mm's page tables have mappings for the prev
	@ task's stack and the next task's stack.
	@
	@ For a lazy mm switch the active mm's page tables have mappings
	@ for the prev task's stack but might not have mappings for the
	@ new taks stack. Do a dummy read from the new stack while
	@ running from the old stack so that we can rely on
	@ do_translation_fault() to fix up any stale PMD entries
	@ covering the vmalloc region.
	@

Ard, does that sound good to you?

Mark.