lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z_0lSxPcw4WW1wAP@gmail.com>
Date: Mon, 14 Apr 2025 17:10:03 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: syzbot <syzbot+c2537ce72a879a38113e@...kaller.appspotmail.com>,
	riel@...riel.com, bp@...en8.de, dave.hansen@...ux.intel.com,
	hpa@...or.com, linux-kernel@...r.kernel.org,
	linux-next@...r.kernel.org, luto@...nel.org, mingo@...hat.com,
	sfr@...b.auug.org.au, syzkaller-bugs@...glegroups.com,
	tglx@...utronix.de, x86@...nel.org
Subject: Re: [syzbot] [kernel?] linux-next test error: WARNING in
 switch_mm_irqs_off


* Peter Zijlstra <peterz@...radead.org> wrote:

> > Call Trace:
> >  <TASK>
> >  unuse_temporary_mm+0x9f/0x100 arch/x86/mm/tlb.c:1038
> >  __text_poke+0x7b6/0xb40 arch/x86/kernel/alternative.c:2214
> >  text_poke arch/x86/kernel/alternative.c:2257 [inline]
> >  smp_text_poke_batch_finish+0x3e7/0x12c0 arch/x86/kernel/alternative.c:2565
> >  arch_jump_label_transform_apply+0x1c/0x30 arch/x86/kernel/jump_label.c:146
> >  static_key_disable_cpuslocked+0xd2/0x1c0 kernel/jump_label.c:240
> >  static_key_disable+0x1a/0x20 kernel/jump_label.c:248
> >  once_deferred+0x70/0xb0 lib/once.c:20
> >  process_one_work kernel/workqueue.c:3238 [inline]
> >  process_scheduled_works+0xac3/0x18e0 kernel/workqueue.c:3319
> >  worker_thread+0x870/0xd50 kernel/workqueue.c:3400
> >  kthread+0x7b7/0x940 kernel/kthread.c:464
> >  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
> >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >  </TASK>
> 
> So I can reproduce, and I I think I see what happens, except I'm
> confused as to why the recently merged patches show this.
> 
> AFAIU what happens is that unuse_temporary_mm() clears the 
> mm_cpumask() for the current CPU, while switch_mm_irqs_off() then 
> checks that the mm_cpumask() bit is set for the current CPU.
> 
> This behaviour hasn't really changed since 209954cbc7d0 ("x86/mm/tlb: 
> Update mm_cpumask lazily") introduced both.
> 
> I'm not entirely sure what the best way forward is.. we can simply 
> delete the warning, or make use_temporary_mm() tag the special MMs 
> somehow and exclude them from the warning.

So, mm_cpumask is basically tracking on which CPUs the MM ran on, and 
this gets cleared lazily whenever there's an opportune time, but not 
during context switches (for shared cacheline performance reasons), 
right?

So why do we need to clear the mm_cpumask in unuse_temporary_mm() to 
begin with:

	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
        cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));

What TLB flushing are we worried about here? Nothing much should 
trigger any TLB flushing for text_poke_mm AFAICS?

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ