lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9EDD3PBNkxlZjTZ@FVFF77S0Q05N>
Date:   Wed, 25 Jan 2023 10:23:11 +0000
From:   Mark Rutland <mark.rutland@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...nel.org, will@...nel.org, boqun.feng@...il.com,
        tglx@...utronix.de, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, hpa@...or.com, seanjc@...gle.com,
        pbonzini@...hat.com, jgross@...e.com, srivatsa@...il.mit.edu,
        amakhalov@...are.com, pv-drivers@...are.com, rostedt@...dmis.org,
        mhiramat@...nel.org, wanpengli@...cent.com, vkuznets@...hat.com,
        boris.ostrovsky@...cle.com, rafael@...nel.org,
        daniel.lezcano@...aro.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        linux-trace-kernel@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH 0/6] A few cpuidle vs rcu fixes

On Wed, Jan 25, 2023 at 10:40:17AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 25, 2023 at 10:35:16AM +0100, Peter Zijlstra wrote:
> > tip/sched/core contains the following patch addressing this:
> > 
> > ---
> > commit 9aedeaed6fc6fe8452b9b8225e95cc2b8631ff91
> > Author: Peter Zijlstra <peterz@...radead.org>
> > Date:   Thu Jan 12 20:43:49 2023 +0100
> > 
> >     tracing, hardirq: No moar _rcuidle() tracing
> >     
> >     Robot reported that trace_hardirqs_{on,off}() tickle the forbidden
> >     _rcuidle() tracepoint through local_irq_{en,dis}able().
> >     
> >     For 'sane' configs, these calls will only happen with RCU enabled and
> >     as such can use the regular tracepoint. This also means it's possible
> >     to trace them from NMI context again.
> >     
> >     Reported-by: kernel test robot <lkp@...el.com>
> >     Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> >     Signed-off-by: Ingo Molnar <mingo@...nel.org>
> >     Link: https://lore.kernel.org/r/20230112195541.477416709@infradead.org
> > 
> > diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> > index 629f2854e12b..f992444a0b1f 100644
> > --- a/kernel/trace/trace_preemptirq.c
> > +++ b/kernel/trace/trace_preemptirq.c
> > @@ -19,6 +19,20 @@
> >  /* Per-cpu variable to prevent redundant calls when IRQs already off */
> >  static DEFINE_PER_CPU(int, tracing_irq_cpu);
> >  
> > +/*
> > + * Use regular trace points on architectures that implement noinstr
> > + * tooling: these calls will only happen with RCU enabled, which can
> > + * use a regular tracepoint.
> > + *
> > + * On older architectures, use the rcuidle tracing methods (which
> > + * aren't NMI-safe - so exclude NMI contexts):
> > + */
> > +#ifdef CONFIG_ARCH_WANTS_NO_INSTR
> > +#define trace(point)	trace_##point
> > +#else
> > +#define trace(point)	if (!in_nmi()) trace_##point##_rcuidle
> > +#endif
> > +
> >  /*
> >   * Like trace_hardirqs_on() but without the lockdep invocation. This is
> >   * used in the low level entry code where the ordering vs. RCU is important
> 
> For some reason I missed the trace_preempt_{on,off} things, so that then
> gets the below on top or so.
> 
> diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> index f992444a0b1f..ea96b41c8838 100644
> --- a/kernel/trace/trace_preemptirq.c
> +++ b/kernel/trace/trace_preemptirq.c
> @@ -100,15 +100,13 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);
>  
>  void trace_preempt_on(unsigned long a0, unsigned long a1)
>  {
> -	if (!in_nmi())
> -		trace_preempt_enable_rcuidle(a0, a1);
> +	trace(preempt_enable)(a0, a1);
>  	tracer_preempt_on(a0, a1);
>  }
>  
>  void trace_preempt_off(unsigned long a0, unsigned long a1)
>  {
> -	if (!in_nmi())
> -		trace_preempt_disable_rcuidle(a0, a1);
> +	trace(preempt_disable)(a0, a1);
>  	tracer_preempt_off(a0, a1);
>  }
>  #endif

I've tested this fixlet atop this series (itself atop tip/sched/core) with a
full-fat ftrace config and the ftrace selftests, and that all runs cleanly.

FWIW, if you spin this as a patch:

Tested-by: Mark Rutland <mark.rutland@....com>

Without the fixlet I get splats on both arm64 and x86_64, e.g.

On arm64:

| ------------[ cut here ]------------
| WARNING: CPU: 1 PID: 1162 at include/trace/events/preemptirq.h:55 trace_preempt_on+0x68/0x70
| Modules linked in:
| CPU: 1 PID: 1162 Comm: ftracetest Not tainted 6.2.0-rc1-00100-g1066815869f5 #1
| Hardware name: linux,dummy-virt (DT)
| pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : trace_preempt_on+0x68/0x70
| lr : preempt_count_sub+0xb4/0xf0
| sp : ffff80000e04ba70
| x29: ffff80000e04ba70 x28: ffff80000ade09e8 x27: ffff800009d0c960
| x26: ffff000007c97e10 x25: ffff000007c97e00 x24: ffff80000adf1410
| x23: ffff80000e04bb80 x22: ffff80000a9aabb0 x21: ffff8000080a5cf4
| x20: ffff8000080a5cf4 x19: 0000000000000001 x18: 0000000000000000
| x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
| x14: 0000000000000028 x13: 0000000000004320 x12: 0000000000000361
| x11: 0000000000000361 x10: 0000000000040361 x9 : ffff8000080a5cf4
| x8 : ffff80000ae42a18 x7 : 0000000000000000 x6 : 0000000000000001
| x5 : ffff80000a8f14e8 x4 : 0000000000000001 x3 : 0000000000000000
| x2 : 0000000000000007 x1 : ffff8000080a5cf4 x0 : ffff8000080a5cf4
| Call trace:
|  trace_preempt_on+0x68/0x70
|  preempt_count_sub+0xb4/0xf0
|  percpu_up_read.constprop.0+0xc4/0x180
|  cpus_read_unlock+0x18/0x24
|  static_key_enable+0x2c/0x40
|  tracepoint_add_func+0x330/0x3dc
|  tracepoint_probe_register+0x74/0xc0
|  trace_event_reg+0x8c/0xa0
|  __ftrace_event_enable_disable+0x174/0x4d0
|  __ftrace_set_clr_event_nolock+0xe0/0x150
|  ftrace_set_clr_event+0x90/0x13c
|  ftrace_event_write+0xd4/0x120
|  vfs_write+0xcc/0x2f0
|  ksys_write+0x78/0x110
|  __arm64_sys_write+0x24/0x30
|  invoke_syscall+0x50/0x120
|  el0_svc_common.constprop.0+0x68/0x124
|  do_el0_svc+0x40/0xbc
|  el0_svc+0x48/0xc0
|  el0t_64_sync_handler+0xf4/0x120
|  el0t_64_sync+0x190/0x194
| irq event stamp: 69662
| hardirqs last  enabled at (69661): [<ffff8000092d63f4>] _raw_spin_unlock_irqrestore+0x80/0xa0
| hardirqs last disabled at (69662): [<ffff8000092c4d64>] el1_dbg+0x24/0x90
| softirqs last  enabled at (69564): [<ffff800008010b08>] __do_softirq+0x448/0x5bc
| softirqs last disabled at (69555): [<ffff800008017288>] ____do_softirq+0x18/0x24
| ---[ end trace 0000000000000000 ]---
| ------------[ cut here ]------------
| WARNING: CPU: 1 PID: 1162 at include/trace/events/preemptirq.h:51 trace_preempt_off+0x68/0xb0
| Modules linked in:
| CPU: 1 PID: 1162 Comm: ftracetest Tainted: G        W          6.2.0-rc1-00100-g1066815869f5 #1
| Hardware name: linux,dummy-virt (DT)
| pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : trace_preempt_off+0x68/0xb0
| lr : preempt_count_add+0xa0/0xc0
| sp : ffff80000e04ba80
| x29: ffff80000e04ba80 x28: ffff80000ade09e8 x27: ffff800009d0c970
| x26: ffff000007c97e90 x25: ffff000007c97e80 x24: ffff80000adf13c8
| x23: ffff80000e04bb80 x22: ffff80000a9aabb0 x21: ffff80000b8c8d18
| x20: ffff8000080a5c68 x19: ffff8000080a5c68 x18: 0000000000000000
| x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
| x14: 0000000000000028 x13: 0000000000004320 x12: 0000000000000361
| x11: 0000000000000361 x10: 0000000000040361 x9 : ffff8000080a5c68
| x8 : ffff80000ae42a18 x7 : 0000000000000000 x6 : 0000000000000003
| x5 : 000000003a9e6115 x4 : 0000000000000001 x3 : 0000000000000000
| x2 : 0000000000000007 x1 : ffff8000080a5c68 x0 : ffff8000080a5c68
| Call trace:
|  trace_preempt_off+0x68/0xb0
|  preempt_count_add+0xa0/0xc0
|  percpu_up_read.constprop.0+0x38/0x180
|  cpus_read_unlock+0x18/0x24
|  static_key_enable+0x2c/0x40
|  tracepoint_add_func+0x330/0x3dc
|  tracepoint_probe_register+0x74/0xc0
|  trace_event_reg+0x8c/0xa0
|  __ftrace_event_enable_disable+0x174/0x4d0
|  __ftrace_set_clr_event_nolock+0xe0/0x150
|  ftrace_set_clr_event+0x90/0x13c
|  ftrace_event_write+0xd4/0x120
|  vfs_write+0xcc/0x2f0
|  ksys_write+0x78/0x110
|  __arm64_sys_write+0x24/0x30
|  invoke_syscall+0x50/0x120
|  el0_svc_common.constprop.0+0x68/0x124
|  do_el0_svc+0x40/0xbc
|  el0_svc+0x48/0xc0
|  el0t_64_sync_handler+0xf4/0x120
|  el0t_64_sync+0x190/0x194
| irq event stamp: 69686
| hardirqs last  enabled at (69685): [<ffff8000092c4db8>] el1_dbg+0x78/0x90
| hardirqs last disabled at (69686): [<ffff8000092c4d64>] el1_dbg+0x24/0x90
| softirqs last  enabled at (69670): [<ffff800008010b08>] __do_softirq+0x448/0x5bc
| softirqs last disabled at (69665): [<ffff800008017288>] ____do_softirq+0x18/0x24
| ---[ end trace 0000000000000000 ]---

On x86_64:

| ------------[ cut here ]------------
| WARNING: CPU: 0 PID: 1083 at include/trace/events/preemptirq.h:55 trace_preempt_on+0x31/0x40
| Modules linked in:
| CPU: 0 PID: 1083 Comm: ftracetest Not tainted 6.2.0-rc1-00100-g1066815869f5 #3
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
| RIP: 0010:trace_preempt_on+0x31/0x40
| Code: 1d d6 45 4c a9 00 00 f0 00 74 05 e9 29 11 00 00 cc 90 e9 22 11 00 00 65 8b 05 07 d6 45 4c 89 c0 48 0f a3 05 c1 af b1 01 73 e1 <0f> 0b eb dd 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 0
| RSP: 0018:ffffb374c0ce7ba8 EFLAGS: 00010247
| RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
| RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffb49e9c1d
| RBP: 0000000000000000 R08: 000000004ea10906 R09: 000000005e02388f
| R10: 00000000f5e02388 R11: 00000000d269a5f5 R12: ffff98620005c780
| R13: ffffffffb3a363d0 R14: 0000000000000a37 R15: 0000000000000a38
| FS:  00007fc812d836a0(0000) GS:ffff98623bc00000(0000) knlGS:0000000000000000
| CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| CR2: 0000000000642e58 CR3: 00000001039c6005 CR4: 0000000000370ef0
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
| Call Trace:
|  <TASK>
|  preempt_count_sub+0xa3/0xe0
|  _raw_spin_unlock+0x2d/0x50
|  ? trace_preempt_on+0x17/0x40
|  __text_poke+0x326/0x4c0
|  ? trace_preempt_on+0x17/0x40
|  text_poke_bp_batch+0x7f/0x360
|  text_poke_finish+0x1f/0x30
|  arch_jump_label_transform_apply+0x1c/0x30
|  static_key_enable_cpuslocked+0x65/0xa0
|  static_key_enable+0x1a/0x20
|  tracepoint_add_func+0x32c/0x430
|  ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10
|  tracepoint_probe_register+0x78/0xb0
|  ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10
|  __ftrace_event_enable_disable+0x180/0x250
|  __ftrace_set_clr_event_nolock+0xe3/0x130
|  ftrace_set_clr_event+0x74/0xf0
|  ftrace_event_write+0xdd/0x110
|  vfs_write+0xee/0x510
|  ksys_write+0x75/0x100
|  do_syscall_64+0x3e/0x90
|  entry_SYSCALL_64_after_hwframe+0x72/0xdc
| RIP: 0033:0x7fc812d09103
| Code: 8b 7c 24 08 89 c5 e8 c5 ff ff ff 89 ef 89 44 24 08 e8 81 bc 02 00 8b 44 24 08 48 83 c4 10 5d c3 48 63 ff b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 10 48 8b 15 4e fd 05 00 f7 d8 64 3
| RSP: 002b:00007ffcb12972f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
| RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fc812d09103
| RDX: 0000000000000004 RSI: 0000000000645050 RDI: 0000000000000001
| RBP: 0000000000645050 R08: fefefefefefefeff R09: ffffffff00000000
| R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000004
| R13: 00007fc812d83690 R14: 0000000000000001 R15: 0000000000000000
|  </TASK>
| irq event stamp: 57597
| hardirqs last  enabled at (57609): [<ffffffffb3b1384e>] __up_console_sem+0x5e/0x80
| hardirqs last disabled at (57620): [<ffffffffb3b13833>] __up_console_sem+0x43/0x80
| softirqs last  enabled at (57556): [<ffffffffb49eae94>] __do_softirq+0x354/0x4d7
| softirqs last disabled at (57543): [<ffffffffb3a92ad7>] irq_exit_rcu+0xc7/0x140
| ---[ end trace 0000000000000000 ]---
| ------------[ cut here ]------------
| WARNING: CPU: 0 PID: 1083 at include/trace/events/preemptirq.h:51 trace_preempt_off+0x31/0x40
| Modules linked in:
| CPU: 0 PID: 1083 Comm: ftracetest Tainted: G        W          6.2.0-rc1-00100-g1066815869f5 #3
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
| RIP: 0010:trace_preempt_off+0x31/0x40
| Code: cd d5 45 4c a9 00 00 f0 00 74 05 e9 f9 11 00 00 cc 90 e9 f2 11 00 00 65 8b 05 b7 d5 45 4c 89 c0 48 0f a3 05 71 af b1 01 73 e1 <0f> 0b eb dd 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 0
| RSP: 0018:ffffb374c0ce7c18 EFLAGS: 00010247
| RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
| RDX: 0000000000000000 RSI: ffffffffb3b74639 RDI: ffffffffb3b74639
| RBP: ffffffffb56faa10 R08: ffffffffb56faa10 R09: 000000005e02388f
| R10: 00000000f5e02388 R11: 00000000d269a5f5 R12: ffffffffb3a36d20
| R13: 0000000000000000 R14: 0000000000000000 R15: ffff9862038a0780
| FS:  00007fc812d836a0(0000) GS:ffff98623bc00000(0000) knlGS:0000000000000000
| CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| CR2: 0000000000642e58 CR3: 00000001039c6005 CR4: 0000000000370ef0
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
| Call Trace:
|  <TASK>
|  on_each_cpu_cond_mask+0x29/0x80
|  text_poke_bp_batch+0xe7/0x360
|  text_poke_finish+0x1f/0x30
|  arch_jump_label_transform_apply+0x1c/0x30
|  static_key_enable_cpuslocked+0x65/0xa0
|  static_key_enable+0x1a/0x20
|  tracepoint_add_func+0x32c/0x430
|  ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10
|  tracepoint_probe_register+0x78/0xb0
|  ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10
|  __ftrace_event_enable_disable+0x180/0x250
|  __ftrace_set_clr_event_nolock+0xe3/0x130
|  ftrace_set_clr_event+0x74/0xf0
|  ftrace_event_write+0xdd/0x110
|  vfs_write+0xee/0x510
|  ksys_write+0x75/0x100
|  do_syscall_64+0x3e/0x90
|  entry_SYSCALL_64_after_hwframe+0x72/0xdc
| RIP: 0033:0x7fc812d09103
| Code: 8b 7c 24 08 89 c5 e8 c5 ff ff ff 89 ef 89 44 24 08 e8 81 bc 02 00 8b 44 24 08 48 83 c4 10 5d c3 48 63 ff b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 10 48 8b 15 4e fd 05 00 f7 d8 64 3
| RSP: 002b:00007ffcb12972f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
| RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fc812d09103
| RDX: 0000000000000004 RSI: 0000000000645050 RDI: 0000000000000001
| RBP: 0000000000645050 R08: fefefefefefefeff R09: ffffffff00000000
| R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000004
| R13: 00007fc812d83690 R14: 0000000000000001 R15: 0000000000000000
|  </TASK>
| irq event stamp: 58293
| hardirqs last  enabled at (58303): [<ffffffffb3b1384e>] __up_console_sem+0x5e/0x80
| hardirqs last disabled at (58314): [<ffffffffb3b13833>] __up_console_sem+0x43/0x80
| softirqs last  enabled at (57820): [<ffffffffb49eae94>] __do_softirq+0x354/0x4d7
| softirqs last disabled at (57811): [<ffffffffb3a92ad7>] irq_exit_rcu+0xc7/0x140
| ---[ end trace 0000000000000000 ]---

Thanks,
Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ