lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Nov 2020 09:23:39 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Paul McKenney <paulmck@...nel.org>, eupm90@...il.com,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>
Subject: Re: #PF from NMI

On Fri, Nov 13, 2020 at 4:53 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> Hi
>
> Eugenio reported
> (https://bugzilla.kernel.org/attachment.cgi?id=293659&action=edit):
>
> [  139.226723] ------------[ cut here ]------------
> [  139.226724] WARNING: CPU: 9 PID: 2290 at kernel/rcu/tree.c:932 __rcu_irq_enter_check_tick+0x84/0xd0
> [  139.226725] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_e
> [  139.226746] CPU: 9 PID: 2290 Comm: perf Tainted: G S                5.10.0-rc2+ #1
> [  139.226746] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.6.0 10/31/2017
> [  139.226747] RIP: 0010:__rcu_irq_enter_check_tick+0x84/0xd0
> [  139.226747] Code: 75 cc 48 8b 7b 18 e8 4b c9 84 00 80 bb 05 01 00 00 00 74 09 80 bb 06 01 00 00 00 74 11 48 8b 7b 18 c6 07 00 0f 1f 40 00
> [  139.226748] RSP: 0000:ffffbdb38586b8c8 EFLAGS: 00010006
> [  139.226749] RAX: 0000000080110000 RBX: ffff9cdc2f92b9c0 RCX: ffffffff9f600fb7
> [  139.226749] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffbdb38586b938
> [  139.226749] RBP: 000000000000000e R08: 0000000000000000 R09: 0000000000000000
> [  139.226750] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [  139.226750] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [  139.226750] FS:  00007fb0c5a628c0(0000) GS:ffff9cdc2f900000(0000) knlGS:0000000000000000
> [  139.226751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  139.226751] CR2: 000000000000000e CR3: 000000010e0ea001 CR4: 00000000003706a0
> [  139.226752] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  139.226752] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  139.226752] Call Trace:
> [  139.226753]  irqentry_enter+0x25/0x40
> [  139.226753]  exc_page_fault+0x38/0x4c0
> [  139.226753]  asm_exc_page_fault+0x1e/0x30
> [  139.226753] RIP: 0010:__get_user_nocheck_8+0x6/0x13
> [  139.226754] Code: 01 ca c3 90 0f 01 cb 0f ae e8 0f b7 10 31 c0 0f 01 ca c3 90 0f 01 cb 0f ae e8 8b 10 31 c0 0f 01 ca c3 66 90 0f 01 cb 0f
> [  139.226754] RSP: 0000:ffffbdb38586b9e8 EFLAGS: 00050087
> [  139.226755] RAX: 000000000000000e RBX: ffffbdb38586ba30 RCX: 00007fb0c586f0d2
> [  139.226756] RDX: 0000000000000080 RSI: 000000000000007f RDI: 00000000ffffdff0
> [  139.226756] RBP: ffffbdb38586bf58 R08: 00000000bffffff0 R09: ffffffffa05a0460
> [  139.226756] R10: 00ffffffffffeff0 R11: 00007fffffffeff0 R12: 000000000000000e
> [  139.226757] R13: ffff9cd4daa017c0 R14: 0000000000000000 R15: 000000000000007f
> [  139.226757]  ? kvmclock_cpu_down_prep+0x20/0x20 [kvm]
> [  139.226757]  perf_callchain_user+0xf4/0x280
> [  139.226758]  get_perf_callchain+0x17b/0x1b0
> [  139.226758]  perf_callchain+0x71/0x80
> [  139.226758]  setup_pebs_fixed_sample_data+0x2a8/0x360
> [  139.226759]  intel_pmu_drain_pebs_nhm+0x477/0x5d0
> [  139.226759]  ? get_page_from_freelist+0x117d/0x1390
> [  139.226759]  ? page_counter_cancel+0x1f/0x30
> [  139.226760]  ? page_counter_uncharge+0x1d/0x30
> [  139.226760]  ? drain_stock.isra.54+0x64/0xb0
> [  139.226760]  ? get_page_from_freelist+0x117d/0x1390
> [  139.226761]  handle_pmi_common+0xc3/0x2d0
> [  139.226761]  ? alloc_set_pte+0xd3/0x5e0
> [  139.226761]  ? filemap_map_pages+0x2d2/0x430
> [  139.226762]  ? xfs_filemap_map_pages+0x44/0x60 [xfs]
> [  139.226762]  ? xfs_iunlock+0x86/0xd0 [xfs]
> [  139.226762]  ? handle_mm_fault+0x145b/0x1650
> [  139.226762]  intel_pmu_handle_irq+0xc9/0x1c0
> [  139.226763]  perf_event_nmi_handler+0x24/0x40
> [  139.226763]  nmi_handle+0x55/0x130
> [  139.226763]  default_do_nmi+0x49/0x100
> [  139.226764]  exc_nmi+0x121/0x150
> [  139.226764]  asm_exc_nmi+0x8e/0xd7
> [  139.226764] RIP: 0033:0x7fb0c586f0d2
> [  139.226765] Code: 48 89 44 24 30 48 8b 83 00 03 00 00 48 85 c0 0f 84 d3 00 00 00 48 8b 7c 24 20 8b 8b fc 02 00 00 8b 54 24 38 23 93 f8 08
> [  139.226766] RSP: 002b:00007ffcf13cde60 EFLAGS: 00010246
> [  139.226766] RAX: 00007fb0c309c2b0 RBX: 00007fb0c5a73500 RCX: 0000000000000006
> [  139.226766] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00000000fc2c9fd0
> [  139.226767] RBP: 000000000000000e R08: 00007ffcf13cdf90 R09: 00007fb0c5a8f410
> [  139.226767] R10: 000000000000002b R11: 0000000000000246 R12: 0000000000000000
> [  139.226768] R13: 0000000000000001 R14: 00007fb0c32a1fbc R15: 00007fb0c5a69cd0
>
> Which is a #PF from NMI context, which is perfectly fine. However
> __rcu_irq_enter_check_tick() is triggering WARN.
>
> AFAICT the right thing is to simply remove the warn like so.
>
> ---
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 430ba58d8bfe..9bda92d8b914 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -928,8 +928,8 @@ void __rcu_irq_enter_check_tick(void)
>  {
>         struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>
> -        // Enabling the tick is unsafe in NMI handlers.
> -       if (WARN_ON_ONCE(in_nmi()))
> +       // if we're here from NMI, there's nothing to do.
> +       if (in_nmi())
>                 return;
>
>         RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),

Reviewed-by: Andy Lutomirski <luto@...nel.org>

with the following caveat that has nothing to do with NMI: the rest of
irqentry_enter() has tracing calls in it.  Does anything prevent
infinite recursion if one of those tracing calls causes a page fault?

--Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ