lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 May 2020 13:40:42 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Joerg Roedel <jroedel@...e.de>
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shile Zhang <shile.zhang@...ux.alibaba.com>,
        Andy Lutomirski <luto@...capital.net>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Tzvetomir Stoyanov <tz.stoyanov@...il.com>
Subject: Re: [PATCH] percpu: Sync vmalloc mappings in pcpu_alloc() and
 free_percpu()

On Mon, 4 May 2020 17:12:36 +0200
Joerg Roedel <jroedel@...e.de> wrote:

> On Thu, Apr 30, 2020 at 10:39:19PM -0400, Steven Rostedt wrote:
> > What's so damn special about alloc_percpu()? It's definitely not a fast
> > path. And it's not used often.  
> 
> Okay, I fixed it in the percpu code. It is definitly not a nice
> solution, but having to call vmalloc_sync_mappings/unmappings() is not a
> nice solution at any place in the code. Here is the patch which fixes
> this issue for me. I am also not sure what to put in the Fixes tag, as
> it is related to tracing code accessing per-cpu data from the page-fault
> handler, not sure when this got introduced. Maybe someone else can
> provide a meaningful Fixes- or stable tag.
> 
> I also have an idea in mind how to make this all more robust and get rid
> of the vmalloc_sync_mappings/unmappings() interface, will show more when
> I know it works the way I think it does.
> 
>

Seems that your patch caused a lockdep splat on my box:

 ========================================================
 WARNING: possible irq lock inversion dependency detected
 5.7.0-rc3-test+ #249 Not tainted
 --------------------------------------------------------
 swapper/4/0 just changed the state of lock:
 ffff9a580fdd75a0 (&ndev->lock){++.-}-{2:2}, at: mld_ifc_timer_expire+0x3c/0x350
 but this lock took another, SOFTIRQ-unsafe lock in the past:
  (pgd_lock){+.+.}-{2:2}
 
 
 and interrupts could create inverse lock ordering between them.
 
 
 other info that might help us debug this:
  Possible interrupt unsafe locking scenario:
 
        CPU0                    CPU1
        ----                    ----
   lock(pgd_lock);
                                local_irq_disable();
                                lock(&ndev->lock);
                                lock(pgd_lock);
   <Interrupt>
     lock(&ndev->lock);
 
  *** DEADLOCK ***
 
 1 lock held by swapper/4/0:
  #0: ffff9a581ab05e70 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x2f0
 
 the shortest dependencies between 2nd lock and 1st lock:
  -> (pgd_lock){+.+.}-{2:2} {
     HARDIRQ-ON-W at:
                       lock_acquire+0xda/0x3d0
                       _raw_spin_lock+0x2f/0x40
                       sync_global_pgds_l4+0x77/0x180
                       pcpu_alloc+0x1fd/0x7b0
                       __kmem_cache_create+0x358/0x540
                       create_cache+0xe1/0x1f0
                       kmem_cache_create_usercopy+0x1a5/0x270
                       kmem_cache_create+0x12/0x20
                       acpi_os_create_cache+0x18/0x30
                       acpi_ut_create_caches+0x47/0xab
                       acpi_ut_init_globals+0xa/0x21a
                       acpi_initialize_subsystem+0x30/0xa5
                       acpi_early_init+0x62/0xd6
                       start_kernel+0x797/0x86a
                       secondary_startup_64+0xa4/0xb0
     SOFTIRQ-ON-W at:
                       lock_acquire+0xda/0x3d0
                       _raw_spin_lock+0x2f/0x40
                       sync_global_pgds_l4+0x77/0x180
                       pcpu_alloc+0x1fd/0x7b0
                       __kmem_cache_create+0x358/0x540
                       create_cache+0xe1/0x1f0
                       kmem_cache_create_usercopy+0x1a5/0x270
                       kmem_cache_create+0x12/0x20
                       acpi_os_create_cache+0x18/0x30
                       acpi_ut_create_caches+0x47/0xab
                       acpi_ut_init_globals+0xa/0x21a
                       acpi_initialize_subsystem+0x30/0xa5
                       acpi_early_init+0x62/0xd6
                       start_kernel+0x797/0x86a
                       secondary_startup_64+0xa4/0xb0
     INITIAL USE at:
   }
   ... key      at: [<ffffffffb96340b8>] pgd_lock+0x18/0x40
   ... acquired at:
    _raw_spin_lock+0x2f/0x40
    sync_global_pgds_l4+0x77/0x180
    pcpu_alloc+0x1fd/0x7b0
    fib_nh_common_init+0x53/0x110
    fib6_nh_init+0x10c/0x700
    ip6_route_info_create+0x344/0x440
    ip6_route_add+0x18/0x90
    addrconf_prefix_route.isra.48+0x17b/0x210
    addrconf_notify+0x743/0x8c0
    notifier_call_chain+0x47/0x70
    __dev_notify_flags+0x9d/0x150
    dev_change_flags+0x48/0x60
    do_setlink+0x39d/0x1080
    rtnl_setlink+0x116/0x190
    rtnetlink_rcv_msg+0x188/0x4b0
    netlink_rcv_skb+0x75/0x140
    netlink_unicast+0x1ae/0x280
    netlink_sendmsg+0x253/0x490
    sock_sendmsg+0x5b/0x60
    __sys_sendto+0x12c/0x190
    __x64_sys_sendto+0x24/0x30
    do_syscall_64+0x60/0x230
    entry_SYSCALL_64_after_hwframe+0x49/0xb3
 
 -> (&ndev->lock){++.-}-{2:2} {
    HARDIRQ-ON-W at:
                     lock_acquire+0xda/0x3d0
                     _raw_write_lock_bh+0x34/0x40
                     ipv6_mc_init_dev+0x19/0xc0
                     ipv6_add_dev+0x2e5/0x490
                     addrconf_init+0x7f/0x250
                     inet6_init+0x1c3/0x373
                     do_one_initcall+0x70/0x340
                     kernel_init_freeable+0x249/0x2ca
                     kernel_init+0xa/0x10a
                     ret_from_fork+0x3a/0x50
    HARDIRQ-ON-R at:
                     lock_acquire+0xda/0x3d0
                     _raw_read_lock_bh+0x37/0x50
                     addrconf_dad_work+0xc6/0x560
                     process_one_work+0x25e/0x5c0
                     worker_thread+0x30/0x380
                     kthread+0x139/0x160
                     ret_from_fork+0x3a/0x50
    IN-SOFTIRQ-R at:
                     lock_acquire+0xda/0x3d0
                     _raw_read_lock_bh+0x37/0x50
                     mld_ifc_timer_expire+0x3c/0x350
                     call_timer_fn+0xa5/0x2f0
                     run_timer_softirq+0x1dd/0x580
                     __do_softirq+0xf8/0x4be
                     irq_exit+0xf1/0x100
                     smp_apic_timer_interrupt+0xd0/0x2a0
                     apic_timer_interrupt+0xf/0x20
                     cpuidle_enter_state+0xcd/0x440
                     cpuidle_enter+0x29/0x40
                     do_idle+0x24a/0x290
                     cpu_startup_entry+0x19/0x20
                     start_secondary+0x195/0x1e0
                     secondary_startup_64+0xa4/0xb0
    INITIAL USE at:
                    lock_acquire+0xda/0x3d0
                    _raw_write_lock_bh+0x34/0x40
                    ipv6_mc_init_dev+0x19/0xc0
                    ipv6_add_dev+0x2e5/0x490
                    addrconf_init+0x7f/0x250
                    inet6_init+0x1c3/0x373
                    do_one_initcall+0x70/0x340
                    kernel_init_freeable+0x249/0x2ca
                    kernel_init+0xa/0x10a
                    ret_from_fork+0x3a/0x50
  }
  ... key      at: [<ffffffffbaf727f0>] __key.78650+0x0/0x10
  ... acquired at:
    mark_lock+0x22e/0x740
    __lock_acquire+0x9e1/0x1c30
    lock_acquire+0xda/0x3d0
    _raw_read_lock_bh+0x37/0x50
    mld_ifc_timer_expire+0x3c/0x350
    call_timer_fn+0xa5/0x2f0
    run_timer_softirq+0x1dd/0x580
    __do_softirq+0xf8/0x4be
    irq_exit+0xf1/0x100
    smp_apic_timer_interrupt+0xd0/0x2a0
    apic_timer_interrupt+0xf/0x20
    cpuidle_enter_state+0xcd/0x440
    cpuidle_enter+0x29/0x40
    do_idle+0x24a/0x290
    cpu_startup_entry+0x19/0x20
    start_secondary+0x195/0x1e0
    secondary_startup_64+0xa4/0xb0
 
 
 stack backtrace:
 CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.7.0-rc3-test+ #249
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  <IRQ>
  dump_stack+0x8f/0xd0
  check_usage_forwards.cold.61+0x1e/0x27
  mark_lock+0x22e/0x740
  ? check_usage_backwards+0x1e0/0x1e0
  __lock_acquire+0x9e1/0x1c30
  lock_acquire+0xda/0x3d0
  ? mld_ifc_timer_expire+0x3c/0x350
  ? mld_dad_timer_expire+0xb0/0xb0
  ? mld_dad_timer_expire+0xb0/0xb0
  _raw_read_lock_bh+0x37/0x50
  ? mld_ifc_timer_expire+0x3c/0x350
  mld_ifc_timer_expire+0x3c/0x350
  ? mld_dad_timer_expire+0xb0/0xb0
  ? mld_dad_timer_expire+0xb0/0xb0
  call_timer_fn+0xa5/0x2f0
  ? mld_dad_timer_expire+0xb0/0xb0
  run_timer_softirq+0x1dd/0x580
  __do_softirq+0xf8/0x4be
  irq_exit+0xf1/0x100
  smp_apic_timer_interrupt+0xd0/0x2a0
  apic_timer_interrupt+0xf/0x20
  </IRQ>
 RIP: 0010:cpuidle_enter_state+0xcd/0x440
 Code: 80 7c 24 13 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 0c 03 00 00 31 ff e8 6f 35 8b ff e8 1a 52 92 ff fb 66 0f 1f 44 00 00 <85> ed 0f 88 74 02 00 00 48 63 c5 4c 8b 3c 24 4c 2b 7c 24 08 48 8d
 RSP: 0018:ffff9a581981fe70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
 RAX: 0000000000e2cf41 RBX: ffff9a581ab37400 RCX: 0000000000000000
 RDX: ffff9a581982d100 RSI: 0000000000000006 RDI: ffff9a581982d100
 RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffb96f14c0
 R13: ffffffffb96f1678 R14: 0000000000000004 R15: 0000000000000004
  cpuidle_enter+0x29/0x40
  do_idle+0x24a/0x290
  cpu_startup_entry+0x19/0x20
  start_secondary+0x195/0x1e0
  secondary_startup_64+0xa4/0xb0


-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ