lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <597352bb-6afa-4fa4-a5ee-1f0aa14e61be@redhat.com>
Date:   Fri, 1 Dec 2023 17:22:26 -0500
From:   Waiman Long <longman@...hat.com>
To:     Catalin Marinas <catalin.marinas@....com>, linux-mm@...ck.org
Cc:     linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] kmemleak: Avoid RCU stalls when freeing metadata for
 per-CPU pointers

On 12/1/23 14:08, Catalin Marinas wrote:
> On systems with large number of CPUs, the following soft lockup splat
> might sometimes happen:
>
> [ 2656.001617] watchdog: BUG: soft lockup - CPU#364 stuck for 21s! [ksoftirqd/364:2206]
>    :
> [ 2656.141194] RIP: 0010:_raw_spin_unlock_irqrestore+0x3d/0x70
>    :
>   2656.241214] Call Trace:
> [ 2656.243971]  <IRQ>
> [ 2656.246237]  ? show_trace_log_lvl+0x1c4/0x2df
> [ 2656.251152]  ? show_trace_log_lvl+0x1c4/0x2df
> [ 2656.256066]  ? kmemleak_free_percpu+0x11f/0x1f0
> [ 2656.261173]  ? watchdog_timer_fn+0x379/0x470
> [ 2656.265984]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [ 2656.271179]  ? __hrtimer_run_queues+0x5f3/0xd00
> [ 2656.276283]  ? __pfx___hrtimer_run_queues+0x10/0x10
> [ 2656.281783]  ? ktime_get_update_offsets_now+0x95/0x2c0
> [ 2656.287573]  ? ktime_get_update_offsets_now+0xdd/0x2c0
> [ 2656.293380]  ? hrtimer_interrupt+0x2e9/0x780
> [ 2656.298221]  ? __sysvec_apic_timer_interrupt+0x184/0x640
> [ 2656.304211]  ? sysvec_apic_timer_interrupt+0x8e/0xc0
> [ 2656.309807]  </IRQ>
> [ 2656.312169]  <TASK>
> [ 2656.326110]  kmemleak_free_percpu+0x11f/0x1f0
> [ 2656.331015]  free_percpu.part.0+0x1b/0xe70
> [ 2656.335635]  free_vfsmnt+0xb9/0x100
> [ 2656.339567]  rcu_do_batch+0x3c8/0xe30
> [ 2656.363693]  rcu_core+0x3de/0x5a0
> [ 2656.367433]  __do_softirq+0x2d0/0x9a8
> [ 2656.381119]  run_ksoftirqd+0x36/0x60
> [ 2656.385145]  smpboot_thread_fn+0x556/0x910
> [ 2656.394971]  kthread+0x2a4/0x350
> [ 2656.402826]  ret_from_fork+0x29/0x50
> [ 2656.406861]  </TASK>
>
> The issue is caused by kmemleak registering each per_cpu_ptr()
> corresponding to the __percpu pointer. This is unnecessary since such
> individual per-CPU pointers are not tracked anyway. Create a new
> object_percpu_tree_root rbtree that stores a single __percpu pointer
> together with an OBJECT_PERCPU flag for the kmemleak metadata. Scanning
> needs to be done for all per_cpu_ptr() pointers with a cond_resched()
> between each CPU iteration to avoid RCU stalls.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@....com>
> Reported-by: Waiman Long <longman@...hat.com>
> Link: https://lore.kernel.org/r/20231127194153.289626-1-longman@redhat.com
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> ---
>
> The only difference from the inlined patch I posted previously is some updated
> comments to include the new object tree.
>
>   mm/kmemleak.c | 178 +++++++++++++++++++++++++++-----------------------
>   1 file changed, 97 insertions(+), 81 deletions(-)
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 1eacca03bedd..eb6cdc3e9af2 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -14,17 +14,15 @@
>    * The following locks and mutexes are used by kmemleak:
>    *
>    * - kmemleak_lock (raw_spinlock_t): protects the object_list as well as
> - *   del_state modifications and accesses to the object_tree_root (or
> - *   object_phys_tree_root). The object_list is the main list holding the
> - *   metadata (struct kmemleak_object) for the allocated memory blocks.
> - *   The object_tree_root and object_phys_tree_root are red
> - *   black trees used to look-up metadata based on a pointer to the
> - *   corresponding memory block. The object_phys_tree_root is for objects
> - *   allocated with physical address. The kmemleak_object structures are
> - *   added to the object_list and object_tree_root (or object_phys_tree_root)
> - *   in the create_object() function called from the kmemleak_alloc() (or
> - *   kmemleak_alloc_phys()) callback and removed in delete_object() called from
> - *   the kmemleak_free() callback
> + *   del_state modifications and accesses to the object trees
> + *   (object_tree_root, object_phys_tree_root, object_percpu_tree_root). The
> + *   object_list is the main list holding the metadata (struct
> + *   kmemleak_object) for the allocated memory blocks. The object trees are
> + *   red black trees used to look-up metadata based on a pointer to the
> + *   corresponding memory block. The kmemleak_object structures are added to
> + *   the object_list and the object tree root in the create_object() function
> + *   called from the kmemleak_alloc() (or kmemleak_alloc_phys()) callback and
> + *   removed in delete_object() called from the kmemleak_free() callback

Just a minor nit. For completeness, should we mention 
kmemleak_alloc_percpu() and kmemleak_free_percpu() here?

Anyway, I won't mind if you want to keep it as it is.

Reviewed-by: Waiman Long <longman@...hat.com>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ