lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJD7tka+DMQfD2MYWD03cc39cxCAbbEOOfYUi1CH5f6FkVbKMw@mail.gmail.com>
Date: Thu, 27 Jun 2024 03:36:01 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>
Cc: "Christoph Lameter (Ampere)" <cl@...ux.com>, Shakeel Butt <shakeel.butt@...ux.dev>, tj@...nel.org, 
	cgroups@...r.kernel.org, hannes@...xchg.org, lizefan.x@...edance.com, 
	longman@...hat.com, kernel-team@...udflare.com, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2] cgroup/rstat: Avoid thundering herd problem by kswapd
 across NUMA nodes

On Thu, Jun 27, 2024 at 2:21 AM Jesper Dangaard Brouer <hawk@...nel.org> wrote:
>
>
> On 27/06/2024 00.07, Yosry Ahmed wrote:
> > On Wed, Jun 26, 2024 at 2:35 PM Jesper Dangaard Brouer <hawk@...nel.org> wrote:
> >>
> >> On 26/06/2024 00.59, Yosry Ahmed wrote:
> >>> On Tue, Jun 25, 2024 at 3:35 PM Christoph Lameter (Ampere) <cl@...ux.com> wrote:
> >>>>
> >>>> On Tue, 25 Jun 2024, Yosry Ahmed wrote:
> >>>>
> [...]
> >>
> >> I implemented a variant using completions as Yosry asked for:
> >>
> >> [V3] https://lore.kernel.org/all/171943668946.1638606.1320095353103578332.stgit@firesoul/
> >
> > Thanks! I will take a look at this a little bit later. I am wondering
> > if you could verify if that solution fixes the problem with kswapd
> > flushing?
>
> I will deploy V3 on some production metals and report back in that thread.
>
> For this patch V2, I already have some results that show it solves the
> kswapd lock contention. Attaching grafana screenshot comparing two
> machines without/with this V2 patch. Green (16m1253) without patch, and
> Yellow line (16m1254) with patched kernel.  These machines have 12 NUMA
> nodes and thus 12 kswapd threads, and CPU time is summed for these threads.

Thanks for the data! Looking forward to whether v3 also fixes the
problem. I think it should, especially with the timeout, but let's see
:)

>
> Zooming in with perf record we can also see the lock contention is gone.
>   - sudo perf record -g -p $(pgrep -d, kswapd) -F 499 sleep 60
>   - sudo perf report --no-children  --call-graph graph,0.01,callee
> --sort symbol
>
>
> On a machine (16m1254) with this V2 patch:
>
>   Samples: 7K of event 'cycles:P', Event count (approx.): 61228473670
>     Overhead  Symbol
>   +    8.28%  [k] mem_cgroup_css_rstat_flush
>   +    6.69%  [k] xfs_perag_get_tag
>   +    6.51%  [k] radix_tree_next_chunk
>   +    5.09%  [k] queued_spin_lock_slowpath
>   +    3.94%  [k] srso_alias_safe_ret
>   +    3.62%  [k] srso_alias_return_thunk
>   +    3.11%  [k] super_cache_count
>   +    2.96%  [k] mem_cgroup_iter
>   +    2.95%  [k] down_read_trylock
>   +    2.48%  [k] shrink_lruvec
>   +    2.12%  [k] isolate_lru_folios
>   +    1.76%  [k] dentry_lru_isolate
>   +    1.74%  [k] radix_tree_gang_lookup_tag
>
>
> On a machine (16m1253) without patch:
>
>   Samples: 65K of event 'cycles:P', Event count (approx.): 492125554022
>     Overhead  SymbolCoverage]
>   +   55.84%  [k] queued_spin_lock_slowpath
>     - 55.80% queued_spin_lock_slowpath
>        + 53.10% __cgroup_rstat_lock
>        + 2.63% evict
>   +    7.06%  [k] mem_cgroup_css_rstat_flush
>   +    2.07%  [k] page_vma_mapped_walk
>   +    1.76%  [k] invalid_folio_referenced_vma
>   +    1.72%  [k] srso_alias_safe_ret
>   +    1.37%  [k] shrink_lruvec
>   +    1.23%  [k] srso_alias_return_thunk
>   +    1.17%  [k] down_read_trylock
>   +    0.98%  [k] perf_adjust_freq_unthr_context
>   +    0.97%  [k] super_cache_count
>
> I think this (clearly) shows that the patch works and eliminates kswapd
> lock contention.
>
> --Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ