linux-kernel - Re: [PATCH v4 0/9] per lruvec lru

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aa5499cd-7947-39a5-fc17-bd277be25764@yandex-team.ru>
Date:   Sun, 24 Nov 2019 18:49:12 +0300
From:   Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:     Alex Shi <alex.shi@...ux.alibaba.com>, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, mgorman@...hsingularity.net,
        tj@...nel.org, hughd@...gle.com, daniel.m.jordan@...cle.com,
        yang.shi@...ux.alibaba.com, willy@...radead.org,
        shakeelb@...gle.com, hannes@...xchg.org
Subject: Re: [PATCH v4 0/9] per lruvec lru_lock for memcg

On 19/11/2019 15.23, Alex Shi wrote:
> Hi all,
> 
> This patchset move lru_lock into lruvec, give a lru_lock for each of
> lruvec, thus bring a lru_lock for each of memcg per node.
> 
> According to Daniel Jordan's suggestion, I run 64 'dd' with on 32
> containers on my 2s* 8 core * HT box with the modefied case:
>    https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice
> 
> With this change above lru_lock censitive testing improved 17% with multiple
> containers scenario. And no performance lose w/o mem_cgroup.

Splitting lru_lock isn't only option for solving this lock contention.
Also it doesn't help if all this happens in one cgroup.

I think better batching could solve more problems with less overhead.

Like larger per-cpu vectors or queues for each numa node or even for each lruvec.
This will preliminarily sort and aggregate pages so actual modification under
lru_lock will be much cheaper and fine grained.

> 
> Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same idea
> 7 years ago. Now I believe considering my testing result, and google internal
> using fact. This feature is clearly benefit multi-container users.
> 
> So I'd like to introduce it here.
> 
> Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel Jordan,
> Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun Wang etc.
> 
> v4:
>    a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner
>    b, remove the irqsave flags changes, thanks Metthew Wilcox
>    c, merge/split patches for better understanding and bisection purpose
> 
> v3: rebase on linux-next, and fold the relock fix patch into introduceing patch
> 
> v2: bypass a performance regression bug and fix some function issues
> 
> v1: initial version, aim testing show 5% performance increase
> 
> 
> Alex Shi (9):
>    mm/swap: fix uninitialized compiler warning
>    mm/huge_memory: fix uninitialized compiler warning
>    mm/lru: replace pgdat lru_lock with lruvec lock
>    mm/mlock: only change the lru_lock iff page's lruvec is different
>    mm/swap: only change the lru_lock iff page's lruvec is different
>    mm/vmscan: only change the lru_lock iff page's lruvec is different
>    mm/pgdat: remove pgdat lru_lock
>    mm/lru: likely enhancement
>    mm/lru: revise the comments of lru_lock
> 
>   Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +----
>   Documentation/admin-guide/cgroup-v1/memory.rst     |  6 +-
>   Documentation/trace/events-kmem.rst                |  2 +-
>   Documentation/vm/unevictable-lru.rst               | 22 +++----
>   include/linux/memcontrol.h                         | 68 ++++++++++++++++++++
>   include/linux/mm_types.h                           |  2 +-
>   include/linux/mmzone.h                             |  5 +-
>   mm/compaction.c                                    | 67 +++++++++++++------
>   mm/filemap.c                                       |  4 +-
>   mm/huge_memory.c                                   | 17 ++---
>   mm/memcontrol.c                                    | 75 +++++++++++++++++-----
>   mm/mlock.c                                         | 27 ++++----
>   mm/mmzone.c                                        |  1 +
>   mm/page_alloc.c                                    |  1 -
>   mm/page_idle.c                                     |  5 +-
>   mm/rmap.c                                          |  2 +-
>   mm/swap.c                                          | 74 +++++++++------------
>   mm/vmscan.c                                        | 74 ++++++++++-----------
>   18 files changed, 287 insertions(+), 180 deletions(-)
>