linux-kernel - Re: [PATCH-cgroup v5 2/2] cgroup: Avoid false cacheline sharing of read mostly rstat

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZWoaCtcrciXpTEPH@slm.duckdns.org>
Date:   Fri, 1 Dec 2023 07:38:18 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Waiman Long <longman@...hat.com>
Cc:     Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org, Joe Mario <jmario@...hat.com>,
        Sebastian Jug <sejug@...hat.com>,
        Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [PATCH-cgroup v5 2/2] cgroup: Avoid false cacheline sharing of
 read mostly rstat_cpu

On Thu, Nov 30, 2023 at 03:43:27PM -0500, Waiman Long wrote:
> The rstat_cpu and also rstat_css_list of the cgroup structure are read
> mostly variables. However, they may share the same cacheline as the
> subsequent rstat_flush_next and *bstat variables which can be updated
> frequently.  That will slow down the cgroup_rstat_cpu() call which is
> called pretty frequently in the rstat code. Add a CACHELINE_PADDING()
> line in between them to avoid false cacheline sharing.
> 
> A parallel kernel build on a 2-socket x86-64 server is used as the
> benchmarking tool for measuring the lock hold time. Below were the lock
> hold time frequency distribution before and after the patch:
> 
>       Run time        Before patch       After patch
>       --------        ------------       -----------
>        0-01 us         9,928,562          9,820,428
>       01-05 us           110,151             50,935
>       05-10 us               270                 93
>       10-15 us               273                146
>       15-20 us               135                 76
>       20-25 us                 0                  2
>       25-30 us                 1                  0
> 
> It can be seen that the patch further pushes the lock hold time towards
> the lower end.
> 
> Signed-off-by: Waiman Long <longman@...hat.com>

Applied to cgroup/for-6.8.

Thanks.

-- 
tejun