lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Jun 2023 16:38:05 -0400
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org,
        kernel test robot <yujie.liu@...el.com>,
        Aaron Lu <aaron.lu@...el.com>,
        Olivier Dion <odion@...icios.com>, michael.christie@...cle.com,
        Feng Tang <feng.tang@...el.com>,
        John Hubbard <jhubbard@...dia.com>,
        Jason Gunthorpe <jgg@...dia.com>, Peter Xu <peterx@...hat.com>,
        linux-mm@...ck.org
Subject: Re: [PATCH] mm: Move mm_count into its own cache line

On 6/16/23 16:16, Andrew Morton wrote:
> On Mon, 15 May 2023 10:35:36 -0400 Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> 
>> The mm_struct mm_count field is frequently updated by mmgrab/mmdrop
>> performed by context switch. This causes false-sharing for surrounding
>> mm_struct fields which are read-mostly.
>>
>> This has been observed on a 2sockets/112core/224cpu Intel Sapphire
>> Rapids server running hackbench, and by the kernel test robot
>> will-it-scale testcase.
>>
>> Move the mm_count field into its own cache line to prevent false-sharing
>> with other mm_struct fields.
>>
>> Move mm_count to the first field of mm_struct to minimize the amount of
>> padding required: rather than adding padding before and after the
>> mm_count field, padding is only added after mm_count.
>>
>> Note that I noticed this odd comment in mm_struct:
>>
>> commit 2e3025434a6b ("mm: relocate 'write_protect_seq' in struct mm_struct")
>>
>>                  /*
>>                   * With some kernel config, the current mmap_lock's offset
>>                   * inside 'mm_struct' is at 0x120, which is very optimal, as
>>                   * its two hot fields 'count' and 'owner' sit in 2 different
>>                   * cachelines,  and when mmap_lock is highly contended, both
>>                   * of the 2 fields will be accessed frequently, current layout
>>                   * will help to reduce cache bouncing.
>>                   *
>>                   * So please be careful with adding new fields before
>>                   * mmap_lock, which can easily push the 2 fields into one
>>                   * cacheline.
>>                   */
>>                  struct rw_semaphore mmap_lock;
>>
>> This comment is rather odd for a few reasons:
>>
>> - It requires addition/removal of mm_struct fields to carefully consider
>>    field alignment of _other_ fields,
>> - It expresses the wish to keep an "optimal" alignment for a specific
>>    kernel config.
>>
>> I suspect that the author of this comment may want to revisit this topic
>> and perhaps introduce a split-struct approach for struct rw_semaphore,
>> if the need is to place various fields of this structure in different
>> cache lines.
>>
>> ...
>>
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -583,6 +583,21 @@ struct mm_cid {
>>   struct kioctx_table;
>>   struct mm_struct {
>>   	struct {
>> +		/*
>> +		 * Fields which are often written to are placed in a separate
>> +		 * cache line.
>> +		 */
>> +		struct {
>> +			/**
>> +			 * @mm_count: The number of references to &struct
>> +			 * mm_struct (@mm_users count as 1).
>> +			 *
>> +			 * Use mmgrab()/mmdrop() to modify. When this drops to
>> +			 * 0, the &struct mm_struct is freed.
>> +			 */
>> +			atomic_t mm_count;
>> +		} ____cacheline_aligned_in_smp;
>> +
> 
> Why add the anonymous struct?
> 
> 	atomic_t mm_count ____cacheline_aligned_in_smp;
> 
> would suffice?

The anonymous struct is needed to ensure the mm_count is alone in its 
own cacheline.

An aligned attribute applied to an integer field only aligns the offset 
of that field in the structure, without changing its size. An aligned 
attribute applied to a structure aligns both its offset in the structure 
containing it _and_ extends its size with padding.

This takes care of adding padding _after_ mm_count as well. Alignment on 
a structure requires that the structure size is extended with padding to 
cover the required alignment. Otherwise an array of that structure could 
not have each of its items on a multiple of the required alignment.

> 
> Secondly, the ____cacheline_aligned_in_smp doesn't actually do
> anything?  mm_count is at offset 0 which is cacheline aligned anyway.
> The next field (mm_mt) will share a cacheline with mm_count.

If applied on the integer field, I agree that it would not do anything. 
However, applied on the anonymous structure, it takes care of adding 
padding _after_ the mm_count field, which is exactly what we want here.

> 
> If the plan is to put mm_count in "its own" cacheline then padding will
> be needed?

It's taken care of by the anonymous structure trick. Here is an quick 
example showing the difference between alignment attribute applied to an 
integer type vs to an anonymous structure:

#include <stdio.h>

struct s1 {
         int a __attribute__((aligned(32)));
         int b;
};

struct s2 {
         int a;
         int b __attribute__((aligned(32)));
};

struct s3 {
         struct {
                 int a;
         } __attribute__((aligned(32)));
         int b;
};

int main()
{
         struct s1 t1;
         struct s2 t2;
         struct s3 t3;

         printf("%d %d\n", (unsigned long)&t1.a - (unsigned long)&t1,
                         (unsigned long)&t1.b - (unsigned long)&t1);
         printf("%d %d\n", (unsigned long)&t2.a - (unsigned long)&t2,
                         (unsigned long)&t2.b - (unsigned long)&t2);
         printf("%d %d\n", (unsigned long)&t3.a - (unsigned long)&t3,
                         (unsigned long)&t3.b - (unsigned long)&t3);
         return 0;
}

Result:

0 4
0 32
0 32

Applying the aligned attribute on the integer field would require 
explicit padding, because it does not add padding after the field.

Applying the aligned attribute on the _following_ field would work, but 
it's rather odd and error-prone.

Applying the aligned attribute on an anonymous structure clearly 
documents the intent, and adds the padding that guarantees this variable 
is alone in its cache line.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ