linux-kernel - Re: [mm/gup] 57efa1fe59: will-it-scale.per_thread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210604070411.GA8221@shbuild999.sh.intel.com>
Date:   Fri, 4 Jun 2021 15:04:11 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     kernel test robot <oliver.sang@...el.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        John Hubbard <jhubbard@...dia.com>, Jan Kara <jack@...e.cz>,
        Peter Xu <peterx@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Christoph Hellwig <hch@....de>,
        Hugh Dickins <hughd@...gle.com>, Jann Horn <jannh@...gle.com>,
        Kirill Shutemov <kirill@...temov.name>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        Leon Romanovsky <leonro@...dia.com>,
        Michal Hocko <mhocko@...e.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        kernel test robot <lkp@...el.com>,
        "Huang, Ying" <ying.huang@...el.com>, zhengjun.xing@...el.com
Subject: Re: [mm/gup] 57efa1fe59: will-it-scale.per_thread_ops -9.2%
 regression

Hi Linus,

Sorry for the late response.

On Mon, May 24, 2021 at 05:11:37PM -1000, Linus Torvalds wrote:
> On Mon, May 24, 2021 at 5:00 PM kernel test robot <oliver.sang@...el.com> wrote:
> >
> > FYI, we noticed a -9.2% regression of will-it-scale.per_thread_ops due to commit:
> > commit: 57efa1fe5957694fa541c9062de0a127f0b9acb0 ("mm/gup: prevent gup_fast from racing with COW during fork")
> 
> Hmm. This looks like one of those "random fluctuations" things.
> 
> It would be good to hear if other test-cases also bisect to the same
> thing, but this report already says:
> 
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+---------------------------------------------------------------------------------+
> > | testcase: change | will-it-scale: will-it-scale.per_thread_ops 3.7% improvement                    |
> 
> which does kind of reinforce that "this benchmark gives unstable numbers".
> 
> The perf data doesn't even mention any of the GUP paths, and on the
> pure fork path the biggest impact would be:
> 
>  (a) maybe "struct mm_struct" changed in size or had a different cache layout

Yes, this seems to be the cause of the regression.

The test case is many thread are doing map/unmap at the same time,
so the process's rw_semaphore 'mmap_lock' is highly contended.

Before the patch (with 0day's kconfig), the mmap_lock is separated
into 2 cachelines, the 'count' is in one line, and the other members
sit in the next line, so it luckily avoid some cache bouncing. After
the patch, the 'mmap_lock' is pushed into one cacheline, which may
cause the regression.

Below is the pahole info:

- before the patch

	spinlock_t         page_table_lock;      /*   116     4 */
	struct rw_semaphore mmap_lock;           /*   120    40 */
	/* --- cacheline 2 boundary (128 bytes) was 32 bytes ago --- */
	struct list_head   mmlist;               /*   160    16 */
	long unsigned int  hiwater_rss;          /*   176     8 */

- after the patch

	spinlock_t         page_table_lock;      /*   124     4 */
	/* --- cacheline 2 boundary (128 bytes) --- */
	struct rw_semaphore mmap_lock;           /*   128    40 */
	struct list_head   mmlist;               /*   168    16 */
	long unsigned int  hiwater_rss;          /*   184     8 */

perf c2c log can also confirm this.

Thanks,
Feng

>  (b) two added (nonatomic) increment operations in the fork path due
> to the seqcount
> 
> and I'm not seeing what would cause that 9% change. Obviously cache
> placement has done it before.
> 
> If somebody else sees something that I'm missing, please holler. But
> I'll ignore this as "noise" otherwise.
> 
>             Linus