[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180314084844.GP4043@hirez.programming.kicks-ass.net>
Date: Wed, 14 Mar 2018 09:48:44 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Laurent Dufour <ldufour@...ux.vnet.ibm.com>
Cc: paulmck@...ux.vnet.ibm.com, akpm@...ux-foundation.org,
kirill@...temov.name, ak@...ux.intel.com, mhocko@...nel.org,
dave@...olabs.net, jack@...e.cz,
Matthew Wilcox <willy@...radead.org>, benh@...nel.crashing.org,
mpe@...erman.id.au, paulus@...ba.org,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, hpa@...or.com,
Will Deacon <will.deacon@....com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
kemi.wang@...el.com, sergey.senozhatsky.work@...il.com,
Daniel Jordan <daniel.m.jordan@...cle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
haren@...ux.vnet.ibm.com, khandual@...ux.vnet.ibm.com,
npiggin@...il.com, bsingharora@...il.com,
Tim Chen <tim.c.chen@...ux.intel.com>,
linuxppc-dev@...ts.ozlabs.org, x86@...nel.org
Subject: Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
On Tue, Mar 13, 2018 at 06:59:47PM +0100, Laurent Dufour wrote:
> This change is inspired by the Peter's proposal patch [1] which was
> protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
> that particular case, and it is introducing major performance degradation
> due to excessive scheduling operations.
Do you happen to have a little more detail on that?
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 34fde7111e88..28c763ea1036 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -335,6 +335,7 @@ struct vm_area_struct {
> struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
> #ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> seqcount_t vm_sequence;
> + atomic_t vm_ref_count; /* see vma_get(), vma_put() */
> #endif
> } __randomize_layout;
>
> @@ -353,6 +354,9 @@ struct kioctx_table;
> struct mm_struct {
> struct vm_area_struct *mmap; /* list of VMAs */
> struct rb_root mm_rb;
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> + rwlock_t mm_rb_lock;
> +#endif
> u32 vmacache_seqnum; /* per-thread vmacache */
> #ifdef CONFIG_MMU
> unsigned long (*get_unmapped_area) (struct file *filp,
When I tried this, it simply traded contention on mmap_sem for
contention on these two cachelines.
This was for the concurrent fault benchmark, where mmap_sem is only ever
acquired for reading (so no blocking ever happens) and the bottle-neck
was really pure cacheline access.
Only by using RCU can you avoid that thrashing.
Also note that if your database allocates the one giant mapping, it'll
be _one_ VMA and that vm_ref_count gets _very_ hot indeed.
Powered by blists - more mailing lists