[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <hcwwxvl4bzyejjtdmrzwvwfyejzi2so2kke2b5yls3z2o67gou@67hxetrsr5ec>
Date: Wed, 13 Nov 2024 15:53:54 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org,
willy@...radead.org, liam.howlett@...cle.com, mhocko@...e.com, vbabka@...e.cz,
hannes@...xchg.org, oliver.sang@...el.com, mgorman@...hsingularity.net,
david@...hat.com, peterx@...hat.com, oleg@...hat.com, dave@...olabs.net,
paulmck@...nel.org, brauner@...nel.org, dhowells@...hat.com, hdanton@...a.com,
hughd@...gle.com, minchan@...gle.com, jannh@...gle.com, shakeel.butt@...ux.dev,
souravpanda@...gle.com, pasha.tatashin@...een.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 2/5] mm: move per-vma lock into vm_area_struct
On Wed, Nov 13, 2024 at 02:28:16PM +0000, Lorenzo Stoakes wrote:
> On Tue, Nov 12, 2024 at 11:46:32AM -0800, Suren Baghdasaryan wrote:
> > Back when per-vma locks were introduces, vm_lock was moved out of
> > vm_area_struct in [1] because of the performance regression caused by
> > false cacheline sharing. Recent investigation [2] revealed that the
> > regressions is limited to a rather old Broadwell microarchitecture and
> > even there it can be mitigated by disabling adjacent cacheline
> > prefetching, see [3].
>
> I don't see a motivating reason as to why we want to do this? We increase
> memory usage here which is not good, but later lock optimisation mitigates
> it, but why wouldn't we just do the lock optimisations and use less memory
> overall?
>
Where would you put the lock in that case though?
With the patchset it sticks with the affected vma, so no false-sharing
woes concerning other instances of the same struct.
If you make them separately allocated and packed, they false-share
between different vmas using them (in fact this is currently happening).
If you make sure to pad them, that's 64 bytes per obj, majority of which
is empty space.
Powered by blists - more mailing lists