linux-kernel - Re: [PATCH 08/41] mm: introduce CONFIG_PER_VMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpH_VZq99=vGQGJ+evVg5wMPKGsjyawgHnOeoKhtEiAi6w@mail.gmail.com>
Date:   Wed, 11 Jan 2023 08:28:49 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     David Laight <David.Laight@...lab.com>
Cc:     Ingo Molnar <mingo@...nel.org>, Michal Hocko <mhocko@...e.com>,
        "michel@...pinasse.org" <michel@...pinasse.org>,
        "joelaf@...gle.com" <joelaf@...gle.com>,
        "songliubraving@...com" <songliubraving@...com>,
        "leewalsh@...gle.com" <leewalsh@...gle.com>,
        "david@...hat.com" <david@...hat.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "bigeasy@...utronix.de" <bigeasy@...utronix.de>,
        "peterx@...hat.com" <peterx@...hat.com>,
        "dhowells@...hat.com" <dhowells@...hat.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "jglisse@...gle.com" <jglisse@...gle.com>,
        "punit.agrawal@...edance.com" <punit.agrawal@...edance.com>,
        "arjunroy@...gle.com" <arjunroy@...gle.com>,
        "minchan@...gle.com" <minchan@...gle.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "hughd@...gle.com" <hughd@...gle.com>,
        "willy@...radead.org" <willy@...radead.org>,
        "gurua@...gle.com" <gurua@...gle.com>,
        "laurent.dufour@...ibm.com" <laurent.dufour@...ibm.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "rientjes@...gle.com" <rientjes@...gle.com>,
        "axelrasmussen@...gle.com" <axelrasmussen@...gle.com>,
        "kernel-team@...roid.com" <kernel-team@...roid.com>,
        "soheil@...gle.com" <soheil@...gle.com>,
        "paulmck@...nel.org" <paulmck@...nel.org>,
        "jannh@...gle.com" <jannh@...gle.com>,
        "liam.howlett@...cle.com" <liam.howlett@...cle.com>,
        "shakeelb@...gle.com" <shakeelb@...gle.com>,
        "luto@...nel.org" <luto@...nel.org>,
        "gthelen@...gle.com" <gthelen@...gle.com>,
        "ldufour@...ux.ibm.com" <ldufour@...ux.ibm.com>,
        "vbabka@...e.cz" <vbabka@...e.cz>,
        "posk@...gle.com" <posk@...gle.com>,
        "lstoakes@...il.com" <lstoakes@...il.com>,
        "peterjung1337@...il.com" <peterjung1337@...il.com>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "kent.overstreet@...ux.dev" <kent.overstreet@...ux.dev>,
        "hughlynch@...gle.com" <hughlynch@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "hannes@...xchg.org" <hannes@...xchg.org>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "tatashin@...gle.com" <tatashin@...gle.com>
Subject: Re: [PATCH 08/41] mm: introduce CONFIG_PER_VMA_LOCK

On Wed, Jan 11, 2023 at 2:03 AM David Laight <David.Laight@...lab.com> wrote:
>
> From: Ingo Molnar
> > Sent: 11 January 2023 09:54
> >
> > * Michal Hocko <mhocko@...e.com> wrote:
> >
> > > On Tue 10-01-23 16:44:42, Suren Baghdasaryan wrote:
> > > > On Tue, Jan 10, 2023 at 4:39 PM Davidlohr Bueso <dave@...olabs.net> wrote:
> > > > >
> > > > > On Mon, 09 Jan 2023, Suren Baghdasaryan wrote:
> > > > >
> > > > > >This configuration variable will be used to build the support for VMA
> > > > > >locking during page fault handling.
> > > > > >
> > > > > >This is enabled by default on supported architectures with SMP and MMU
> > > > > >set.
> > > > > >
> > > > > >The architecture support is needed since the page fault handler is called
> > > > > >from the architecture's page faulting code which needs modifications to
> > > > > >handle faults under VMA lock.
> > > > >
> > > > > I don't think that per-vma locking should be something that is user-configurable.
> > > > > It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK?
> > > >
> > > > Thanks for the suggestion! I would be happy to make that change if
> > > > there are no objections. I think the only pushback might have been the
> > > > vma size increase but with the latest optimization in the last patch
> > > > maybe that's less of an issue?
> > >
> > > Has vma size ever been a real problem? Sure there might be a lot of those
> > > but your patch increases it by rwsem (without the last patch) which is
> > > something like 40B on top of 136B vma so we are talking about 400B in
> > > total which even with wild mapcount limits shouldn't really be
> > > prohibitive. With a default map count limit we are talking about 2M
> > > increase at most (per address space).
> > >
> > > Or are you aware of any specific usecases where vma size is a real
> > > problem?

Well, when fixing the cacheline bouncing problem in the initial design
I was adding 44 bytes to 152-byte vm_area_struct (CONFIG_NUMA enabled)
and pushing it just above 192 bytes while allocating these structures
from cache-aligned slab (keeping the lock in a separate cacheline to
prevent cacheline bouncing). That would use the whole 256 bytes per
VMA and it did make me nervous. The current design with no need to
cache-align vm_area_structs and with 44-byte overhead trimmed down to
16 bytes seems much more palatable.

> >
> > 40 bytes for the rwsem, plus the patch also adds a 32-bit sequence counter:
> >
> >   + int vm_lock_seq;
> >   + struct rw_semaphore lock;
> >
> > So it's +44 bytes.

Correct.

>
> Depend in whether vm_lock_seq goes into a padding hole or not
> it will be 40 or 48 bytes.
>
> But if these structures are allocated individually (not an array)
> then it depends on how may items kmalloc() fits into a page (or 2,4).

Yep. Depends on how we arrange the fields.

Anyhow. Sounds like the overhead of the current design is small enough
to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture
support?
Thanks,
Suren.

>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>