[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0dd5029f-d464-4c59-aac9-4b3e9d0a3438@lucifer.local>
Date: Thu, 30 Oct 2025 10:04:31 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Muchun Song <muchun.song@...ux.dev>,
        Oscar Salvador <osalvador@...e.de>,
        David Hildenbrand <david@...hat.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Yuanchu Xie <yuanchu@...gle.com>, Wei Xu <weixugc@...gle.com>,
        Peter Xu <peterx@...hat.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
        Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
        Kees Cook <kees@...nel.org>, Matthew Wilcox <willy@...radead.org>,
        John Hubbard <jhubbard@...dia.com>, Leon Romanovsky <leon@...nel.org>,
        Zi Yan <ziy@...dia.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
        Lance Yang <lance.yang@...ux.dev>, Xu Xin <xu.xin16@....com.cn>,
        Chengming Zhou <chengming.zhou@...ux.dev>,
        Jann Horn <jannh@...gle.com>, Matthew Brost <matthew.brost@...el.com>,
        Joshua Hahn <joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>,
        Byungchul Park <byungchul@...com>, Gregory Price <gourry@...rry.net>,
        Ying Huang <ying.huang@...ux.alibaba.com>,
        Alistair Popple <apopple@...dia.com>, Pedro Falcato <pfalcato@...e.de>,
        Shakeel Butt <shakeel.butt@...ux.dev>,
        David Rientjes <rientjes@...gle.com>, Rik van Riel <riel@...riel.com>,
        Harry Yoo <harry.yoo@...cle.com>,
        Kemeng Shi <shikemeng@...weicloud.com>,
        Kairui Song <kasong@...cent.com>, Nhat Pham <nphamcs@...il.com>,
        Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Qi Zheng <zhengqi.arch@...edance.com>, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 4/4] mm: introduce and use VMA flag test helpers
On Wed, Oct 29, 2025 at 04:22:14PM -0300, Jason Gunthorpe wrote:
> On Wed, Oct 29, 2025 at 05:49:38PM +0000, Lorenzo Stoakes wrote:
> > We introduce vma_flags_test() and vma_test() (the latter operating on a
> > VMA, the former on a pointer to a vma_flags_t value).
> >
> > It's useful to have both, as many functions modify a local VMA flags
> > variable before setting the VMA flags to this value.
>
> Hmm, sure would be nice to not have this inconsistency though.
Yes!
>
> It is a bit wordy but with the C preprocessor we can make this work:
>
> struct vm_flags_t {DECLARE_BITMAP(..)};
>
> void func(..)
> {
>    struct vm_flags_t flags = OR_VMA_FLAGS(VMA_READ_BIT, VMA_WRITE_BIT);
>
>    flags = vm_flags_or(flags, OR_VMA_FLAGS(VMA_MAYREAD_BIT, VMA_MAYWRITE_BIT);
> }
>
> Where OR_VMA_FLAGS's OR's together its __VA_ARGS__ and returns a struct vm_flags_t.
>
> Would that be interesting? Eliminate the inconsistency?
>
> eg
>
> https://stackoverflow.com/questions/77244843/c-macro-to-bitwise-or-together-a-variable-number-of-arguments-lightweight-solut
>
> Or other similar solutions.
Well this would help things be more succinct rather than doing, e.g.:
vma_flags_set(&flags, VMA_READ_BIT);
vma_flags_set(&flags, VMA_WRITE_BIT);
But the reason for this separation is more so needing to also do other
operations like testing for bits against local flags.
It may also just be sensible to drop the vma_test() since I've named VMA flags
vma->flags which is kinda neat and not so painful to do:
	if (vma_flags_test(&vma->flags, VMA_READ_BIT)) {
	}
Another note - I do hope to drop the _BIT at some point. But it felt egregious
to do so _now_ since VM_READ, VMA_READ are so close it'd be _super_ easy to
mistake the two.
The sparse stuff will flag that up (no pun intended), but I didn't want to make
that kind of error _too_ easy to achieve.
Buuut I'm guessing actually you're thinking more of getting rid of
vm_flags_word_[and, any, all]() all of which take VM_xxx parameters.
>
> The compiler is pretty smart so this would all fold away to very
> few instructions.
>
Well I'm not sure, hopefully. Maybe I need to test this and see exactly what the
it comes up with.
I mean you could in theory have:
vma_flags_any(&vma->flags, OR_VMA_FLAGS(VMA_PFNMAP_BIT, VMA_SEALED_BIT))
Where OR_VMA_FLAGS() generates a vma_flags_t which can then be bitmap_or()'d.
It'd then have to be something like (say for 64 bit flags on a 32-bit system):
	unsigned long val[2] = {};
	__set_bit(VMA_PFNMAP_BIT, &val);
	__set_bit(VMA_SEALED_BIT, &val);
	/* ...assuming dst can be src also... */
	bitmap_or(&val, ACCESS_PRIVATE(&vma->flags, __vma_flags), &val);
	return !bitmap_empty(&val);
And... I can't really see the compiler finding a way to make that efficient,
esp. given e.g. bitmap_or() -> __bitmap_or() for the non-small number-of-bits
case.
I feel like we're going to need the 'special first word' stuff permanently for
performance reasons.
At the same time, can definitely look into what the compiler actually
generates/even look at improving the bitmap stuff if it's inefficient. But I
think that should be a future project :)
> Then everything only works with _BIT and we don't have the special
> first word situation.
In any case we still need to maintain the word stuff for legacy purposes at
least to handle the existing vm_flags_*() interfaces until the work is complete.
I think it's reasonable either way to treat this as iterative - if we can find
efficient ways to do this stuff with _BIT only then let's do that, but for now I
think it's reasonable to have the various compromises to make the initial
conversion easier.
Of course we need to try to get things as right as we can early on, but we also
don't want to get stuck in analysis paralysis either.
>
> Jason
Cheers, Lorenzo
Powered by blists - more mailing lists
 
