[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9e3e1f9d238c01bdeacb165501483ab666a766cd@linux.dev>
Date: Mon, 21 Apr 2025 08:02:01 +0000
From: "Lance Yang" <lance.yang@...ux.dev>
To: "David Hildenbrand" <david@...hat.com>
Cc: mingzhe.yang@...com, willy@...radead.org, ziy@...dia.com,
mhocko@...e.com, vbabka@...e.cz, surenb@...gle.com, linux-mm@...ck.org,
jackmanb@...gle.com, hannes@...xchg.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, "Lance Yang" <ioworker0@...il.com>
Subject: Re: [PATCH 1/1] mm/rmap: optimize MM-ID mapcount handling with union
April 21, 2025 at 3:40 PM, "David Hildenbrand" <david@...hat.com> wrote:
>
> >
> > >
> > > Are we sure the compiler cannot optimize that itself?
> > >
> > > On x86-64 I get with gcc 14.2.1:
> > >
> > > ; folio->_mm_id_mapcount[0] = -1;
> > >
> > > 3f2f: 48 c7 42 60 ff ff ff ff movq $-0x1, 0x60(%rdx)
> > >
> > > Which should be a quadword (64bit) setting, so exactly what you want to achieve.
> > >
> >
> > > Yeah, the compiler should be as smart as we expect it to be.
> >
> > However, it seems that gcc 4.8.5 doesn't behave as expected
> >
> > with the -O2 optimization level on the x86-64 test machine.
> >
> > struct folio_array {
> >
> > int _mm_id_mapcount[2];
> >
> > };
> >
> > void init_array(struct folio_array *f) {
> >
> > f->_mm_id_mapcount[0] = -1;
> >
> > f->_mm_id_mapcount[1] = -1;
> >
> > }
> >
> > 0000000000000000 <init_array>:
> >
> > 0: c7 07 ff ff ff ff movl $0xffffffff,(%rdi)
> >
> > 6: c7 47 04 ff ff ff ff movl $0xffffffff,0x4(%rdi)
> >
> > d: c3 retq
> >
> > ---
> >
> > struct folio_union {
> >
> > union {
> >
> > int _mm_id_mapcount[2];
> >
> > unsigned long _mm_id_mapcounts;
> >
> > };
> >
> > };
> >
> > void init_union(struct folio_union *f) {
> >
> > f->_mm_id_mapcounts = -1UL;
> >
> > }
> >
> > 0000000000000010 <init_union>:
> >
> > 10: 48 c7 07 ff ff ff ff movq $0xffffffffffffffff,(%rdi)
> >
> > 17: c3 retq
> >
> > Hmm... I'm not sure if it's valuable for those compilers that
> >
> > are not very new.
> >
>
> Yeah, we shouldn't care about performance with rusty old compilers, especially if the gain would likely not even be measurable.
Ah, nice to know that ;)
>
> Note that even Linux requires 5.1 ever since 2021. GCC seems to implement this optimization starting with 7.1 (at least when playing with the compiler explorer).
Thanks for the details. Let’s just drop it - no measurable gain.
Thanks,
Lance
>
> -- Cheers,
>
> David / dhildenb
>
Powered by blists - more mailing lists