[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHVum0cynwp5Phx=v2LV33Hsa8viq0jpVLh0Q_ZtpUZVy6Lm9w@mail.gmail.com>
Date: Mon, 28 Mar 2022 12:13:18 -0700
From: Vipin Sharma <vipinsh@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: David Matlack <dmatlack@...gle.com>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm list <kvm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: x86/mmu: Speed up slot_rmap_walk_next for sparsely
populated rmaps
Thank you David and Paolo, for checking this patch carefully. With
hindsight, I should have explicitly mentioned adding "noinline" in my
patch email.
On Sun, Mar 27, 2022 at 3:41 AM Paolo Bonzini <pbonzini@...hat.com> wrote:
>
> On 3/26/22 01:31, Vipin Sharma wrote:
> >>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> >>> +static noinline void
> >>
> >> What is the reason to add noinline?
> >
> > My understanding is that since this method is called from
> > __always_inline methods, noinline will avoid gcc inlining the
> > slot_rmap_walk_next in those functions and generate smaller code.
> >
>
> Iterators are written in such a way that it's way more beneficial to
> inline them. After inlining, compilers replace the aggregates (in this
> case, struct slot_rmap_walk_iterator) with one variable per field and
> that in turn enables a lot of optimizations, so the iterators should
> actually be always_inline if anything.
>
> For the same reason I'd guess the effect on the generated code should be
> small (next time please include the output of "size mmu.o"), but should
> still be there. I'll do a quick check of the generated code and apply
> the patch.
Yeah, I should have added the "size mmu.o" output. Here is what I have found:
size arch/x86/kvm/mmu/mmu.o
Without noinline:
text data bss dec hex filename
89938 15793 72 105803 19d4b arch/x86/kvm/mmu/mmu.o
With noinline:
text data bss dec hex filename
90058 15793 72 105923 19dc3 arch/x86/kvm/mmu/mmu.o
With noinline, increase in size = 120
Curiously, I also checked file size with "ls -l" command
File size:
Without noinline: 1394272 bytes
With noinline: 1381216 bytes
With noinline, decrease in size = 13056 bytes
I also disassembled mmu.o via "objdump -d" and found following
Total lines in the generated assembly:
Without noinline: 23438
With noinline: 23393
With noinline, decrease in assembly code = 45
I can see in assembly code that there are multiple "call" operations
in the "with noinline" object file, which is expected and has less
lines of code compared to "without noinline". I am not sure why the
size command is showing an increase in text segment for "with
noinline" and what to infer with all of this data.
Thanks
Vipin
Powered by blists - more mailing lists