[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6c48912-494b-793e-d741-db2d634588d5@redhat.com>
Date: Tue, 12 Sep 2017 21:55:56 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Peter Feiner <pfeiner@...gle.com>,
Jim Mattson <jmattson@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
kvm list <kvm@...r.kernel.org>,
David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH] KVM: MMU: speedup update_permission_bitmask
On 12/09/2017 18:48, Peter Feiner wrote:
>>>
>>> Because update_permission_bitmask is actually the top item in the profile
>>> for nested vmexits, this speeds up an L2->L1 vmexit by about ten thousand
>>> clock cycles, or up to 30%:
>
> This is a great improvement! Why not take it a step further and
> compute the whole table once at module init time and be done with it?
> There are only 5 extra input bits (nx, ept, smep, smap, wp),
4 actually, nx could be ignored (because unlike WP, the bit is reserved
when nx is disabled). It is only handled for clarity.
> so the
> whole table would only take up (1 << 5) * 16 = 512 bytes. Moreover, if
> you had 32 VMs on the host, you'd actually save memory!
Indeed; my thought was to write a script or something to generate the
tables at compile time, but doing it at module init time would be clever
and easier.
That said, the generated code for the function, right now, is pretty
good. If it saved 1000 clock cycles per nested vmexit it would be very
convincing, but if it were 50 or even 100 a bit less so.
Paolo
Powered by blists - more mailing lists