[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4FC62C33.7080509@redhat.com>
Date: Wed, 30 May 2012 17:18:27 +0300
From: Avi Kivity <avi@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: "H. Peter Anvin" <hpa@...or.com>,
"Michael S. Tsirkin" <mst@...hat.com>, kvm@...r.kernel.org,
Marcelo Tosatti <mtosatti@...hat.com>,
Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] kvm: optimize ISR lookups
On 05/24/2012 01:00 AM, Thomas Gleixner wrote:
> Thought more about that.
>
> We have a clear distinction between HW accessed data and software
> accessed data.
>
> If I look at TPR then it is special cased already and it does:
>
> case APIC_TASKPRI:
> report_tpr_access(apic, false|true);
> /* fall thru */
>
> And the fall through is using the general accessor for all not special
> cased registers.
>
> So all you have to do is
>
> case APIC_TASKPRI:
> report_tpr_access(apic, false|true);
> + return access_mapped_reg(...);
>
> Instead of the fall through.
>
> So there is no synchronizing back and forth problem simply because you
> already have a special case for that register.
>
> I know you'll argue that the tpr reporting is a special hack for
> windows guests, at least that's what the changelog tells.
>
> But even if we have a few more registers accessed by hardware and if
> they do not require a special casing, I really doubt that the overhead
> of special casing those few regs will be worse than not having the
> obvious optimization in place.
>
> And looking deeper it's a total non issue. The apic mapping is 4k. The
> register stride is strictly 0x10. That makes a total of 256 possible
> registers.
>
> So now you have two possibilites:
>
> 1) Create a 256 bit == 64byte bitfield to select the one or the other
> representation.
>
> The overhead of checking the bit is not going to be observable.
>
> 2) Create a 256 function pointer array == 2k resp. 1k (64 / 32bit)
>
> That's not a lot of memory even if you have to maintain two
> separate variants for read and write, but it allows you to get rid
> of the already horribly compiled switch case in apic_read/write and
> you'll get the optional stuff like report_tpr_access() w/o extra
> conditionals just for free.
>
> An extra goodie is that you can catch any access to a non existing
> register which you now just silently ignore. And that allows you
> to react on any future hardware oddities without adding a single
> runtime conditional.
>
> This is stricly x86 and x86 is way better at dealing with indirect
> calls than with the mess gcc creates compiling those switch case
> constructs.
>
> So I'd go for that and rather spend the memory and the time in
> setting up the function pointers on init/ioctl than dealing with
> the inconsistency of HW/SW representation with magic hacks.
>
I like the bitmap version, it seems very lightweight. But by itself it
doesn't allow us to use bitmap_weight (or the other bitmap accessors),
unless you assume beforehand that those registers will never be in the
hardware-layout region.
(you also need extra code for copying the APIC state to and from
userspace; right now we just memcpy the APIC page)
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists