[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56E841CC.4090806@amd.com>
Date: Wed, 16 Mar 2016 00:09:32 +0700
From: Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>
To: Paolo Bonzini <pbonzini@...hat.com>, <rkrcmar@...hat.com>,
<joro@...tes.org>, <bp@...en8.de>, <gleb@...nel.org>,
<alex.williamson@...hat.com>
CC: <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<wei@...hat.com>, <sherry.hurwitz@....com>
Subject: Re: [PART1 RFC v2 05/10] KVM: x86: Detect and Initialize AVIC support
Hi
On 03/07/2016 11:41 PM, Paolo Bonzini wrote:
> On 04/03/2016 21:46, Suravee Suthikulpanit wrote:
> > [....]
>> +/* Note: This structure is per VM */
>> +struct svm_vm_data {
>> + atomic_t count;
>> + u32 ldr_mode;
>> + u32 avic_max_vcpu_id;
>> + u32 avic_tag;
>> +
>> + struct page *avic_log_ait_page;
>> + struct page *avic_phy_ait_page;
>
> You can put these directly in kvm_arch. Do not use abbreviations:
>
> struct page *avic_logical_apic_id_table_page;
> struct page *avic_physical_apic_id_table_page;
>
Actually, the reason I would like to introduce this per-arch specific
structure is because I feel that it is easier to manage these
processor-specific variable/data-structure. If we add all these directly
into kvm_arch, which is shared b/w SVM and VMX, it is more difficult to
tell which one is used in the different code base.
>> [...]
>> + memcpy(vapic_bkpg, svm->in_kernel_lapic_regs, PAGE_SIZE);
>> + svm->vcpu.arch.apic->regs = vapic_bkpg;
>
> Can you explain the flipping logic, and why you cannot just use the
> existing apic.regs?
Please see "explanation 1" below.
>> [...]
>> +static struct svm_avic_phy_ait_entry *
>> +avic_get_phy_ait_entry(struct kvm_vcpu *vcpu, int index)
>> +{
>> + [.....]
>> +}
>> +
>> +struct svm_avic_log_ait_entry *
>> +avic_get_log_ait_entry(struct kvm_vcpu *vcpu, u8 mda, bool is_flat)
>> +{
>> + [.....]
>> +}
>
> Instead of these functions, create a complete function to handle APIC_ID
> and APIC_LDR writes. Then use kmap/kunmap instead of page_address.
>
Ok. May I ask why we are against using page_address? I have see that
used in several places in the code.
>> [...]
>> +static int avic_alloc_bk_page(struct vcpu_svm *svm, int id)
>> +{
>> + int ret = 0, i;
>> + bool realloc = false;
>> + struct kvm_vcpu *vcpu;
>> + struct kvm *kvm = svm->vcpu.kvm;
>> + struct svm_vm_data *vm_data = kvm->arch.arch_data;
>> +
>> + mutex_lock(&kvm->slots_lock);
>> +
>> + /* Check if we have already allocated vAPIC backing
>> + * page for this vCPU. If not, we need to realloc
>> + * a new one and re-assign all other vCPU.
>> + */
>> + if (kvm->arch.apic_access_page_done &&
>> + (id > vm_data->avic_max_vcpu_id)) {
>> + kvm_for_each_vcpu(i, vcpu, kvm)
>> + avic_unalloc_bk_page(vcpu);
>> +
>> + __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
>> + 0, 0);
>> + realloc = true;
>> + vm_data->avic_max_vcpu_id = 0;
>> + }
>> +
>> + /*
>> + * We are allocating vAPIC backing page
>> + * upto the max vCPU ID
>> + */
>> + if (id >= vm_data->avic_max_vcpu_id) {
>> + ret = __x86_set_memory_region(kvm,
>> + APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
>> + APIC_DEFAULT_PHYS_BASE,
>> + PAGE_SIZE * (id + 1));
>
> Why is this necessary? The APIC access page is a peculiarity of Intel
> processors (and the special memslot for only needs to map 0xfee00000 to
> 0xfee00fff; after that there is the MSI area).
>
Please see "explanation 1" below.
>> [...]
>> + if (ret)
>> + goto out;
>> +
>> + vm_data->avic_max_vcpu_id = id;
>> + }
>> +
>> + /* Reinit vAPIC backing page for exisinting vcpus */
>> + if (realloc)
>> + kvm_for_each_vcpu(i, vcpu, kvm)
>> + avic_init_bk_page(vcpu);
>
> Why is this necessary?
Explanation 1:
The current lapic regs page is allocated using get_zeroed_page(), which
can be paged out. If I use these pages for AVIC backing pages, it seems
to cause VM to slow down quite a bit due to a lot of page faults.
Currently, the AVIC backing pages are acquired from __x86_set_memory
region() with APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, which maps the pages for
address 0xfee00000 and above for VM to use. I mostly grab this from the
VMX implementation in alloc_apic_access_page().
However, the memslot requires specification of the size at the time when
calling __x86_set_memory_region(). However, I can't seem to figure out
where I can get the number of vcpus at the time when we creating VM.
Therefore, I have to track the vcpu creation, and re-acquire larger
memslot every time vcpu_create() is called.
I was not sure if this is the right approach, any suggestion for this part.
Thanks,
Suravee
Powered by blists - more mailing lists