[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f56599a6-77ac-e1ef-4843-51167b1284b3@linux.intel.com>
Date: Fri, 19 Jul 2019 16:27:04 +0800
From: Lu Baolu <baolu.lu@...ux.intel.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: baolu.lu@...ux.intel.com, Joerg Roedel <joro@...tes.org>,
David Woodhouse <dwmw2@...radead.org>, kevin.tian@...el.com,
ashok.raj@...el.com, linux-kernel@...r.kernel.org,
iommu@...ts.linux-foundation.org, cai@....pw,
jacob.jun.pan@...el.com
Subject: Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid
duplication
Hi Alex,
On 7/19/19 7:16 AM, Alex Williamson wrote:
> On Wed, 12 Jun 2019 08:28:51 +0800
> Lu Baolu <baolu.lu@...ux.intel.com> wrote:
>
>> The domain_init() and md_domain_init() do almost the same job.
>> Consolidate them to avoid duplication.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
>> ---
>> drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
>> 1 file changed, 36 insertions(+), 87 deletions(-)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 5215dcd535a1..b8c6cf1d5f90 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
>> return agaw;
>> }
>>
>> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>> - int guest_width)
>> -{
>> - int adjust_width, agaw;
>> - unsigned long sagaw;
>> - int err;
>> -
>> - init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>> -
>> - err = init_iova_flush_queue(&domain->iovad,
>> - iommu_flush_iova, iova_entry_free);
>> - if (err)
>> - return err;
>> -
>> - domain_reserve_special_ranges(domain);
>> -
>> - /* calculate AGAW */
>> - if (guest_width > cap_mgaw(iommu->cap))
>> - guest_width = cap_mgaw(iommu->cap);
>> - domain->gaw = guest_width;
>> - adjust_width = guestwidth_to_adjustwidth(guest_width);
>> - agaw = width_to_agaw(adjust_width);
>> - sagaw = cap_sagaw(iommu->cap);
>> - if (!test_bit(agaw, &sagaw)) {
>> - /* hardware doesn't support it, choose a bigger one */
>> - pr_debug("Hardware doesn't support agaw %d\n", agaw);
>> - agaw = find_next_bit(&sagaw, 5, agaw);
>> - if (agaw >= 5)
>> - return -ENODEV;
>> - }
>> - domain->agaw = agaw;
>> -
>> - if (ecap_coherent(iommu->ecap))
>> - domain->iommu_coherency = 1;
>> - else
>> - domain->iommu_coherency = 0;
>> -
>> - if (ecap_sc_support(iommu->ecap))
>> - domain->iommu_snooping = 1;
>> - else
>> - domain->iommu_snooping = 0;
>> -
>> - if (intel_iommu_superpage)
>> - domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
>> - else
>> - domain->iommu_superpage = 0;
>> -
>> - domain->nid = iommu->node;
>> -
>> - /* always allocate the top pgd */
>> - domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>> - if (!domain->pgd)
>> - return -ENOMEM;
>> - __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>> - return 0;
>> -}
>> -
>> static void domain_exit(struct dmar_domain *domain)
>> {
>> struct page *freelist;
>> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>> return 0;
>> }
>>
>> +static int domain_init(struct dmar_domain *domain, int guest_width)
>> +{
>> + int adjust_width;
>> +
>> + init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
>> + domain_reserve_special_ranges(domain);
>> +
>> + /* calculate AGAW */
>> + domain->gaw = guest_width;
>> + adjust_width = guestwidth_to_adjustwidth(guest_width);
>> + domain->agaw = width_to_agaw(adjust_width);
>
>
> How do we justify that domain->agaw is nothing like it was previously
> here? I spent some more time working on the failure to boot that I
> thought was caused by 4ec066c7b147, but there are so many breakages and
> fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
> the wrong one. Instead I cherry-picked every commit from Joerg's tree
> and matched Fixes patches to their original commit, which led me to
> this patch, mainline commit 123b2ffc376e. The issue I'm seeing is that
> we call domain_context_mapping_one() and we are in this section:
>
> struct dma_pte *pgd = domain->pgd;
> int agaw;
>
> context_set_domain_id(context, did);
>
> if (translation != CONTEXT_TT_PASS_THROUGH) {
> /*
> * Skip top levels of page tables for iommu which has
> * less agaw than default. Unnecessary for PT mode.
> */
> for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
> ret = -ENOMEM;
> pgd = phys_to_virt(dma_pte_addr(pgd));
> if (!dma_pte_present(pgd))
> goto out_unlock;
> }
>
> Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
> don't enter the loop. With this commit, we have domain->agaw=3,
> iommu->agaw=1 with pgd->val=0!
>
iommu->agaw presents the level of page table which is by default
supported by the @iommu. domain->agaw presents the level of page
table which is used by the @domain.
agaw = 1: 3-level page table
agaw = 2: 4-level page table
agaw = 3: 5-level page table
The case here is that @iommu only supports 3-level page table, but the
@domain was set to use 5-level page table. So we must skip level 4 and 5
page tables of the @domain.
This code in the loop looks odd to me. It will always goto to unlock and
leave pgd->val==0. How about below change? (not tested yet!)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 412f18aba501..98d6878cd29d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2020,11 +2020,15 @@ static int domain_context_mapping_one(struct
dmar_domain *domain,
* Skip top levels of page tables for iommu
which has
* less agaw than default. Unnecessary for PT mode.
*/
- for (agaw = domain->agaw; agaw > iommu->agaw;
agaw--) {
- ret = -ENOMEM;
- pgd = phys_to_virt(dma_pte_addr(pgd));
- if (!dma_pte_present(pgd))
- goto out_unlock;
+ while (iommu->agaw < domain->agaw) {
+ struct dma_pte *pte;
+
+ pte = domain->pgd;
+ if (dma_pte_present(pte)) {
+ domain->pgd =
phys_to_virt(dma_pte_addr(pte));
+ free_pgtable_page(pte);
+ }
+ domain->agaw--;
}
info = iommu_support_dev_iotlb(domain, iommu,
bus, devfn);
> I don't really follow how the setting of these fields above is
> equivalent to what they were previously or if they're supposed to be
> updated lazily, but the current behavior is non-functional. Commit
> 123b2ffc376e can be reverted with only a bit of offset, which brings
> Linus' tree back into working operation for me. Should we revert or is
> there an obvious fix here? Thanks,
This commit is not the root cause of this issue as far as I can see, it
only triggers above loop to get entered.
Best regards,
Baolu
Powered by blists - more mailing lists