[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BY5PR12MB3764CB9BBC42426B67537563B3349@BY5PR12MB3764.namprd12.prod.outlook.com>
Date: Fri, 11 Jun 2021 18:30:25 +0000
From: Krishna Reddy <vdumpa@...dia.com>
To: Will Deacon <will@...nel.org>, Ashish Mhetre <amhetre@...dia.com>
CC: "joro@...tes.org" <joro@...tes.org>,
"robin.murphy@....com" <robin.murphy@....com>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>
Subject: RE: [PATCH 1/2] iommu: Fix race condition during default domain
allocation
> > + mutex_lock(&group->mutex);
> > iommu_alloc_default_domain(group, dev);
> > + mutex_unlock(&group->mutex);
>
> It feels wrong to serialise this for everybody just to cater for systems with
> aliasing SIDs between devices.
Serialization is limited to devices in the same group. Unless devices share SID, they wouldn't be in same group.
> Can you provide some more information about exactly what the h/w
> configuration is, and the callstack which exhibits the race, please?
The failure is an after effect and is a page fault. Don't have a failure call stack here. Ashish has traced it through print messages and he can provide them.
>From the prints messages, The following was observed in page fault case:
Device1: iommu_probe_device() --> iommu_alloc_default_domain() --> iommu_group_alloc_default_domain() --> __iommu_attach_device(group->default_domain)
Device2: iommu_probe_device() --> iommu_alloc_default_domain() --> iommu_group_alloc_default_domain() --> __iommu_attach_device(group->default_domain)
Both devices(with same SID) are entering into iommu_group_alloc_default_domain() function and each one getting attached to a different group->default_domain
as the second one overwrites group->default_domain after the first one attaches to group->default_domain it has created.
SMMU would be setup to use first domain for the context page table. Whereas all the dma map/unamp requests from second device would
be performed on a domain that is not used by SMMU for context translations and IOVA (not mapped in first domain) accesses from second device lead to page faults.
-KR
Powered by blists - more mailing lists