[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c7c23d49-bd44-a78c-bb83-de665737a5f8@arm.com>
Date: Thu, 23 Jan 2020 14:55:37 +0000
From: Robin Murphy <robin.murphy@....com>
To: Lu Baolu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Bjorn Helgaas <bhelgaas@...gle.com>, ashok.raj@...el.com,
jacob.jun.pan@...el.com, kevin.tian@...el.com,
Christoph Hellwig <hch@....de>,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 3/4] iommu: Preallocate iommu group when probing
devices
On 22/01/2020 5:39 am, Lu Baolu wrote:
> Hi Robin,
>
> On 1/21/20 8:45 PM, Robin Murphy wrote:
>> On 19/01/2020 6:29 am, Lu Baolu wrote:
>>> Hi Joerg,
>>>
>>> On 1/17/20 6:21 PM, Joerg Roedel wrote:
>>>> On Wed, Jan 01, 2020 at 01:26:47PM +0800, Lu Baolu wrote:
>>>>> This splits iommu group allocation from adding devices. This makes
>>>>> it possible to determine the default domain type for each group as
>>>>> all devices belonging to the group have been determined.
>>>>
>>>> I think its better to keep group allocation as it is and just defer
>>>> default domain allocation after each device is in its group. But take
>>>
>>> I tried defering default domain allocation, but it seems not possible.
>>>
>>> The call path of adding devices into their groups:
>>>
>>> iommu_probe_device
>>> -> ops->add_device(dev)
>>> -> (iommu vendor driver) iommu_group_get_for_dev(dev)
>>>
>>> After doing this, the vendor driver will get the default domain and
>>> apply dma_ops according to the domain type. If we defer the domain
>>> allocation, they will get a NULL default domain and cause panic in
>>> the vendor driver.
>>>
>>> Any suggestions?
>>
>> https://lore.kernel.org/linux-iommu/6dbbfc10-3247-744c-ae8d-443a336e0c50@linux.intel.com/
>>
>>
>> Haven't we been here before? ;)
>>
>> Since we can't (safely or reasonably) change a group's default domain
>> after ops->add_device() has returned, and in general it gets
>> impractical to evaluate "all device in a group" once you look beyond
>> &pci_bus_type (or consider hotplug as mentioned), then AFAICS there's
>> no reasonable way to get away from the default domain type being
>> defined by the first device to attach.
>
> Yes, agreed.
>
>> But in practice it's hardly a problem anyway - if every device in a
>> given group requests the same domain type then it doesn't matter which
>> comes first, and if they don't then we ultimately end up with an
>> impossible set of constraints, so are doomed to do the 'wrong' thing
>> regardless.
>
> The third case is, for example, three devices A, B, C in a group. The
> first device A is neutral about which type of default domain type is
> used. So the iommu framework will use a static default domain. But the
> device B requires to use a specific one which is different from the
> default. Currently, this is handled in the vendor iommu driver and one
> motivation of this patch set is to handle this in the generic layer.
Yes, I wasn't explicitly considering that particular case, but it mostly
falls out more or less the same way. Given that multi-device groups
*should* be relatively rare, for the user override it seems reasonable
to expect the user to see when devices get grouped and specify all of
them to achieve the desired result; the trusted/untrusted attribute
definitely shouldn't differ within any given group; and
opportunistically replacing passthrough domains with translation domains
for DMA-limited devices can only ever be a best-effort thing without
consistent results, since at best that still comes down to which driver
probed and called dma_set_mask() first.
Platform-specific exceptions like in device_def_domain_type() probably
do want to stay in the individual drivers, but rolling that up into
default domain allocation would be neat, and functionally no worse than
the existing process.
In principle we could fairly easily delay allocating a group's default
domain until the first driver bind event. It wouldn't help universally -
in the absolute worst case, device B might only be created at all by
device A's driver probing - and it might need careful coordination in
areas like the bus->dma_configure() flow, but it could at least help
accommodate the more common PCI case.
Robin.
Powered by blists - more mailing lists