lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3337df6c-f800-4610-8689-fbd4b4a5d07a@collabora.com>
Date: Fri, 20 Jun 2025 16:42:02 +0200
From: Benjamin Gaignard <benjamin.gaignard@...labora.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: joro@...tes.org, will@...nel.org, robin.murphy@....com, robh@...nel.org,
 krzk+dt@...nel.org, conor+dt@...nel.org, heiko@...ech.de,
 nicolas.dufresne@...labora.com, iommu@...ts.linux.dev,
 devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-arm-kernel@...ts.infradead.org, linux-rockchip@...ts.infradead.org,
 kernel@...labora.com
Subject: Re: [PATCH v3 3/5] iommu: Add verisilicon IOMMU driver


Le 20/06/2025 à 15:52, Benjamin Gaignard a écrit :
>
> Le 20/06/2025 à 14:05, Jason Gunthorpe a écrit :
>> On Fri, Jun 20, 2025 at 10:57:49AM +0200, Benjamin Gaignard wrote:
>>> Le 19/06/2025 à 18:59, Jason Gunthorpe a écrit :
>>>> On Thu, Jun 19, 2025 at 06:27:52PM +0200, Benjamin Gaignard wrote:
>>>>> Le 19/06/2025 à 15:47, Jason Gunthorpe a écrit :
>>>>>> Ugh. This ignores the gfp flags that are passed into map because you
>>>>>> have to force atomic due to the spinlock that shouldn't be there :(
>>>>>> This means it does not set GFP_KERNEL_ACCOUNT when required. It 
>>>>>> would
>>>>>> be better to continue to use the passed in GFP flags but override 
>>>>>> them
>>>>>> to atomic mode.
>>>>> I will add a gfp_t parameter and use it like that:
>>>>> page_table = iommu_alloc_pages_sz(gfp | GFP_ATOMIC | GFP_DMA32, 
>>>>> SPAGE_SIZE);
>>>> This will or together GFP_ATOMIC and GFP_KERNEL, I don't think that
>>>> works..
>>> I have test it, I don't see conflicts or errors. What worries you ?
>> Just looking at the bitmaps I'm not sure? Did you run with lockdep?
>
> Yes and it complains about that.
> I see that sun50i driver have more less the same struct than my driver
> but doesn't use lock. I will try see if I can remove the lock.

I have replace the two spinlock by a mutex in vsi_iommu structure.
It seems it works well and lockdep doesn't complain anymore.

>
>>>>>>> +static int vsi_iommu_attach_device(struct iommu_domain *domain,
>>>>>>> +                   struct device *dev)
>>>>>>> +{
>>>>>>> +    struct vsi_iommu *iommu = dev_iommu_priv_get(dev);
>>>>>>> +    struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
>>>>>>> +    unsigned long flags;
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    if (WARN_ON(!iommu))
>>>>>>> +        return -ENODEV;
>>>>>>> +
>>>>>>> +    /* iommu already attached */
>>>>>>> +    if (iommu->domain == domain)
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>> +    ret = vsi_iommu_identity_attach(&vsi_identity_domain, dev);
>>>>>>> +    if (ret)
>>>>>>> +        return ret;
>>>>>> Hurm, this is actually quite bad, now that it is clear the HW is 
>>>>>> in an
>>>>>> identity mode it is actually a security problem for VFIO to 
>>>>>> switch the
>>>>>> translation to identity during attach_device. I'd really prefer new
>>>>>> drivers don't make this mistake.
>>>>>>
>>>>>> It seems the main thing motivating this is the fact a linked list 
>>>>>> has
>>>>>> only a single iommu->node so you can't attach the iommu to both the
>>>>>> new/old domain and atomically update the page table base.
>>>>>>
>>>>>> Is it possible for the HW to do a blocking behavior? That would 
>>>>>> be an
>>>>>> easy fix.. You should always be able to force this by allocating a
>>>>>> shared top page table level during probe time and making it entirely
>>>>>> empty while staying always in the paging mode. Maybe there is a less
>>>>>> expensive way.
>>>>>>
>>>>>> Otherwise you probably have work more like the other drivers and
>>>>>> allocate a struct for each attachment so you can have the iommu
>>>>>> attached two domains during the switch over and never drop to an
>>>>>> identity mode.
>>>>> I will remove the switch to identity domain and it will works fine.
>>>> You'll loose invalidations..
>>>>
>>>> Maybe the easiest thing is to squish vsi_iommu_enable() and reorganize
>>>> it so that the spinlock is held across the register programming and
>>>> then you can atomically under the lock change the registers and change
>>>> the linked list. The register write cannot fail so this is fine.
>>>>
>>>> This should probably also flush the iotlb inside the lock.
>>> I will try to summarize:
>>> in vsi_iommu_attach_device() I should:
>>> - take the lock
>>> - do nothing if the domain is the same.
>>> - if iommu is used (ie powered up):
>>>    - update the registers to enable the iommu
>>>    - flush
>>>    - update the link list
>>> - update iommu->domain
>>> - release the lock
>> That sounds believable, yes.. Though can you do the "powered up" checks
>> inside the spinlock - are they sleeping? Can they be done before the
>> spinlock?
>>
>>> in vsi_iommu_identity_attach() I should:
>>> - take the lock
>>> - do nothing if the domain is the same.
>>> - if iommu is used (ie powered up):
>>>    - disable the iommu
>>>    - remove the node from the list
>>> - update iommu->domain
>>> - release the lock
>> And maybe flush too? How does the caching hw work at this point? You
>> can't have stale entries in the cache when you switch back to an
>> enabled/translating configuration. So either the HW flushes implicitly
>> or you need to add a flush somewhere..
>
> I do not have the documentation for the hardware but it seems that
> enable/disable are enough to do the job.
>
>
>>> vsi_iommu_suspend() and vsi_iommu_resume() will also have to take 
>>> the lock
>>> before calling vsi_iommu_disable() and vsi_iommu_enable().
>> Yes, if they use iommu->domain that seems good
>>
>> If the above locking is a problem then I'd use the group mutex instead
>> during resume/suspend. The attach functions are already called with
>> the group mutex held.
>
> Does group mutex is also called when using vsi_iommu_map or 
> vsi_iommu_unmap ?
>
>>
>>> Do I have to switch to identity domain in vsi_iommu_attach_device()
>>> before applying the requested domain ?
>> No, that is what creates the security problem. What you want is to
>> update the HW registers in a way that the HW just changes hitlessly
>> from one translation to another, then flush the cache.
>>
>> Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ